So after dealing with some of my issues (and others still open), I have one set of data that seems to be working... I can do a select count(1) just fine on it.
However, when I try to do a select count(distinct field1) from table, I get "Error: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes" Fragment 3:0 Ok, this sorta makes, sense, there must be a schema change in the data. Cool So this data isn't huge, it's 455 json files over around 16 GB of storage space. So how to do troubleshoot this? My first attempt was to say, ok, the field that I am trying to get a distinct count on is an IP. So I thought to myself, what if I did select count(1) from table where field1 like '1%' or field1 like '2%' (all the way up to 9) (Thought process here is IPs must start with 1-9, at least how it's outputted here). I get the same count as with select count(1) from table (no where clause) so the field I am aggregating on is not the one that's changing ( I don't think) So as a Driller, how do I troubleshoot this scenario. Is there a way to outline which files have different schemas? Other tricks? Appreciate any thoughts here! John
