Ok, this has nothing to do with multiple files either. Selecting from a single file produces a schema change. (Who would have thunk it)
0: jdbc:drill:zk=local> select type, count(*) from dfs.asa.`/streaming/venuepoint/events/2016-03-28` as s group by type; Error: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes Fragment 0:0 [Error Id: ad4aa637-e0c5-46fe-b074-814005bf8024 on swift:31010] (state=,code=0) On Mon, Mar 28, 2016 at 8:51 PM, Stefán Baxter (JIRA) <[email protected]> wrote: > Stefán Baxter created DRILL-4548: > ------------------------------------ > > Summary: Drill can not select from multiple Avro files > Key: DRILL-4548 > URL: https://issues.apache.org/jira/browse/DRILL-4548 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Avro > Affects Versions: 1.6.0, 1.7.0 > Reporter: Stefán Baxter > > > Hi, > > I have reworked/refactored our Avro based logging system trying to make > the whole Drill + Avro->Parquet experience a bit more agreeable. > > Long story short I'm getting this error when selecting form multiple Avro > files even though these files share the EXCACT same schema: > > Error: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema > changes > Fragment 0:0 > [Error Id: 00d49aa2-5564-497e-a330-e852d5889beb on swift:31010] > (state=,code=0) > > We are using union types but only to allow for null values as seems to be > supported by drill as per this comment in the Drill code: > // currently supporting only nullable union (optional fields) like > ["null", "some-type"]. > > This happens for a very simple group_by + count(*) query that only uses > two fields in Avro and neither one of them uses a Union construct so and > both of them contain string values in every case. > > I now think this has nothing to do with the union types since the query > uses only simple string, unless there is a full schema validation done on > the content of the files rather then the identical Avro schema embedded in > both files. > > > > -- > This message was sent by Atlassian JIRA > (v6.3.4#6332) >
