Andre, I don't believe that QueryRecord supports RecordPath expressions as "column names" (but someone should correct me if I'm mistaken :) I think it takes a tabular view of the record, meaning each top-level field is a "column" for the purposes of querying.
For JSON, the JsonPathReader lets you "hoist" fields from a nested record into what looks like top-level fields from the record API perspective. For Avro, there is no current AvroPathReader, which IIRC is partly due to the fact that there is no AvroPath implementation in the Avro project [1]. However the Jira case mentions an Apache-licensed library for doing such things, and certainly RecordPathCompiler handles it from the Record API side. I think the kicker is how to expose fields for projections (i.e. SELECTs); for arbitrarily nested records, that's a lot of "columns" and/or non-standard SQL. It can be done (Apache Drill does); but from what I remember, Drill has some crafted Calcite voodoo, where QueryRecord leverages Calcite as a library. A workaround could use UpdateRecord to manually hoist the field to the top-level by adding a new field to the outgoing schema at the top-level, and in UpdateRecord setting its value to the evaluation of your RecordPath "/header/deviceEventClassId". Then you could use QueryRecord on that "column". Regards, Matt [1] https://issues.apache.org/jira/browse/AVRO-992 On Mon, Jul 3, 2017 at 10:07 PM, Andre <[email protected]> wrote: > All, > > What is the expected to refer to nested avro element from within QueryRecord > SQL queries? > > When using LookupRecord I may refer to a record child element by the > RecordPath "/header/deviceEventClassId", however, it is not clear what > syntax should I use with QueryRecord? > > SELECT * FROM FLOWFILE WHERE "/header/deviceEventClassId" <> "1" > > Fails with column not found.
