Hi Paul, Regarding your point "We can also handle a map projection: `a.b` which matches:
* A (possibly repeated) map * A (possibly repeated) DICT with VARCHAR keys * A UNION (because a union might contain a possibly-repeated map) * A LIST (because the list can contain a union which might contain a possibly-repeated map)": I am not sure why `a.b` is possible for REPEATED MAP - this looks as a shortcut of some sort. I mean, it looks wrong with respect to data types, isn't it? Consider an example in Java: `Map<String, Integer>[] a = ...; Object result = a.get("b");` does not yield array of Integer; let's pretend the 'Map<String, Integer>' represents a Drill's MAP. But this notation could have been an alias to some 'function', like `Integer[] array = collect((Map<String, Integer>) a, "b")`. This does not work for REPEATED MAP in Drill currently, though such behaviour is present in Hive. (I am not saying this is wrong to support it for a REPEATED MAP, it may be useful.) In the case of REPEATED DICT we _may_ choose not to support such "shortcut", but provide UDFs with needed functionality. Regarding using keys in filter: I think, it is a good idea to provide UDFs for such needs. Hive, for example, has following functions for (Hive's) MAP [1] (see "Collection Functions"): array<K> map_keys(Map<K.V>) array<K> map_values(Map<K.V>) But yes, we must treat projections as general as possible until the real schema is known and this is a hard task. [1] https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-OperatorsonComplexTypes