Github user paul-rogers commented on the issue:
https://github.com/apache/drill/pull/1170
BTW: thanks for tackling such a difficult, core issue in Drill. Drill
claims to be a) schema free and b) SQL compliant. SQL is based on operations
over relations with a fixed number of columns of fixed types. Reconciling these
two ideas is very difficult. Even the original Drill developers, who built a
huge amount of code very quickly, and who had intimate knowledge of the Drill
internals, even they did not find a good solution which is why the problem is
still open.
There are two obvious approaches: 1) redefine SQL to operate over lists of
maps (with arbitrary name/value pairs that differ across rows), or 2) define
translation rules from schema-free input into the schema-full relations that
SQL requires.
This PR attempts to go down the first route: redefine SQL. To be
successful, we'd want to rely on research papers, if any, that show how to
reformulate relational theory on top of lists of maps rather than on relations
and domains.
The other approach is to define conversion rules: something much more on
the order of a straight-forward implementation project. Can the user provide
conversion rules (in the form of a schema) when the conversion is ambiguous?
Would users rather encounter schema change exceptions or provide the conversion
rules? These are interesting open questions.
---