Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/1170 BTW: thanks for tackling such a difficult, core issue in Drill. Drill claims to be a) schema free and b) SQL compliant. SQL is based on operations over relations with a fixed number of columns of fixed types. Reconciling these two ideas is very difficult. Even the original Drill developers, who built a huge amount of code very quickly, and who had intimate knowledge of the Drill internals, even they did not find a good solution which is why the problem is still open. There are two obvious approaches: 1) redefine SQL to operate over lists of maps (with arbitrary name/value pairs that differ across rows), or 2) define translation rules from schema-free input into the schema-full relations that SQL requires. This PR attempts to go down the first route: redefine SQL. To be successful, we'd want to rely on research papers, if any, that show how to reformulate relational theory on top of lists of maps rather than on relations and domains. The other approach is to define conversion rules: something much more on the order of a straight-forward implementation project. Can the user provide conversion rules (in the form of a schema) when the conversion is ambiguous? Would users rather encounter schema change exceptions or provide the conversion rules? These are interesting open questions.
---