So we have two incompatible SQL parser configs in beam. One is in
BeamQueryPlanner
<https://github.com/apache/beam/blob/8ef71b6eb1d2d5c63974ec506a01faf3813efe74/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/planner/BeamQueryPlanner.java#L95>
 which is used by default and a second in BeamSqlParser
<https://github.com/apache/beam/blob/598774738e7a1236cf30f70a584311cee52d1818/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/parser/BeamSqlParser.java#L33>
which is used only in the BeamSqlCli execute path (not the explain path).
There are also a bunch of 'toLowerCase()' calls scattered around our code.
I'd like to get us on one parser config and remove the need for toLowerCase
calls.

To do this, I am proposing we standardize these all to go through
BeamQueryPlanner, use the Calcite Lex.JAVA config, and drop the
'toLowerCase()' calls. This will result in the parser preserving case,
being case sensitive, and using backticks for quoted identifiers. This is
the same as the default config in Apache Flink and is roughly compatible
with BigQuery. It effectively leaves the default path unchanged, except
case will now be preserved and checked consistently. The BeamSqlCli execute
path will remain unchanged for unquoted queries with all lower case names,
which is what we have tested. Comments? Objections?

Andrew

Reply via email to