I've adapted Calcite's EnumerableCalc code generation to generate the
BeamCalc DoFn. The primary purpose behind this change is so we can take
advantage of Calcite's extensive SQL operator implementation. This deletes
~11000 lines of code from Beam (with ~350 added), significantly increases
the set of supported SQL operators, and improves performance and
correctness of currently supported operators. Here is my work in progress:
https://github.com/apache/beam/pull/6417

There are a few bugs in Calcite that this has exposed:

Fixed in Calcite master:

   - CALCITE-2321 <https://issues.apache.org/jira/browse/CALCITE-2321>
   - The type of a union of CHAR columns of different lengths should be VARCHAR
   - CALCITE-2447 <https://issues.apache.org/jira/browse/CALCITE-2447> -
   Some POWER, ATAN2 functions fail with NoSuchMethodException

Pending PRs:

   - CALCITE-2529 <https://issues.apache.org/jira/browse/CALCITE-2529>
   - linq4j should promote integer to floating point when generating function
   calls
   - CALCITE-2530 <https://issues.apache.org/jira/browse/CALCITE-2530>
   - TRIM function does not throw exception when the length of trim character
   is not 1(one)

More work:

   - CALCITE-2404 <https://issues.apache.org/jira/browse/CALCITE-2404> -
   Accessing structured-types is not implemented by the runtime
   - (none yet) - Support multi character TRIM extension in Calcite

I would like to push these changes in with these minor regressions. Do any
of these Calcite bugs block this functionality being adding to Beam?

Andrew

Reply via email to