Hi devs,

I'm working on improving Storm SQL to make it useful for actual use cases.

I saw some comments [1] from Julian and Milinda in the early stage of Storm
SQL which stated that having Storm physical algebra would be better than
operating on Calcite logical algebra.

Could you shed a light on understanding what that comments meaning? I'd
like to get some resources, documents, and recommendations. I'm reading the
code of Samza SQL planner but would also like to read some documents for
that.

Thanks in advance for your support.

Explaining some backgrounds on current Storm SQL:
Storm SQL relies on Trident which is a high-level abstraction on
"micro-batch" transactional API. Trident supports streaming computations
like functions, filters, grouping, aggregations, joins so Calcite logical
algebra could be directly matched to Trident topology. Because of that we
still rely on converting calcite logical plan to Trident topology [2], but
it generates lots of nodes (operators) which should be optimized [3].
Trident itself will optimize them for merging nodes into one execution
component - Bolt - but it doesn't address the SQL optimizations like
pushdown filter.

Thanks!
Jungtaek Lim (HeartSaVioR)

[1]
https://issues.apache.org/jira/browse/STORM-1040?focusedCommentId=15036651&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15036651
[2]
https://github.com/apache/storm/blob/master/external/sql/storm-sql-core/src/jvm/org/apache/storm/sql/compiler/backends/trident/TridentLogicalPlanCompiler.java
[3]
https://docs.google.com/spreadsheets/d/1tTyIxbNafvVw0tQjWjiBQCKDCPbbnfbT5JZ7F5JEBNE

Reply via email to