[ https://issues.apache.org/jira/browse/CALCITE-963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16932891#comment-16932891 ]
Scott Reynolds commented on CALCITE-963: ---------------------------------------- {quote} * I was surprised that HoistedVariables was added to so many API calls. Maybe it could belong to another piece of the preparation context{quote} I was tempted to add it to {{DataContext}}. It does have to land into {{CalciteSignature}} {code:java} public CalciteSignature(String sql, List<AvaticaParameter> parameterList, Map<String, Object> internalParameters, RelDataType rowType, List<ColumnMetaData> columns, Meta.CursorFactory cursorFactory, CalciteSchema rootSchema, List<RelCollation> collationList, long maxRowCount, Bindable<T> bindable, Meta.StatementType statementType, HoistedVariables variables) { super(columns, sql, parameterList, internalParameters, cursorFactory, statementType); this.rowType = rowType; this.rootSchema = rootSchema; this.collationList = collationList; this.maxRowCount = maxRowCount; this.bindable = bindable; this.variables = variables; } public Enumerable<T> enumerable(DataContext dataContext) { Enumerable<T> enumerable = bindable.bind(dataContext, variables); if (maxRowCount >= 0) { // Apply limit. In JDBC 0 means "no limit". But for us, -1 means // "no limit", and 0 is a valid limit. enumerable = EnumerableDefaults.take(enumerable, maxRowCount); } return enumerable; } {code} {quote}I was surprised that this work touched so much Enumerable code. An alternative approach would be to transform a RelNode tree, early in the planning process, transforming some RexLiteral instances into RexDynamicParam. The rest of the planning process would proceed as if the user had provided a statement with bind variables. I know you state that this was a non-goal, but why was it not a goal? It probably would have been a lot simpler, and it would have worked with conventions besides Enumerable.{quote} Ya I should add color to this. Given the discussions on the mailing list I would like to add an optimization to our use case. Our Fact Tables contain a single account hierarchy represented by a column {{sub_account}}. Often a Fact will not be associated with an {{sub_account}} and so instead of storing that as null and dealing with missing right hand row, we store a sentinel value in the Fact so our joins to the account dimension are simpler. So when a query comes in that for all Facts that do not contain a sub account, we want to change the cost calculation of that filter as {{sub_account_sid == 'IS_NULL'}} is not as selective as {{sub_account_sid = 'AC128485'}}. So my understanding is, by translating into a {{RexDynamicParam}}, we lose out on this optimization. > Hoist literals > -------------- > > Key: CALCITE-963 > URL: https://issues.apache.org/jira/browse/CALCITE-963 > Project: Calcite > Issue Type: Bug > Reporter: Julian Hyde > Priority: Major > Labels: pull-request-available > Attachments: HoistedVariables.png > > Time Spent: 10m > Remaining Estimate: 0h > > Convert literals into (internal) bind variables so that statements that > differ only in literal values can be executed using the same plan. > In [mail > thread|http://mail-archives.apache.org/mod_mbox/calcite-dev/201511.mbox/%3c56437bf8.70...@gmail.com%3E] > Homer wrote: > {quote}Imagine that it is common to run a large number of very similar > machine generated queries that just change the literals in the sql query. > For example (the real queries would be much more complex): > {code}Select * from emp where empno = 1; > Select * from emp where empno = 2; > etc.{code} > The plan that is likely being generated for these kind of queries is going to > be very much the same each time, so to save some time, I would like to > recognize that the literals are all that have changed in a query and use the > previously optimized execution plan and just replace the literals.{quote} > I think this could be done as a transform on the initial RelNode tree. It > would find literals (RexLiteral), replace them with bind variables > (RexDynamicParam) and write the value into a pool. The next statement would > go through the same process and the RelNode tree would be identical, but with > possibly different values for the bind variables. > The bind variables are of course internal; not visible from JDBC. When the > statement is executed, the bind variables are implicitly bound. > Statements would be held in a Guava cache. > This would be enabled by a config parameter. Unfortunately I don't think we > could do this by default -- we'd lose optimization power because we would no > longer be able to do constant reduction. -- This message was sent by Atlassian Jira (v8.3.4#803005)