[jira] [Commented] (PHOENIX-3941) Filter regions to scan for local indexes based on data table leading pk filter conditions
[ https://issues.apache.org/jira/browse/PHOENIX-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16358916#comment-16358916 ] James Taylor commented on PHOENIX-3941: --- FYI, to get this patch to apply to 5.x, [~elserj], I had to also apply PHOENIX-4342. Hopefully this is ok, but it would be good to do an inventory of any other missing patches for 5.x. > Filter regions to scan for local indexes based on data table leading pk > filter conditions > - > > Key: PHOENIX-3941 > URL: https://issues.apache.org/jira/browse/PHOENIX-3941 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor >Priority: Major > Labels: SFDC, localIndex > Fix For: 4.14.0 > > Attachments: PHOENIX-3941_v1.patch, PHOENIX-3941_v2.patch, > PHOENIX-3941_v3.patch, PHOENIX-3941_v4.patch > > > Had a good offline conversation with [~ndimiduk] at PhoenixCon about local > indexes. Depending on the query, we can often times prune the regions we need > to scan over based on the where conditions against the data table pk. For > example, with a multi-tenant table, we only need to scan the regions that are > prefixed by the tenant ID. > We can easily get this information from the compilation of the query against > the data table (which we always do), through the > statementContext.getScanRanges() structure. We'd just want to keep a pointer > to the data table QueryPlan from the local index QueryPlan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3941) Filter regions to scan for local indexes based on data table leading pk filter conditions
[ https://issues.apache.org/jira/browse/PHOENIX-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356803#comment-16356803 ] Hudson commented on PHOENIX-3941: - FAILURE: Integrated in Jenkins build Phoenix-master #1929 (See [https://builds.apache.org/job/Phoenix-master/1929/]) PHOENIX-3941 Filter regions to scan for local indexes based on data (jtaylor: rev ba8bcefc9a5472365c1ca95a242c7abb19c0d27b) * (edit) phoenix-core/src/main/java/org/apache/phoenix/compile/QueryCompiler.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/optimize/QueryOptimizer.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/compile/PostDDLCompiler.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/jdbc/PhoenixStatement.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/compile/ScanRanges.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/compile/UpsertCompiler.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/execute/ScanPlan.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionlessQueryServicesImpl.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/compile/ExplainPlan.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/execute/LiteralResultIterationPlan.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/iterate/SerialIterators.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/iterate/BaseResultIterators.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/iterate/ExplainTable.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/compile/DeleteCompiler.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/execute/BaseQueryPlan.java * (edit) phoenix-core/src/test/java/org/apache/phoenix/query/ParallelIteratorsSplitTest.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/compile/JoinCompiler.java * (edit) phoenix-core/src/test/java/org/apache/phoenix/compile/QueryCompilerTest.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/iterate/ParallelIterators.java * (add) phoenix-core/src/test/java/org/apache/phoenix/query/KeyRangeClipTest.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/execute/AggregatePlan.java > Filter regions to scan for local indexes based on data table leading pk > filter conditions > - > > Key: PHOENIX-3941 > URL: https://issues.apache.org/jira/browse/PHOENIX-3941 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor >Priority: Major > Labels: SFDC, localIndex > Fix For: 4.14.0 > > Attachments: PHOENIX-3941_v1.patch, PHOENIX-3941_v2.patch, > PHOENIX-3941_v3.patch, PHOENIX-3941_v4.patch > > > Had a good offline conversation with [~ndimiduk] at PhoenixCon about local > indexes. Depending on the query, we can often times prune the regions we need > to scan over based on the where conditions against the data table pk. For > example, with a multi-tenant table, we only need to scan the regions that are > prefixed by the tenant ID. > We can easily get this information from the compilation of the query against > the data table (which we always do), through the > statementContext.getScanRanges() structure. We'd just want to keep a pointer > to the data table QueryPlan from the local index QueryPlan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3941) Filter regions to scan for local indexes based on data table leading pk filter conditions
[ https://issues.apache.org/jira/browse/PHOENIX-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356568#comment-16356568 ] James Taylor commented on PHOENIX-3941: --- Attached final patch (with unit test fix for case where columns are in common but there aren't that many matching filters). > Filter regions to scan for local indexes based on data table leading pk > filter conditions > - > > Key: PHOENIX-3941 > URL: https://issues.apache.org/jira/browse/PHOENIX-3941 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor >Priority: Major > Labels: SFDC, localIndex > Fix For: 4.14.0 > > Attachments: PHOENIX-3941_v1.patch, PHOENIX-3941_v2.patch, > PHOENIX-3941_v3.patch, PHOENIX-3941_v4.patch > > > Had a good offline conversation with [~ndimiduk] at PhoenixCon about local > indexes. Depending on the query, we can often times prune the regions we need > to scan over based on the where conditions against the data table pk. For > example, with a multi-tenant table, we only need to scan the regions that are > prefixed by the tenant ID. > We can easily get this information from the compilation of the query against > the data table (which we always do), through the > statementContext.getScanRanges() structure. We'd just want to keep a pointer > to the data table QueryPlan from the local index QueryPlan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3941) Filter regions to scan for local indexes based on data table leading pk filter conditions
[ https://issues.apache.org/jira/browse/PHOENIX-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16355037#comment-16355037 ] Thomas D'Silva commented on PHOENIX-3941: - +1 > Filter regions to scan for local indexes based on data table leading pk > filter conditions > - > > Key: PHOENIX-3941 > URL: https://issues.apache.org/jira/browse/PHOENIX-3941 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor >Priority: Major > Labels: SFDC, localIndex > Fix For: 4.14.0 > > Attachments: PHOENIX-3941_v1.patch, PHOENIX-3941_v2.patch, > PHOENIX-3941_v3.patch > > > Had a good offline conversation with [~ndimiduk] at PhoenixCon about local > indexes. Depending on the query, we can often times prune the regions we need > to scan over based on the where conditions against the data table pk. For > example, with a multi-tenant table, we only need to scan the regions that are > prefixed by the tenant ID. > We can easily get this information from the compilation of the query against > the data table (which we always do), through the > statementContext.getScanRanges() structure. We'd just want to keep a pointer > to the data table QueryPlan from the local index QueryPlan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3941) Filter regions to scan for local indexes based on data table leading pk filter conditions
[ https://issues.apache.org/jira/browse/PHOENIX-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354654#comment-16354654 ] Thomas D'Silva commented on PHOENIX-3941: - Sure I can review it. > Filter regions to scan for local indexes based on data table leading pk > filter conditions > - > > Key: PHOENIX-3941 > URL: https://issues.apache.org/jira/browse/PHOENIX-3941 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor >Priority: Major > Labels: SFDC, localIndex > Fix For: 4.14.0 > > Attachments: PHOENIX-3941_v1.patch, PHOENIX-3941_v2.patch, > PHOENIX-3941_v3.patch > > > Had a good offline conversation with [~ndimiduk] at PhoenixCon about local > indexes. Depending on the query, we can often times prune the regions we need > to scan over based on the where conditions against the data table pk. For > example, with a multi-tenant table, we only need to scan the regions that are > prefixed by the tenant ID. > We can easily get this information from the compilation of the query against > the data table (which we always do), through the > statementContext.getScanRanges() structure. We'd just want to keep a pointer > to the data table QueryPlan from the local index QueryPlan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3941) Filter regions to scan for local indexes based on data table leading pk filter conditions
[ https://issues.apache.org/jira/browse/PHOENIX-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354295#comment-16354295 ] James Taylor commented on PHOENIX-3941: --- [~tdsilva] - would you have some spare cycles to review? The only case we're not handling is when a join uses a local index. That follow up work will be done in PHOENIX-4585. > Filter regions to scan for local indexes based on data table leading pk > filter conditions > - > > Key: PHOENIX-3941 > URL: https://issues.apache.org/jira/browse/PHOENIX-3941 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor >Priority: Major > Labels: SFDC, localIndex > Fix For: 4.14.0 > > Attachments: PHOENIX-3941_v1.patch, PHOENIX-3941_v2.patch, > PHOENIX-3941_v3.patch > > > Had a good offline conversation with [~ndimiduk] at PhoenixCon about local > indexes. Depending on the query, we can often times prune the regions we need > to scan over based on the where conditions against the data table pk. For > example, with a multi-tenant table, we only need to scan the regions that are > prefixed by the tenant ID. > We can easily get this information from the compilation of the query against > the data table (which we always do), through the > statementContext.getScanRanges() structure. We'd just want to keep a pointer > to the data table QueryPlan from the local index QueryPlan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3941) Filter regions to scan for local indexes based on data table leading pk filter conditions
[ https://issues.apache.org/jira/browse/PHOENIX-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354161#comment-16354161 ] James Taylor commented on PHOENIX-3941: --- Thanks, [~maryannxue]. I've filed PHOENIX-4585 for this follow up work. It's fine for PHOENIX-1556 to go in first. > Filter regions to scan for local indexes based on data table leading pk > filter conditions > - > > Key: PHOENIX-3941 > URL: https://issues.apache.org/jira/browse/PHOENIX-3941 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor >Priority: Major > Labels: SFDC, localIndex > Fix For: 4.14.0 > > Attachments: PHOENIX-3941_v1.patch, PHOENIX-3941_v2.patch > > > Had a good offline conversation with [~ndimiduk] at PhoenixCon about local > indexes. Depending on the query, we can often times prune the regions we need > to scan over based on the where conditions against the data table pk. For > example, with a multi-tenant table, we only need to scan the regions that are > prefixed by the tenant ID. > We can easily get this information from the compilation of the query against > the data table (which we always do), through the > statementContext.getScanRanges() structure. We'd just want to keep a pointer > to the data table QueryPlan from the local index QueryPlan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3941) Filter regions to scan for local indexes based on data table leading pk filter conditions
[ https://issues.apache.org/jira/browse/PHOENIX-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351017#comment-16351017 ] Maryann Xue commented on PHOENIX-3941: -- [~jamestaylor], Unlike the single queries, join queries are optimized through {{JoinCompiler#optimize()}}, which in turns calls {{QueryOptimizer#optimize()}} for each join table to find out a *local* optimal plan. So when it comes the time to compile the join query, we are already working on a index-replaced query. So a straightforward solution might be to have the {{JoinCompiler#optimize()}} method return a map from tableRef to dataPlan which QueryCompiler can use later on to fill in the information. Would you like me to do this part of the job? If yes, would you mind waiting for PHOENIX-1556 to get in first? BTW, finding a local optimal for each of the join tables was the best we could do at the time when we started and there was no stats info, but now that we have stats and the cost model ready, we are able to find out a global optimal plan for join queries. However, the compile time could start to explode as the number of join tables or that of the indices for each table go up. And it would also require quite an amount of work, but it's something we can keep in mind. > Filter regions to scan for local indexes based on data table leading pk > filter conditions > - > > Key: PHOENIX-3941 > URL: https://issues.apache.org/jira/browse/PHOENIX-3941 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor >Priority: Major > Labels: SFDC, localIndex > Fix For: 4.14.0 > > Attachments: PHOENIX-3941_v1.patch, PHOENIX-3941_v2.patch > > > Had a good offline conversation with [~ndimiduk] at PhoenixCon about local > indexes. Depending on the query, we can often times prune the regions we need > to scan over based on the where conditions against the data table pk. For > example, with a multi-tenant table, we only need to scan the regions that are > prefixed by the tenant ID. > We can easily get this information from the compilation of the query against > the data table (which we always do), through the > statementContext.getScanRanges() structure. We'd just want to keep a pointer > to the data table QueryPlan from the local index QueryPlan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3941) Filter regions to scan for local indexes based on data table leading pk filter conditions
[ https://issues.apache.org/jira/browse/PHOENIX-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350902#comment-16350902 ] James Taylor commented on PHOENIX-3941: --- [~maryannxue] - would you mind taking a look at this v2 patch? I'm trying to keep the data plan with the query plan used for an index (so we can potentially prune local index regions when there are leading PK columns in common between the data table and index table). For joins, I'm losing the QueryCompiler.dataPlan along the way and I'm not sure how to fix it. > Filter regions to scan for local indexes based on data table leading pk > filter conditions > - > > Key: PHOENIX-3941 > URL: https://issues.apache.org/jira/browse/PHOENIX-3941 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor >Priority: Major > Labels: SFDC, localIndex > Fix For: 4.14.0 > > Attachments: PHOENIX-3941_v1.patch, PHOENIX-3941_v2.patch > > > Had a good offline conversation with [~ndimiduk] at PhoenixCon about local > indexes. Depending on the query, we can often times prune the regions we need > to scan over based on the where conditions against the data table pk. For > example, with a multi-tenant table, we only need to scan the regions that are > prefixed by the tenant ID. > We can easily get this information from the compilation of the query against > the data table (which we always do), through the > statementContext.getScanRanges() structure. We'd just want to keep a pointer > to the data table QueryPlan from the local index QueryPlan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3941) Filter regions to scan for local indexes based on data table leading pk filter conditions
[ https://issues.apache.org/jira/browse/PHOENIX-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16257497#comment-16257497 ] Lars Hofhansl commented on PHOENIX-3941: Let's get this in for 4.14. Without this LOCAL INDEX is somewhat incomplete as read time is then bound by the slowest region of the table! This would narrow that to the slowest region with relevant data. > Filter regions to scan for local indexes based on data table leading pk > filter conditions > - > > Key: PHOENIX-3941 > URL: https://issues.apache.org/jira/browse/PHOENIX-3941 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: James Taylor > Labels: SFDC, localIndex > Fix For: 4.14.0 > > > Had a good offline conversation with [~ndimiduk] at PhoenixCon about local > indexes. Depending on the query, we can often times prune the regions we need > to scan over based on the where conditions against the data table pk. For > example, with a multi-tenant table, we only need to scan the regions that are > prefixed by the tenant ID. > We can easily get this information from the compilation of the query against > the data table (which we always do), through the > statementContext.getScanRanges() structure. We'd just want to keep a pointer > to the data table QueryPlan from the local index QueryPlan. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-3941) Filter regions to scan for local indexes based on data table leading pk filter conditions
[ https://issues.apache.org/jira/browse/PHOENIX-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16061662#comment-16061662 ] Lars Hofhansl commented on PHOENIX-3941: Great idea. We can pre-prune by any prefix of the key in fact. > Filter regions to scan for local indexes based on data table leading pk > filter conditions > - > > Key: PHOENIX-3941 > URL: https://issues.apache.org/jira/browse/PHOENIX-3941 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor > Labels: SFDC, localIndex > Fix For: 4.12.0 > > > Had a good offline conversation with [~ndimiduk] at PhoenixCon about local > indexes. Depending on the query, we can often times prune the regions we need > to scan over based on the where conditions against the data table pk. For > example, with a multi-tenant table, we only need to scan the regions that are > prefixed by the tenant ID. > We can easily get this information from the compilation of the query against > the data table (which we always do), through the > statementContext.getScanRanges() structure. We'd just want to keep a pointer > to the data table QueryPlan from the local index QueryPlan. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-3941) Filter regions to scan for local indexes based on data table leading pk filter conditions
[ https://issues.apache.org/jira/browse/PHOENIX-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049784#comment-16049784 ] James Taylor commented on PHOENIX-3941: --- FYI, [~lhofhansl] & [~rajeshbabu] - I think this will ofthen help prevent large numbers of RPCs over big tables that use local indexes. > Filter regions to scan for local indexes based on data table leading pk > filter conditions > - > > Key: PHOENIX-3941 > URL: https://issues.apache.org/jira/browse/PHOENIX-3941 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor > Labels: SFDC, localIndex > Fix For: 4.12.0 > > > Had a good offline conversation with [~ndimiduk] at PhoenixCon about local > indexes. Depending on the query, we can often times prune the regions we need > to scan over based on the where conditions against the data table pk. For > example, with a multi-tenant table, we only need to scan the regions that are > prefixed by the tenant ID. > We can easily get this information from the compilation of the query against > the data table (which we always do), through the > statementContext.getScanRanges() structure. We'd just want to keep a pointer > to the data table QueryPlan from the local index QueryPlan. -- This message was sent by Atlassian JIRA (v6.4.14#64029)