[ https://issues.apache.org/jira/browse/PHOENIX-5389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chenglei updated PHOENIX-5389: ------------------------------ Attachment: PHOENIX-5389_v1-4.x-HBase-1.4.patch > Push down PostFilter to Sub-JoinTable for SortMergeJoin and NoStarJoin > ---------------------------------------------------------------------- > > Key: PHOENIX-5389 > URL: https://issues.apache.org/jira/browse/PHOENIX-5389 > Project: Phoenix > Issue Type: Improvement > Affects Versions: 4.14.2 > Reporter: chenglei > Assignee: chenglei > Priority: Major > Fix For: 4.15.0, 5.1.0 > > Attachments: PHOENIX-5389_v1-4.x-HBase-1.4.patch > > > Given following tables: > {code:java} > create table order_table > (order_id varchar(15) not null primary key, > customer_id varchar(10), > item_id varchar(10), > price integer, > quantity integer, > date timestamp) > create table item_table > (item_id varchar(10) not null primary key, > name varchar, > price integer, > discount1 integer, > discount2 integer, > supplier_id varchar(10), > description varchar) > create table supplier_table > (supplier_id varchar(10) not null primary key, > name varchar, > phone varchar(12), > address varchar, > loc_id varchar(5)) > {code} > for following sql : > {code:java} > select /*+ USE_SORT_MERGE_JOIN */ o.order_id,i.item_id, s.supplier_id, > s.name from > supplier_table s inner join item_table i on s.supplier_id = i.supplier_id > inner join order_table o on i.item_id = o.item_id > where s.address = 'hai' or i.supplier_id != 'medi' > {code} > When using SortMergeJoin, the above join is composited as: > {{(supplier_table inner join item_table) inner join order_table}}, which > doing the {{supplier_table s inner join item_table}} first, and then join > {{order_table}}. > Obviously, the where condition {{s.address = 'hai' or i.supplier_id != > 'medi'}} could be pushed down to the first join {{(supplier_table inner > join item_table)}},because it only references the columns of > {{supplier_table}} and {{item_table}}. > but unfortunately, this where condition {{s.address = 'hai' or i.supplier_id > != 'medi'}} does not be pushed down as expected, and it becomes the > {{postFilter}} of the whole join {{(supplier_table s inner join item_table i) > inner join order_table}}. Of course, it is very ineffiecient. > This problem is caused by the following code line 431 in > QueryCompiler.compileJoinQuery: > {code:java} > 430 case SORT_MERGE: { > 431 JoinTable lhsJoin = > joinTable.getSubJoinTableWithoutPostFilters(); > 432 JoinSpec lastJoinSpec = joinSpecs.get(joinSpecs.size() - > 1); > 433 JoinType type = lastJoinSpec.getType(); > 434 JoinTable rhsJoin = lastJoinSpec.getJoinTable(); > {code} > At the beginning of SORT_MERGE, above line 431 create a new JoinTable for > {{(supplier_table inner join item_table)}} using following > {{JoinTable.getSubJoinTableWithoutPostFilters()}} method: > {code:java} > 441 public JoinTable getSubJoinTableWithoutPostFilters() { > 442 return joinSpecs.size() > 1 ? new JoinTable(table, > joinSpecs.subList(0, joinSpecs.size() - 1)) : > 443 new JoinTable(table); > 444 } > {code} > {{JoinTable.getSubJoinTableWithoutPostFilters()}} only creates the new > JoinTable and does not try to push down the where condition {{s.address = > 'hai' or i.supplier_id != 'medi'}} to the new created JoinTable, so cause > this problem. -- This message was sent by Atlassian JIRA (v7.6.14#76016)