[ https://issues.apache.org/jira/browse/HIVE-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936248#comment-13936248 ]
Harish Butani commented on HIVE-6668: ------------------------------------- Looks right to me, you have to use the original work object because the clone creates new Operator objects. Ran the test with the patch, confirmed that the ConditionalResolver resolves to a mapJoin. Navis thanks for jumping on this. > When auto join convert is on and noconditionaltask is off, > ConditionalResolverCommonJoin fails to resolve map joins. > -------------------------------------------------------------------------------------------------------------------- > > Key: HIVE-6668 > URL: https://issues.apache.org/jira/browse/HIVE-6668 > Project: Hive > Issue Type: Bug > Affects Versions: 0.13.0, 0.14.0 > Reporter: Yin Huai > Assignee: Navis > Priority: Blocker > Fix For: 0.13.0 > > Attachments: HIVE-6668.1.patch.txt > > > I tried the following query today ... > {code:sql} > set mapred.job.map.memory.mb=2048; > set mapred.job.reduce.memory.mb=2048; > set mapred.map.child.java.opts=-server -Xmx3072m > -Djava.net.preferIPv4Stack=true; > set mapred.reduce.child.java.opts=-server -Xmx3072m > -Djava.net.preferIPv4Stack=true; > set mapred.reduce.tasks=60; > set hive.stats.autogather=false; > set hive.exec.parallel=false; > set hive.enforce.bucketing=true; > set hive.enforce.sorting=true; > set hive.map.aggr=true; > set hive.optimize.bucketmapjoin=true; > set hive.optimize.bucketmapjoin.sortedmerge=true; > set hive.mapred.reduce.tasks.speculative.execution=false; > set hive.auto.convert.join=true; > set hive.auto.convert.sortmerge.join=true; > set hive.auto.convert.sortmerge.join.noconditionaltask=false; > set hive.auto.convert.join.noconditionaltask=false; > set hive.auto.convert.join.noconditionaltask.size=100000000; > set hive.optimize.reducededuplication=true; > set hive.optimize.reducededuplication.min.reducer=1; > set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; > set hive.mapjoin.smalltable.filesize=45000000; > set hive.optimize.index.filter=false; > set hive.vectorized.execution.enabled=false; > set hive.optimize.correlation=false; > select > i_item_id, > s_state, > avg(ss_quantity) agg1, > avg(ss_list_price) agg2, > avg(ss_coupon_amt) agg3, > avg(ss_sales_price) agg4 > FROM store_sales > JOIN date_dim on (store_sales.ss_sold_date_sk = date_dim.d_date_sk) > JOIN item on (store_sales.ss_item_sk = item.i_item_sk) > JOIN customer_demographics on (store_sales.ss_cdemo_sk = > customer_demographics.cd_demo_sk) > JOIN store on (store_sales.ss_store_sk = store.s_store_sk) > where > cd_gender = 'F' and > cd_marital_status = 'U' and > cd_education_status = 'Primary' and > d_year = 2002 and > s_state in ('GA','PA', 'LA', 'SC', 'MI', 'AL') > group by i_item_id, s_state with rollup > order by > i_item_id, > s_state > limit 100; > {code} > The log shows ... > {code} > 14/03/14 17:05:02 INFO plan.ConditionalResolverCommonJoin: Failed to resolve > driver alias (threshold : 45000000, length mapping : {store=94175, > store_sales=48713909726, item=39798667, customer_demographics=1660831, > date_dim=2275902}) > Stage-27 is filtered out by condition resolver. > 14/03/14 17:05:02 INFO exec.Task: Stage-27 is filtered out by condition > resolver. > Stage-28 is filtered out by condition resolver. > 14/03/14 17:05:02 INFO exec.Task: Stage-28 is filtered out by condition > resolver. > Stage-3 is selected by condition resolver. > {code} > Stage-3 is a reduce join. Actually, the resolver should pick the map join -- This message was sent by Atlassian JIRA (v6.2#6252)