[jira] [Updated] (HIVE-2732) Reduce Sink deduplication fails if the child reduce sink is followed by a join

Phabricator (Updated) (JIRA) Mon, 20 Feb 2012 17:50:02 -0800

     [ 
https://issues.apache.org/jira/browse/HIVE-2732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Phabricator updated HIVE-2732:
------------------------------

    Attachment: HIVE-2732.D1809.1.patch

navis requested code review of "HIVE-2732 [jira] Reduce Sink deduplication 
fails if the child reduce sink is followed by a join".
Reviewers: JIRA

  DPAL-854 Reduce Sink deduplication fails if the child reduce sink is followed 
by a join

  set hive.optimize.reducededuplication=true;
  set hive.auto.convert.join=true;
  explain select * from (select * from src distribute by key sort by key) a 
join src b on a.key = b.key;

  fails with the following exception

  java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.SelectOperator 
cannot be cast to org.apache.hadoop.hive.ql.exec.ReduceSinkOperator
        at 
org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:313)
        at 
org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.genMapJoinOpAndLocalWork(MapJoinProcessor.java:226)
        at 
org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.processCurrentTask(CommonJoinResolver.java:174)
        at 
org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.dispatch(CommonJoinResolver.java:287)
        at 
org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
        at 
org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:194)
        at 
org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:139)
        at 
org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver.resolve(CommonJoinResolver.java:68)
        at 
org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:72)
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genMapRedTasks(SemanticAnalyzer.java:7019)
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7312)
        at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
        at 
org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
        at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
        at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

  If hive.auto.convert.join is set to false, it produces an incorrect plan 
where the two halves of the join are processed in two separate map reduce 
tasks, and the reducers of these two tasks both contain the join operator 
resulting in an exception.

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D1809

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java
  ql/src/test/queries/clientpositive/reduce_deduplicate_exclude_gby.q
  ql/src/test/results/clientpositive/reduce_deduplicate_exclude_gby.q.out

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/3855/

Tip: use the X-Herald-Rules header to filter Herald messages in your client.

                
> Reduce Sink deduplication fails if the child reduce sink is followed by a join
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-2732
>                 URL: https://issues.apache.org/jira/browse/HIVE-2732
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Kevin Wilfong
>         Attachments: HIVE-2732.D1809.1.patch
>
>
> set hive.optimize.reducededuplication=true;
> set hive.auto.convert.join=true;
> explain select * from (select * from src distribute by key sort by key) a 
> join src b on a.key = b.key;
> fails with the following exception
> java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.SelectOperator 
> cannot be cast to org.apache.hadoop.hive.ql.exec.ReduceSinkOperator
>       at 
> org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:313)
>       at 
> org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.genMapJoinOpAndLocalWork(MapJoinProcessor.java:226)
>       at 
> org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.processCurrentTask(CommonJoinResolver.java:174)
>       at 
> org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.dispatch(CommonJoinResolver.java:287)
>       at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
>       at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:194)
>       at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:139)
>       at 
> org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver.resolve(CommonJoinResolver.java:68)
>       at 
> org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:72)
>       at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genMapRedTasks(SemanticAnalyzer.java:7019)
>       at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7312)
>       at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
>       at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
>       at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
>       at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
>       at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
>       at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
>       at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
>       at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
>       at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
>       at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> If hive.auto.convert.join is set to false, it produces an incorrect plan 
> where the two halves of the join are processed in two separate map reduce 
> tasks, and the reducers of these two tasks both contain the join operator 
> resulting in an exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2732) Reduce Sink deduplication fails if the child reduce sink is followed by a join

Reply via email to