[ 
https://issues.apache.org/jira/browse/HIVE-20281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16567013#comment-16567013
 ] 

Jesus Camacho Rodriguez commented on HIVE-20281:
------------------------------------------------

[~ashutoshc], can you take a look?

The problem was that when we are trying to merge two (sub)trees and we are 
gathering the operators that we need to remove, these are divided into two 
sets: {{discardableOps}} and {{discardableInputOps}}. The former gathers the 
operators that we are traversing while checking, while the latter gathers the 
inputs to those operators (obviously it also checks whether those inputs are 
the same). This distinction is useful later on when we actually perform the 
merge operation. {{discardableInputOps}} should not include {{discardableOps}}. 
However, for extended shared work optimizer I had introduced a boolean that 
does exactly that. Because we have those duplicate operators, we end up with 
inconsistent state that leads to additional operators in the cache (plan is 
still correct btw, though I am not sure whether this could lead to incorrect 
plan in some cases). Looking back at the code, it does not make sense to have 
that boolean / distinction, I think maybe I made the assumption while coding 
that I needed to keep them in both.

> SharedWorkOptimizer fails with 'operator cache contents and actual plan 
> differ'
> -------------------------------------------------------------------------------
>
>                 Key: HIVE-20281
>                 URL: https://issues.apache.org/jira/browse/HIVE-20281
>             Project: Hive
>          Issue Type: Bug
>          Components: Physical Optimizer
>    Affects Versions: 4.0.0, 3.2.0
>            Reporter: Ashutosh Chauhan
>            Assignee: Jesus Camacho Rodriguez
>            Priority: Critical
>         Attachments: HIVE-20281.patch
>
>
> HIVE-18201 seems to trigger a latent bug in SW optimizer. Test 
> {{subquery_in_having}} fails with:
> {code}
> 2018-07-31T08:42:57,328 DEBUG [b68f20cc-54d5-466d-b512-1540b3a43396 main] 
> optimizer.SharedWorkOptimizer: After SharedWorkExtendedOptimizer:
> TS[0]-SEL[1]-MAPJOIN[131]-FIL[12]-SEL[13]-GBY[14]-RS[15]-GBY[16]-SEL[17]-MAPJOIN[136]-MAPJOIN[137]-FIL[103]-SEL[104]-FS[105]
>      
> -FIL[113]-SEL[20]-RS[44]-MAPJOIN[133]-SEL[47]-GBY[48]-RS[49]-GBY[50]-SEL[51]-GBY[55]-RS[98]-MAPJOIN[136]
>                                                           
> -RS[88]-GBY[89]-SEL[120]-FIL[116]-SEL[91]-GBY[93]-RS[94]-GBY[95]-SEL[96]-RS[101]-MAPJOIN[137]
> TS[2]-FIL[112]-GBY[5]-RS[6]-GBY[7]-SEL[8]-RS[10]-MAPJOIN[131]
>                                          
> -RS[31]-MAPJOIN[132]-FIL[33]-SEL[34]-GBY[35]-RS[36]-GBY[37]-SEL[38]-GBY[42]-MAPJOIN[133]
> TS[21]-FIL[114]-SEL[22]-MAPJOIN[132]
> 2018-07-31T08:42:57,329 ERROR [b68f20cc-54d5-466d-b512-1540b3a43396 main] 
> ql.Driver: FAILED: SemanticException Error in shared work optimizer: operator 
> cache contentsand actual plan differ
> org.apache.hadoop.hive.ql.parse.SemanticException: Error in shared work 
> optimizer: operator cache contentsand actual plan differ
>         at 
> org.apache.hadoop.hive.ql.optimizer.SharedWorkOptimizer.transform(SharedWorkOptimizer.java:524)
>         at 
> org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:185)
>         at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:146)
>         at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12361)
>         at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:356)
>         at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:284)
>         at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:165)
>         at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:284)
>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:663)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to