[jira] [Assigned] (HIVE-17572) Warnings from SparkCrossProductCheck for MapJoins are confusing

2018-12-11 Thread Andrew Sherman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman reassigned HIVE-17572:
-

Assignee: (was: Andrew Sherman)

> Warnings from SparkCrossProductCheck for MapJoins are confusing
> ---
>
> Key: HIVE-17572
> URL: https://issues.apache.org/jira/browse/HIVE-17572
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Priority: Major
>
> When the {{SparkCrossProductCheck}} detects a cross-product in a map-join, it 
> prints out a confusing warning - e.g. {{Map Join MAPJOIN\[9\]\[bigTable=?\] 
> in task 'Stage-1:MAPRED' is a cross product}}
> I see a few ways this can be imrpoved:
> * {{bigTable}} should actually specify the big table
> * I'm not sure why the stage id is printed instead of the work id, when a 
> cross product is detected in a shuffle join the work id is shown (e.g. 
> {{Warning: Shuffle Join JOIN\[13\]\[tables = \[$hdt$_1, $hdt$_2, $hdt$_0\]\] 
> in Work 'Reducer 3' is a cross product}})
> * It shouldn't say {{MAPRED}} that can be confusing to users
> * The {{MAPJOIN}} id doesn't need to be printed, it doesn't have any meaning 
> to the user and the value just keeps on going up and up the longer a session 
> lives
> On a somewhat related note, could we just stick this warning in the explain 
> plan? Otherwise users may not even notice it



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-17572) Warnings from SparkCrossProductCheck for MapJoins are confusing

2017-10-06 Thread Andrew Sherman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman reassigned HIVE-17572:
-

Assignee: Andrew Sherman

> Warnings from SparkCrossProductCheck for MapJoins are confusing
> ---
>
> Key: HIVE-17572
> URL: https://issues.apache.org/jira/browse/HIVE-17572
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Andrew Sherman
>
> When the {{SparkCrossProductCheck}} detects a cross-product in a map-join, it 
> prints out a confusing warning - e.g. {{Map Join MAPJOIN\[9\]\[bigTable=?\] 
> in task 'Stage-1:MAPRED' is a cross product}}
> I see a few ways this can be imrpoved:
> * {{bigTable}} should actually specify the big table
> * I'm not sure why the stage id is printed instead of the work id, when a 
> cross product is detected in a shuffle join the work id is shown (e.g. 
> {{Warning: Shuffle Join JOIN\[13\]\[tables = \[$hdt$_1, $hdt$_2, $hdt$_0\]\] 
> in Work 'Reducer 3' is a cross product}})
> * It shouldn't say {{MAPRED}} that can be confusing to users
> * The {{MAPJOIN}} id doesn't need to be printed, it doesn't have any meaning 
> to the user and the value just keeps on going up and up the longer a session 
> lives
> On a somewhat related note, could we just stick this warning in the explain 
> plan? Otherwise users may not even notice it



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)