[jira] [Updated] (HIVE-9097) Support runtime skew join for more queries [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-9097: - Attachment: HIVE-9097.1-spark.patch The patch splits the original spark task into two tasks so that conditional map joins can be inserted to process skewed data. Changes to golden files are all in query plan. Support runtime skew join for more queries [Spark Branch] - Key: HIVE-9097 URL: https://issues.apache.org/jira/browse/HIVE-9097 Project: Hive Issue Type: Improvement Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-9097.1-spark.patch After HIVE-8913, runtime skew join is enabled for spark. But currently the optimization only supports the simplest case where join is the leaf ReduceWork in a work graph. This is because the results from the original join and the conditional map join have to be unioned to feed to downstream works, which can be a little tricky for spark. This JIRA is to research and find a way to relax the above restriction. A possible solution is to break the original task into two tasks on the join work, and insert the conditional task in between. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9097) Support runtime skew join for more queries [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-9097: - Status: Patch Available (was: Open) Support runtime skew join for more queries [Spark Branch] - Key: HIVE-9097 URL: https://issues.apache.org/jira/browse/HIVE-9097 Project: Hive Issue Type: Improvement Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-9097.1-spark.patch After HIVE-8913, runtime skew join is enabled for spark. But currently the optimization only supports the simplest case where join is the leaf ReduceWork in a work graph. This is because the results from the original join and the conditional map join have to be unioned to feed to downstream works, which can be a little tricky for spark. This JIRA is to research and find a way to relax the above restriction. A possible solution is to break the original task into two tasks on the join work, and insert the conditional task in between. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9097) Support runtime skew join for more queries [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-9097: - Affects Version/s: spark-branch Support runtime skew join for more queries [Spark Branch] - Key: HIVE-9097 URL: https://issues.apache.org/jira/browse/HIVE-9097 Project: Hive Issue Type: Improvement Components: Spark Affects Versions: spark-branch Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-9097.1-spark.patch After HIVE-8913, runtime skew join is enabled for spark. But currently the optimization only supports the simplest case where join is the leaf ReduceWork in a work graph. This is because the results from the original join and the conditional map join have to be unioned to feed to downstream works, which can be a little tricky for spark. This JIRA is to research and find a way to relax the above restriction. A possible solution is to break the original task into two tasks on the join work, and insert the conditional task in between. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9097) Support runtime skew join for more queries [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-9097: - Component/s: Spark Support runtime skew join for more queries [Spark Branch] - Key: HIVE-9097 URL: https://issues.apache.org/jira/browse/HIVE-9097 Project: Hive Issue Type: Improvement Components: Spark Affects Versions: spark-branch Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-9097.1-spark.patch After HIVE-8913, runtime skew join is enabled for spark. But currently the optimization only supports the simplest case where join is the leaf ReduceWork in a work graph. This is because the results from the original join and the conditional map join have to be unioned to feed to downstream works, which can be a little tricky for spark. This JIRA is to research and find a way to relax the above restriction. A possible solution is to break the original task into two tasks on the join work, and insert the conditional task in between. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9097) Support runtime skew join for more queries [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-9097: -- Resolution: Fixed Fix Version/s: spark-branch Status: Resolved (was: Patch Available) Committed to Spark branch. Thanks, Rui. Support runtime skew join for more queries [Spark Branch] - Key: HIVE-9097 URL: https://issues.apache.org/jira/browse/HIVE-9097 Project: Hive Issue Type: Improvement Components: Spark Affects Versions: spark-branch Reporter: Rui Li Assignee: Rui Li Fix For: spark-branch Attachments: HIVE-9097.1-spark.patch After HIVE-8913, runtime skew join is enabled for spark. But currently the optimization only supports the simplest case where join is the leaf ReduceWork in a work graph. This is because the results from the original join and the conditional map join have to be unioned to feed to downstream works, which can be a little tricky for spark. This JIRA is to research and find a way to relax the above restriction. A possible solution is to break the original task into two tasks on the join work, and insert the conditional task in between. -- This message was sent by Atlassian JIRA (v6.3.4#6332)