subject:"\[jira\] \[Commented\] \(HIVE\-7613\) Research optimization of auto convert join to map join \[Spark branch\]"

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

2015-01-12 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14274401#comment-14274401
 ] 

Szehon Ho commented on HIVE-7613:
-

Very belated HNY!  I uploaded this pdf as a wiki page, which is more 
maintainable:

[https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark:+Join+Design+Master|https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark:+Join+Design+Master].

Its linked and a child of : 
[DesignDocs|https://cwiki.apache.org/confluence/display/Hive/DesignDocs] page.

 Research optimization of auto convert join to map join [Spark branch]
 -

 Key: HIVE-7613
 URL: https://issues.apache.org/jira/browse/HIVE-7613
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Suhas Satish
Priority: Minor
  Labels: TODOC-SPARK
 Fix For: spark-branch

 Attachments: HIve on Spark Map join background.docx, Hive on Spark 
 Join Master Design.pdf, small_table_broadcasting.pdf


 ConvertJoinMapJoin is an optimization the replaces a common join(aka shuffle 
 join) with a map join(aka broadcast or fragment replicate join) when 
 possible. we need to research how to make it workable with Hive on Spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

2014-12-31 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14262043#comment-14262043
 ] 

Szehon Ho commented on HIVE-7613:
-

Thats a good idea, it would be useful.  I'll look into that when I get back 
after New Years

 Research optimization of auto convert join to map join [Spark branch]
 -

 Key: HIVE-7613
 URL: https://issues.apache.org/jira/browse/HIVE-7613
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Suhas Satish
Priority: Minor
 Fix For: spark-branch

 Attachments: HIve on Spark Map join background.docx, Hive on Spark 
 Join Master Design.pdf, small_table_broadcasting.pdf


 ConvertJoinMapJoin is an optimization the replaces a common join(aka shuffle 
 join) with a map join(aka broadcast or fragment replicate join) when 
 possible. we need to research how to make it workable with Hive on Spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

2014-12-31 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14262057#comment-14262057
 ] 

Lefty Leverenz commented on HIVE-7613:
--

Thanks [~szehon], and Happy New Year!

I added a TODOC-SPARK label just to help us keep track of this.

 Research optimization of auto convert join to map join [Spark branch]
 -

 Key: HIVE-7613
 URL: https://issues.apache.org/jira/browse/HIVE-7613
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Suhas Satish
Priority: Minor
  Labels: TODOC-SPARK
 Fix For: spark-branch

 Attachments: HIve on Spark Map join background.docx, Hive on Spark 
 Join Master Design.pdf, small_table_broadcasting.pdf


 ConvertJoinMapJoin is an optimization the replaces a common join(aka shuffle 
 join) with a map join(aka broadcast or fragment replicate join) when 
 possible. we need to research how to make it workable with Hive on Spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

2014-12-30 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14260943#comment-14260943
 ] 

Lefty Leverenz commented on HIVE-7613:
--

Should this join design doc be added to the wiki?  Or if not, should the 
existing Hive on Spark: Getting Started include a link to it?

* [Hive on Spark: Getting Started | 
https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started]

 Research optimization of auto convert join to map join [Spark branch]
 -

 Key: HIVE-7613
 URL: https://issues.apache.org/jira/browse/HIVE-7613
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Suhas Satish
Priority: Minor
 Fix For: spark-branch

 Attachments: HIve on Spark Map join background.docx, Hive on Spark 
 Join Master Design.pdf, small_table_broadcasting.pdf


 ConvertJoinMapJoin is an optimization the replaces a common join(aka shuffle 
 join) with a map join(aka broadcast or fragment replicate join) when 
 possible. we need to research how to make it workable with Hive on Spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

2014-10-27 Thread Suhas Satish (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14185878#comment-14185878
 ] 

Suhas Satish commented on HIVE-7613:


Submitted patch for HIVE-8616. This can be used as the baseline patch for 
subsequent sub-tasks. 

 Research optimization of auto convert join to map join [Spark branch]
 -

 Key: HIVE-7613
 URL: https://issues.apache.org/jira/browse/HIVE-7613
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Suhas Satish
Priority: Minor
 Attachments: HIve on Spark Map join background.docx


 ConvertJoinMapJoin is an optimization the replaces a common join(aka shuffle 
 join) with a map join(aka broadcast or fragment replicate join) when 
 possible. we need to research how to make it workable with Hive on Spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

2014-09-18 Thread Suhas Satish (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138567#comment-14138567
 ] 

Suhas Satish commented on HIVE-7613:


Hi Xuefu, 
thats a good idea. I was thinking on the lines of calling SparkContext's 
addFile method in each of the N-1 spark jobs in HashTableSinkOperator.java  to 
write the hash tables as files and then read it in the map-only join job in 
MapJoinOperator. But that doesn't involve RDDs.   

 Research optimization of auto convert join to map join [Spark branch]
 -

 Key: HIVE-7613
 URL: https://issues.apache.org/jira/browse/HIVE-7613
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Suhas Satish
Priority: Minor
 Attachments: HIve on Spark Map join background.docx


 ConvertJoinMapJoin is an optimization the replaces a common join(aka shuffle 
 join) with a map join(aka broadcast or fragment replicate join) when 
 possible. we need to research how to make it workable with Hive on Spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

2014-09-18 Thread Suhas Satish (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139611#comment-14139611
 ] 

Suhas Satish commented on HIVE-7613:


{{ConvertJoinMapJoin}} heavily uses {{OptimizeTezProcContext}} . Although we do 
have an equivalent {{OptimizeSparkProcContext}}, the 2 are not derived from any 
common ancestor class. We will need some class hierarchy redesign/refactoring 
to  make ConvertJoinMapJoin be more generic to support multiple execution 
frameworks. 

For now, I am thinking of proceeding with a cloned {{SparkConvertJoinMapJoin}}  
class using {{OptimizeSparkProcContext}}
We might need to open a jira for this refactoring.


 Research optimization of auto convert join to map join [Spark branch]
 -

 Key: HIVE-7613
 URL: https://issues.apache.org/jira/browse/HIVE-7613
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Suhas Satish
Priority: Minor
 Attachments: HIve on Spark Map join background.docx


 ConvertJoinMapJoin is an optimization the replaces a common join(aka shuffle 
 join) with a map join(aka broadcast or fragment replicate join) when 
 possible. we need to research how to make it workable with Hive on Spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

2014-09-17 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138560#comment-14138560
 ] 

Xuefu Zhang commented on HIVE-7613:
---

Here is what I have in mind:

1. For N-way join being converting to a map join, we can run N-1 Spark jobs, 
one for each small input to the join (assuming transformation is needed. If 
not, then we don't need a Spark job). Each job generates some RDD at the end, 
so we have N-1 RDDs in the end.

2. Dump the content of RDDs into the data structure (hash tables) that's needed 
by MapJoinOperator.

3. Call SparkContext.broadcast() on that data structure. This will broadcast 
the data struture to all nodes.

4. Then, we can launch the map only, join job, which can load the broadcasted 
data structure via the HashTableLoader interface.

For more information about Spark's broadcast variable, please refer to 
http://spark.apache.org/docs/latest/programming-guide.html#broadcast-variables.

 Research optimization of auto convert join to map join [Spark branch]
 -

 Key: HIVE-7613
 URL: https://issues.apache.org/jira/browse/HIVE-7613
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Suhas Satish
Priority: Minor
 Attachments: HIve on Spark Map join background.docx


 ConvertJoinMapJoin is an optimization the replaces a common join(aka shuffle 
 join) with a map join(aka broadcast or fragment replicate join) when 
 possible. we need to research how to make it workable with Hive on Spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

2014-09-03 Thread Suhas Satish (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120697#comment-14120697
 ] 

Suhas Satish commented on HIVE-7613:


as a part of this work, we should also enable auto_sortmerge_join_1.q which 
currently fails with 

{code:title=auto_sortmerge_join_1.stackTrace|borderStyle=solid}
2014-09-03 16:12:59,607 ERROR [main]: spark.SparkClient 
(SparkClient.java:execute(166)) - Error executing Spark Plan
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 
1, localhost): java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row {key:0,value:val_0,ds:2008-04-08}

org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:151)

org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:47)

org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:28)

org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:99)

scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
scala.collection.Iterator$class.foreach(Iterator.scala:727)
scala.collection.AbstractIterator.foreach(Iterator.scala:1157)

org.apache.spark.shuffle.hash.HashShuffleWriter.write(HashShuffleWriter.scala:65)

org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)

org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
org.apache.spark.scheduler.Task.run(Task.scala:54)
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)

java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1177)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1166)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1165)
at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at 
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1165)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
at scala.Option.foreach(Option.scala:236)
at 
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1383)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

{code}

 Research optimization of auto convert join to map join [Spark branch]
 -

 Key: HIVE-7613
 URL: https://issues.apache.org/jira/browse/HIVE-7613
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Szehon Ho
Priority: Minor
 Attachments: HIve on Spark Map join background.docx


 ConvertJoinMapJoin is an optimization the replaces a common join(aka shuffle 
 join) with a map join(aka broadcast or fragment replicate join) when 
 possible. we need to research how to make it workable with Hive on Spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

2014-08-24 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14108600#comment-14108600
 ] 

Brock Noland commented on HIVE-7613:


As part of this work we should enable auto_sortmerge_join_13.q which currently 
fails with:

{noformat}
Done query: auto_sortmerge_join_12.q elapsedTime=8s
Begin query: auto_sortmerge_join_13.q
java.lang.NullPointerException
  at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:455)
  at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:836)
  at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:583)
  at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
  at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
  at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
  at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
  at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
  at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
  at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
  at 
org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:175)
  at 
org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
  at 
org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:111)
  at 
scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
  at scala.collection.Iterator$class.foreach(Iterator.scala:727)
  at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
  at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:759)
  at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:759)
  at 
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
  at 
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
  at org.apache.spark.scheduler.Task.run(Task.scala:54)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:744)
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
  at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:459)
  at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:836)
  at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:583)
  at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
  at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
  at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
  at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
  at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
  at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
  at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
  at 
org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:175)
  at 
org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
  at 
org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:111)
  at 
scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
  at scala.collection.Iterator$class.foreach(Iterator.scala:727)
  at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
  at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:759)
  at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:759)
  at 
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
  at 
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
  at org.apache.spark.scheduler.Task.run(Task.scala:54)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.NullPointerException
  at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:455)
{noformat}

 Research optimization of auto convert join to map join [Spark branch]
 -

 Key: HIVE-7613

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join[Spark branch]

2014-08-18 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101475#comment-14101475
 ] 

Brock Noland commented on HIVE-7613:


Note that MapJoin is a stretch goal of M1 and a specific goal of M2.

 Research optimization of auto convert join to map join[Spark branch]
 

 Key: HIVE-7613
 URL: https://issues.apache.org/jira/browse/HIVE-7613
 Project: Hive
  Issue Type: Task
  Components: Spark
Reporter: Chengxiang Li
Priority: Minor

 ConvertJoinMapJoin is an optimization the replaces a common join(aka shuffle 
 join) with a map join(aka broadcast or fragment replicate join) when 
 possible. we need to research how to make it workable with Hive on Spark.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join[Spark branch]

2014-08-05 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085956#comment-14085956
 ] 

Xuefu Zhang commented on HIVE-7613:
---

Thanks for logging this, [~chengxiang li]. This is a good area too look at. 
Since it's optimization, it doesn't belong to either Milestone 1 or 2. Thus, 
let's give a lower priority to this one, unless there are no tasks at a higher 
priority.

 Research optimization of auto convert join to map join[Spark branch]
 

 Key: HIVE-7613
 URL: https://issues.apache.org/jira/browse/HIVE-7613
 Project: Hive
  Issue Type: Task
  Components: Spark
Reporter: Chengxiang Li

 ConvertJoinMapJoin is an optimization the replaces a common join(aka shuffle 
 join) with a map join(aka broadcast or fragment replicate join) when 
 possible. we need to research how to make it workable with Hive on Spark.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join[Spark branch]

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join[Spark branch]

12 matches

Site Navigation

Mail list logo

Footer information