[jira] [Commented] (HIVE-7387) Guava version conflict between hadoop and spark [Spark-Branch]

2015-01-28 Thread Tim Robertson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295212#comment-14295212
 ] 

Tim Robertson commented on HIVE-7387:
-

This affects anyone trying to use a custom UDF from the Hive CLI when the UDF 
depends on later Guava methods too.  
Suggest reopening this as a valid issue.

 Guava version conflict between hadoop and spark [Spark-Branch]
 --

 Key: HIVE-7387
 URL: https://issues.apache.org/jira/browse/HIVE-7387
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
 Attachments: HIVE-7387-spark.patch


 The guava conflict happens in hive driver compile stage, as in the follow 
 exception stacktrace, conflict happens while initiate spark RDD in 
 SparkClient, hive driver take both guava 11 from hadoop classpath and spark 
 assembly jar which contains guava 14 classes in its classpath, spark invoked 
 HashFunction.hasInt which method does not exists in guava 11 version, obvious 
 the guava 11 version HashFunction is loaded into the JVM, which lead to a  
 NoSuchMethodError during initiate spark RDD.
 {code}
 java.lang.NoSuchMethodError: 
 com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;
   at 
 org.apache.spark.util.collection.OpenHashSet.org$apache$spark$util$collection$OpenHashSet$$hashcode(OpenHashSet.scala:261)
   at 
 org.apache.spark.util.collection.OpenHashSet$mcI$sp.getPos$mcI$sp(OpenHashSet.scala:165)
   at 
 org.apache.spark.util.collection.OpenHashSet$mcI$sp.contains$mcI$sp(OpenHashSet.scala:102)
   at 
 org.apache.spark.util.SizeEstimator$$anonfun$visitArray$2.apply$mcVI$sp(SizeEstimator.scala:214)
   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
   at 
 org.apache.spark.util.SizeEstimator$.visitArray(SizeEstimator.scala:210)
   at 
 org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:169)
   at 
 org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:161)
   at 
 org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:155)
   at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:75)
   at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:92)
   at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:661)
   at org.apache.spark.storage.BlockManager.put(BlockManager.scala:546)
   at 
 org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:812)
   at 
 org.apache.spark.broadcast.HttpBroadcast.init(HttpBroadcast.scala:52)
   at 
 org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:35)
   at 
 org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:29)
   at 
 org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
   at org.apache.spark.SparkContext.broadcast(SparkContext.scala:776)
   at org.apache.spark.rdd.HadoopRDD.init(HadoopRDD.scala:112)
   at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:527)
   at 
 org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:307)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.createRDD(SparkClient.java:204)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:167)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:32)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:159)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)
 {code}
 NO PRECOMMIT TESTS. This is for spark branch only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7387) Guava version conflict between hadoop and spark [Spark-Branch]

2014-10-29 Thread qiaohaijun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14188118#comment-14188118
 ] 

qiaohaijun commented on HIVE-7387:
--

does it merge into spark 1.1.1?

 Guava version conflict between hadoop and spark [Spark-Branch]
 --

 Key: HIVE-7387
 URL: https://issues.apache.org/jira/browse/HIVE-7387
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
 Attachments: HIVE-7387-spark.patch


 The guava conflict happens in hive driver compile stage, as in the follow 
 exception stacktrace, conflict happens while initiate spark RDD in 
 SparkClient, hive driver take both guava 11 from hadoop classpath and spark 
 assembly jar which contains guava 14 classes in its classpath, spark invoked 
 HashFunction.hasInt which method does not exists in guava 11 version, obvious 
 the guava 11 version HashFunction is loaded into the JVM, which lead to a  
 NoSuchMethodError during initiate spark RDD.
 {code}
 java.lang.NoSuchMethodError: 
 com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;
   at 
 org.apache.spark.util.collection.OpenHashSet.org$apache$spark$util$collection$OpenHashSet$$hashcode(OpenHashSet.scala:261)
   at 
 org.apache.spark.util.collection.OpenHashSet$mcI$sp.getPos$mcI$sp(OpenHashSet.scala:165)
   at 
 org.apache.spark.util.collection.OpenHashSet$mcI$sp.contains$mcI$sp(OpenHashSet.scala:102)
   at 
 org.apache.spark.util.SizeEstimator$$anonfun$visitArray$2.apply$mcVI$sp(SizeEstimator.scala:214)
   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
   at 
 org.apache.spark.util.SizeEstimator$.visitArray(SizeEstimator.scala:210)
   at 
 org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:169)
   at 
 org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:161)
   at 
 org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:155)
   at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:75)
   at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:92)
   at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:661)
   at org.apache.spark.storage.BlockManager.put(BlockManager.scala:546)
   at 
 org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:812)
   at 
 org.apache.spark.broadcast.HttpBroadcast.init(HttpBroadcast.scala:52)
   at 
 org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:35)
   at 
 org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:29)
   at 
 org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
   at org.apache.spark.SparkContext.broadcast(SparkContext.scala:776)
   at org.apache.spark.rdd.HadoopRDD.init(HadoopRDD.scala:112)
   at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:527)
   at 
 org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:307)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.createRDD(SparkClient.java:204)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:167)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:32)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:159)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)
 {code}
 NO PRECOMMIT TESTS. This is for spark branch only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7387) Guava version conflict between hadoop and spark [Spark-Branch]

2014-10-29 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189515#comment-14189515
 ] 

Xuefu Zhang commented on HIVE-7387:
---

bq: does it merge into spark 1.1.1?

Yes.

 Guava version conflict between hadoop and spark [Spark-Branch]
 --

 Key: HIVE-7387
 URL: https://issues.apache.org/jira/browse/HIVE-7387
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
 Attachments: HIVE-7387-spark.patch


 The guava conflict happens in hive driver compile stage, as in the follow 
 exception stacktrace, conflict happens while initiate spark RDD in 
 SparkClient, hive driver take both guava 11 from hadoop classpath and spark 
 assembly jar which contains guava 14 classes in its classpath, spark invoked 
 HashFunction.hasInt which method does not exists in guava 11 version, obvious 
 the guava 11 version HashFunction is loaded into the JVM, which lead to a  
 NoSuchMethodError during initiate spark RDD.
 {code}
 java.lang.NoSuchMethodError: 
 com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;
   at 
 org.apache.spark.util.collection.OpenHashSet.org$apache$spark$util$collection$OpenHashSet$$hashcode(OpenHashSet.scala:261)
   at 
 org.apache.spark.util.collection.OpenHashSet$mcI$sp.getPos$mcI$sp(OpenHashSet.scala:165)
   at 
 org.apache.spark.util.collection.OpenHashSet$mcI$sp.contains$mcI$sp(OpenHashSet.scala:102)
   at 
 org.apache.spark.util.SizeEstimator$$anonfun$visitArray$2.apply$mcVI$sp(SizeEstimator.scala:214)
   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
   at 
 org.apache.spark.util.SizeEstimator$.visitArray(SizeEstimator.scala:210)
   at 
 org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:169)
   at 
 org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:161)
   at 
 org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:155)
   at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:75)
   at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:92)
   at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:661)
   at org.apache.spark.storage.BlockManager.put(BlockManager.scala:546)
   at 
 org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:812)
   at 
 org.apache.spark.broadcast.HttpBroadcast.init(HttpBroadcast.scala:52)
   at 
 org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:35)
   at 
 org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:29)
   at 
 org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
   at org.apache.spark.SparkContext.broadcast(SparkContext.scala:776)
   at org.apache.spark.rdd.HadoopRDD.init(HadoopRDD.scala:112)
   at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:527)
   at 
 org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:307)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.createRDD(SparkClient.java:204)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:167)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:32)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:159)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)
 {code}
 NO PRECOMMIT TESTS. This is for spark branch only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7387) Guava version conflict between hadoop and spark [Spark-Branch]

2014-10-24 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183150#comment-14183150
 ] 

Xuefu Zhang commented on HIVE-7387:
---

With SPARK-2848, shading guava in Spark, this is no longer a problem in Hive.

 Guava version conflict between hadoop and spark [Spark-Branch]
 --

 Key: HIVE-7387
 URL: https://issues.apache.org/jira/browse/HIVE-7387
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
 Attachments: HIVE-7387-spark.patch


 The guava conflict happens in hive driver compile stage, as in the follow 
 exception stacktrace, conflict happens while initiate spark RDD in 
 SparkClient, hive driver take both guava 11 from hadoop classpath and spark 
 assembly jar which contains guava 14 classes in its classpath, spark invoked 
 HashFunction.hasInt which method does not exists in guava 11 version, obvious 
 the guava 11 version HashFunction is loaded into the JVM, which lead to a  
 NoSuchMethodError during initiate spark RDD.
 {code}
 java.lang.NoSuchMethodError: 
 com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;
   at 
 org.apache.spark.util.collection.OpenHashSet.org$apache$spark$util$collection$OpenHashSet$$hashcode(OpenHashSet.scala:261)
   at 
 org.apache.spark.util.collection.OpenHashSet$mcI$sp.getPos$mcI$sp(OpenHashSet.scala:165)
   at 
 org.apache.spark.util.collection.OpenHashSet$mcI$sp.contains$mcI$sp(OpenHashSet.scala:102)
   at 
 org.apache.spark.util.SizeEstimator$$anonfun$visitArray$2.apply$mcVI$sp(SizeEstimator.scala:214)
   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
   at 
 org.apache.spark.util.SizeEstimator$.visitArray(SizeEstimator.scala:210)
   at 
 org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:169)
   at 
 org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:161)
   at 
 org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:155)
   at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:75)
   at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:92)
   at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:661)
   at org.apache.spark.storage.BlockManager.put(BlockManager.scala:546)
   at 
 org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:812)
   at 
 org.apache.spark.broadcast.HttpBroadcast.init(HttpBroadcast.scala:52)
   at 
 org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:35)
   at 
 org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:29)
   at 
 org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
   at org.apache.spark.SparkContext.broadcast(SparkContext.scala:776)
   at org.apache.spark.rdd.HadoopRDD.init(HadoopRDD.scala:112)
   at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:527)
   at 
 org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:307)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.createRDD(SparkClient.java:204)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:167)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:32)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:159)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)
 {code}
 NO PRECOMMIT TESTS. This is for spark branch only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7387) Guava version conflict between hadoop and spark [Spark-Branch]

2014-07-23 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072781#comment-14072781
 ] 

Xuefu Zhang commented on HIVE-7387:
---

{quote}
This patch provided by Sean Owen may solve the conflict with guava. I tested it 
with Spark v1.0.1.
{quote}

To clarify, the patch isn't for Hive, but for Spark on top of 1.0.1, as a 
workaround for this problem: simply apply the patch to your own Spark build.

[~szehon] Could you included this as your wiki get started instructions?

 Guava version conflict between hadoop and spark [Spark-Branch]
 --

 Key: HIVE-7387
 URL: https://issues.apache.org/jira/browse/HIVE-7387
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
 Attachments: HIVE-7387-spark.patch


 The guava conflict happens in hive driver compile stage, as in the follow 
 exception stacktrace, conflict happens while initiate spark RDD in 
 SparkClient, hive driver take both guava 11 from hadoop classpath and spark 
 assembly jar which contains guava 14 classes in its classpath, spark invoked 
 HashFunction.hasInt which method does not exists in guava 11 version, obvious 
 the guava 11 version HashFunction is loaded into the JVM, which lead to a  
 NoSuchMethodError during initiate spark RDD.
 {code}
 java.lang.NoSuchMethodError: 
 com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;
   at 
 org.apache.spark.util.collection.OpenHashSet.org$apache$spark$util$collection$OpenHashSet$$hashcode(OpenHashSet.scala:261)
   at 
 org.apache.spark.util.collection.OpenHashSet$mcI$sp.getPos$mcI$sp(OpenHashSet.scala:165)
   at 
 org.apache.spark.util.collection.OpenHashSet$mcI$sp.contains$mcI$sp(OpenHashSet.scala:102)
   at 
 org.apache.spark.util.SizeEstimator$$anonfun$visitArray$2.apply$mcVI$sp(SizeEstimator.scala:214)
   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
   at 
 org.apache.spark.util.SizeEstimator$.visitArray(SizeEstimator.scala:210)
   at 
 org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:169)
   at 
 org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:161)
   at 
 org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:155)
   at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:75)
   at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:92)
   at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:661)
   at org.apache.spark.storage.BlockManager.put(BlockManager.scala:546)
   at 
 org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:812)
   at 
 org.apache.spark.broadcast.HttpBroadcast.init(HttpBroadcast.scala:52)
   at 
 org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:35)
   at 
 org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:29)
   at 
 org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
   at org.apache.spark.SparkContext.broadcast(SparkContext.scala:776)
   at org.apache.spark.rdd.HadoopRDD.init(HadoopRDD.scala:112)
   at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:527)
   at 
 org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:307)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.createRDD(SparkClient.java:204)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:167)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:32)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:159)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)
 {code}
 NO PRECOMMIT TESTS. This is for spark branch only.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7387) Guava version conflict between hadoop and spark [Spark-Branch]

2014-07-23 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072855#comment-14072855
 ] 

Szehon Ho commented on HIVE-7387:
-

Sure, will add this once I add the page.

 Guava version conflict between hadoop and spark [Spark-Branch]
 --

 Key: HIVE-7387
 URL: https://issues.apache.org/jira/browse/HIVE-7387
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
 Attachments: HIVE-7387-spark.patch


 The guava conflict happens in hive driver compile stage, as in the follow 
 exception stacktrace, conflict happens while initiate spark RDD in 
 SparkClient, hive driver take both guava 11 from hadoop classpath and spark 
 assembly jar which contains guava 14 classes in its classpath, spark invoked 
 HashFunction.hasInt which method does not exists in guava 11 version, obvious 
 the guava 11 version HashFunction is loaded into the JVM, which lead to a  
 NoSuchMethodError during initiate spark RDD.
 {code}
 java.lang.NoSuchMethodError: 
 com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;
   at 
 org.apache.spark.util.collection.OpenHashSet.org$apache$spark$util$collection$OpenHashSet$$hashcode(OpenHashSet.scala:261)
   at 
 org.apache.spark.util.collection.OpenHashSet$mcI$sp.getPos$mcI$sp(OpenHashSet.scala:165)
   at 
 org.apache.spark.util.collection.OpenHashSet$mcI$sp.contains$mcI$sp(OpenHashSet.scala:102)
   at 
 org.apache.spark.util.SizeEstimator$$anonfun$visitArray$2.apply$mcVI$sp(SizeEstimator.scala:214)
   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
   at 
 org.apache.spark.util.SizeEstimator$.visitArray(SizeEstimator.scala:210)
   at 
 org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:169)
   at 
 org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:161)
   at 
 org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:155)
   at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:75)
   at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:92)
   at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:661)
   at org.apache.spark.storage.BlockManager.put(BlockManager.scala:546)
   at 
 org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:812)
   at 
 org.apache.spark.broadcast.HttpBroadcast.init(HttpBroadcast.scala:52)
   at 
 org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:35)
   at 
 org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:29)
   at 
 org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
   at org.apache.spark.SparkContext.broadcast(SparkContext.scala:776)
   at org.apache.spark.rdd.HadoopRDD.init(HadoopRDD.scala:112)
   at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:527)
   at 
 org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:307)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.createRDD(SparkClient.java:204)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:167)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:32)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:159)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)
 {code}
 NO PRECOMMIT TESTS. This is for spark branch only.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7387) Guava version conflict between hadoop and spark [Spark-Branch]

2014-07-16 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064187#comment-14064187
 ] 

Sean Owen commented on HIVE-7387:
-

Hi Xuefu, I was wrong about Spark not using Guava 12+. It does now. I posted an 
update on the Spark JIRA. That makes it somewhat harder to downgrade, although 
not much. I would not characterize it as not being taken seriously. There are 
legitimate questions here, like why Hadoop can't get off of Guava 11, which is 
about 2.5 years old now. It was very helpful to link the Spark JIRA to this 
one, which has the details.

 Guava version conflict between hadoop and spark [Spark-Branch]
 --

 Key: HIVE-7387
 URL: https://issues.apache.org/jira/browse/HIVE-7387
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li

 hadoop-hdfs and hadoop-comman have dependency on guava-11.0.2.jar, and spark 
 dependent on guava-14.0.1.jar. guava-11.0.2 has API conflict with 
 guava-14.0.1, as Hive CLI load both dependency into classpath currently, 
 query failed on either spark engine or mr engine.
 {code}
 java.lang.NoSuchMethodError: 
 com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;
   at 
 org.apache.spark.util.collection.OpenHashSet.org$apache$spark$util$collection$OpenHashSet$$hashcode(OpenHashSet.scala:261)
   at 
 org.apache.spark.util.collection.OpenHashSet$mcI$sp.getPos$mcI$sp(OpenHashSet.scala:165)
   at 
 org.apache.spark.util.collection.OpenHashSet$mcI$sp.contains$mcI$sp(OpenHashSet.scala:102)
   at 
 org.apache.spark.util.SizeEstimator$$anonfun$visitArray$2.apply$mcVI$sp(SizeEstimator.scala:214)
   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
   at 
 org.apache.spark.util.SizeEstimator$.visitArray(SizeEstimator.scala:210)
   at 
 org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:169)
   at 
 org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:161)
   at 
 org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:155)
   at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:75)
   at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:92)
   at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:661)
   at org.apache.spark.storage.BlockManager.put(BlockManager.scala:546)
   at 
 org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:812)
   at 
 org.apache.spark.broadcast.HttpBroadcast.init(HttpBroadcast.scala:52)
   at 
 org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:35)
   at 
 org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:29)
   at 
 org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
   at org.apache.spark.SparkContext.broadcast(SparkContext.scala:776)
   at org.apache.spark.rdd.HadoopRDD.init(HadoopRDD.scala:112)
   at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:527)
   at 
 org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:307)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.createRDD(SparkClient.java:204)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:167)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:32)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:159)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)
 {code}
 NO PRECOMMIT TESTS. This is for spark branch only.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7387) Guava version conflict between hadoop and spark [Spark-Branch]

2014-07-16 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064204#comment-14064204
 ] 

Xuefu Zhang commented on HIVE-7387:
---

Thanks, Sean. Linked.

 Guava version conflict between hadoop and spark [Spark-Branch]
 --

 Key: HIVE-7387
 URL: https://issues.apache.org/jira/browse/HIVE-7387
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li

 hadoop-hdfs and hadoop-comman have dependency on guava-11.0.2.jar, and spark 
 dependent on guava-14.0.1.jar. guava-11.0.2 has API conflict with 
 guava-14.0.1, as Hive CLI load both dependency into classpath currently, 
 query failed on either spark engine or mr engine.
 {code}
 java.lang.NoSuchMethodError: 
 com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;
   at 
 org.apache.spark.util.collection.OpenHashSet.org$apache$spark$util$collection$OpenHashSet$$hashcode(OpenHashSet.scala:261)
   at 
 org.apache.spark.util.collection.OpenHashSet$mcI$sp.getPos$mcI$sp(OpenHashSet.scala:165)
   at 
 org.apache.spark.util.collection.OpenHashSet$mcI$sp.contains$mcI$sp(OpenHashSet.scala:102)
   at 
 org.apache.spark.util.SizeEstimator$$anonfun$visitArray$2.apply$mcVI$sp(SizeEstimator.scala:214)
   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
   at 
 org.apache.spark.util.SizeEstimator$.visitArray(SizeEstimator.scala:210)
   at 
 org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:169)
   at 
 org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:161)
   at 
 org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:155)
   at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:75)
   at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:92)
   at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:661)
   at org.apache.spark.storage.BlockManager.put(BlockManager.scala:546)
   at 
 org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:812)
   at 
 org.apache.spark.broadcast.HttpBroadcast.init(HttpBroadcast.scala:52)
   at 
 org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:35)
   at 
 org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:29)
   at 
 org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
   at org.apache.spark.SparkContext.broadcast(SparkContext.scala:776)
   at org.apache.spark.rdd.HadoopRDD.init(HadoopRDD.scala:112)
   at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:527)
   at 
 org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:307)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.createRDD(SparkClient.java:204)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:167)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:32)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:159)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)
 {code}
 NO PRECOMMIT TESTS. This is for spark branch only.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7387) Guava version conflict between hadoop and spark [Spark-Branch]

2014-07-15 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062065#comment-14062065
 ] 

Xuefu Zhang commented on HIVE-7387:
---

[~lirui] Guava 11 are used in many hadoop components. Upgrading it across board 
poses a big challenge. However, Spark isn't using anythinging from Guava 12+, 
so I proposed in SPARK-2420 that guava be downgraded to 11 in Spark. 
Unfortunately, it hasn't been taken seriously in Spark community.

 Guava version conflict between hadoop and spark [Spark-Branch]
 --

 Key: HIVE-7387
 URL: https://issues.apache.org/jira/browse/HIVE-7387
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li

 hadoop-hdfs and hadoop-comman have dependency on guava-11.0.2.jar, and spark 
 dependent on guava-14.0.1.jar. guava-11.0.2 has API conflict with 
 guava-14.0.1, as Hive CLI load both dependency into classpath currently, 
 query failed on either spark engine or mr engine.
 java.lang.NoSuchMethodError: 
 com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;
   at 
 org.apache.spark.util.collection.OpenHashSet.org$apache$spark$util$collection$OpenHashSet$$hashcode(OpenHashSet.scala:261)
   at 
 org.apache.spark.util.collection.OpenHashSet$mcI$sp.getPos$mcI$sp(OpenHashSet.scala:165)
   at 
 org.apache.spark.util.collection.OpenHashSet$mcI$sp.contains$mcI$sp(OpenHashSet.scala:102)
   at 
 org.apache.spark.util.SizeEstimator$$anonfun$visitArray$2.apply$mcVI$sp(SizeEstimator.scala:214)
   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
   at 
 org.apache.spark.util.SizeEstimator$.visitArray(SizeEstimator.scala:210)
   at 
 org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:169)
   at 
 org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:161)
   at 
 org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:155)
   at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:75)
   at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:92)
   at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:661)
   at org.apache.spark.storage.BlockManager.put(BlockManager.scala:546)
   at 
 org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:812)
   at 
 org.apache.spark.broadcast.HttpBroadcast.init(HttpBroadcast.scala:52)
   at 
 org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:35)
   at 
 org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:29)
   at 
 org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
   at org.apache.spark.SparkContext.broadcast(SparkContext.scala:776)
   at org.apache.spark.rdd.HadoopRDD.init(HadoopRDD.scala:112)
   at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:527)
   at 
 org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:307)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.createRDD(SparkClient.java:204)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:167)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:32)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:159)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)
 NO PRECOMMIT TESTS. This is for spark branch only.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7387) Guava version conflict between hadoop and spark [Spark-Branch]

2014-07-11 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14058438#comment-14058438
 ] 

Rui Li commented on HIVE-7387:
--

Seems that hive (spark branch) also depends on guava-14. Can we bump the 
version to 14 in hadoop too?

 Guava version conflict between hadoop and spark [Spark-Branch]
 --

 Key: HIVE-7387
 URL: https://issues.apache.org/jira/browse/HIVE-7387
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Chengxiang Li

 hadoop-hdfs and hadoop-comman have dependency on guava-11.0.2.jar, and spark 
 dependent on guava-14.0.1.jar. guava-11.0.2 has API conflict with 
 guava-14.0.1, as Hive CLI load both dependency into classpath currently, 
 query failed on either spark engine or mr engine.
 java.lang.NoSuchMethodError: 
 com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;
   at 
 org.apache.spark.util.collection.OpenHashSet.org$apache$spark$util$collection$OpenHashSet$$hashcode(OpenHashSet.scala:261)
   at 
 org.apache.spark.util.collection.OpenHashSet$mcI$sp.getPos$mcI$sp(OpenHashSet.scala:165)
   at 
 org.apache.spark.util.collection.OpenHashSet$mcI$sp.contains$mcI$sp(OpenHashSet.scala:102)
   at 
 org.apache.spark.util.SizeEstimator$$anonfun$visitArray$2.apply$mcVI$sp(SizeEstimator.scala:214)
   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
   at 
 org.apache.spark.util.SizeEstimator$.visitArray(SizeEstimator.scala:210)
   at 
 org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:169)
   at 
 org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:161)
   at 
 org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:155)
   at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:75)
   at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:92)
   at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:661)
   at org.apache.spark.storage.BlockManager.put(BlockManager.scala:546)
   at 
 org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:812)
   at 
 org.apache.spark.broadcast.HttpBroadcast.init(HttpBroadcast.scala:52)
   at 
 org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:35)
   at 
 org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:29)
   at 
 org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
   at org.apache.spark.SparkContext.broadcast(SparkContext.scala:776)
   at org.apache.spark.rdd.HadoopRDD.init(HadoopRDD.scala:112)
   at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:527)
   at 
 org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:307)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.createRDD(SparkClient.java:204)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:167)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:32)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:159)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)
 NO PRECOMMIT TESTS. This is for spark branch only.



--
This message was sent by Atlassian JIRA
(v6.2#6252)