just try dropping in that Jar. Hadoop core ships with an out of date guava JAR to avoid breaking old code downstream, but 2.7.x is designed to work with later versions too (i.e. it has moved off any of the now-removed methods. See https://issues.apache.org/jira/browse/HADOOP-10101 for the specifics
On 23 Oct 2015, at 12:10, jinhong lu <lujinho...@gmail.com<mailto:lujinho...@gmail.com>> wrote: Hi, I run spark to write data to hbase, but found NoSuchMethodException: 15/10/23 18:45:21 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1, dn18-formal.i.nease.net<http://dn18-formal.i.nease.net/>): java.lang.NoSuchMethodError: com.google.common.io.ByteStreams.limit(Ljava/io/InputStream;J)Ljava/io/InputStream; I found guava.jar in hadoop/hbase dir and the version is 12.0, but com.google.common.io.ByteStreams.limit is since 14.0, so NoSuchMethodException occurs. I try to run spark-submmit by - -jars,but the same. and I try to add configuration.set("spark.executor.extraClassPath", "/home/ljh") configuration.set("spark.driver.userClassPathFirst","true"); to my code, still the same. How to solve this? How to remove the guava.jar in hadoop/hbase from class path? why it does not use the guava.jar in spark dir. Here is my code: rdd.foreach({ res => val configuration = HBaseConfiguration.create(); configuration.set("hbase.zookeeper.property.clientPort", "2181"); configuration.set("hbase.zookeeper.quorum", “ip.66"); configuration.set("hbase.master", “ip:60000"); configuration.set("spark.executor.extraClassPath", "/home/ljh") configuration.set("spark.driver.userClassPathFirst","true"); val hadmin = new HBaseAdmin(configuration); configuration.clear(); configuration.addResource("/home/hadoop/conf/core-default.xml") configuration.addResource("/home/hadoop/conf/core-site.xml") configuration.addResource("/home/hadoop/conf/mapred-default.xml") configuration.addResource("/home/hadoop/conf/mapred-site.xml") configuration.addResource("/home/hadoop/conf/yarn-default.xml") configuration.addResource("/home/hadoop/conf/yarn-site.xml") configuration.addResource("/home/hadoop/conf/hdfs-default.xml") configuration.addResource("/home/hadoop/conf/hdfs-site.xml") configuration.addResource("/home/hadoop/conf/hbase-default.xml") configuration.addResource("/home/ljhn1829/hbase-site.xml") val table = new HTable(configuration, "ljh_test2"); var put = new Put(Bytes.toBytes(res.toKey())); put.add(Bytes.toBytes("basic"), Bytes.toBytes("name"), Bytes.toBytes(res.totalCount + "\t" + res.positiveCount)); table.put(put); table.flushCommits() }) and the error message: 15/10/23 19:06:42 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1, gdc-dn126-formal.i.nease.net<http://gdc-dn126-formal.i.nease.net/>): java.lang.NoSuchMethodError: com.google.common.io.ByteStreams.limit(Ljava/io/InputStream;J)Ljava/io/InputStream; at org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.nextBatchStream(ExternalAppendOnlyMap.scala:420) at org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.<init>(ExternalAppendOnlyMap.scala:392) at org.apache.spark.util.collection.ExternalAppendOnlyMap.spill(ExternalAppendOnlyMap.scala:207) at org.apache.spark.util.collection.ExternalAppendOnlyMap.spill(ExternalAppendOnlyMap.scala:63) at org.apache.spark.util.collection.Spillable$class.maybeSpill(Spillable.scala:83) at org.apache.spark.util.collection.ExternalAppendOnlyMap.maybeSpill(ExternalAppendOnlyMap.scala:63) at org.apache.spark.util.collection.ExternalAppendOnlyMap.insertAll(ExternalAppendOnlyMap.scala:129) at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:60) at org.apache.spark.shuffle.hash.HashShuffleReader.read(HashShuffleReader.scala:46) at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:90) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 15/10/23 19:06:42 INFO TaskSetManager: Starting task 0.1 in stage 1.0 (TID 2, gdc-dn166-formal.i.nease.net<http://gdc-dn166-formal.i.nease.net/>, PROCESS_LOCAL, 1277 bytes) 15/10/23 19:06:42 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on gdc-dn166-formal.i.nease.net<http://gdc-dn166-formal.i.nease.net/>:3838 (size: 3.2 KB, free: 1060.3 MB) 15/10/23 19:06:42 ERROR YarnScheduler: Lost executor 1 on gdc-dn126-formal.i.nease.net<http://gdc-dn126-formal.i.nease.net/>: remote Rpc client disassociated 15/10/23 19:06:42 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkexecu...@gdc-dn126-formal.i.nease.net:1656] has failed, address is now gated for [5000] ms. Reason is: [Disassociated]. 15/10/23 19:06:42 INFO TaskSetManager: Re-queueing tasks for 1 from TaskSet 1.0 15/10/23 19:06:42 INFO DAGScheduler: Executor lost: 1 (epoch 1) 15/10/23 19:06:42 INFO BlockManagerMasterEndpoint: Trying to remove executor 1 from BlockManagerMaster. 15/10/23 19:06:42 INFO BlockManagerMasterEndpoint: Removing block manager BlockManagerId(1, gdc-dn126-formal.i.nease.net<http://gdc-dn126-formal.i.nease.net/>, 44635) 15/10/23 19:06:42 INFO BlockManagerMaster: Removed 1 successfully in removeExecutor 15/10/23 19:06:42 INFO ShuffleMapStage: ShuffleMapStage 0 is now unavailable on executor 1 (0/1, false) 15/10/23 19:06:42 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to gdc-dn166-formal.i.nease.net<http://gdc-dn166-formal.i.nease.net/>:28595 15/10/23 19:06:42 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 1 is 84 bytes 15/10/23 19:06:42 WARN TaskSetManager: Lost task 0.1 in stage 1.0 (TID 2, gdc-dn166-formal.i.nease.net<http://gdc-dn166-formal.i.nease.net/>): FetchFailed(null, shuffleId=1, mapId=-1, reduceId=0, message= org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 1 at org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$1.apply(MapOutputTracker.scala:389) at org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$1.apply(MapOutputTracker.scala:386) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108) at org.apache.spark.MapOutputTracker$.org$apache$spark$MapOutputTracker$$convertMapStatuses(MapOutputTracker.scala:385) at org.apache.spark.MapOutputTracker.getServerStatuses(MapOutputTracker.scala:172) at org.apache.spark.shuffle.hash.BlockStoreShuffleFetcher$.fetch(BlockStoreShuffleFetcher.scala:42) at org.apache.spark.shuffle.hash.HashShuffleReader.read(HashShuffleReader.scala:40) at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:90) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)