Re: spark streaming 1.3 kafka error
Exception comes when client has so many connections to some another external server also. So I think Exception is coming because of client side issue only- server side there is no issue. Want to understand is executor(simple consumer) not making new connection to kafka broker at start of each task ? Or is it created once only and that is getting closed somehow ? On Sat, Aug 22, 2015 at 9:41 AM, Shushant Arora shushantaror...@gmail.com wrote: it comes at start of each tasks when there is new data inserted in kafka.( data inserted is very few) kafka topic has 300 partitions - data inserted is ~10 MB. Tasks gets failed and it retries which succeed and after certain no of fail tasks it kills the job. On Sat, Aug 22, 2015 at 2:08 AM, Akhil Das ak...@sigmoidanalytics.com wrote: That looks like you are choking your kafka machine. Do a top on the kafka machines and see the workload, it may happen that you are spending too much time on disk io etc. On Aug 21, 2015 7:32 AM, Cody Koeninger c...@koeninger.org wrote: Sounds like that's happening consistently, not an occasional network problem? Look at the Kafka broker logs Make sure you've configured the correct kafka broker hosts / ports (note that direct stream does not use zookeeper host / port). Make sure that host / port is reachable from your driver and worker nodes, ie telnet or netcat to it. It looks like your driver can reach it (since there's partition info in the logs), but that doesn't mean the worker can. Use lsof / netstat to see what's going on with those ports while the job is running, or tcpdump if you need to. If you can't figure out what's going on from a networking point of view, post a minimal reproducible code sample that demonstrates the issue, so it can be tested in a different environment. On Fri, Aug 21, 2015 at 4:06 AM, Shushant Arora shushantaror...@gmail.com wrote: Hi Getting below error in spark streaming 1.3 while consuming from kafka using directkafka stream. Few of tasks are getting failed in each run. What is the reason /solution of this error? 15/08/21 08:54:54 ERROR executor.Executor: Exception in task 262.0 in stage 130.0 (TID 16332) java.io.EOFException: Received -1 when reading from channel, socket has likely been closed. at kafka.utils.Utils$.read(Utils.scala:376) at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) at kafka.network.Receive$class.readCompletely(Transmission.scala:56) at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29) at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100) at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:81) at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:71) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:109) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:109) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:109) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:108) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:108) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:108) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:107) at org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator.fetchBatch(KafkaRDD.scala:150) at org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator.getNext(KafkaRDD.scala:162) at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:210) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:64) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 15/08/21 08:54:54 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 16348 15/08/21 08:54:54 INFO executor.Executor: Running task 260.1 in stage 130.0 (TID 16348) 15/08/21 08:54:54 INFO kafka.KafkaRDD: Computing topic test_hbrealtimeevents, partition 75
Re: Transformation not happening for reduceByKey or GroupByKey
HI All, Currently using DSE 4.7 and Spark 1.2.2 version Regards, Satish On Fri, Aug 21, 2015 at 7:30 PM, java8964 java8...@hotmail.com wrote: What version of Spark you are using, or comes with DSE 4.7? We just cannot reproduce it in Spark. yzhang@localhost$ more test.spark val pairs = sc.makeRDD(Seq((0,1),(0,2),(1,20),(1,30),(2,40))) pairs.reduceByKey((x,y) = x + y).collect yzhang@localhost$ ~/spark/bin/spark-shell --master local -i test.spark Welcome to __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 1.3.1 /_/ Using Scala version 2.10.4 Spark context available as sc. SQL context available as sqlContext. Loading test.spark... pairs: org.apache.spark.rdd.RDD[(Int, Int)] = ParallelCollectionRDD[0] at makeRDD at console:21 15/08/21 09:58:51 WARN SizeEstimator: Failed to check whether UseCompressedOops is set; assuming yes res0: Array[(Int, Int)] = Array((0,3), (1,50), (2,40)) Yong -- Date: Fri, 21 Aug 2015 19:24:09 +0530 Subject: Re: Transformation not happening for reduceByKey or GroupByKey From: jsatishchan...@gmail.com To: abhis...@tetrationanalytics.com CC: user@spark.apache.org HI Abhishek, I have even tried that but rdd2 is empty Regards, Satish On Fri, Aug 21, 2015 at 6:47 PM, Abhishek R. Singh abhis...@tetrationanalytics.com wrote: You had: RDD.reduceByKey((x,y) = x+y) RDD.take(3) Maybe try: rdd2 = RDD.reduceByKey((x,y) = x+y) rdd2.take(3) -Abhishek- On Aug 20, 2015, at 3:05 AM, satish chandra j jsatishchan...@gmail.com wrote: HI All, I have data in RDD as mentioned below: RDD : Array[(Int),(Int)] = Array((0,1), (0,2),(1,20),(1,30),(2,40)) I am expecting output as Array((0,3),(1,50),(2,40)) just a sum function on Values for each key Code: RDD.reduceByKey((x,y) = x+y) RDD.take(3) Result in console: RDD: org.apache.spark.rdd.RDD[(Int,Int)]= ShuffledRDD[1] at reduceByKey at console:73 res:Array[(Int,Int)] = Array() Command as mentioned dse spark --master local --jars postgresql-9.4-1201.jar -i ScriptFile Please let me know what is missing in my code, as my resultant Array is empty Regards, Satish
Re: Spark streaming multi-tasking during I/O
Hmm for a singl core VM you will have to run it in local mode(specifying master= local[4]). The flag is available in all the versions of spark i guess. On Aug 22, 2015 5:04 AM, Sateesh Kavuri sateesh.kav...@gmail.com wrote: Thanks Akhil. Does this mean that the executor running in the VM can spawn two concurrent jobs on the same core? If this is the case, this is what we are looking for. Also, which version of Spark is this flag in? Thanks, Sateesh On Sat, Aug 22, 2015 at 1:44 AM, Akhil Das ak...@sigmoidanalytics.com wrote: You can look at the spark.streaming.concurrentJobs by default it runs a single job. If set it to 2 then it can run 2 jobs parallely. Its an experimental flag, but go ahead and give it a try. On Aug 21, 2015 3:36 AM, Sateesh Kavuri sateesh.kav...@gmail.com wrote: Hi, My scenario goes like this: I have an algorithm running in Spark streaming mode on a 4 core virtual machine. Majority of the time, the algorithm does disk I/O and database I/O. Question is, during the I/O, where the CPU is not considerably loaded, is it possible to run any other task/thread so as to efficiently utilize the CPU? Note that one DStream of the algorithm runs completely on a single CPU Thank you, Sateesh
sparkStreaming how to work with partitions,how tp create partition
1. how to work with partition in spark streaming from kafka 2. how to create partition in spark streaming from kafka when i send the message from kafka topic having three partitions. Spark will listen the message when i say kafkautils.createStream or createDirectstSream have local[4] Now i want to see if spark will create partitions when it receive message from kafka using dstream, how and where ,prwhich method of spark api i have to see to find out
Re: Spark streaming multi-tasking during I/O
Hi Rishitesh, We are not using any RDD's to parallelize the processing and all of the algorithm runs on a single core (and in a single thread). The parallelism is done at the user level The disk can be started in a separate IO, but then the executor will not be able to take up more jobs, since thats how I believe Spark is designed by default On Sat, Aug 22, 2015 at 12:51 AM, Rishitesh Mishra rishi80.mis...@gmail.com wrote: Hi Sateesh, It is interesting to know , how did you determine that the Dstream runs on a single core. Did you mean receivers? Coming back to your question, could you not start disk io in a separate thread, so that the sceduler can go ahead and assign other tasks ? On 21 Aug 2015 16:06, Sateesh Kavuri sateesh.kav...@gmail.com wrote: Hi, My scenario goes like this: I have an algorithm running in Spark streaming mode on a 4 core virtual machine. Majority of the time, the algorithm does disk I/O and database I/O. Question is, during the I/O, where the CPU is not considerably loaded, is it possible to run any other task/thread so as to efficiently utilize the CPU? Note that one DStream of the algorithm runs completely on a single CPU Thank you, Sateesh
Re: spark streaming 1.3 kafka error
On trying the consumer without external connections or with low number of external conections its working fine - so doubt is how socket got closed - java.io.EOFException: Received -1 when reading from channel, socket has likely been closed. On Sat, Aug 22, 2015 at 7:24 PM, Akhil Das ak...@sigmoidanalytics.com wrote: Can you try some other consumer and see if the issue still exists? On Aug 22, 2015 12:47 AM, Shushant Arora shushantaror...@gmail.com wrote: Exception comes when client has so many connections to some another external server also. So I think Exception is coming because of client side issue only- server side there is no issue. Want to understand is executor(simple consumer) not making new connection to kafka broker at start of each task ? Or is it created once only and that is getting closed somehow ? On Sat, Aug 22, 2015 at 9:41 AM, Shushant Arora shushantaror...@gmail.com wrote: it comes at start of each tasks when there is new data inserted in kafka.( data inserted is very few) kafka topic has 300 partitions - data inserted is ~10 MB. Tasks gets failed and it retries which succeed and after certain no of fail tasks it kills the job. On Sat, Aug 22, 2015 at 2:08 AM, Akhil Das ak...@sigmoidanalytics.com wrote: That looks like you are choking your kafka machine. Do a top on the kafka machines and see the workload, it may happen that you are spending too much time on disk io etc. On Aug 21, 2015 7:32 AM, Cody Koeninger c...@koeninger.org wrote: Sounds like that's happening consistently, not an occasional network problem? Look at the Kafka broker logs Make sure you've configured the correct kafka broker hosts / ports (note that direct stream does not use zookeeper host / port). Make sure that host / port is reachable from your driver and worker nodes, ie telnet or netcat to it. It looks like your driver can reach it (since there's partition info in the logs), but that doesn't mean the worker can. Use lsof / netstat to see what's going on with those ports while the job is running, or tcpdump if you need to. If you can't figure out what's going on from a networking point of view, post a minimal reproducible code sample that demonstrates the issue, so it can be tested in a different environment. On Fri, Aug 21, 2015 at 4:06 AM, Shushant Arora shushantaror...@gmail.com wrote: Hi Getting below error in spark streaming 1.3 while consuming from kafka using directkafka stream. Few of tasks are getting failed in each run. What is the reason /solution of this error? 15/08/21 08:54:54 ERROR executor.Executor: Exception in task 262.0 in stage 130.0 (TID 16332) java.io.EOFException: Received -1 when reading from channel, socket has likely been closed. at kafka.utils.Utils$.read(Utils.scala:376) at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) at kafka.network.Receive$class.readCompletely(Transmission.scala:56) at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29) at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100) at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:81) at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:71) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:109) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:109) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:109) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:108) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:108) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:108) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:107) at org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator.fetchBatch(KafkaRDD.scala:150) at org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator.getNext(KafkaRDD.scala:162) at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:210) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:64) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) at
subscribe
subscribe - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
pickling error with PySpark and Elasticsearch-py analyzer
Reposting my question from SO: http://stackoverflow.com/questions/32161865/elasticsearch-analyze-not-compatible-with-spark-in-python I'm using the elasticsearch-py client within PySpark using Python 3 and I'm running into a problem using the analyze() function with ES in conjunction with an RDD. In particular, each record in my RDD is a string of text and I'm trying to analyze it to get out the token information, but I'm getting an error when trying to use it within a map function in Spark. For example, this works perfectly fine: from elasticsearch import Elasticsearch es = Elasticsearch() t = 'the quick brown fox' es.indices.analyze(text=t)['tokens'][0] {'end_offset': 3, 'position': 1, 'start_offset': 0, 'token': 'the', 'type': 'ALPHANUM'} However, when I try this: trdd = sc.parallelize(['the quick brown fox']) trdd.map(lambda x: es.indices.analyze(text=x)['tokens'][0]).collect() I get a really really long error message related to pickling (Here's the end of it): (self, obj)109if'recursion'in.[0]:110=Could not pickle object as excessively deep recursion required.-- 111 picklePicklingErrormsg save_memoryviewself obj : Could not pickle object as excessively deep recursion required. raise.()112113def(,):PicklingError I'm not sure what the error means. Am I doing something wrong? Is there a way to map the ES analyze function onto records of an RDD? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/pickling-error-with-PySpark-and-Elasticsearch-py-analyzer-tp24402.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark streaming multi-tasking during I/O
Hi Akhil, Think of the scenario as running a piece of code in normal Java with multiple threads. Lets say there are 4 threads spawned by a Java process to handle reading from database, some processing and storing to database. In this process, while a thread is performing a database I/O, the CPU could allow another thread to perform the processing, thus efficiently using the resources. Incase of Spark, while a node executor is running the same read from DB = process data = store to DB, during the read from DB and store to DB phase, the CPU is not given to other requests in queue, since the executor will allocate the resources completely to the current ongoing request. Does not flag spark.streaming.concurrentJobs enable this kind of scenario or is there any other way to achieve what I am looking for Thanks, Sateesh On Sat, Aug 22, 2015 at 7:26 PM, Akhil Das ak...@sigmoidanalytics.com wrote: Hmm for a singl core VM you will have to run it in local mode(specifying master= local[4]). The flag is available in all the versions of spark i guess. On Aug 22, 2015 5:04 AM, Sateesh Kavuri sateesh.kav...@gmail.com wrote: Thanks Akhil. Does this mean that the executor running in the VM can spawn two concurrent jobs on the same core? If this is the case, this is what we are looking for. Also, which version of Spark is this flag in? Thanks, Sateesh On Sat, Aug 22, 2015 at 1:44 AM, Akhil Das ak...@sigmoidanalytics.com wrote: You can look at the spark.streaming.concurrentJobs by default it runs a single job. If set it to 2 then it can run 2 jobs parallely. Its an experimental flag, but go ahead and give it a try. On Aug 21, 2015 3:36 AM, Sateesh Kavuri sateesh.kav...@gmail.com wrote: Hi, My scenario goes like this: I have an algorithm running in Spark streaming mode on a 4 core virtual machine. Majority of the time, the algorithm does disk I/O and database I/O. Question is, during the I/O, where the CPU is not considerably loaded, is it possible to run any other task/thread so as to efficiently utilize the CPU? Note that one DStream of the algorithm runs completely on a single CPU Thank you, Sateesh
Re: Using spark streaming to load data from Kafka to HDFS
Last time I checked, Camus doesn't support storing data as parquet, which is a deal breaker for me. Otherwise it works well for my Kafka topics with low data volume. I am currently using spark streaming to ingest data, generate semi-realtime stats and publish to a dashboard, and dump full dataset into hdfs in parquet at a longer interval. One problem is that storing parquet is sometimes time consuming, and that cause delay of my regular stats-generating tasks. I am thinking of splitting my streaming job into two, one for parquet output and one for stats generation, but obviously this would consume data from Kafka twice. -Simon On Wednesday, May 6, 2015, Rendy Bambang Junior rendy.b.jun...@gmail.com wrote: Because using spark streaming looks like a lot simpler. Whats the difference between Camus and Kafka Streaming for this case? Why Camus excel? Rendy On Wed, May 6, 2015 at 2:15 PM, Saisai Shao sai.sai.s...@gmail.com javascript:_e(%7B%7D,'cvml','sai.sai.s...@gmail.com'); wrote: Also Kafka has a Hadoop consumer API for doing such things, please refer to http://kafka.apache.org/081/documentation.html#kafkahadoopconsumerapi 2015-05-06 12:22 GMT+08:00 MrAsanjar . afsan...@gmail.com javascript:_e(%7B%7D,'cvml','afsan...@gmail.com');: why not try https://github.com/linkedin/camus - camus is kafka to HDFS pipeline On Tue, May 5, 2015 at 11:13 PM, Rendy Bambang Junior rendy.b.jun...@gmail.com javascript:_e(%7B%7D,'cvml','rendy.b.jun...@gmail.com'); wrote: Hi all, I am planning to load data from Kafka to HDFS. Is it normal to use spark streaming to load data from Kafka to HDFS? What are concerns on doing this? There are no processing to be done by Spark, only to store data to HDFS from Kafka for storage and for further Spark processing Rendy
Re: subscribe
https://www.youtube.com/watch?v=umDr0mPuyQc On Sat, Aug 22, 2015 at 8:01 AM, Ted Yu yuzhih...@gmail.com wrote: See http://spark.apache.org/community.html Cheers On Sat, Aug 22, 2015 at 2:51 AM, Lars Hermes li...@hermes-it-consulting.de wrote: subscribe - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
spark 1.4.1 - LZFException
Hi All, We have a spark standalone cluster running 1.4.1 and we are setting spark.io.compression.codec to lzf. I have a long running interactive application which behaves as normal, but after a few days I get the following exception in multiple jobs. Any ideas on what could be causing this ? Yadid Job aborted due to stage failure: Task 27 in stage 286.0 failed 4 times, most recent failure: Lost task 27.3 in stage 286.0 (TID 516817, xx.yy.zz.ww): com.esotericsoftware.kryo.KryoException: com.ning.compress.lzf.LZFException: Corrupt input data, block did not start with 2 byte signature ('ZV') followed by type byte, 2-byte length) at com.esotericsoftware.kryo.io.Input.fill(Input.java:142) at com.esotericsoftware.kryo.io.Input.require(Input.java:155) at com.esotericsoftware.kryo.io.Input.readInt(Input.java:337) at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:109) at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:610) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:721) at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:182) at org.apache.spark.serializer.DeserializationStream.readKey(Serializer.scala:169) at org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:200) at org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:197) at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71) at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) at org.apache.spark.util.collection.ExternalAppendOnlyMap.insertAll(ExternalAppendOnlyMap.scala:127) at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:60) at org.apache.spark.shuffle.hash.HashShuffleReader.read(HashShuffleReader.scala:46) at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:90) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69) at org.apache.spark.rdd.RDD.iterator(RDD.scala:242) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69) at org.apache.spark.rdd.RDD.iterator(RDD.scala:242) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: com.ning.compress.lzf.LZFException: Corrupt input data, block did not start with 2 byte signature ('ZV') followed by type byte, 2-byte length) at com.ning.compress.lzf.ChunkDecoder._reportCorruptHeader(ChunkDecoder.java:267) at com.ning.compress.lzf.impl.UnsafeChunkDecoder.decodeChunk(UnsafeChunkDecoder.java:55) at com.ning.compress.lzf.LZFInputStream.readyBuffer(LZFInputStream.java:363) at com.ning.compress.lzf.LZFInputStream.read(LZFInputStream.java:193) at com.esotericsoftware.kryo.io.Input.fill(Input.java:140) ... 37 more
how to migrate from spark 0.9 to spark 1.4
currently i am using spark 0.9 on my data i wrote code in java for sparksql.now i want to use spark 1.4 so how to do and what changes i have to do for tables.i ahve .sql file,pom file,.py file. iam using s3 for storage -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/how-to-migrate-from-spark-0-9-to-spark-1-4-tp24403.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Worker Machine running out of disk for Long running Streaming process
Interesting. TD, can you please throw some light on why this is and point to the relevant code in Spark repo. It will help in a better understanding of things that can affect a long running streaming job. On Aug 21, 2015 1:44 PM, Tathagata Das t...@databricks.com wrote: Could you periodically (say every 10 mins) run System.gc() on the driver. The cleaning up shuffles is tied to the garbage collection. On Fri, Aug 21, 2015 at 2:59 AM, gaurav sharma sharmagaura...@gmail.com wrote: Hi All, I have a 24x7 running Streaming Process, which runs on 2 hour windowed data The issue i am facing is my worker machines are running OUT OF DISK space I checked that the SHUFFLE FILES are not getting cleaned up. /log/spark-2b875d98-1101-4e61-86b4-67c9e71954cc/executor-5bbb53c1-cee9-4438-87a2-b0f2becfac6f/blockmgr-c905b93b-c817-4124-a774-be1e706768c1//00/shuffle_2739_5_0.data Ultimately the machines runs out of Disk Spac i read about *spark.cleaner.ttl *config param which what i can understand from the documentation, says cleans up all the metadata beyond the time limit. I went through https://issues.apache.org/jira/browse/SPARK-5836 it says resolved, but there is no code commit Can anyone please throw some light on the issue.
Re: Spark streaming multi-tasking during I/O
Thanks Akhil. Does this mean that the executor running in the VM can spawn two concurrent jobs on the same core? If this is the case, this is what we are looking for. Also, which version of Spark is this flag in? Thanks, Sateesh On Sat, Aug 22, 2015 at 1:44 AM, Akhil Das ak...@sigmoidanalytics.com wrote: You can look at the spark.streaming.concurrentJobs by default it runs a single job. If set it to 2 then it can run 2 jobs parallely. Its an experimental flag, but go ahead and give it a try. On Aug 21, 2015 3:36 AM, Sateesh Kavuri sateesh.kav...@gmail.com wrote: Hi, My scenario goes like this: I have an algorithm running in Spark streaming mode on a 4 core virtual machine. Majority of the time, the algorithm does disk I/O and database I/O. Question is, during the I/O, where the CPU is not considerably loaded, is it possible to run any other task/thread so as to efficiently utilize the CPU? Note that one DStream of the algorithm runs completely on a single CPU Thank you, Sateesh
Re: spark streaming 1.3 kafka error
On trying the consumer without external connections or with low number of external conections its working fine - so doubt is how socket got closed - 15/08/21 08:54:54 ERROR executor.Executor: Exception in task 262.0 in stage 130.0 (TID 16332) java.io.EOFException: Received -1 when reading from channel, socket has likely been closed. at kafka.utils.Utils$.read(Utils.scala:376) is executor(simple consumer) not making new connection to kafka broker at start of each task ? Or is it created once only and that is getting closed somehow ? On Sat, Aug 22, 2015 at 7:35 PM, Shushant Arora shushantaror...@gmail.com wrote: On trying the consumer without external connections or with low number of external conections its working fine - so doubt is how socket got closed - java.io.EOFException: Received -1 when reading from channel, socket has likely been closed. On Sat, Aug 22, 2015 at 7:24 PM, Akhil Das ak...@sigmoidanalytics.com wrote: Can you try some other consumer and see if the issue still exists? On Aug 22, 2015 12:47 AM, Shushant Arora shushantaror...@gmail.com wrote: Exception comes when client has so many connections to some another external server also. So I think Exception is coming because of client side issue only- server side there is no issue. Want to understand is executor(simple consumer) not making new connection to kafka broker at start of each task ? Or is it created once only and that is getting closed somehow ? On Sat, Aug 22, 2015 at 9:41 AM, Shushant Arora shushantaror...@gmail.com wrote: it comes at start of each tasks when there is new data inserted in kafka.( data inserted is very few) kafka topic has 300 partitions - data inserted is ~10 MB. Tasks gets failed and it retries which succeed and after certain no of fail tasks it kills the job. On Sat, Aug 22, 2015 at 2:08 AM, Akhil Das ak...@sigmoidanalytics.com wrote: That looks like you are choking your kafka machine. Do a top on the kafka machines and see the workload, it may happen that you are spending too much time on disk io etc. On Aug 21, 2015 7:32 AM, Cody Koeninger c...@koeninger.org wrote: Sounds like that's happening consistently, not an occasional network problem? Look at the Kafka broker logs Make sure you've configured the correct kafka broker hosts / ports (note that direct stream does not use zookeeper host / port). Make sure that host / port is reachable from your driver and worker nodes, ie telnet or netcat to it. It looks like your driver can reach it (since there's partition info in the logs), but that doesn't mean the worker can. Use lsof / netstat to see what's going on with those ports while the job is running, or tcpdump if you need to. If you can't figure out what's going on from a networking point of view, post a minimal reproducible code sample that demonstrates the issue, so it can be tested in a different environment. On Fri, Aug 21, 2015 at 4:06 AM, Shushant Arora shushantaror...@gmail.com wrote: Hi Getting below error in spark streaming 1.3 while consuming from kafka using directkafka stream. Few of tasks are getting failed in each run. What is the reason /solution of this error? 15/08/21 08:54:54 ERROR executor.Executor: Exception in task 262.0 in stage 130.0 (TID 16332) java.io.EOFException: Received -1 when reading from channel, socket has likely been closed. at kafka.utils.Utils$.read(Utils.scala:376) at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) at kafka.network.Receive$class.readCompletely(Transmission.scala:56) at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29) at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100) at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:81) at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:71) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:109) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:109) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:109) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:108) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:108) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:108) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:107) at org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator.fetchBatch(KafkaRDD.scala:150) at
Re: subscribe
See http://spark.apache.org/community.html Cheers On Sat, Aug 22, 2015 at 2:51 AM, Lars Hermes li...@hermes-it-consulting.de wrote: subscribe - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: How can I save the RDD result as Orcfile with spark1.3?
In Spark 1.4, there was considerable refactoring around interaction with Hive, such as SPARK-7491. It would not be straight forward to port ORC support to 1.3 FYI On Fri, Aug 21, 2015 at 10:21 PM, dong.yajun dongt...@gmail.com wrote: hi Ted, thanks for your reply, are there any other way to do this with spark 1.3? such as write the orcfile manually in foreachPartition method? On Sat, Aug 22, 2015 at 12:19 PM, Ted Yu yuzhih...@gmail.com wrote: ORC support was added in Spark 1.4 See SPARK-2883 On Fri, Aug 21, 2015 at 7:36 PM, dong.yajun dongt...@gmail.com wrote: Hi list, Is there a way to save the RDD result as Orcfile in spark1.3? due to some reasons we can't upgrade our spark version to 1.4 now. -- *Ric Dong* -- *Ric Dong*
Re: spark streaming 1.3 kafka error
Can you try some other consumer and see if the issue still exists? On Aug 22, 2015 12:47 AM, Shushant Arora shushantaror...@gmail.com wrote: Exception comes when client has so many connections to some another external server also. So I think Exception is coming because of client side issue only- server side there is no issue. Want to understand is executor(simple consumer) not making new connection to kafka broker at start of each task ? Or is it created once only and that is getting closed somehow ? On Sat, Aug 22, 2015 at 9:41 AM, Shushant Arora shushantaror...@gmail.com wrote: it comes at start of each tasks when there is new data inserted in kafka.( data inserted is very few) kafka topic has 300 partitions - data inserted is ~10 MB. Tasks gets failed and it retries which succeed and after certain no of fail tasks it kills the job. On Sat, Aug 22, 2015 at 2:08 AM, Akhil Das ak...@sigmoidanalytics.com wrote: That looks like you are choking your kafka machine. Do a top on the kafka machines and see the workload, it may happen that you are spending too much time on disk io etc. On Aug 21, 2015 7:32 AM, Cody Koeninger c...@koeninger.org wrote: Sounds like that's happening consistently, not an occasional network problem? Look at the Kafka broker logs Make sure you've configured the correct kafka broker hosts / ports (note that direct stream does not use zookeeper host / port). Make sure that host / port is reachable from your driver and worker nodes, ie telnet or netcat to it. It looks like your driver can reach it (since there's partition info in the logs), but that doesn't mean the worker can. Use lsof / netstat to see what's going on with those ports while the job is running, or tcpdump if you need to. If you can't figure out what's going on from a networking point of view, post a minimal reproducible code sample that demonstrates the issue, so it can be tested in a different environment. On Fri, Aug 21, 2015 at 4:06 AM, Shushant Arora shushantaror...@gmail.com wrote: Hi Getting below error in spark streaming 1.3 while consuming from kafka using directkafka stream. Few of tasks are getting failed in each run. What is the reason /solution of this error? 15/08/21 08:54:54 ERROR executor.Executor: Exception in task 262.0 in stage 130.0 (TID 16332) java.io.EOFException: Received -1 when reading from channel, socket has likely been closed. at kafka.utils.Utils$.read(Utils.scala:376) at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) at kafka.network.Receive$class.readCompletely(Transmission.scala:56) at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29) at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100) at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:81) at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:71) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:109) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:109) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:109) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:108) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:108) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:108) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:107) at org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator.fetchBatch(KafkaRDD.scala:150) at org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator.getNext(KafkaRDD.scala:162) at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:210) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:64) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 15/08/21 08:54:54 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 16348 15/08/21 08:54:54 INFO executor.Executor: Running task 260.1 in
Re: spark streaming 1.3 kafka error
I think you also can give a try to this consumer : http://spark-packages.org/package/dibbhatt/kafka-spark-consumer in your environment. This has been running fine for topic with large number of Kafka partition ( 200 ) like yours without any issue.. no issue with connection as this consumer re-use kafka connection , and also can recover from any failures ( network loss , Kafka leader goes down, ZK down etc ..). Regards, Dibyendu On Sat, Aug 22, 2015 at 7:35 PM, Shushant Arora shushantaror...@gmail.com wrote: On trying the consumer without external connections or with low number of external conections its working fine - so doubt is how socket got closed - java.io.EOFException: Received -1 when reading from channel, socket has likely been closed. On Sat, Aug 22, 2015 at 7:24 PM, Akhil Das ak...@sigmoidanalytics.com wrote: Can you try some other consumer and see if the issue still exists? On Aug 22, 2015 12:47 AM, Shushant Arora shushantaror...@gmail.com wrote: Exception comes when client has so many connections to some another external server also. So I think Exception is coming because of client side issue only- server side there is no issue. Want to understand is executor(simple consumer) not making new connection to kafka broker at start of each task ? Or is it created once only and that is getting closed somehow ? On Sat, Aug 22, 2015 at 9:41 AM, Shushant Arora shushantaror...@gmail.com wrote: it comes at start of each tasks when there is new data inserted in kafka.( data inserted is very few) kafka topic has 300 partitions - data inserted is ~10 MB. Tasks gets failed and it retries which succeed and after certain no of fail tasks it kills the job. On Sat, Aug 22, 2015 at 2:08 AM, Akhil Das ak...@sigmoidanalytics.com wrote: That looks like you are choking your kafka machine. Do a top on the kafka machines and see the workload, it may happen that you are spending too much time on disk io etc. On Aug 21, 2015 7:32 AM, Cody Koeninger c...@koeninger.org wrote: Sounds like that's happening consistently, not an occasional network problem? Look at the Kafka broker logs Make sure you've configured the correct kafka broker hosts / ports (note that direct stream does not use zookeeper host / port). Make sure that host / port is reachable from your driver and worker nodes, ie telnet or netcat to it. It looks like your driver can reach it (since there's partition info in the logs), but that doesn't mean the worker can. Use lsof / netstat to see what's going on with those ports while the job is running, or tcpdump if you need to. If you can't figure out what's going on from a networking point of view, post a minimal reproducible code sample that demonstrates the issue, so it can be tested in a different environment. On Fri, Aug 21, 2015 at 4:06 AM, Shushant Arora shushantaror...@gmail.com wrote: Hi Getting below error in spark streaming 1.3 while consuming from kafka using directkafka stream. Few of tasks are getting failed in each run. What is the reason /solution of this error? 15/08/21 08:54:54 ERROR executor.Executor: Exception in task 262.0 in stage 130.0 (TID 16332) java.io.EOFException: Received -1 when reading from channel, socket has likely been closed. at kafka.utils.Utils$.read(Utils.scala:376) at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) at kafka.network.Receive$class.readCompletely(Transmission.scala:56) at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29) at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100) at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:81) at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:71) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:109) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:109) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:109) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:108) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:108) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:108) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:107) at org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator.fetchBatch(KafkaRDD.scala:150) at org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator.getNext(KafkaRDD.scala:162) at
Re: spark streaming 1.3 kafka error
To be perfectly clear, the direct kafka stream will also recover from any failures, because it does the simplest thing possible - fail the task and let spark retry it. If you're consistently having socket closed problems on one task after another, there's probably something else going on in your environment. Shushant, none of your responses have indicated whether you've tried any of the system level troubleshooting suggestions that have been made by various people. Also, if you have 300 partitions, and only 10mb of data, that is completely unnecessary. You're probably going to have lots of empty partitions, which will have a negative effect on your runtime. On Sat, Aug 22, 2015 at 11:28 AM, Dibyendu Bhattacharya dibyendu.bhattach...@gmail.com wrote: I think you also can give a try to this consumer : http://spark-packages.org/package/dibbhatt/kafka-spark-consumer in your environment. This has been running fine for topic with large number of Kafka partition ( 200 ) like yours without any issue.. no issue with connection as this consumer re-use kafka connection , and also can recover from any failures ( network loss , Kafka leader goes down, ZK down etc ..). Regards, Dibyendu On Sat, Aug 22, 2015 at 7:35 PM, Shushant Arora shushantaror...@gmail.com wrote: On trying the consumer without external connections or with low number of external conections its working fine - so doubt is how socket got closed - java.io.EOFException: Received -1 when reading from channel, socket has likely been closed. On Sat, Aug 22, 2015 at 7:24 PM, Akhil Das ak...@sigmoidanalytics.com wrote: Can you try some other consumer and see if the issue still exists? On Aug 22, 2015 12:47 AM, Shushant Arora shushantaror...@gmail.com wrote: Exception comes when client has so many connections to some another external server also. So I think Exception is coming because of client side issue only- server side there is no issue. Want to understand is executor(simple consumer) not making new connection to kafka broker at start of each task ? Or is it created once only and that is getting closed somehow ? On Sat, Aug 22, 2015 at 9:41 AM, Shushant Arora shushantaror...@gmail.com wrote: it comes at start of each tasks when there is new data inserted in kafka.( data inserted is very few) kafka topic has 300 partitions - data inserted is ~10 MB. Tasks gets failed and it retries which succeed and after certain no of fail tasks it kills the job. On Sat, Aug 22, 2015 at 2:08 AM, Akhil Das ak...@sigmoidanalytics.com wrote: That looks like you are choking your kafka machine. Do a top on the kafka machines and see the workload, it may happen that you are spending too much time on disk io etc. On Aug 21, 2015 7:32 AM, Cody Koeninger c...@koeninger.org wrote: Sounds like that's happening consistently, not an occasional network problem? Look at the Kafka broker logs Make sure you've configured the correct kafka broker hosts / ports (note that direct stream does not use zookeeper host / port). Make sure that host / port is reachable from your driver and worker nodes, ie telnet or netcat to it. It looks like your driver can reach it (since there's partition info in the logs), but that doesn't mean the worker can. Use lsof / netstat to see what's going on with those ports while the job is running, or tcpdump if you need to. If you can't figure out what's going on from a networking point of view, post a minimal reproducible code sample that demonstrates the issue, so it can be tested in a different environment. On Fri, Aug 21, 2015 at 4:06 AM, Shushant Arora shushantaror...@gmail.com wrote: Hi Getting below error in spark streaming 1.3 while consuming from kafka using directkafka stream. Few of tasks are getting failed in each run. What is the reason /solution of this error? 15/08/21 08:54:54 ERROR executor.Executor: Exception in task 262.0 in stage 130.0 (TID 16332) java.io.EOFException: Received -1 when reading from channel, socket has likely been closed. at kafka.utils.Utils$.read(Utils.scala:376) at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) at kafka.network.Receive$class.readCompletely(Transmission.scala:56) at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29) at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100) at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:81) at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:71) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:109) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:109) at