Re: Spark streaming cannot receive any message from Kafka
Hi Jerry, I looked at KafkaUtils.createStream api and found actually the spark.default.parallelism is specified in SparkConf instead. I do not remember the exact stacks of the exception. But the exception was incurred when createStream was called if we do not specify the spark.default.parallelism. The error message basically shows parsing an empty string into Int if spark.default.parallelism is not specified. Bill On Mon, Nov 17, 2014 at 4:45 PM, Shao, Saisai wrote: > Hi Bill, > > > > Would you mind describing what you found a little more specifically, I’m > not sure there’s the a parameter in KafkaUtils.createStream you can specify > the spark parallelism, also what is the exception stacks. > > > > Thanks > > Jerry > > > > *From:* Bill Jay [mailto:bill.jaypeter...@gmail.com] > *Sent:* Tuesday, November 18, 2014 2:47 AM > *To:* Helena Edelson > *Cc:* Jay Vyas; u...@spark.incubator.apache.org; Tobias Pfeiffer; Shao, > Saisai > > *Subject:* Re: Spark streaming cannot receive any message from Kafka > > > > Hi all, > > > > I find the reason of this issue. It seems in the new version, if I do not > specify spark.default.parallelism in KafkaUtils.createstream, there will be > an exception since the kakfa stream creation stage. In the previous > versions, it seems Spark will use the default value. > > > > Thanks! > > > > Bill > > > > On Thu, Nov 13, 2014 at 5:00 AM, Helena Edelson < > helena.edel...@datastax.com> wrote: > > I encounter no issues with streaming from kafka to spark in 1.1.0. Do you > perhaps have a version conflict? > > Helena > > On Nov 13, 2014 12:55 AM, "Jay Vyas" wrote: > > Yup , very important that n>1 for spark streaming jobs, If local use > local[2] > > > > The thing to remember is that your spark receiver will take a thread to > itself and produce data , so u need another thread to consume it . > > > > In a cluster manager like yarn or mesos, the word thread Is not used > anymore, I guess has different meaning- you need 2 or more free compute > slots, and that should be guaranteed by looking to see how many free node > managers are running etc. > > > On Nov 12, 2014, at 7:53 PM, "Shao, Saisai" wrote: > > Did you configure Spark master as local, it should be local[n], n > 1 > for local mode. Beside there’s a Kafka wordcount example in Spark Streaming > example, you can try that. I’ve tested with latest master, it’s OK. > > > > Thanks > > Jerry > > > > *From:* Tobias Pfeiffer [mailto:t...@preferred.jp ] > *Sent:* Thursday, November 13, 2014 8:45 AM > *To:* Bill Jay > *Cc:* u...@spark.incubator.apache.org > *Subject:* Re: Spark streaming cannot receive any message from Kafka > > > > Bill, > > > > However, when I am currently using Spark 1.1.0. the Spark streaming job > cannot receive any messages from Kafka. I have not made any change to the > code. > > > > Do you see any suspicious messages in the log output? > > > > Tobias > > > > >
RE: Spark streaming cannot receive any message from Kafka
Hi Bill, Would you mind describing what you found a little more specifically, I’m not sure there’s the a parameter in KafkaUtils.createStream you can specify the spark parallelism, also what is the exception stacks. Thanks Jerry From: Bill Jay [mailto:bill.jaypeter...@gmail.com] Sent: Tuesday, November 18, 2014 2:47 AM To: Helena Edelson Cc: Jay Vyas; u...@spark.incubator.apache.org; Tobias Pfeiffer; Shao, Saisai Subject: Re: Spark streaming cannot receive any message from Kafka Hi all, I find the reason of this issue. It seems in the new version, if I do not specify spark.default.parallelism in KafkaUtils.createstream, there will be an exception since the kakfa stream creation stage. In the previous versions, it seems Spark will use the default value. Thanks! Bill On Thu, Nov 13, 2014 at 5:00 AM, Helena Edelson mailto:helena.edel...@datastax.com>> wrote: I encounter no issues with streaming from kafka to spark in 1.1.0. Do you perhaps have a version conflict? Helena On Nov 13, 2014 12:55 AM, "Jay Vyas" mailto:jayunit100.apa...@gmail.com>> wrote: Yup , very important that n>1 for spark streaming jobs, If local use local[2] The thing to remember is that your spark receiver will take a thread to itself and produce data , so u need another thread to consume it . In a cluster manager like yarn or mesos, the word thread Is not used anymore, I guess has different meaning- you need 2 or more free compute slots, and that should be guaranteed by looking to see how many free node managers are running etc. On Nov 12, 2014, at 7:53 PM, "Shao, Saisai" mailto:saisai.s...@intel.com>> wrote: Did you configure Spark master as local, it should be local[n], n > 1 for local mode. Beside there’s a Kafka wordcount example in Spark Streaming example, you can try that. I’ve tested with latest master, it’s OK. Thanks Jerry From: Tobias Pfeiffer [mailto:t...@preferred.jp] Sent: Thursday, November 13, 2014 8:45 AM To: Bill Jay Cc: u...@spark.incubator.apache.org<mailto:u...@spark.incubator.apache.org> Subject: Re: Spark streaming cannot receive any message from Kafka Bill, However, when I am currently using Spark 1.1.0. the Spark streaming job cannot receive any messages from Kafka. I have not made any change to the code. Do you see any suspicious messages in the log output? Tobias
Re: Spark streaming cannot receive any message from Kafka
Hi all, I find the reason of this issue. It seems in the new version, if I do not specify spark.default.parallelism in KafkaUtils.createstream, there will be an exception since the kakfa stream creation stage. In the previous versions, it seems Spark will use the default value. Thanks! Bill On Thu, Nov 13, 2014 at 5:00 AM, Helena Edelson wrote: > I encounter no issues with streaming from kafka to spark in 1.1.0. Do you > perhaps have a version conflict? > > Helena > On Nov 13, 2014 12:55 AM, "Jay Vyas" wrote: > >> Yup , very important that n>1 for spark streaming jobs, If local use >> local[2] >> >> The thing to remember is that your spark receiver will take a thread to >> itself and produce data , so u need another thread to consume it . >> >> In a cluster manager like yarn or mesos, the word thread Is not used >> anymore, I guess has different meaning- you need 2 or more free compute >> slots, and that should be guaranteed by looking to see how many free node >> managers are running etc. >> >> On Nov 12, 2014, at 7:53 PM, "Shao, Saisai" >> wrote: >> >> Did you configure Spark master as local, it should be local[n], n > 1 >> for local mode. Beside there’s a Kafka wordcount example in Spark Streaming >> example, you can try that. I’ve tested with latest master, it’s OK. >> >> >> >> Thanks >> >> Jerry >> >> >> >> *From:* Tobias Pfeiffer [mailto:t...@preferred.jp ] >> *Sent:* Thursday, November 13, 2014 8:45 AM >> *To:* Bill Jay >> *Cc:* u...@spark.incubator.apache.org >> *Subject:* Re: Spark streaming cannot receive any message from Kafka >> >> >> >> Bill, >> >> >> >> However, when I am currently using Spark 1.1.0. the Spark streaming >> job cannot receive any messages from Kafka. I have not made any change to >> the code. >> >> >> >> Do you see any suspicious messages in the log output? >> >> >> >> Tobias >> >> >> >>
Re: Spark streaming cannot receive any message from Kafka
I encounter no issues with streaming from kafka to spark in 1.1.0. Do you perhaps have a version conflict? Helena On Nov 13, 2014 12:55 AM, "Jay Vyas" wrote: > Yup , very important that n>1 for spark streaming jobs, If local use > local[2] > > The thing to remember is that your spark receiver will take a thread to > itself and produce data , so u need another thread to consume it . > > In a cluster manager like yarn or mesos, the word thread Is not used > anymore, I guess has different meaning- you need 2 or more free compute > slots, and that should be guaranteed by looking to see how many free node > managers are running etc. > > On Nov 12, 2014, at 7:53 PM, "Shao, Saisai" wrote: > > Did you configure Spark master as local, it should be local[n], n > 1 > for local mode. Beside there’s a Kafka wordcount example in Spark Streaming > example, you can try that. I’ve tested with latest master, it’s OK. > > > > Thanks > > Jerry > > > > *From:* Tobias Pfeiffer [mailto:t...@preferred.jp ] > *Sent:* Thursday, November 13, 2014 8:45 AM > *To:* Bill Jay > *Cc:* u...@spark.incubator.apache.org > *Subject:* Re: Spark streaming cannot receive any message from Kafka > > > > Bill, > > > > However, when I am currently using Spark 1.1.0. the Spark streaming job > cannot receive any messages from Kafka. I have not made any change to the > code. > > > > Do you see any suspicious messages in the log output? > > > > Tobias > > > >
Re: Spark streaming cannot receive any message from Kafka
Yup , very important that n>1 for spark streaming jobs, If local use local[2] The thing to remember is that your spark receiver will take a thread to itself and produce data , so u need another thread to consume it . In a cluster manager like yarn or mesos, the word thread Is not used anymore, I guess has different meaning- you need 2 or more free compute slots, and that should be guaranteed by looking to see how many free node managers are running etc. > On Nov 12, 2014, at 7:53 PM, "Shao, Saisai" wrote: > > Did you configure Spark master as local, it should be local[n], n > 1 for > local mode. Beside there’s a Kafka wordcount example in Spark Streaming > example, you can try that. I’ve tested with latest master, it’s OK. > > Thanks > Jerry > > From: Tobias Pfeiffer [mailto:t...@preferred.jp] > Sent: Thursday, November 13, 2014 8:45 AM > To: Bill Jay > Cc: u...@spark.incubator.apache.org > Subject: Re: Spark streaming cannot receive any message from Kafka > > Bill, > > However, when I am currently using Spark 1.1.0. the Spark streaming job > cannot receive any messages from Kafka. I have not made any change to the > code. > > Do you see any suspicious messages in the log output? > > Tobias >
RE: Spark streaming cannot receive any message from Kafka
Did you configure Spark master as local, it should be local[n], n > 1 for local mode. Beside there’s a Kafka wordcount example in Spark Streaming example, you can try that. I’ve tested with latest master, it’s OK. Thanks Jerry From: Tobias Pfeiffer [mailto:t...@preferred.jp] Sent: Thursday, November 13, 2014 8:45 AM To: Bill Jay Cc: u...@spark.incubator.apache.org Subject: Re: Spark streaming cannot receive any message from Kafka Bill, However, when I am currently using Spark 1.1.0. the Spark streaming job cannot receive any messages from Kafka. I have not made any change to the code. Do you see any suspicious messages in the log output? Tobias
Re: Spark streaming cannot receive any message from Kafka
Hi all, Thanks for the information. I am running Spark streaming in a yarn cluster and the configuration should be correct. I followed the KafkaWordCount to write the current code three months ago. It has been working for several months. The messages are in json format. Actually, this code worked a few days ago. But now it is not working. Below please find my spark submit script: SPARK_BIN=/home/hadoop/spark/bin/ $SPARK_BIN/spark-submit \ --class com.test \ --master yarn-cluster \ --deploy-mode cluster \ --verbose \ --driver-memory 20G \ --executor-memory 20G \ --executor-cores 6 \ --num-executors $2 \ $1 $3 $4 $5 Thanks! Bill On Wed, Nov 12, 2014 at 4:53 PM, Shao, Saisai wrote: > Did you configure Spark master as local, it should be local[n], n > 1 > for local mode. Beside there’s a Kafka wordcount example in Spark Streaming > example, you can try that. I’ve tested with latest master, it’s OK. > > > > Thanks > > Jerry > > > > *From:* Tobias Pfeiffer [mailto:t...@preferred.jp] > *Sent:* Thursday, November 13, 2014 8:45 AM > *To:* Bill Jay > *Cc:* u...@spark.incubator.apache.org > *Subject:* Re: Spark streaming cannot receive any message from Kafka > > > > Bill, > > > > However, when I am currently using Spark 1.1.0. the Spark streaming job > cannot receive any messages from Kafka. I have not made any change to the > code. > > > > Do you see any suspicious messages in the log output? > > > > Tobias > > >
Re: Spark streaming cannot receive any message from Kafka
Bill, However, when I am currently using Spark 1.1.0. the Spark streaming job > cannot receive any messages from Kafka. I have not made any change to the > code. > Do you see any suspicious messages in the log output? Tobias