Re: Spark streaming stops computing while the receiver keeps running without any errors reported

2014-09-24 Thread Aniket Bhatnagar
Hi all

I was finally able to get this working by setting
the SPARK_EXECUTOR_INSTANCES to a high number. However, I am wondering if
this is a bug because the app gets submitted but ceases to run because it
can't run desired number of workers. Shouldn't the app be rejected if it
cant be run on the cluster?

Thanks,
Aniket

On 22 September 2014 18:14, Aniket Bhatnagar aniket.bhatna...@gmail.com
wrote:

 Hi all

 I was finally able to figure out why this streaming appeared stuck. The
 reason was that I was running out of workers in my standalone deployment of
 Spark. There was no feedback in the logs which is why it took a little time
 for me to figure out.

 However, now I am trying to run the same in yarn-client mode and this is
 again giving the same problem. Is it possible to run out of workers in YARN
 mode as well? If so, how can I figure that out?

 Thanks,
 Aniket

 On 19 September 2014 18:07, Aniket Bhatnagar aniket.bhatna...@gmail.com
 wrote:

 Apologies in delay in getting back on this. It seems the Kinesis example
 does not run on Spark 1.1.0 even when it is built using kinesis-acl profile
 because of a dependency conflict in http client (same issue as
 http://mail-archives.apache.org/mod_mbox/spark-dev/201409.mbox/%3ccajob8btdxks-7-spjj5jmnw0xsnrjwdpcqqtjht1hun6j4z...@mail.gmail.com%3E).
 I had to add a later version of http client in kinesis-acl profile to make
 it run. Then, the Kinesis example sets master as local so it does not
 honour the MASTER environment variable as other examples do. Once I was
 able to resolve these issues, I was finally able to reproduce the issue.
 The example works fine in local mode but does not do anything when receiver
 runs in remote workers.

 Spark streaming does not report any blocks received from the receivers
 even though I can see the following lines in the app logs (I modified the
 debug line to print size as well):

 14/09/19 12:30:18 DEBUG ReceiverSupervisorImpl: Pushed block
 input-0-1411129664668 in 15 ms
 14/09/19 12:30:18 DEBUG ReceiverSupervisorImpl: Reported block
 input-0-1411129664668 of size 1

 Here are the screenshots of Spark admin: http://imgur.com/a/gWKYm

 I also ran other examples (custom receiver, etc) in both local and
 distributed mode and they seem to be working fine.

 Any ideas?

 Thanks,
 Aniket

 On 12 September 2014 02:49, Tathagata Das tathagata.das1...@gmail.com
 wrote:

 This is very puzzling, given that this works in the local mode.

 Does running the kinesis example work with your spark-submit?


 https://github.com/apache/spark/blob/master/extras/kinesis-asl/src/main/scala/org/apache/spark/examples/streaming/KinesisWordCountASL.scala

 The instructions are present in the streaming guide.
 https://github.com/apache/spark/blob/master/docs/streaming-kinesis-integration.md

 If that does not work on cluster, then I would see the streaming UI for
 the number records that are being received, and the stages page for whether
 jobs are being executed for every batch or not. Can tell use whether that
 is working well.

 Also ccing, chris fregly who wrote Kinesis integration.

 TD




 On Thu, Sep 11, 2014 at 4:51 AM, Aniket Bhatnagar 
 aniket.bhatna...@gmail.com wrote:

 Hi all

 I am trying to run kinesis spark streaming application on a standalone
 spark cluster. The job works find in local mode but when I submit it (using
 spark-submit), it doesn't do anything. I enabled logs
 for org.apache.spark.streaming.kinesis package and I regularly get the
 following in worker logs:

 14/09/11 11:41:25 DEBUG KinesisRecordProcessor: Stored:  Worker
 x.x.x.x:b88a9210-cbb9-4c31-8da7-35fd92faba09 stored 34 records for shardId
 shardId-
 14/09/11 11:41:25 DEBUG KinesisRecordProcessor: Stored:  Worker
 x.x.x.x:b2e9c20f-470a-44fe-a036-630c776919fb stored 31 records for shardId
 shardId-0001

 But the job does not perform any operations defined on DStream. To
 investigate this further, I changed the kinesis-asl's KinesisUtils to
 perform the following computation on the DStream created
 using ssc.receiverStream(new KinesisReceiver...):

 stream.count().foreachRDD(rdd = rdd.foreach(tuple = logInfo(Emitted
  + tuple)))

 Even the above line does not results in any corresponding log entries
 both in driver and worker logs. The only relevant logs that I could find in
 driver logs are:
 14/09/11 11:40:58 INFO DAGScheduler: Stage 2 (foreach at
 KinesisUtils.scala:68) finished in 0.398 s
 14/09/11 11:40:58 INFO SparkContext: Job finished: foreach at
 KinesisUtils.scala:68, took 4.926449985 s
 14/09/11 11:40:58 INFO JobScheduler: Finished job streaming job
 1410435653000 ms.0 from job set of time 1410435653000 ms
 14/09/11 11:40:58 INFO JobScheduler: Starting job streaming job
 1410435653000 ms.1 from job set of time 1410435653000 ms
 14/09/11 11:40:58 INFO SparkContext: Starting job: foreach at
 KinesisUtils.scala:68
 14/09/11 11:40:58 INFO DAGScheduler: Registering RDD 13 (union at
 DStream.scala:489)
 14/09/11 11:40:58 INFO 

Re: Spark streaming stops computing while the receiver keeps running without any errors reported

2014-09-19 Thread Aniket Bhatnagar
Apologies in delay in getting back on this. It seems the Kinesis example
does not run on Spark 1.1.0 even when it is built using kinesis-acl profile
because of a dependency conflict in http client (same issue as
http://mail-archives.apache.org/mod_mbox/spark-dev/201409.mbox/%3ccajob8btdxks-7-spjj5jmnw0xsnrjwdpcqqtjht1hun6j4z...@mail.gmail.com%3E).
I had to add a later version of http client in kinesis-acl profile to make
it run. Then, the Kinesis example sets master as local so it does not
honour the MASTER environment variable as other examples do. Once I was
able to resolve these issues, I was finally able to reproduce the issue.
The example works fine in local mode but does not do anything when receiver
runs in remote workers.

Spark streaming does not report any blocks received from the receivers even
though I can see the following lines in the app logs (I modified the debug
line to print size as well):

14/09/19 12:30:18 DEBUG ReceiverSupervisorImpl: Pushed block
input-0-1411129664668 in 15 ms
14/09/19 12:30:18 DEBUG ReceiverSupervisorImpl: Reported block
input-0-1411129664668 of size 1

Here are the screenshots of Spark admin: http://imgur.com/a/gWKYm

I also ran other examples (custom receiver, etc) in both local and
distributed mode and they seem to be working fine.

Any ideas?

Thanks,
Aniket

On 12 September 2014 02:49, Tathagata Das tathagata.das1...@gmail.com
wrote:

 This is very puzzling, given that this works in the local mode.

 Does running the kinesis example work with your spark-submit?


 https://github.com/apache/spark/blob/master/extras/kinesis-asl/src/main/scala/org/apache/spark/examples/streaming/KinesisWordCountASL.scala

 The instructions are present in the streaming guide.
 https://github.com/apache/spark/blob/master/docs/streaming-kinesis-integration.md

 If that does not work on cluster, then I would see the streaming UI for
 the number records that are being received, and the stages page for whether
 jobs are being executed for every batch or not. Can tell use whether that
 is working well.

 Also ccing, chris fregly who wrote Kinesis integration.

 TD




 On Thu, Sep 11, 2014 at 4:51 AM, Aniket Bhatnagar 
 aniket.bhatna...@gmail.com wrote:

 Hi all

 I am trying to run kinesis spark streaming application on a standalone
 spark cluster. The job works find in local mode but when I submit it (using
 spark-submit), it doesn't do anything. I enabled logs
 for org.apache.spark.streaming.kinesis package and I regularly get the
 following in worker logs:

 14/09/11 11:41:25 DEBUG KinesisRecordProcessor: Stored:  Worker
 x.x.x.x:b88a9210-cbb9-4c31-8da7-35fd92faba09 stored 34 records for shardId
 shardId-
 14/09/11 11:41:25 DEBUG KinesisRecordProcessor: Stored:  Worker
 x.x.x.x:b2e9c20f-470a-44fe-a036-630c776919fb stored 31 records for shardId
 shardId-0001

 But the job does not perform any operations defined on DStream. To
 investigate this further, I changed the kinesis-asl's KinesisUtils to
 perform the following computation on the DStream created
 using ssc.receiverStream(new KinesisReceiver...):

 stream.count().foreachRDD(rdd = rdd.foreach(tuple = logInfo(Emitted 
 + tuple)))

 Even the above line does not results in any corresponding log entries
 both in driver and worker logs. The only relevant logs that I could find in
 driver logs are:
 14/09/11 11:40:58 INFO DAGScheduler: Stage 2 (foreach at
 KinesisUtils.scala:68) finished in 0.398 s
 14/09/11 11:40:58 INFO SparkContext: Job finished: foreach at
 KinesisUtils.scala:68, took 4.926449985 s
 14/09/11 11:40:58 INFO JobScheduler: Finished job streaming job
 1410435653000 ms.0 from job set of time 1410435653000 ms
 14/09/11 11:40:58 INFO JobScheduler: Starting job streaming job
 1410435653000 ms.1 from job set of time 1410435653000 ms
 14/09/11 11:40:58 INFO SparkContext: Starting job: foreach at
 KinesisUtils.scala:68
 14/09/11 11:40:58 INFO DAGScheduler: Registering RDD 13 (union at
 DStream.scala:489)
 14/09/11 11:40:58 INFO DAGScheduler: Got job 3 (foreach at
 KinesisUtils.scala:68) with 2 output partitions (allowLocal=false)
 14/09/11 11:40:58 INFO DAGScheduler: Final stage: Stage 5(foreach at
 KinesisUtils.scala:68)

 After the above logs, nothing shows up corresponding to KinesisUtils. I
 am out of ideas on this one and any help on this would greatly appreciated.

 Thanks,
 Aniket





Re: Spark streaming stops computing while the receiver keeps running without any errors reported

2014-09-11 Thread Tathagata Das
This is very puzzling, given that this works in the local mode.

Does running the kinesis example work with your spark-submit?

https://github.com/apache/spark/blob/master/extras/kinesis-asl/src/main/scala/org/apache/spark/examples/streaming/KinesisWordCountASL.scala

The instructions are present in the streaming guide.
https://github.com/apache/spark/blob/master/docs/streaming-kinesis-integration.md

If that does not work on cluster, then I would see the streaming UI for the
number records that are being received, and the stages page for whether
jobs are being executed for every batch or not. Can tell use whether that
is working well.

Also ccing, chris fregly who wrote Kinesis integration.

TD




On Thu, Sep 11, 2014 at 4:51 AM, Aniket Bhatnagar 
aniket.bhatna...@gmail.com wrote:

 Hi all

 I am trying to run kinesis spark streaming application on a standalone
 spark cluster. The job works find in local mode but when I submit it (using
 spark-submit), it doesn't do anything. I enabled logs
 for org.apache.spark.streaming.kinesis package and I regularly get the
 following in worker logs:

 14/09/11 11:41:25 DEBUG KinesisRecordProcessor: Stored:  Worker
 x.x.x.x:b88a9210-cbb9-4c31-8da7-35fd92faba09 stored 34 records for shardId
 shardId-
 14/09/11 11:41:25 DEBUG KinesisRecordProcessor: Stored:  Worker
 x.x.x.x:b2e9c20f-470a-44fe-a036-630c776919fb stored 31 records for shardId
 shardId-0001

 But the job does not perform any operations defined on DStream. To
 investigate this further, I changed the kinesis-asl's KinesisUtils to
 perform the following computation on the DStream created
 using ssc.receiverStream(new KinesisReceiver...):

 stream.count().foreachRDD(rdd = rdd.foreach(tuple = logInfo(Emitted  +
 tuple)))

 Even the above line does not results in any corresponding log entries both
 in driver and worker logs. The only relevant logs that I could find in
 driver logs are:
 14/09/11 11:40:58 INFO DAGScheduler: Stage 2 (foreach at
 KinesisUtils.scala:68) finished in 0.398 s
 14/09/11 11:40:58 INFO SparkContext: Job finished: foreach at
 KinesisUtils.scala:68, took 4.926449985 s
 14/09/11 11:40:58 INFO JobScheduler: Finished job streaming job
 1410435653000 ms.0 from job set of time 1410435653000 ms
 14/09/11 11:40:58 INFO JobScheduler: Starting job streaming job
 1410435653000 ms.1 from job set of time 1410435653000 ms
 14/09/11 11:40:58 INFO SparkContext: Starting job: foreach at
 KinesisUtils.scala:68
 14/09/11 11:40:58 INFO DAGScheduler: Registering RDD 13 (union at
 DStream.scala:489)
 14/09/11 11:40:58 INFO DAGScheduler: Got job 3 (foreach at
 KinesisUtils.scala:68) with 2 output partitions (allowLocal=false)
 14/09/11 11:40:58 INFO DAGScheduler: Final stage: Stage 5(foreach at
 KinesisUtils.scala:68)

 After the above logs, nothing shows up corresponding to KinesisUtils. I am
 out of ideas on this one and any help on this would greatly appreciated.

 Thanks,
 Aniket