Re: Having problem with Spark streaming with Kinesis

2014-12-19 Thread Ashrafuzzaman
Thanks Aniket , clears a lot of confusion. 😄 On Dec 14, 2014 7:11 PM, "Aniket Bhatnagar" wrote: > The reason is because of the following code: > > val numStreams = numShards > val kinesisStreams = (0 until numStreams).map { i => > KinesisUtils.createStream(ssc, streamName, endpointUrl, > kinesi

Re: Having problem with Spark streaming with Kinesis

2014-12-14 Thread Aniket Bhatnagar
The reason is because of the following code: val numStreams = numShards val kinesisStreams = (0 until numStreams).map { i => KinesisUtils.createStream(ssc, streamName, endpointUrl, kinesisCheckpointInterval, InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2) } In the above co

Re: Having problem with Spark streaming with Kinesis

2014-12-13 Thread A.K.M. Ashrafuzzaman
Thanks Aniket, The trick is to have the #workers >= #shards + 1. But I don’t know why is that. http://spark.apache.org/docs/latest/streaming-kinesis-integration.html Here in the figure[spark streaming kinesis architecture], it seems like one node should be able to take on more than one shards.

Re: Having problem with Spark streaming with Kinesis

2014-12-03 Thread A.K.M. Ashrafuzzaman
Guys, In my local machine it consumes a stream of Kinesis with 3 shards. But in EC2 it does not consume from the stream. Later we found that the EC2 machine was of 2 cores and my local machine was of 4 cores. I am using a single machine and in spark standalone mode. And we got a larger machine f

Re: Having problem with Spark streaming with Kinesis

2014-11-26 Thread Aniket Bhatnagar
Did you set spark master as local[*]? If so, then it means that nunber of executors is equal to number of cores of the machine. Perhaps your mac machine has more cores (certainly more than number of kinesis shards +1). Try explicitly setting master as local[N] where N is number of kinesis shards +

Re: Having problem with Spark streaming with Kinesis

2014-11-26 Thread Ashrafuzzaman
I was trying in one machine with just sbt run. And it is working with my mac environment with the same configuration. I used the sample code from https://github.com/apache/spark/blob/master/extras/kinesis-asl/src/main/scala/org/apache/spark/examples/streaming/KinesisWordCountASL.scala val kines

Re: Having problem with Spark streaming with Kinesis

2014-11-26 Thread Aniket Bhatnagar
What's your cluster size? For streamig to work, it needs shards + 1 executors. On Wed, Nov 26, 2014, 5:53 PM A.K.M. Ashrafuzzaman < ashrafuzzaman...@gmail.com> wrote: > Hi guys, > When we are using Kinesis with 1 shard then it works fine. But when we use > more that 1 then it falls into an infini

Re: Having problem with Spark streaming with Kinesis

2014-11-26 Thread Akhil Das
I have it working without any issues (tried with 5 shrads), except my java version was 1.7. Here's the piece of code that i used. System.setProperty("AWS_ACCESS_KEY_ID", this.kConf.getOrElse("access_key", "")) System.setProperty("AWS_SECRET_KEY", this.kConf.getOrElse("secret", "")) val

Having problem with Spark streaming with Kinesis

2014-11-26 Thread A.K.M. Ashrafuzzaman
Hi guys, When we are using Kinesis with 1 shard then it works fine. But when we use more that 1 then it falls into an infinite loop and no data is processed by the spark streaming. In the kinesis dynamo DB, I can see that it keeps increasing the leaseCounter. But it do start processing. I am us