Your understanding is the right one (having re-read the documentation). Still wondering how I can verify that 5 partitions have been created. My job is reading from a topic in Kafka that has 5 partitions and sends the data to E/S. I can see that when there is one task to read from Kafka there are 5 tasks writing to E/S. So I'm supposing that the task reading from Kafka does it in // using 5 partitions and that's why there are then 5 tasks to write to E/S. But I'm supposing ...
> On Feb 16, 2016, at 21:12, ayan guha <guha.a...@gmail.com> wrote: > > I have a slightly different understanding. > > Direct stream generates 1 RDD per batch, however, number of partitions in > that RDD = number of partitions in kafka topic. > > On Wed, Feb 17, 2016 at 12:18 PM, Cyril Scetbon <cyril.scet...@free.fr > <mailto:cyril.scet...@free.fr>> wrote: > Hi guys, > > I'm making some tests with Spark and Kafka using a Python script. I use the > second method that doesn't need any receiver (Direct Approach). It should > adapt the number of RDDs to the number of partitions in the topic. I'm trying > to verify it. What's the easiest way to verify it ? I also tried to co-locate > Yarn, Spark and Kafka to check if RDDs are created depending on the leaders > of partitions in a topic, and they are not. Can you confirm that RDDs are not > created depending on the location of partitions and that co-locating Kafka > with Spark is not a must-have or that Spark does not take advantage of it ? > > As the parallelism is simplified (by creating as many RDDs as there are > partitions) I suppose that the biggest part of the tuning is playing with > KafKa partitions (not talking about network configuration or management of > Spark resources) ? > > Thank you > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > <mailto:user-unsubscr...@spark.apache.org> > For additional commands, e-mail: user-h...@spark.apache.org > <mailto:user-h...@spark.apache.org> > > > > > -- > Best Regards, > Ayan Guha