Your understanding is the right one (having re-read the documentation). Still 
wondering how I can verify that 5 partitions have been created. My job is 
reading from a topic in Kafka that has 5 partitions and sends the data to E/S. 
I can see that when there is one task to read from Kafka there are 5 tasks 
writing to E/S. So I'm supposing that the task reading from Kafka does it in // 
using 5 partitions and that's why there are then 5 tasks to write to E/S. But 
I'm supposing ...

> On Feb 16, 2016, at 21:12, ayan guha <guha.a...@gmail.com> wrote:
> 
> I have a slightly different understanding. 
> 
> Direct stream generates 1 RDD per batch, however, number of partitions in 
> that RDD = number of partitions in kafka topic.
> 
> On Wed, Feb 17, 2016 at 12:18 PM, Cyril Scetbon <cyril.scet...@free.fr 
> <mailto:cyril.scet...@free.fr>> wrote:
> Hi guys,
> 
> I'm making some tests with Spark and Kafka using a Python script. I use the 
> second method that doesn't need any receiver (Direct Approach). It should 
> adapt the number of RDDs to the number of partitions in the topic. I'm trying 
> to verify it. What's the easiest way to verify it ? I also tried to co-locate 
> Yarn, Spark and Kafka to check if RDDs are created depending on the leaders 
> of partitions in a topic, and they are not. Can you confirm that RDDs are not 
> created depending on the location of partitions and that co-locating Kafka 
> with Spark is not a must-have or that Spark does not take advantage of it ?
> 
> As the parallelism is simplified (by creating as many RDDs as there are 
> partitions) I suppose that the biggest part of the tuning is playing with 
> KafKa partitions (not talking about network configuration or management of 
> Spark resources) ?
> 
> Thank you
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org 
> <mailto:user-unsubscr...@spark.apache.org>
> For additional commands, e-mail: user-h...@spark.apache.org 
> <mailto:user-h...@spark.apache.org>
> 
> 
> 
> 
> -- 
> Best Regards,
> Ayan Guha

Reply via email to