Not separately at the level of `flatMap` and `map`.  The number of
partitions in the RDD those operations are working on determines the
potential parallelism.  The number of worker cores available determines how
much of that potential can be actualized.


On Tue, Oct 22, 2013 at 7:24 AM, Ryan Chan <[email protected]> wrote:

> In storm, you can control the number of thread with the setSpout/setBolt,
> and how to do the same with Spark Streaming?
>
> e.g.
>
> val lines = ssc.socketTextStream(args(1), args(2).toInt)
> val words = lines.flatMap(_.split(" "))
> val wordCounts = words.map(x => (x, 1)).reduceByKey(_ + _)
> wordCounts.print()
> ssc.start()
>
>
> Sound like I cannot tell Spark to tell how many thread to be used with
> `flatMap` and how many thread to be used with `map` etc, right?
>
>
>

Reply via email to