No, I dont use Yarn. This is standalone spark that comes with DataStax Enterprise version of Cassandra.
On Thu, Oct 26, 2017 at 11:22 PM, Jörn Franke <jornfra...@gmail.com> wrote: > Do you use yarn ? Then you need to configure the queues with the right > scheduler and method. > > On 27. Oct 2017, at 08:05, Cassa L <lcas...@gmail.com> wrote: > > Hi, > I have a spark job that has use case as below: > RRD1 and RDD2 read from Cassandra tables. These two RDDs then do some > transformation and after that I do a count on transformed data. > > Code somewhat looks like this: > > RDD1=JavaFunctions.cassandraTable(...) > RDD2=JavaFunctions.cassandraTable(...) > RDD3 = RDD1.flatMap(..) > RDD4 = RDD2.flatMap() > > RDD3.count > RDD4.count > > In Spark UI I see count() functions are getting called one after another. > How do I make it parallel? I also looked at below discussion from Cloudera, > but it does not show how to run driver functions in parallel. Do I just add > Executor and run them in threads? > > https://community.cloudera.com/t5/Advanced-Analytics- > Apache-Spark/Getting-Spark-stages-to-run-in-parallel- > inside-an-application/td-p/38515 > > <Screen Shot 2017-10-26 at 10.54.51 PM.png>Attaching UI snapshot here? > > > Thanks. > LCassa > >