Thanks for your reply Val.
We tried that, now it stopped spinning up several servers, all of those
became clients now (I still dont know the difference). But still just for a
single row rdd it takes several minutes to retrieve it.
It still spins up 100s of executor even though i specify 1 executor 1 core.
When digged deeper, i think its creating way too many tasks, more than 1000,
so its making existing executor to backlog and making spark to spin up new
executors to process.
If this is the expected behaviour, then I'm not sure Ignite SharedRDD is
tested enough and is functional.
Just for reference, i will create a bug as well on this behaviour and adding
code snippet of my consumer (Its slightly tweaked version of the SharedRDD
example in github).
val ccfg = new CacheConfiguration[Int, AssetWTag]()
//ccfg.setAffinity(new RendezvousAffinityFunction(false, 1));
// Retrieve sharedRDD back from the Cache.
val transformedValues: IgniteRDD[Int, SomeEntity] =
println(">>> Transforming values stored in Ignite Shared RDD...")
// Filter out pairs which square roots are less than 100 and
// take the first five elements from the transformed IgniteRDD and print
transformedValues.take(5).foreach(println) // THIS RETURNS 1 ROW AFTER
SPINNING UP 100 EXECUTORS, AROUND 1.5 mins
println(">>> Executing SQL query over Ignite Shared RDD...")
// Execute a SQL query over the Ignite Shared RDD.
val df = transformedValues.sql("select * from SomeEntity")
// Show ten rows from the result set.
df.show(10) // THIS ALWAYS RETURNS EMPTY
> Looks like executors start additional server nodes, while they should be
> in client mode instead when standalone mode is used. I would recommend to
> explicitly provide clientMode=true property in the config provided to
> I also created a ticket for this:
View this message in context:
Sent from the Apache Ignite Users mailing list archive at Nabble.com.