Hi all,

We have written an application which uses spark over Hbase.
We are using YARN as the resource manager.
Currently each time the application run it raise spark context, run few mappers 
and then finishes.
In each run, no matter what the amount of data it run on, we can see a spike in 
CPU usage on the cluster nodes.
For no data at all the spike can be around 15%, and large amount of data can 
reach to even to 30% CPU usage.

We expected to see a change in the memory usage but not in the CPU usage.
Is this the normal behavior? Is it caused of only raising the spark context?
Or maybe we are doing something wrong?

The main problem we have with this situation is when we need to run multiple 
instances of the application.
Since in this case the CPU of the nodes spikes constantly to 100%, and the run 
of all the application get very slow.

Thanks
Dana.


---------------------------------------------------------------------
Intel Electronics Ltd.

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Reply via email to