Hi all, We have written an application which uses spark over Hbase. We are using YARN as the resource manager. Currently each time the application run it raise spark context, run few mappers and then finishes. In each run, no matter what the amount of data it run on, we can see a spike in CPU usage on the cluster nodes. For no data at all the spike can be around 15%, and large amount of data can reach to even to 30% CPU usage.
We expected to see a change in the memory usage but not in the CPU usage. Is this the normal behavior? Is it caused of only raising the spark context? Or maybe we are doing something wrong? The main problem we have with this situation is when we need to run multiple instances of the application. Since in this case the CPU of the nodes spikes constantly to 100%, and the run of all the application get very slow. Thanks Dana. --------------------------------------------------------------------- Intel Electronics Ltd. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.