Thanks for the info!
Am 13.07.2015 um 11:22 schrieb Arjun Sharma:
I am not measuring RAM or CPU usage. I am just measuring the overall
time the job takes to finish on a large input. For assigning RAM to
the workers, I am using the job parameters
-Dmapreduce.map.memory.mb=9300 -Dmapreduce.map.java.opts="-Xms9G
-Xmx9G" (I am running on YARN).
On Mon, Jul 13, 2015 at 2:05 AM, Sonja Koenig <[email protected]
<mailto:[email protected]>> wrote:
Hi there!
On a related matter:
May I ask you how you perform your measurements? Especially for
capturing RAM and CPU usage..
I also want to do some performance tests and I would be thankful
to hear how you succeeded on that issue ;)
Regards,
Sonja
Am 13.07.2015 um 10:56 schrieb Arjun Sharma:
Hi,
Many of the discussions on this forum suggest using one worker
per physical machine, and increasing the number of threads per
worker, versus using multiple workers per physical machine,
with a less number of threads. This does not seem to be the
case with my experiments.
The cluster I am using has 12 physical machines (used
exclusively for workers), 64 GB of RAM and 12 cores each. I
experimented with two setups:
Setup 1 runs 72 workers (i.e., 6 workers per machine), 72*72
partitions, which is the default, and 8 threads per worker.
Setup 2 tries to simulate Setup 1, but using threads instead
of workers. Therefore, it has 12 workers (1 worker per
machine), 72*72 partitions (using numUserPartitions), and
since the number of parallel tasks per machine in Setup 1 is 6
workers * 8 threads, then the number of compute, input, output
threads is set to 48.
In both cases 56 GB of RAM is assigned equally to all workers
on the machine (either given to the 1 worker on that machine
or divided among 6 of them).
In my case, Setup 1 performs significantly better (faster)
than Setup 2, which sounds counter intuitive, and not agreeing
with other suggestions of using less number of workers, and
more number of threads. Is there anything I am missing here?
Is there any kind of tuning or configuration parameter setting
that can make Setup 2 outperform Setup 1?
Thanks!