Giraph uses threads for compute, netty server, netty client on workers,
execution pools, input, output etc.You can see most of these options in
org.apache.giraph.conf.GiraphConstants for instance
/** Netty client threads */ IntConfOption NETTY_CLIENT_THREADS = new
IntConfOption("giraph.nettyClientThreads", 4, "Netty client threads");
/** Netty server threads */ IntConfOption NETTY_SERVER_THREADS = new
IntConfOption("giraph.nettyServerThreads", 16, "Netty server threads");
/** Number of threads for vertex computation */ IntConfOption
NUM_COMPUTE_THREADS = new IntConfOption("giraph.numComputeThreads", 1,
"Number of threads for vertex computation");
/** Number of threads for input split loading */ IntConfOption
NUM_INPUT_THREADS = new IntConfOption("giraph.numInputThreads", 1,
"Number of threads for input split loading");
The idea is that if you run your job in a cluster of 5 machines: typically 1
machine is the master & 4 of them are "workers" which load the graph & compute
on it. Each worker is a separate machine and to maximize its utilization we can
use as many threads as it can handle.
However, if you are running it in pseudo mode then all workers run on the same
machine & still try to launch the number of threads (default set in the config)
- though each worker is now a thread (instead of a machine) it still launches
all these other threads unscrupulously. Anyway, u can configure these threads
spawned by workers to reduce the over all number of threads launched in your
one machine.
From: [email protected]
To: [email protected]
Subject: Optimal number of Workers
Date: Tue, 15 Apr 2014 13:34:53 +0200
Hello !!Can anybody explain how threads are used by worker in Giraph ? for
which purposes ? how the number of thread to use is determined by worker?
I often have the following error :org.apache.hadoop.mapred.Child: Error running
child : java.lang.OutOfMemoryError: unable to create new native thread.
A check on the number of thread by worker gives child processes with 100
threads by worker process (10 workers in a 12 processors machine), which is in
my opinion too large isn't it ?if i reduce the number of workers , the number
of threads decreases. How must we choose the number of workers?
Thanks in advance.Chadi