Hello!
We have Hadoop/HDFS running with Yarn/Spark on worker nodes for
processing jobs that are ran on a schedule. We would like to introduce a
queue for Spark "streaming" jobs that are indefinite/do not exit,
without interfering with the scheduled jobs or Hadoop/HBase/HDFS. We
currently limit Yarn to 11 CPUs, and want to bump it up to 14 CPUs to
handle this additional queue. Is this a sensible thing to do on the
workers themselves? From profiling a bit it seems like the
non-Yarn/Spark related processes don't require a huge amount of CPU, but
is there a recommended resource amount for Hadoop/HBase/HDFS that I can
reference? One worker has 24 CPU, 125GB RAM, 8 Disks.
Thanks!
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]