Hi all,

I have a Hadoop cluster (3.3.4) with 6 nodes of equal resource size that run 
HDFS and YARN and 1 node with lower resources which only runs YARN that I use 
for Hive AMs, the LLAP AM, Spark AMs and Hive file merge containers. The HDFS 
nodes are set up such that the queue for LLAP on the YARN NodeManager is 
allocated resources exactly equal to what the LLAP daemons consume. However, 
when I need to re-launch LLAP, I currently have to stop the NodeManager 
processes on each HDFS node, then launch LLAP to guarantee that the application 
master ends up on the YARN-only machine, then start the NodeManager processes 
again to let the daemons start spawning on the nodes. This used to not be a 
problem because only Hive/LLAP was using YARN but now we've started using Spark 
in my company and I'm in a position where if LLAP happens to crash, I would 
need to wait for Spark jobs to finish before I can re-launch LLAP, which would 
put our ETL processes behind, potentially to unacceptable delays. I could 
allocate 1 vcore and 1024mb memory extra for the LLAP queue on each machine, 
however that would mean I have 5 vcores and 5gb RAM being reserved and unused 
at all times, so I was wondering if there's a way to specify which node to 
launch the LLAP AM on, perhaps through YARN node labels similar to the Spark 
"spark.yarn.am.nodeLabelExpression" configuration? Or even a way to specify the 
node machine through a different mechanism? My Hive version is 3.1.3.

Thanks,
Aaron

Reply via email to