Hi all, I have a Hadoop cluster (3.3.4) with 6 nodes of equal resource size that run HDFS and YARN and 1 node with lower resources which only runs YARN that I use for Hive AMs, the LLAP AM, Spark AMs and Hive file merge containers. The HDFS nodes are set up such that the queue for LLAP on the YARN NodeManager is allocated resources exactly equal to what the LLAP daemons consume. However, when I need to re-launch LLAP, I currently have to stop the NodeManager processes on each HDFS node, then launch LLAP to guarantee that the application master ends up on the YARN-only machine, then start the NodeManager processes again to let the daemons start spawning on the nodes. This used to not be a problem because only Hive/LLAP was using YARN but now we've started using Spark in my company and I'm in a position where if LLAP happens to crash, I would need to wait for Spark jobs to finish before I can re-launch LLAP, which would put our ETL processes behind, potentially to unacceptable delays. I could allocate 1 vcore and 1024mb memory extra for the LLAP queue on each machine, however that would mean I have 5 vcores and 5gb RAM being reserved and unused at all times, so I was wondering if there's a way to specify which node to launch the LLAP AM on, perhaps through YARN node labels similar to the Spark "spark.yarn.am.nodeLabelExpression" configuration? Or even a way to specify the node machine through a different mechanism? My Hive version is 3.1.3.
Thanks, Aaron