Our hadoop cluster has multiple YARN execution queues for running Hadoop jobs(like MR, SPARK) at different resource capacity.
But the current implementation of IntermediateHiveTableStep doesn't have option for users to specify the YARN queue, which basically runs the "hive -e" command in the *DEFAULT* queue. Unfortunately, *DEFAULT* queue might not have enough resource configured. I think it would be great to allow user specify the running queue for KYLIN jobs, and as far as I know it can be accomplished easily: 1. In kylin.properties, specify the MR arugment like "kylin.job.cmd.extra.args=-D mapreduce.job.queuename=your_yarn_queue" 2. Modify the KylinConfig to add an option of YARN queue 3. Modify the createIntermediateHiveTableStep method of AbstractJobBuilder to append "SET mapreduce.job.queuename=your_yarn_queue" to the "hive -e" command For step 2 & 3, it only needs a little bit of coding. I am not sure if the above approach is the best way of doing it, so I would like to hear the opinions from KYLIN community. Thanks, Hua
