Hi Hua,
We have the same requirement as you in eBay internal deployment, and our
solution is adding such hadoop property in the
$KYLIN_HOME/conf/kylin_job_conf.xml, like:
<property>
<name>mapreduce.job.queuename</name>
<value>queue-name</value>
<description>Job queue</description>
</property>
The properties in this xml will be applied when running hive command and
MR jobs;
>From 0.8, we separate hive related to another file called
³kylin_hive_conf.xml², which will only be applied when running hive
command:
https://github.com/apache/incubator-kylin/blob/0.8/conf/kylin_hive_conf.xml
Basically, we don¹t want to add such hadoop configurations to
kylin.properties; kylin.properties is for Kylin specific;
Just let me know if it answers your question;
On 8/12/15, 2:25 PM, "Huang Hua" <[email protected]> wrote:
>Our hadoop cluster has multiple YARN execution queues for running Hadoop
>jobs(like MR, SPARK) at different resource capacity.
>
>
>
>But the current implementation of IntermediateHiveTableStep doesn't have
>option for users to specify the YARN queue,
>
>which basically runs the "hive -e" command in the *DEFAULT* queue.
>Unfortunately, *DEFAULT* queue might not have enough resource configured.
>
>
>
>I think it would be great to allow user specify the running queue for
>KYLIN
>jobs, and as far as I know it can be accomplished easily:
>
>1. In kylin.properties, specify the MR arugment like
>"kylin.job.cmd.extra.args=-D mapreduce.job.queuename=your_yarn_queue"
>
>2. Modify the KylinConfig to add an option of YARN queue
>
>3. Modify the createIntermediateHiveTableStep method of AbstractJobBuilder
>to append "SET mapreduce.job.queuename=your_yarn_queue" to the "hive -e"
>command
>
>For step 2 & 3, it only needs a little bit of coding.
>
>
>
>I am not sure if the above approach is the best way of doing it, so I
>would
>like to hear the opinions from KYLIN community.
>
>
>
>
>Thanks,
>
>Hua
>