[ 
https://issues.apache.org/jira/browse/KYLIN-1082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110855#comment-15110855
 ] 

Zhong Yanghong commented on KYLIN-1082:
---------------------------------------

Thank you for you guys. Finally my initial version is done. I mainly changed 
the file 
engine-mr/src/main/java/org/apache/kylin/engine/mr/common/AbstractHadoopJob.java.
The additional function is to filter the system level property called 
"kylin.hive.dependency" to get interested hive dependency jar names which will 
later be added to the hadoop's special property called "tmpjars" whose related 
jars will be uploaded to each datanode running mapreduce work. 
To set the value for "kylin.hive.dependency", for each platform there is a way. 
For development and testing machines which don't install hive and run the 
"DebugTomcat.java" to start KYLIN, just add the 
System.setProperty("kylin.hive.dependency","XXXX"). While for the sandbox which 
have install hive and run kylin.sh to start KYLIN, the shell script 
find-hive-dependency.sh run in kylin.sh will automatically set the property.
To add additional helpful jars, there is another way. In the file 
"kylin.properties", we can set a property called "kylin.job.mr.lib.dir". Then 
AbstractHadoopJob.java will parse out all of the jars and files under this 
self-defined directory including the subdirectory and add them to "tmpjars".


> Hive dependencies should be add to tmpjars
> ------------------------------------------
>
>                 Key: KYLIN-1082
>                 URL: https://issues.apache.org/jira/browse/KYLIN-1082
>             Project: Kylin
>          Issue Type: Bug
>            Reporter: liyang
>            Assignee: Zhong Yanghong
>              Labels: newbie
>
> Currently kylin assume all data nodes have hive deployment at exact same FS 
> location. However, a better position is to think hive as a client side app. 
> Then we need to ship hive jar with MR job every time.
> This make deploy kylin a lot easier in cluster that does not have hive on all 
> data nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to