[jira] [Commented] (KYLIN-1839) improvement set classpath before submitting mr job

2016-10-18 Thread fengYu (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587499#comment-15587499
 ] 

fengYu commented on KYLIN-1839:
---

glad to do it, submit it latter.

> improvement set classpath before submitting mr job
> --
>
> Key: KYLIN-1839
> URL: https://issues.apache.org/jira/browse/KYLIN-1839
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Affects Versions: v1.5.2
>Reporter: fengYu
>Assignee: fengYu
> Fix For: v1.6.0
>
> Attachments: 0001-KYLIN-1839-support-kylin-lib-in-HDFS.patch
>
>
> in setClasspath, kylin will alway find hive jars from hive dependency using 
> regex, however, this will not change in one process lifetime, so I cache the 
> location of tmpjars and tmpfiles.
> What is more, support extends user lib setting to hdfs path rather than only 
> support local filesystem which will cause upload jars every time if 
> DistributedCache do not exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1839) improvement set classpath before submitting mr job

2016-10-18 Thread Billy(Yiming) Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587486#comment-15587486
 ] 

Billy(Yiming) Liu commented on KYLIN-1839:
--

Thanks [~feng_xiao_yu], could you update the kylin.properties?

> improvement set classpath before submitting mr job
> --
>
> Key: KYLIN-1839
> URL: https://issues.apache.org/jira/browse/KYLIN-1839
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Affects Versions: v1.5.2
>Reporter: fengYu
>Assignee: fengYu
> Fix For: v1.6.0
>
> Attachments: 0001-KYLIN-1839-support-kylin-lib-in-HDFS.patch
>
>
> in setClasspath, kylin will alway find hive jars from hive dependency using 
> regex, however, this will not change in one process lifetime, so I cache the 
> location of tmpjars and tmpfiles.
> What is more, support extends user lib setting to hdfs path rather than only 
> support local filesystem which will cause upload jars every time if 
> DistributedCache do not exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1839) improvement set classpath before submitting mr job

2016-10-18 Thread Billy(Yiming) Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587367#comment-15587367
 ] 

Billy(Yiming) Liu commented on KYLIN-1839:
--

Hi [~feng_xiao_yu], and doc update required for this update?

> improvement set classpath before submitting mr job
> --
>
> Key: KYLIN-1839
> URL: https://issues.apache.org/jira/browse/KYLIN-1839
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Affects Versions: v1.5.2
>Reporter: fengYu
>Assignee: fengYu
> Fix For: v1.6.0
>
> Attachments: 0001-KYLIN-1839-support-kylin-lib-in-HDFS.patch
>
>
> in setClasspath, kylin will alway find hive jars from hive dependency using 
> regex, however, this will not change in one process lifetime, so I cache the 
> location of tmpjars and tmpfiles.
> What is more, support extends user lib setting to hdfs path rather than only 
> support local filesystem which will cause upload jars every time if 
> DistributedCache do not exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1839) improvement set classpath before submitting mr job

2016-10-18 Thread liyang (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587152#comment-15587152
 ] 

liyang commented on KYLIN-1839:
---

Thanks to FengYu!

The patch is merged followed by a code review commit.

> improvement set classpath before submitting mr job
> --
>
> Key: KYLIN-1839
> URL: https://issues.apache.org/jira/browse/KYLIN-1839
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Affects Versions: v1.5.2
>Reporter: fengYu
>Assignee: fengYu
> Fix For: v1.6.0
>
> Attachments: 0001-KYLIN-1839-support-kylin-lib-in-HDFS.patch
>
>
> in setClasspath, kylin will alway find hive jars from hive dependency using 
> regex, however, this will not change in one process lifetime, so I cache the 
> location of tmpjars and tmpfiles.
> What is more, support extends user lib setting to hdfs path rather than only 
> support local filesystem which will cause upload jars every time if 
> DistributedCache do not exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1839) improvement set classpath before submitting mr job

2016-09-27 Thread liyang (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15527849#comment-15527849
 ] 

liyang commented on KYLIN-1839:
---

We still need to divide the patch into two parts. On one hand, the HDFS 
classpath improvement is good (it may further consider FS like s3, mapr etc). 
On the other hand, the cache part is not always good. While the cache is good 
for FengYu's case, it could cause problem for other users who set libs at cube 
level. And code is not on performance critical path, caching here won't help 
overall system performance.

Suggest the cache be removed, and the classpath improvement patch can apply.

> improvement set classpath before submitting mr job
> --
>
> Key: KYLIN-1839
> URL: https://issues.apache.org/jira/browse/KYLIN-1839
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Affects Versions: v1.5.2
>Reporter: fengYu
>Assignee: fengYu
> Attachments: 
> 0001-KYLIN-1839-support-extend-lib-from-HDFS-and-cache-tm.patch
>
>
> in setClasspath, kylin will alway find hive jars from hive dependency using 
> regex, however, this will not change in one process lifetime, so I cache the 
> location of tmpjars and tmpfiles.
> What is more, support extends user lib setting to hdfs path rather than only 
> support local filesystem which will cause upload jars every time if 
> DistributedCache do not exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1839) improvement set classpath before submitting mr job

2016-09-21 Thread fengYu (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509663#comment-15509663
 ] 

fengYu commented on KYLIN-1839:
---

I think the patch is useable. I think tmpjars and tmpfiles which used for MR 
job is process level, So I cache them all.

> improvement set classpath before submitting mr job
> --
>
> Key: KYLIN-1839
> URL: https://issues.apache.org/jira/browse/KYLIN-1839
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Affects Versions: v1.5.2
>Reporter: fengYu
>Assignee: fengYu
> Attachments: 
> 0001-KYLIN-1839-support-extend-lib-from-HDFS-and-cache-tm.patch
>
>
> in setClasspath, kylin will alway find hive jars from hive dependency using 
> regex, however, this will not change in one process lifetime, so I cache the 
> location of tmpjars and tmpfiles.
> What is more, support extends user lib setting to hdfs path rather than only 
> support local filesystem which will cause upload jars every time if 
> DistributedCache do not exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1839) improvement set classpath before submitting mr job

2016-07-05 Thread liyang (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363496#comment-15363496
 ] 

liyang commented on KYLIN-1839:
---

Note tmpjars and tmpfiles *can change* per cube, depending the user lib setting 
at cube level.

Support of user libs on HDFS is a good plus.

> improvement set classpath before submitting mr job
> --
>
> Key: KYLIN-1839
> URL: https://issues.apache.org/jira/browse/KYLIN-1839
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Affects Versions: v1.5.2
>Reporter: fengYu
>Assignee: fengYu
> Attachments: 
> 0001-KYLIN-1839-support-extend-lib-from-HDFS-and-cache-tm.patch
>
>
> in setClasspath, kylin will alway find hive jars from hive dependency using 
> regex, however, this will not change in one process lifetime, so I cache the 
> location of tmpjars and tmpfiles.
> What is more, support extends user lib setting to hdfs path rather than only 
> support local filesystem which will cause upload jars every time if 
> DistributedCache do not exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)