[ 
https://issues.apache.org/jira/browse/FLINK-24897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17483689#comment-17483689
 ] 

Biao Geng commented on FLINK-24897:
-----------------------------------

hi [~wangyang0918], I have completed the 
[PR|https://github.com/apache/flink/pull/18531]. Would you mind reviewing it in 
some days?  Thanks!

Besides, I find that the method 
`ClusterEntrypointUtils#tryFindUserLibDirectory` may be buggy: it is a method 
in the ClusterEntrypointUtils which is utilized by Entrypoint classes to find 
if there are any `usrlib` in the standalone/k8s/yarn cluster. However, the code 
shows that it will use the `FLINK_LIB_DIR` to find if there are any `usrlib` in 
the remote. But in YARN, the uploaded `usrlib` is located in the YARN cache 
dir(e.g. appcache/application_xx/container_yy). As a result, if we set 
`FLINK_LIB_DIR` by default in the remote cluster, `tryFindUserLibDirectory` 
will try to find `usrlib` in wrong location.
The reason for such bug not affecting correctness is that in most cases, 
`FLINK_LIB_DIR` will not be set in remote cluster's ENV and the current code 
then will use working dir which is System.getProperty("user.dir") by default.  
The working dir is the correct choice in most cases. 
I am not sure if this bug should be fixed in this PR or we should create a new 
one due to it will influence standalone/k8s/yarn clusterEntryPoint.


> Enable application mode on YARN to use usrlib
> ---------------------------------------------
>
>                 Key: FLINK-24897
>                 URL: https://issues.apache.org/jira/browse/FLINK-24897
>             Project: Flink
>          Issue Type: Improvement
>          Components: Deployment / YARN
>            Reporter: Biao Geng
>            Assignee: Biao Geng
>            Priority: Major
>              Labels: pull-request-available
>
> Hi there, 
> I am working to utilize application mode to submit flink jobs to YARN cluster 
> but I find that currently there is no easy way to ship my user-defined 
> jars(e.g. some custom connectors or udf jars that would be shared by some 
> jobs) and ask the FlinkUserCodeClassLoader to load classes in these jars. 
> I checked some relevant jiras, like  FLINK-21289. In k8s mode, there is a 
> solution that users can use `usrlib` directory to store their user-defined 
> jars and these jars would be loaded by FlinkUserCodeClassLoader when the job 
> is executed on JM/TM.
> But on YARN mode, `usrlib` does not work as that:
> In this method(org.apache.flink.yarn.YarnClusterDescriptor#addShipFiles), if 
> I want to use `yarn.ship-files` to ship `usrlib` from my flink client(in my 
> local machine) to remote cluster, I must not set  UserJarInclusion to 
> DISABLED due to the checkArgument(). However, if I do not set that option to 
> DISABLED, the user jars to be shipped will be added into systemClassPaths. As 
> a result, classes in those user jars will be loaded by AppClassLoader. 
> But if I do not ship these jars, there is no convenient way to utilize these 
> jars in my flink run command. Currently, all I can do seems to use `-C` 
> option, which means I have to upload my jars to some shared store first and 
> then use these remote paths. It is not so perfect as we have already make it 
> possible to ship jars or files directly and we also introduce `usrlib` in 
> application mode on YARN. It would be more user-friendly if we can allow 
> shipping `usrlib` from local to remote cluster while using 
> FlinkUserCodeClassLoader to load classes in the jars in `usrlib`.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to