[jira] [Updated] (FLINK-19335) Automatically get udf's resource files from hdfs when running a job that uses hive-udf

Husky Zeng (Jira) Mon, 21 Sep 2020 20:51:41 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-19335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Husky Zeng updated FLINK-19335:
-------------------------------
    Description: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Is-there-a-way-to-avoid-submit-hive-udf-s-resources-when-we-submit-a-job-td38204.html

As the mail say，maintain a copy of udf's files in flink-client is a big trouble 
 in my production environment , which blocked our automated task submission , 
and easy to cause asynchronous problems . 

So I plan to develop a feature ——automatically  get udf's resource files from 
hdfs when running a job that uses hive-udf. Do you think this function is 
beneficial to the community ? Or any suggestion ?  

  was:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Is-there-a-way-to-avoid-submit-hive-udf-s-resources-when-we-submit-a-job-td38204.html

As the mail say，upload udf's files every time is a big trouble  in my 
production environment , which blocked our automated task submission , and lead 
to muti-data between hive-metastore and flink client ( Maintain multiple copies 
of one data in two system is easy to cause asynchronous problems) . I plan to 
develop a feature ——automatically  get udf's resource files from hdfs when 
running a job that uses hive-udf.Do you think this function is beneficial to 
the community? Or any suggestion?  

This is my plan：
https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/module/hive/HiveModule.java#L80
We have already get those udf resources's path in FunctionInfo , and pass the 
path with job，when it 


> Automatically  get udf's resource files from hdfs when running a job that 
> uses hive-udf
> ---------------------------------------------------------------------------------------
>
>                 Key: FLINK-19335
>                 URL: https://issues.apache.org/jira/browse/FLINK-19335
>             Project: Flink
>          Issue Type: Improvement
>          Components: Connectors / Hive
>         Environment: yarn ，per-job mode
>            Reporter: Husky Zeng
>            Priority: Major
>
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Is-there-a-way-to-avoid-submit-hive-udf-s-resources-when-we-submit-a-job-td38204.html
> As the mail say，maintain a copy of udf's files in flink-client is a big 
> trouble  in my production environment , which blocked our automated task 
> submission , and easy to cause asynchronous problems . 
> So I plan to develop a feature ——automatically  get udf's resource files from 
> hdfs when running a job that uses hive-udf. Do you think this function is 
> beneficial to the community ? Or any suggestion ?  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19335) Automatically get udf's resource files from hdfs when running a job that uses hive-udf

Reply via email to