[
https://issues.apache.org/jira/browse/FLINK-19335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Husky Zeng updated FLINK-19335:
-------------------------------
Description:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Is-there-a-way-to-avoid-submit-hive-udf-s-resources-when-we-submit-a-job-td38204.html
As the mail say,upload udf's files every time is a big trouble in my
production environment , which blocked our automated task submission , and lead
to muti-data between hive-metastore and flink client ( Maintain multiple copies
of one data in two system is easy to cause asynchronous problems) . I plan to
develop a feature ——automatically get udf's resource files from hdfs when
running a job that uses hive-udf.Do you think this function is beneficial to
the community? Or any suggestion?
This is my plan:
https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/module/hive/HiveModule.java#L80
We have already get those udf resources's path in FunctionInfo , and pass the
path with job,when it
was:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Is-there-a-way-to-avoid-submit-hive-udf-s-resources-when-we-submit-a-job-td38204.html
As the mail say,upload udf's files every time is a big trouble in my in my
production environment , which blocked our automated task submission , and lead
to muti-data between hive-metastore and flink client ( Maintain multiple copies
of one data in two system is easy to cause asynchronous problems) . I plan to
develop a feature ——automatically get udf's resource files from hdfs when
running a job that uses hive-udf.Do you think this function is beneficial to
the community? Or any suggestion?
> Automatically get udf's resource files from hdfs when running a job that
> uses hive-udf
> ---------------------------------------------------------------------------------------
>
> Key: FLINK-19335
> URL: https://issues.apache.org/jira/browse/FLINK-19335
> Project: Flink
> Issue Type: Improvement
> Components: Connectors / Hive
> Environment: yarn ,per-job mode
> Reporter: Husky Zeng
> Priority: Major
>
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Is-there-a-way-to-avoid-submit-hive-udf-s-resources-when-we-submit-a-job-td38204.html
> As the mail say,upload udf's files every time is a big trouble in my
> production environment , which blocked our automated task submission , and
> lead to muti-data between hive-metastore and flink client ( Maintain multiple
> copies of one data in two system is easy to cause asynchronous problems) . I
> plan to develop a feature ——automatically get udf's resource files from hdfs
> when running a job that uses hive-udf.Do you think this function is
> beneficial to the community? Or any suggestion?
> This is my plan:
> https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/module/hive/HiveModule.java#L80
> We have already get those udf resources's path in FunctionInfo , and pass the
> path with job,when it
--
This message was sent by Atlassian Jira
(v8.3.4#803005)