Hi fanrui,
Issue has been created, please refer here , welcome to continue to improve. https://github.com/apache/incubator-streampark/issues/1782 At 2022-10-10 12:56:20, "Rui Fan" <[email protected]> wrote: >Hi tiger: > >> The e-mail does not show the picture, reapply it as an attachment. > >Thanks for your discussion. Could you create an issue first? And >add all background, motivation and solutions in that issue. > >Best, >fanrui > >On Mon, Oct 10, 2022 at 12:32 PM tiger <[email protected]> wrote: > >> The e-mail does not show the picture, reapply it as an attachment. >> >> >> >> At 2022-10-10 12:23:48, "tiger" <[email protected]> wrote: >> >> hi huajie, >> >> Nice to receive a reply, so I'll share my thoughts next >> >> >> - *Is the UDF management module just a simple CRUD module?* >> >> Personally, I think so: it has CRUD functionality, but the UDF module is >> designed primarily to be user-friendly. >> Users may create many UDFs, but over time, they may forget some >> information (e.g., function name, class corresponding to the function, >> storage path, etc.), and with this module these problems can be solved; at >> the same time, when creating a job, you can also choose which UDF to use >> (refer to the next point ), which eliminates the need to upload this step >> and is more convenient. >> >> >> - *How does it work with the user's job?* >> >> The current plan is mainly based on the yarn application model, so the >> following is mainly an example of how to use UDF. >> >> 1. When creating a job, select the required UDF (e.g., a drop-down box >> showing the UDF available to the current user, associated with udfId); >> 2. When starting a job, it will query the paths of these udf stores >> according to the selected udfId (there can be more than one), and at the >> same time stitch these storage paths into strings, and finally pass them >> into yarn.provided.lib.dirs when submitting the job to achieve dynamic >> loading. >> >> *UDF Select Box UI Example:* >> *Example of sql using udf *: >> refer: >> https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/create/#create-function >> >> *yarn.provided.lib.dirs*: >> refer: >> https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#yarn-provided-lib-dirs >> >> >> - *About compatibility with other deployment models* >> >> k8s, standalone and other modes, speakinsg frankly,there is not much idea >> yet, plan to study later, but also very welcome to work with everyone to >> improve these features. >> >> >> >> >> >> >> At 2022-10-09 17:32:05, "Huajie Wang" <[email protected]> wrote: >> >hi tiger: >> > >> >Thanks for starting a valuable discussion, If udf is only a management >> >module(CURD), that's easy, The key is in multiple deployment modes (on >> >yarn|k8s|standalone...) How these udfs work together with the user's job? >> >This is a difficult problem. Do you have any relevant ideas and designs for >> >this? >> > >> > >> >Best, >> >Huajie Wang >> > >> > >> > >> >tiger <[email protected]> 于2022年10月9日周日 17:18写道: >> > >> >> Hello everyone >> >> >> >> >> >> As previously discussed in the group, an issue has been created over here >> >> and suggestions are welcome. >> >> >> >> >> >> Regarding the development of specific features, as I don't have permission >> >> to create a branch, could @Huajie help to create a new branch based on the >> >> 1.2.3-release branch? For example udf-management to facilitate >> >> development. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> At 2022-10-08 17:18:35, "功夫熊猫" <[email protected]> wrote: >> >> >Hi all, >> >> > >> >> >BackGround:I've been in contact with StreamPark for a while, and I've had >> >> a pretty good experience in terms of ease of use and stability. At >> >> present, >> >> StreamPark itself supports UDF functions, but it seems that there is no >> >> unified management menu for UDF, so I would like to add a new menu for UDF >> >> management, which is used for the management of UDF. >> >> > >> >> >Main implementation ideas: >> >> >Currently, we mainly create UDF through restful api, then select UDF when >> >> creating the job, and associate UDF ids (mainly to get UDF JAR storage >> >> path >> >> later), and finally achieve dynamic loading through yarn.provided.lib.dirs >> >> parameter. >> >> >Note: This feature is currently only implemented based on SQL jobs in >> >> Yarn Application mode; the JAR is saved on top of HDFS. >> >> > >> >> > >> >> >Main APIs: >> >> >Add UDF >> >> > >> >> >Query UDF (list) >> >> >Edit UDF >> >> >Delete UDF >> >> > >> >> > >> >> >Follow up plan: >> >> >Basic functional development at the API level is implemented first, >> >> followed by front-end UI-related development. >> >> > >> >> > >> >> >Best wishes >> >> >tiger >> >> >> >>
