Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2024-01-05 Thread Márton Balassi
Thanks, Paul. Ferenc and I have been looking into unblocking the Kubernetes path via an updated implementation for FLINK-28915 to ship the jars conveniently there. You can expect an updated PR there next week. Looking forward to your findings in the YARN POC. On Mon, Dec 11, 2023 at 4:01 AM Paul

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-12-10 Thread Paul Lam
Hi Ferenc, Sorry for my late reply. > Is any active work happening on this FLIP? As far as I see there > are blockers that needs to happen first to implement regarding > artifact distribution. You’re right. There’s a block in K8s application mode, but none in YARN application. I’m doing a POC

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-11-20 Thread Ferenc Csaky
Hello devs, Is any active work happening on this FLIP? As far as I see there are blockers that needs to happen first to implement regarding artifact distribution. Is this work in halt completetly or some efforts are going into resolve the blockers first or something? Our platform would benefit

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-29 Thread Paul Lam
Hi Jing, Thanks for your input! > Would you like to add > one section to describe(better with script/code example) how to use it in > these two scenarios from users' perspective? OK. I’ll update the FLIP with the code snippet after I get the POC branch done. > NIT: the pictures have

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-26 Thread Jing Ge
Hi Paul, Thanks for driving it and thank you all for the informative discussion! The FLIP is in good shape now. As described in the FLIP, SQL Driver will be mainly used to run Flink SQLs in two scenarios: 1. SQL client/gateway in application mode and 2. external system integration. Would you like

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-26 Thread Paul Lam
Hi Shengkai, > * How can we ship the json plan to the JobManager? The Flink K8s module should be responsible for file distribution. We could introduce an option like `kubernetes.storage.dir`. For each flink cluster, there would be a dedicated subdirectory, with the pattern like

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-26 Thread Shengkai Fang
Hi, Paul. Thanks for your update. I have a few questions about the new design: * How can we ship the json plan to the JobManager? The current design only exposes an option about the URL of the json plan. It seems the gateway is responsible to upload to an external stroage. Can we reuse the

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-19 Thread Paul Lam
Hi Shengkai, Sorry for my late reply. It took me some time to update the FLIP. In the latest FLIP design, SQL Driver is placed in flink-sql-gateway module. PTAL. The FLIP does not cover details about the K8s file distribution, but its general usage would be very much the same as YARN setups.

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-12 Thread Paul Lam
Hi Yang, Thanks a lot for your input! It’s great that FLINK-28915 has covered the file download part. I’ve created a ticket for the file upload part [1]. It's a prerequisite for supporting K8s application mode for SQL Gateway. [1] https://issues.apache.org/jira/browse/FLINK-32315 Best, Paul

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-12 Thread Shengkai Fang
> If it’s the case, I’m good with introducing a new module and making SQL Driver > an internal class and accepts JSON plans only. I rethink this again and again. I think it's better to move the SqlDriver into the sql-gateway module because the sql client relies on the sql-gateway to submit the

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-11 Thread Yang Wang
Sorry for the late reply. I am in favor of introducing such a built-in resource localization mechanism based on Flink FileSystem. Then FLINK-28915[1] could be the second step which will download the jars and dependencies to the JobManager/TaskManager local directory before working. The first step

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-09 Thread Paul Lam
Hi Mason, I get your point. I'm increasingly feeling the need to introduce a built-in file distribution mechanism for flink-kubernetes module, just like Spark does with `spark.kubernetes.file.upload.path` [1]. I’m assuming the workflow is as follows: - KubernetesClusterDescripter uploads all

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-08 Thread Paul Lam
Hi ShengKai, Good point with the ANALYZE TABLE and CALL PROCEDURE statements. > Can we remove the jars if the job is running or gateway exits? Yes, I think it would be okay to remove the resources after the job is submitted. It should be Gateway’s responsibility to remove them. > Can we use

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-08 Thread Paul Lam
Hi Weihua, Thanks a lot for your input! I see the difference here is implementing the file distribution mechanism in the generic CLI or in the SQL Driver. The CLI approach could benefit non-pure-SQL applications (which is not covered by SQL Driver) as well. Not sure if you’re proposing the

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-07 Thread Mason Chen
Hi Paul, Thanks for your response! I agree that utilizing SQL Drivers in Java applications is equally important > as employing them in SQL Gateway. WRT init containers, I think most > users use them just as a workaround. For example, wget a jar from the > maven repo. > > We could implement the

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-07 Thread Shengkai Fang
Hi. Paul. Thanks for your update and the update makes me understand the design much better. But I still have some questions about the FLIP. > For SQL Gateway, only DMLs need to be delegated to the SQL server > Driver. I would think about the details and update the FLIP. Do you have some > ideas

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-06 Thread Weihua Hu
Hi, Thanks for updating the FLIP. I have two cents on the distribution of SQLs and resources. 1. Should we support a common file distribution mechanism for k8s application mode? I have seen some issues and requirements on the mailing list. In our production environment, we implement the

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-06 Thread Paul Lam
Hi Mason, Thanks for your input! > +1 for init containers or a more generalized way of obtaining arbitrary > files. File fetching isn't specific to just SQL--it also matters for Java > applications if the user doesn't want to rebuild a Flink image and just > wants to modify the user application

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-05 Thread Mason Chen
Hi Paul, +1 for this feature and supporting SQL file + JSON plans. We get a lot of requests to just be able to submit a SQL file, but the JSON plan optimizations make sense. +1 for init containers or a more generalized way of obtaining arbitrary files. File fetching isn't specific to just

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-05 Thread Paul Lam
Hi Jark, Thanks for your input! Please see my comments inline. > Isn't Table API the same way as DataSream jobs to submit Flink SQL? > DataStream API also doesn't provide a default main class for users, > why do we need to provide such one for SQL? Sorry for the confusion I caused. By

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-02 Thread Jark Wu
Hi Paul, Thanks for your reply. I left my comments inline. > As the FLIP said, it’s good to have a default main class for Flink SQLs, > which allows users to submit Flink SQLs in the same way as DataStream > jobs, or else users need to write their own main class. Isn't Table API the same way as

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-02 Thread Paul Lam
The FLIP is in the early phase and some details are not included, but fortunately, we got lots of valuable ideas from the discussion. Thanks to everyone who joined the dissuasion! @Weihua @Shanmon @Shengkai @Biao @Jark This weekend I’m gonna revisit and update the FLIP, adding more details.

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-02 Thread Paul Lam
Hi Jark, Thanks a lot for your input! > If we decide to submit ExecNodeGraph instead of SQL file, is it still > necessary to support SQL Driver? I think so. Apart from usage in SQL Gateway, SQL Driver could simplify Flink SQL execution with Flink CLI. As the FLIP said, it’s good to have a

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-01 Thread Jark Wu
Hi Paul, Thanks for starting this discussion. I like the proposal! This is a frequently requested feature! I agree with Shengkai that ExecNodeGraph as the submission object is a better idea than SQL file. To be more specific, it should be JsonPlanGraph or CompiledPlan which is the serializable

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-06-01 Thread Paul Lam
Hi Weihua, You’re right. Distributing the SQLs to the TMs is one of the challenging parts of this FLIP. Web submission is not enabled in application mode currently as you said, but it could be changed if we have good reasons. What do you think about introducing a distributed storage for SQL

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-05-31 Thread Weihua Hu
Thanks Paul for your reply. SQLDriver looks good to me. 2. Do you mean a pass the SQL string a configuration or a program argument? I brought this up because we were unable to pass the SQL file to Flink using Kubernetes mode. For DataStream/Python users, they need to prepare their images for

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-05-31 Thread Paul Lam
Hi Biao, Thanks for your comments! > 1. Scope: is this FLIP only targeted for non-interactive Flink SQL jobs in > Application mode? More specifically, if we use SQL client/gateway to > execute some interactive SQLs like a SELECT query, can we ask flink to use > Application mode to execute those

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-05-31 Thread Paul Lam
Hi Shengkai, Thanks a lot for your comments! Please see my comments inline. > 1. The FLIP does not specify the kind of SQL that will be submitted with > the application mode. I believe only a portion of the SQL will be delegated > to the SqlRunner. You’re right. For SQL Gateway, only DMLs need

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-05-30 Thread Paul Lam
Sorry for the typo. I mean “We already have a PythonDriver doing the same job for PyFlink." Best, Paul Lam > 2023年5月31日 11:49,Paul Lam 写道: > > 1. I have a PythonDriver doing the same job for PyFlink [1]

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-05-30 Thread Biao Geng
Thanks Paul for the proposal!I believe it would be very useful for flink users. After reading the FLIP, I have some questions: 1. Scope: is this FLIP only targeted for non-interactive Flink SQL jobs in Application mode? More specifically, if we use SQL client/gateway to execute some interactive

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-05-30 Thread Paul Lam
Hi Shammon, Thanks a lot for your input! I thought SQL Driver could act as a general-purpose default main class for Flink SQL. It could be used in Flink CLI submission, web submission, or SQL Client/Gateway submission. For SQL Client/Gateway submission, we use it implicitly if needed, and for

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-05-30 Thread Paul Lam
Hi Weihua, Thanks a lot for your input! Please see my comments inline. > - Is SQLRunner the better name? We use this to run a SQL Job. (Not strong, > the SQLDriver is fine for me) I’ve thought about SQL Runner but picked SQL Driver for the following reasons FYI: 1. I have a PythonDriver doing

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-05-30 Thread Shengkai Fang
Thanks for the proposal. The Application mode is very important to Flink SQL. But I have some questions about the FLIP: 1. The FLIP does not specify the kind of SQL that will be submitted with the application mode. I believe only a portion of the SQL will be delegated to the SqlRunner. 2. Will

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-05-30 Thread Shammon FY
Thanks Paul for driving this proposal. I found the sql driver has no config related options. If I understand correctly, the sql driver can be used to submit sql jobs in a 'job submission service' such as sql-gateway. In general, in addition to the default config for Flink cluster which includes

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-05-30 Thread Weihua Hu
Thanks Paul for the proposal. +1 for this. It is valuable in improving ease of use. I have a few questions. - Is SQLRunner the better name? We use this to run a SQL Job. (Not strong, the SQLDriver is fine for me) - Could we run SQL jobs using SQL in strings? Otherwise, we need to prepare a SQL

[DISCUSS] FLIP-316: Introduce SQL Driver

2023-05-29 Thread Paul Lam
Hi team, I’d like to start a discussion about FLIP-316 [1], which introduces a SQL driver as the default main class for Flink SQL jobs. Currently, Flink SQL could be executed out of the box either via SQL Client/Gateway or embedded in a Flink Java/Python program. However, each one has its