Re: [DISCUSS] Flink SQL Syntax for Query/Savepoint Management

Shengkai Fang Sun, 17 Apr 2022 19:39:42 -0700

Hi, Paul.

I am just confused that how the client can retrieve the SQL statement from
the cluster? The SQL statement has been translated to the jobgraph and
submit to the cluster.


I think we will not only manage the query statement lifecyle. How about
`SHOW JOBS` and it will list the Job ID, Job Name, Job Type(DQL/DML) and
Status(runnning or failing) ?

Best,
Shengkai

Paul Lam <paullin3...@gmail.com> 于2022年4月12日周二 11:28写道：

> Hi Jark,
>
> Thanks a lot!
>
> I’m thinking of the 2nd approach. With this approach, the query lifecycle
> statements
> (show/stop/savepoint etc) are basically equivalent alternatives to Flink
> CLI from the
> user point of view.
>
> BTW, the completed jobs might be missing in `SHOW QUERIES`, because for
> application/per-clusters modes, the clusters would stop when the job
> terminates.
>
> WDYT?
>
> Best,
> Paul Lam
>
> > 2022年4月11日 14:17，Jark Wu <imj...@gmail.com> 写道：
> >
> > Hi Paul, I grant the permission to you.
> >
> > Regarding the "SHOW QUERIES", how will you bookkeep and persist the
> running
> > and complete queries?
> > Or will you retrieve the queries information from the cluster every time
> > when you receive the command?
> >
> >
> > Best,
> > Jark
> >
> >
> > On Wed, 6 Apr 2022 at 11:23, Paul Lam <paullin3...@gmail.com> wrote:
> >
> >> Hi Timo,
> >>
> >> Thanks for you reply!
> >>
> >>> It would be great to further investigate which other commands are
> >> required that would be usually be exeuted via CLI commands. I would
> like to
> >> avoid a large amount of FLIPs each adding a special job lifecycle
> command.
> >>
> >> Okay. I listed only the commands about jobs/queries that’s required for
> >> savepoints for simplicity. I would come up with a complete set of
> commands
> >> for the full lifecycle of jobs.
> >>
> >>> I guess job lifecycle commands don't make much sense in Table API? Or
> >> are you planning to support those also TableEnvironment.executeSql and
> >> integrate them into SQL parser?
> >>
> >> Yes, I’m thinking of adding job lifecycle management in SQL Client. SQL
> >> client could execute queries via TableEnvironment.executeSql and
> bookkeep
> >> the IDs, which is similar to ResultSotre in LocalExecutor.
> >>
> >> BTW, may I ask for the permission on Confluence to create a FLIP?
> >>
> >> Best,
> >> Paul Lam
> >>
> >>> 2022年4月4日 15:36，Timo Walther <twal...@apache.org> 写道：
> >>>
> >>> Hi Paul,
> >>>
> >>> thanks for proposing this. I think in general it makes sense to have
> >> those commands in SQL Client.
> >>>
> >>> However, this will be a big shift because we start adding job lifecycle
> >> SQL syntax. It would be great to further investigate which other
> commands
> >> are required that would be usually be exeuted via CLI commands. I would
> >> like to avoid a large amount of FLIPs each adding a special job
> lifecycle
> >> command
> >>>
> >>> I guess job lifecycle commands don't make much sense in Table API? Or
> >> are you planning to support those also TableEnvironment.executeSql and
> >> integrate them into SQL parser?
> >>>
> >>> Thanks,
> >>> Timo
> >>>
> >>>
> >>> Am 01.04.22 um 12:28 schrieb Paul Lam:
> >>>> Hi Martjin,
> >>>>
> >>>>> For any extension on the SQL syntax, there should be a FLIP. I would
> >> like
> >>>>> to understand how this works for both bounded and unbounded jobs, how
> >> this
> >>>>> works with the SQL upgrade story. Could you create one?
> >>>> Sure. I’m preparing one. Please give me the permission if possible.
> >>>>
> >>>> My Confluence user name is `paulin3280`, and the full name is `Paul
> >> Lam`.
> >>>>
> >>>>> I'm also copying in @Timo Walther <twal...@apache.org> and @Jark Wu
> >>>>> <imj...@gmail.com> for their opinion on this.
> >>>> Looking forward to your opinions @Timo @Jark :)
> >>>>
> >>>> Best,
> >>>> Paul Lam
> >>>>
> >>>>> 2022年4月1日 18:10，Martijn Visser <martijnvis...@apache.org> 写道：
> >>>>>
> >>>>> Hi Paul,
> >>>>>
> >>>>> For any extension on the SQL syntax, there should be a FLIP. I would
> >> like
> >>>>> to understand how this works for both bounded and unbounded jobs, how
> >> this
> >>>>> works with the SQL upgrade story. Could you create one?
> >>>>>
> >>>>> I'm also copying in @Timo Walther <twal...@apache.org> and @Jark Wu
> >>>>> <imj...@gmail.com> for their opinion on this.
> >>>>>
> >>>>> Best regards,
> >>>>>
> >>>>> Martijn
> >>>>>
> >>>>> On Fri, 1 Apr 2022 at 12:01, Paul Lam <paullin3...@gmail.com> wrote:
> >>>>>
> >>>>>> Hi Martijn,
> >>>>>>
> >>>>>> Thanks a lot for your input.
> >>>>>>
> >>>>>>> Have you already thought on how you would implement this in Flink?
> >>>>>> Yes, I roughly thought about the implementation:
> >>>>>>
> >>>>>> 1. Extending Executor to support job list via ClusterClient.
> >>>>>> 2. Extending Executor to support savepoint trigger/cancel/remove via
> >>>>>> JobClient.
> >>>>>> 3. Extending SQL parser to support the new statements via regex
> >>>>>> (AbstractRegexParseStrategy) or Calcite.
> >>>>>>
> >>>>>> IMHO, the implementation is not very complicated and barely touches
> >> the
> >>>>>> architecture of FLIP-91.
> >>>>>> (BTW,  FLIP-91 might be a little bit outdated and doesn’t fully
> >> reflect
> >>>>>> the current status of Flink SQL client/gateway.)
> >>>>>>
> >>>>>> WDYT?
> >>>>>>
> >>>>>> Best,
> >>>>>> Paul Lam
> >>>>>>
> >>>>>>> 2022年4月1日 17:33，Martijn Visser <mart...@ververica.com> 写道：
> >>>>>>>
> >>>>>>> Hi Paul,
> >>>>>>>
> >>>>>>> Thanks for opening the discussion. I agree that there are
> >> opportunities
> >>>>>> in
> >>>>>>> this area to increase user value.
> >>>>>>>
> >>>>>>> I would say that the syntax should be part of a proposal in a FLIP,
> >>>>>> because
> >>>>>>> the implementation would actually be the complex part, not so much
> >> the
> >>>>>>> syntax :) Especially since this also touches on FLIP-91 [1]
> >>>>>>>
> >>>>>>> Have you already thought on how you would implement this in Flink?
> >>>>>>>
> >>>>>>> Best regards,
> >>>>>>>
> >>>>>>> Martijn Visser
> >>>>>>> https://twitter.com/MartijnVisser82
> >>>>>>> https://github.com/MartijnVisser
> >>>>>>>
> >>>>>>> [1]
> >>>>>>>
> >>>>>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-91%3A+Support+SQL+Client+Gateway
> >>>>>>>
> >>>>>>> On Fri, 1 Apr 2022 at 11:25, Paul Lam <paullin3...@gmail.com>
> wrote:
> >>>>>>>
> >>>>>>>> Hi team,
> >>>>>>>>
> >>>>>>>> Greetings from Apache Kyuubi(incubating) community. We’re
> >> integrating
> >>>>>>>> Flink as a SQL engine and aiming to make it production-ready.
> >>>>>>>>
> >>>>>>>> However, query/savepoint management is a crucial but missing part
> in
> >>>>>> Flink
> >>>>>>>> SQL, thus we reach out to discuss the SQL syntax with Flink
> >> community.
> >>>>>>>>
> >>>>>>>> We propose to introduce the following statements:
> >>>>>>>>
> >>>>>>>> SHOW QUERIES: shows the running queries in the current session,
> >> which
> >>>>>>>> mainly returns query(namely Flink job) IDs and SQL statements.
> >>>>>>>> TRIGGER SAVEPOINT <query_id>: triggers a savepoint for the
> specified
> >>>>>>>> query, which returns the stored path of the savepoint.
> >>>>>>>> SHOW SAVEPOINTS <query_id>: shows the savepoints for the specified
> >>>>>> query,
> >>>>>>>> which returns the stored paths of the savepoints.
> >>>>>>>> REMOVE SAVEPOINT <savepoint_path>: removes the specified
> savepoint.
> >>>>>>>>
> >>>>>>>> WRT to keywords, `TRIGGER` and `SAVEPOINT` are already reserved
> >> keywords
> >>>>>>>> in Flink SQL[1], so the only new keyword is `QUERIES`.
> >>>>>>>>
> >>>>>>>> If we reach a consensus on the syntax, we could either implement
> it
> >> in
> >>>>>>>> Kyuubi and contribute back to Flink, or directly implement it in
> >> Flink.
> >>>>>>>>
> >>>>>>>> Looking forward for your feedback ;)
> >>>>>>>>
> >>>>>>>> [1]
> >>>>>>>>
> >>>>>>
> >>
> https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/overview/#reserved-keywords
> >>>>>>>> Best,
> >>>>>>>> Paul Lam
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> >>
>
>

Re: [DISCUSS] Flink SQL Syntax for Query/Savepoint Management

Reply via email to