Hi Jeyhun,

Thank you for your questions, please see my answers below.

> What is its impact on query optimization because resolving
> variables at the parsing stage might affect query optimization.

The approach I mentioned in the FLIP would not affect query
optimization, as it restricts variables to be literals, hence do
not support calculated variables. This means that the substitution
would be a simple string replace for the variables before the
actual parse happens on the already resolved statement.

Although this may not be the way to go, according to Yanfei's
previous comments I started to check on possible solutions for
calculated variables, which will probably change this answer, but
I will report back when I have something regarding this topic.

> What is the scope of variables? I mean when and how they override
> each other and when get out of their scopes?

This is a good question, I did not mention this in the FLIP. My
thinking on this topic is that a VariableStore is tied to a SQL
session, so variables are session scoped.

Having a system-wide scope might make sense. In that case, the
system-wide variable should be shadowed by the same session-wide
variable IMO, as a general rule regarding variable shadowing [1].
Although I did not include system-wide scope in my PoC, but this
would basically mean to maintain a specific system-wide
VariableStore.

> Does the proposal support dynamic assignment of the variables or
> the value of variables should be known at query compile time?

Covered this in my answer to the first Q.

> Can we somehow benefit from/leverage Calcite's parameterization
> feature in this proposal?

I am not super familiar with Calcite capabilities regarding this
topic and the Calcite docs were not really helpful either. But I
might looked over something, so can you elaborate more / point me
towards what you mean?


Best,
Ferenc

[1] https://en.wikipedia.org/wiki/Variable_shadowing


On Monday, April 1st, 2024 at 15:24, Jeyhun Karimov <je.kari...@gmail.com> 
wrote:

> 
> 
> Hi Ferenc,
> 
> Thanks for the proposal. Sounds like a good idea!
> I have a few questions on that:
> 
> - What is its impact on query optimization because resolving variables at
> the parsing stage might affect query optimization.
> 
> - What is the scope of variables? I mean when and how they override each
> other and when get out of their scopes?
> 
> - Does the proposal support dynamic assignment of the variables or the
> value of variables should be known at query compile time?
> 
> - Can we somehow benefit from/leverage Calcite's parameterization feature
> in this proposal?
> 
> Regards,
> Jeyhun
> 
> On Thu, Mar 28, 2024 at 6:21 PM Ferenc Csaky ferenc.cs...@pm.me.invalid
> 
> wrote:
> 
> > Hi, Jim, Yanfei,
> > 
> > Thanks for your comments! Let me reply in the order of the
> > messages.
> > 
> > > I'd prefer sticking to the SQL standard if possible. Would
> > > it be possible / sensible to allow for each syntax, perhaps
> > > managed by a config setting?
> > 
> > Correct me if I am wrong, but AFAIK variables are not part of
> > the ANSI SQL standard. The '@' prefix is used by some widely
> > used DB mgr, e.g. MySQL.
> > 
> > Regarding having multiple resolution syntax, it would be possible,
> > if we agree it adds value. Personally I do not have a strong
> > opinion on that.
> > 
> > > I'm new to Flink SQL and I'm curious if these variables can be
> > > calculated from statements or expression [1]?
> > 
> > Good point! The proposed solution would lack this functionality.
> > On our platform, we have a working solution of this that was
> > sufficient to solve the main problem we had to carry SQL between
> > environments without change.
> > 
> > At this point, variable values can only be literals, and they are
> > automatically escaped during resolution. Except if they are
> > resolved as a DDL statement property value.
> > 
> > But if the community agrees that it would be useful to have the
> > ability of calculated variables I would happily spend some time
> > on possible solutions that makes sense in Flink.
> > 
> > WDYT?
> > 
> > Best,
> > Ferenc
> > 
> > On Thursday, March 28th, 2024 at 03:58, Yanfei Lei fredia...@gmail.com
> > wrote:
> > 
> > > Hi Ferenc,
> > > 
> > > Thanks for the proposal, using SQLvariables to exclude
> > > environment-specific configuration from code sounds like a good idea.
> > > 
> > > I'm new to Flink SQL and I'm curious if these variables can be
> > > calculated from statements or expression [1]? In FLIP, it seems that
> > > the values are in the form of StringLiteral.
> > > 
> > > [1]
> > > https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-aux-set-variable.html
> > > 
> > > Jim Hughes jhug...@confluent.io.invalid 于2024年3月28日周四 04:54写道:
> > > 
> > > > Hi Ferenc,
> > > > 
> > > > Looks like a good idea.
> > > > 
> > > > I'd prefer sticking to the SQL standard if possible. Would it be
> > > > possible
> > > > / sensible to allow for each syntax, perhaps managed by a config
> > > > setting?
> > > > 
> > > > Cheers,
> > > > 
> > > > Jim
> > > > 
> > > > On Tue, Mar 26, 2024 at 6:59 AM Ferenc Csaky ferenc.cs...@pm.me.invalid
> > > > wrote:
> > > > 
> > > > > Hello devs,
> > > > > 
> > > > > I would like to start a discussion about FLIP-XXX: Introduce Flink
> > > > > SQL
> > > > > variables [1].
> > > > > 
> > > > > The main motivation behing this change is to be able to abstract
> > > > > Flink SQL
> > > > > from
> > > > > environment-specific configuration and provide a way to carry jobs
> > > > > between
> > > > > environments (e.g. dev-stage-prod) without the need to make changes
> > > > > in the
> > > > > code.
> > > > > It can also be a way to decouple sensitive information from the job
> > > > > code,
> > > > > or help
> > > > > with redundant literals.
> > > > > 
> > > > > The main decision regarding the proposed solution is to handle the
> > > > > variable resolution
> > > > > as early as possible on the given string statement, so the whole
> > > > > operation
> > > > > is an easy and
> > > > > lightweight string replace. But this approach introduces some
> > > > > limitations
> > > > > as well:
> > > > > 
> > > > > - The executed SQL will always be the unresolved, raw string, so in
> > > > > case
> > > > > of secrets
> > > > > a DESC operation would show them.
> > > > > - Changing the value of a variable can break code that uses that
> > > > > variable.
> > > > > 
> > > > > For more details, please check the FLIP [1]. There is also a stale
> > > > > Jira
> > > > > about this [2].
> > > > > 
> > > > > Looking forward to any comments and opinions!
> > > > > 
> > > > > Thanks,
> > > > > Ferenc
> > > > > 
> > > > > [1]
> > 
> > https://docs.google.com/document/d/1-eUz-PBCdqNggG_irDT0X7fdL61ysuHOaWnrkZHb5Hc/edit?usp=sharing
> > 
> > > > > [2] https://issues.apache.org/jira/browse/FLINK-17377
> > > 
> > > --
> > > Best,
> > > Yanfei

Reply via email to