Thanks Ron for initiating this discussion, the OFGC sounds a cool
optimization.

I also agree with above comments about the benchmark result, it is
important for performance improvements.

And I thing Yun and Jingsong has raised a very interesting topic about the
support for streaming. It's worth mentioning that FLINK-19621[1] also aims
to support both batch and streaming in the Jira description and design
doc[2], but it closes with only supporting batch now.

I agree that those optimizations has higher priority for batch since it has
standard benchmarks such as TPCDS/TPCH, the performance improvement is much
more easier to show it's value. For the streaming part, I think it would
also be great to have if the optimization fits streaming as well because
Flink is a unified streaming and batch engine. Hence, we'd better to
clearly set the goal such as supporting batch only, or both streaming and
batch, and write it down clearly in the FLIP.

[1] https://issues.apache.org/jira/browse/FLINK-19621
[2]
https://docs.google.com/document/d/1qKVohV12qn-bM51cBZ8Hcgp31ntwClxjoiNBUOqVHsI/edit#


Jingsong Li <jingsongl...@gmail.com> 于2023年6月5日周一 14:15写道:

> > For the state compatibility session, it seems that the checkpoint
> compatibility would be broken just like [1] did. Could FLIP-190 [2] still
> be helpful in this case for SQL version upgrades?
>
> I guess this is only for batch processing. Streaming should be another
> story?
>
> Best,
> Jingsong
>
> On Mon, Jun 5, 2023 at 2:07 PM Yun Tang <myas...@live.com> wrote:
> >
> > Hi Ron,
> >
> > I think this FLIP would help to improve the performance, looking forward
> to its completion in Flink!
> >
> > For the state compatibility session, it seems that the checkpoint
> compatibility would be broken just like [1] did. Could FLIP-190 [2] still
> be helpful in this case for SQL version upgrades?
> >
> >
> > [1]
> https://docs.google.com/document/d/1qKVohV12qn-bM51cBZ8Hcgp31ntwClxjoiNBUOqVHsI/edit#heading=h.fri5rtpte0si
> > [2]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336489
> >
> > Best
> > Yun Tang
> >
> > ________________________________
> > From: Lincoln Lee <lincoln.8...@gmail.com>
> > Sent: Monday, June 5, 2023 10:56
> > To: dev@flink.apache.org <dev@flink.apache.org>
> > Subject: Re: [DISCUSS] FLIP-315: Support Operator Fusion Codegen for
> Flink SQL
> >
> > Hi Ron
> >
> > OFGC looks like an exciting optimization, looking forward to its
> completion
> > in Flink!
> > A small question, do we consider adding a benchmark for the operators to
> > intuitively understand the improvement brought by each improvement?
> > In addition, for the implementation plan, mentioned in the FLIP that 1.18
> > will support Calc, HashJoin and HashAgg, then what will be the next step?
> > and which operators do we ultimately expect to cover (all or specific
> ones)?
> >
> > Best,
> > Lincoln Lee
> >
> >
> > liu ron <ron9....@gmail.com> 于2023年6月5日周一 09:40写道:
> >
> > > Hi, Jark
> > >
> > > Thanks for your feedback, according to my initial assessment, the work
> > > effort is relatively large.
> > >
> > > Moreover, I will add a test result of all queries to the FLIP.
> > >
> > > Best,
> > > Ron
> > >
> > > Jark Wu <imj...@gmail.com> 于2023年6月1日周四 20:45写道:
> > >
> > > > Hi Ron,
> > > >
> > > > Thanks a lot for the great proposal. The FLIP looks good to me in
> > > general.
> > > > It looks like not an easy work but the performance sounds promising.
> So I
> > > > think it's worth doing.
> > > >
> > > > Besides, if there is a complete test graph with all TPC-DS queries,
> the
> > > > effect of this FLIP will be more intuitive.
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > >
> > > >
> > > > On Wed, 31 May 2023 at 14:27, liu ron <ron9....@gmail.com> wrote:
> > > >
> > > > > Hi, Jinsong
> > > > >
> > > > > Thanks for your valuable suggestions.
> > > > >
> > > > > Best,
> > > > > Ron
> > > > >
> > > > > Jingsong Li <jingsongl...@gmail.com> 于2023年5月30日周二 13:22写道:
> > > > >
> > > > > > Thanks Ron for your information.
> > > > > >
> > > > > > I suggest that it can be written in the Motivation of FLIP.
> > > > > >
> > > > > > Best,
> > > > > > Jingsong
> > > > > >
> > > > > > On Tue, May 30, 2023 at 9:57 AM liu ron <ron9....@gmail.com>
> wrote:
> > > > > > >
> > > > > > > Hi, Jingsong
> > > > > > >
> > > > > > > Thanks for your review. We have tested it in TPC-DS case, and
> got a
> > > > 12%
> > > > > > > gain overall when only supporting only Calc&HashJoin&HashAgg
> > > > operator.
> > > > > In
> > > > > > > some queries, we even get more than 30% gain, it looks like  an
> > > > > effective
> > > > > > > way.
> > > > > > >
> > > > > > > Best,
> > > > > > > Ron
> > > > > > >
> > > > > > > Jingsong Li <jingsongl...@gmail.com> 于2023年5月29日周一 14:33写道:
> > > > > > >
> > > > > > > > Thanks Ron for the proposal.
> > > > > > > >
> > > > > > > > Do you have some benchmark results for the performance
> > > > improvement? I
> > > > > > > > am more concerned about the improvement on Flink than the
> data in
> > > > > > > > other papers.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Jingsong
> > > > > > > >
> > > > > > > > On Mon, May 29, 2023 at 2:16 PM liu ron <ron9....@gmail.com>
> > > > wrote:
> > > > > > > > >
> > > > > > > > > Hi, dev
> > > > > > > > >
> > > > > > > > > I'd like to start a discussion about FLIP-315: Support
> Operator
> > > > > > Fusion
> > > > > > > > > Codegen for Flink SQL[1]
> > > > > > > > >
> > > > > > > > > As main memory grows, query performance is more and more
> > > > determined
> > > > > > by
> > > > > > > > the
> > > > > > > > > raw CPU costs of query processing itself, this is due to
> the
> > > > query
> > > > > > > > > processing techniques based on interpreted execution shows
> poor
> > > > > > > > performance
> > > > > > > > > on modern CPUs due to lack of locality and frequent
> instruction
> > > > > > > > > mis-prediction. Therefore, the industry is also
> researching how
> > > > to
> > > > > > > > improve
> > > > > > > > > engine performance by increasing operator execution
> efficiency.
> > > > In
> > > > > > > > > addition, during the process of optimizing Flink's
> performance
> > > > for
> > > > > > TPC-DS
> > > > > > > > > queries, we found that a significant amount of CPU time was
> > > spent
> > > > > on
> > > > > > > > > virtual function calls, framework collector calls, and
> invalid
> > > > > > > > > calculations, which can be optimized to improve the overall
> > > > engine
> > > > > > > > > performance. After some investigation, we found Operator
> Fusion
> > > > > > Codegen
> > > > > > > > > which is proposed by Thomas Neumann in the paper[2] can
> address
> > > > > these
> > > > > > > > > problems. I have finished a PoC[3] to verify its
> feasibility
> > > and
> > > > > > > > validity.
> > > > > > > > >
> > > > > > > > > Looking forward to your feedback.
> > > > > > > > >
> > > > > > > > > [1]:
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-315+Support+Operator+Fusion+Codegen+for+Flink+SQL
> > > > > > > > > [2]: http://www.vldb.org/pvldb/vol4/p539-neumann.pdf
> > > > > > > > > [3]: https://github.com/lsyldliu/flink/tree/OFCG
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Ron
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
>


-- 

Best,
Benchao Li

Reply via email to