Thank liya for bringing this discussion.
As the changes are very big and has a great impact on architecture.
 I think we should be more clear about the benchmark. We need to
 be cautious about testing to ensure that we really benefit from it.

About the benchmark:
1.Can you make the test report clearer? For example, environment
and data scale.
2.Can test scenarios be richer? For example, benchmark about
spill scenarios. Benchmark about TPCDS.
3.Can we have more detailed test conclusions? About what kind of 
case will be quicker. At present, the calculation of blink-planner is 
not perfect. For example, it can avoid the overhead of virtual 
function calls. Aggregate algorithm needs to be improved. Can 
you can make further analysis with your benchmark.

> More compact memory layout
I think BinaryRow and ColumnarRow already have an efficient 
and compact memory layout.

Just like you mentioned in doc. Blink's ColumnarRow now has 
vector computing features. And we can also push down a lot of 
calculations into the specific source, which can be more native 
to support the calculation near the source. I don't think complete 
vector calculation is that necessary. Because of the following 
reasons, the latter calculation is difficult to obtain benefits through 
Vector calculation:
1. Maybe the cost of conversion between VectorBatch and Row will 
be the performance killer. I think maybe we should do some 
performance test to it. If there are join/shuffle nodes, there will be
 vector-to-row and row-to-vector overhead? These two operators 
are often the key to job performance.
2. Operators like sort, aggregation, their vectorized computational 
versions maybe need more benchmarks. I have no idea about it.
3. Now Java SIMD can only improve a limited number of vector 
computation like filter and calc, but often the bottleneck of batch 
jobs is not there, more on Join and Shuffler. Complete Java vector 
computing looks like a long way off. If we vectorize through JNI, 
the cost of JNI can not be ignored. And SIMD algorithm is not 
necessarily faster, it brings a lot of additional data copies.
4. If we move forward with CodGenerator(Like Spark 
WholeStageCodeGen), can we achieve better results without 
vector computation? The JavaCompiler/JVM may optimize the 
code to SIMD.

Other thing is that the vector version of operators maybe need 
consider the problem of memory management?

Best, JingsongLee


------------------------------------------------------------------
From:Fan Liya <liya.fa...@gmail.com>
Send Time:2019年7月2日(星期二) 16:31
To:dev <dev@flink.apache.org>; Ji Liu <niki...@aliyun.com>
Subject:Re: [DISCUSS] Vectorization Support in Flink

@Ji Liu, thanks a lot for your feedback.
This work must be performed in a progressive manner, so as not to break
existing code.

Best,
Liya Fan

On Tue, Jul 2, 2019 at 3:57 PM Ji Liu <niki...@aliyun.com.invalid> wrote:

> Hi Liya,
> Thanks for opening this discuss.
> +1 for this, vectorization makes sense for Flink especially for batch work
> loads, I think Flink should look into supporting it progressively.
>
> Thanks,
> Ji Liu
>
>
> ------------------------------------------------------------------
> From:Jeff Zhang <zjf...@gmail.com>
> Send Time:2019年7月2日(星期二) 15:50
> To:dev <dev@flink.apache.org>
> Subject:Re: [DISCUSS] Vectorization Support in Flink
>
> Hi Liya,
>
> Displaying image is not supported in apache mail list, you need to put it
> elsewhere and post link in mail list.
>
>
>
> Fan Liya <liya.fa...@gmail.com> 于2019年7月2日周二 下午3:40写道:
>
> > Performance chart. FYI.
> >
> > Best,
> > Liya Fan
> > [image: image.png]
> >
> > On Tue, Jul 2, 2019 at 3:37 PM Fan Liya <liya.fa...@gmail.com> wrote:
> >
> >> Hi all,
> >>
> >> We have opened an issue about vectorization in Flink (FLINK-13053
> >> <https://issues.apache.org/jira/browse/FLINK-13053>). Would you please
> >> give your valuable feedback? Thank you in advance.
> >>
> >> Vectorization is a popular technique in SQL engines today. Compared with
> >> traditional row-based approach, it has some distinct advantages, for
> >> example:
> >>
> >>
> >>
> >> 1)      Better use of CPU resources (e.g. SIMD)
> >>
> >> 2)      More compact memory layout
> >>
> >> 3)      More friendly to compressed data format.
> >>
> >>
> >>
> >> Currently, Flink is based on a row-based SQL engine for both stream and
> >> batch workloads. To enjoy the above benefits, we want to bring
> >> vectorization to Flink. This involves substantial changes to the
> existing
> >> code base. Therefore, we give a plan to carry out such changes in small,
> >> incremental steps, in order not to affect existing features. We want to
> >> apply it to batch workload first. The details can be found in our
> proposal.
> >>
> >>
> >>
> >> For the past months, we have developed an initial implementation of the
> >> above ideas. Initial performance evaluations on TPC-H benchmarks show
> that
> >> substantial performance improvements can be obtained by vectorization
> (see
> >> the figure below). More details can be found in our proposal.
> >>
> >>
> >>
> >> [image:
> >>
> https://lh5.googleusercontent.com/hjXkXGImWOjaiB8zF0SKIMoItY6VCBm-BmJWWEXRo0ZPHdwLgKzCmIoNKef1YPCaAA7NXN6RvO-nwBBXBee52KeAtBjyIvh_NcAuChvW3BEtQuZGL5GPddqxL_iMV7HvEVCC6k-m
> ]
> >>
> >>
> >>
> >> Special thanks to @Kurt Young’s team for all the kind help.
> >>
> >> Special thanks to @Piotr Nowojski for all the valuable feedback and help
> >> suggestions.
> >>
> >>
> >> Best,
> >>
> >> Liya Fan
> >>
> >
>
> --
> Best Regards
>
> Jeff Zhang
>

Reply via email to