Thank liya for bringing this discussion. As the changes are very big and has a great impact on architecture. I think we should be more clear about the benchmark. We need to be cautious about testing to ensure that we really benefit from it.
About the benchmark: 1.Can you make the test report clearer? For example, environment and data scale. 2.Can test scenarios be richer? For example, benchmark about spill scenarios. Benchmark about TPCDS. 3.Can we have more detailed test conclusions? About what kind of case will be quicker. At present, the calculation of blink-planner is not perfect. For example, it can avoid the overhead of virtual function calls. Aggregate algorithm needs to be improved. Can you can make further analysis with your benchmark. > More compact memory layout I think BinaryRow and ColumnarRow already have an efficient and compact memory layout. Just like you mentioned in doc. Blink's ColumnarRow now has vector computing features. And we can also push down a lot of calculations into the specific source, which can be more native to support the calculation near the source. I don't think complete vector calculation is that necessary. Because of the following reasons, the latter calculation is difficult to obtain benefits through Vector calculation: 1. Maybe the cost of conversion between VectorBatch and Row will be the performance killer. I think maybe we should do some performance test to it. If there are join/shuffle nodes, there will be vector-to-row and row-to-vector overhead? These two operators are often the key to job performance. 2. Operators like sort, aggregation, their vectorized computational versions maybe need more benchmarks. I have no idea about it. 3. Now Java SIMD can only improve a limited number of vector computation like filter and calc, but often the bottleneck of batch jobs is not there, more on Join and Shuffler. Complete Java vector computing looks like a long way off. If we vectorize through JNI, the cost of JNI can not be ignored. And SIMD algorithm is not necessarily faster, it brings a lot of additional data copies. 4. If we move forward with CodGenerator(Like Spark WholeStageCodeGen), can we achieve better results without vector computation? The JavaCompiler/JVM may optimize the code to SIMD. Other thing is that the vector version of operators maybe need consider the problem of memory management? Best, JingsongLee ------------------------------------------------------------------ From:Fan Liya <liya.fa...@gmail.com> Send Time:2019年7月2日(星期二) 16:31 To:dev <dev@flink.apache.org>; Ji Liu <niki...@aliyun.com> Subject:Re: [DISCUSS] Vectorization Support in Flink @Ji Liu, thanks a lot for your feedback. This work must be performed in a progressive manner, so as not to break existing code. Best, Liya Fan On Tue, Jul 2, 2019 at 3:57 PM Ji Liu <niki...@aliyun.com.invalid> wrote: > Hi Liya, > Thanks for opening this discuss. > +1 for this, vectorization makes sense for Flink especially for batch work > loads, I think Flink should look into supporting it progressively. > > Thanks, > Ji Liu > > > ------------------------------------------------------------------ > From:Jeff Zhang <zjf...@gmail.com> > Send Time:2019年7月2日(星期二) 15:50 > To:dev <dev@flink.apache.org> > Subject:Re: [DISCUSS] Vectorization Support in Flink > > Hi Liya, > > Displaying image is not supported in apache mail list, you need to put it > elsewhere and post link in mail list. > > > > Fan Liya <liya.fa...@gmail.com> 于2019年7月2日周二 下午3:40写道: > > > Performance chart. FYI. > > > > Best, > > Liya Fan > > [image: image.png] > > > > On Tue, Jul 2, 2019 at 3:37 PM Fan Liya <liya.fa...@gmail.com> wrote: > > > >> Hi all, > >> > >> We have opened an issue about vectorization in Flink (FLINK-13053 > >> <https://issues.apache.org/jira/browse/FLINK-13053>). Would you please > >> give your valuable feedback? Thank you in advance. > >> > >> Vectorization is a popular technique in SQL engines today. Compared with > >> traditional row-based approach, it has some distinct advantages, for > >> example: > >> > >> > >> > >> 1) Better use of CPU resources (e.g. SIMD) > >> > >> 2) More compact memory layout > >> > >> 3) More friendly to compressed data format. > >> > >> > >> > >> Currently, Flink is based on a row-based SQL engine for both stream and > >> batch workloads. To enjoy the above benefits, we want to bring > >> vectorization to Flink. This involves substantial changes to the > existing > >> code base. Therefore, we give a plan to carry out such changes in small, > >> incremental steps, in order not to affect existing features. We want to > >> apply it to batch workload first. The details can be found in our > proposal. > >> > >> > >> > >> For the past months, we have developed an initial implementation of the > >> above ideas. Initial performance evaluations on TPC-H benchmarks show > that > >> substantial performance improvements can be obtained by vectorization > (see > >> the figure below). More details can be found in our proposal. > >> > >> > >> > >> [image: > >> > https://lh5.googleusercontent.com/hjXkXGImWOjaiB8zF0SKIMoItY6VCBm-BmJWWEXRo0ZPHdwLgKzCmIoNKef1YPCaAA7NXN6RvO-nwBBXBee52KeAtBjyIvh_NcAuChvW3BEtQuZGL5GPddqxL_iMV7HvEVCC6k-m > ] > >> > >> > >> > >> Special thanks to @Kurt Young’s team for all the kind help. > >> > >> Special thanks to @Piotr Nowojski for all the valuable feedback and help > >> suggestions. > >> > >> > >> Best, > >> > >> Liya Fan > >> > > > > -- > Best Regards > > Jeff Zhang >