Hi jingsong, Agree! Let's do that first.
Best, Fang Yong On Wed, Jul 24, 2024 at 6:21 PM Jingsong Li <jingsongl...@gmail.com> wrote: > Thanks Yong and Ming! > > We can take action and improve the structure of sort file based on > performance testing~ > > Strive to replace Hash File as the default option as soon as possible. > > Best, > Jingsong > > On Mon, Jul 22, 2024 at 10:58 AM Yong Fang <zjur...@gmail.com> wrote: > > > > Hi devs, > > > > LiMing and I would like to initiate a discussion on PIP-25: Introduce a > > key-value file format for Paimon primary key tables [1]. Currently, when > > Paimon requires creating lookup tables for lookup joins in streaming > > processes, it reads data from ORC/Parquet/Avro format files in HDFS/S3, > > converts records to key-value format data, and writes them to disk. This > > process consumes a substantial amount of time. > > > > JingSong Lee has added a basic sort lookup store in Paimon, and we aim to > > introduce the new key-value file format into Paimon based on it in order > to > > reduce the cost of creating lookup tables. Users can take advantage of > this > > file format for Paimon primary key tables when using Paimon as a lookup > > table. In this case, Paimon will create lookup tables based on the sorted > > store files without rebuilding key-value files. > > > > Looking forward to your feedback, thanks. > > > > [1] > > > https://cwiki.apache.org/confluence/display/PAIMON/PIP-25%3A+Introduce+a+key-value+file+format+for+paimon+primary+key+table > > > > Best, > > Fang Yong >