Hi jingsong,

Agree! Let's do that first.

Best,
Fang Yong

On Wed, Jul 24, 2024 at 6:21 PM Jingsong Li <jingsongl...@gmail.com> wrote:

> Thanks Yong and Ming!
>
> We can take action and improve the structure of sort file based on
> performance testing~
>
> Strive to replace Hash File as the default option as soon as possible.
>
> Best,
> Jingsong
>
> On Mon, Jul 22, 2024 at 10:58 AM Yong Fang <zjur...@gmail.com> wrote:
> >
> > Hi devs,
> >
> > LiMing and I would like to initiate a discussion on PIP-25: Introduce a
> > key-value file format for Paimon primary key tables [1]. Currently, when
> > Paimon requires creating lookup tables for lookup joins in streaming
> > processes, it reads data from ORC/Parquet/Avro format files in HDFS/S3,
> > converts records to key-value format data, and writes them to disk. This
> > process consumes a substantial amount of time.
> >
> > JingSong Lee has added a basic sort lookup store in Paimon, and we aim to
> > introduce the new key-value file format into Paimon based on it in order
> to
> > reduce the cost of creating lookup tables. Users can take advantage of
> this
> > file format for Paimon primary key tables when using Paimon as a lookup
> > table. In this case, Paimon will create lookup tables based on the sorted
> > store files without rebuilding key-value files.
> >
> > Looking forward to your feedback, thanks.
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/PAIMON/PIP-25%3A+Introduce+a+key-value+file+format+for+paimon+primary+key+table
> >
> > Best,
> > Fang Yong
>

Reply via email to