Thanks Yong and Ming!

We can take action and improve the structure of sort file based on
performance testing~

Strive to replace Hash File as the default option as soon as possible.

Best,
Jingsong

On Mon, Jul 22, 2024 at 10:58 AM Yong Fang <[email protected]> wrote:
>
> Hi devs,
>
> LiMing and I would like to initiate a discussion on PIP-25: Introduce a
> key-value file format for Paimon primary key tables [1]. Currently, when
> Paimon requires creating lookup tables for lookup joins in streaming
> processes, it reads data from ORC/Parquet/Avro format files in HDFS/S3,
> converts records to key-value format data, and writes them to disk. This
> process consumes a substantial amount of time.
>
> JingSong Lee has added a basic sort lookup store in Paimon, and we aim to
> introduce the new key-value file format into Paimon based on it in order to
> reduce the cost of creating lookup tables. Users can take advantage of this
> file format for Paimon primary key tables when using Paimon as a lookup
> table. In this case, Paimon will create lookup tables based on the sorted
> store files without rebuilding key-value files.
>
> Looking forward to your feedback, thanks.
>
> [1]
> https://cwiki.apache.org/confluence/display/PAIMON/PIP-25%3A+Introduce+a+key-value+file+format+for+paimon+primary+key+table
>
> Best,
> Fang Yong

Reply via email to