Hi Yong,

Thank you for sharing this fascinating PIP proposal!


I'm particularly intrigued by the sort buffer concept for Append tables. I'd 
like to ask: does the current sorting process only support multi-field 
sequential sorting, or would it be possible to introduce more advanced sorting 
strategies, such as Z-order sorting or other space-filling curve algorithms?


The reason I'm curious about this is that in high-volume batch write scenarios, 
having the data already sorted upon completion of the batch write could 
significantly improve query efficiency. Z-order sorting, for instance, could 
provide better data locality for multi-dimensional range queries, which is 
quite common in analytical workloads.


Would love to hear your thoughts on whether such sorting enhancements align 
with the current design goals! Looking forward to your insights and the 
continued development of this proposal!


Best regards,

Lei Li

2026年1月13日 17:32,Yong Fang <[email protected]> 写道:

Hi devs,

I'd like to initiate a discussion on PIP-41: Introduce FilePath Global
Index And Optimizations For Lookup In Append Table [1].

We use Paimon as cold storage for sample data of businesses such as search,
recommendation, and advertising. Given the extremely large volume of sample
data, we adopt Append Tables for data storage.

Batch jobs read and process this sample data, while lookup by key
capability for historical data is also required during the sample data
processing. To support hybrid queries for such ultra-high-dimensional data
in Paimon, we introduce the FilePath Global Index, along with optimizations
for reading Parquet metadata and Parquet file data in Append Tables (Some
of the major optimization designs come from our partner teams cc @lingpeng,
@guanziyue, thx), aiming to enhance the lookup capability of Paimon Append
Tables.

Looking forward to hearing from you, thanks


[1]
https://cwiki.apache.org/confluence/display/PAIMON/PIP-41%3A+Introduce+FilePath+Global+Index+And+Optimizations+For+Lookup+In+Append+Table

Best,
Fang Yong

Reply via email to