Thanks to Jingsong for the proposal. 
Global indexes can be useful in AI scenarios such as vector retrieval, and 
projects like LanceDB have already adopted this concept. Implementing a global 
index in the Paimon data lake could further expand its applicability for other 
scenarios. 


I hope Paimon’s index-abstraction interfaces will be designed to support 
customization by users. I’m looking forward to this feature.


Best wishes,
Xinyu Liu



At 2025-10-23 11:32:02, "Jingsong Li" <[email protected]> wrote:
>Hi everyone,
>
>I'd like to start a new discussion. [1]
>
>Global Index is a new indexing mechanism provided by Paimon, which is
>designed to optimize the performance of field equivalent queries,
>range queries, and complex filtering conditions. Compared with
>traditional file indexes, global indexes are managed through unified
>metadata, It solves the problem of index fragmentation in distributed
>scenarios and supports more flexible query modes. The file index is an
>index file for each file, while the global index is a table-level
>index that manages all data in a unified manner.
>
>The Index Manifest manages global indexes. We already have two Index
>Manifest types, 'DELETION_VECTORS' and 'HASH'. We can first introduce
>the 'BITMAP' global Index.The global index maintains the mapping
>relationship between the index field and the global row id, so the
>global index feature needs to rely on the row-tracking.enabled
>feature.
>
>[1] 
>https://cwiki.apache.org/confluence/display/PAIMON/PIP-38%3A+Introduce+Global+Index+for+Paimon+Table
>
>Best,
>Jingsong

Reply via email to