[PR] [common] Introduce PrefixFileIndex for prefix query optimization [paimon]

via GitHub Thu, 30 Apr 2026 02:14:02 -0700


xuzifu666 opened a new pull request, #7750:
URL: https://github.com/apache/paimon/pull/7750


   ### Purpose
   In real-world analytics scenarios, prefix queries on high-cardinality string 
columns are very common. For example:
   ```
   WHERE url LIKE '/api/v1/%'
   WHERE order_id LIKE 'ORD2024%'
   ```
   Existing file indexes in Paimon, such as BloomFilter and Bitmap, excel at 
equality lookups but cannot efficiently handle prefix matching. BloomFilter 
only checks exact value existence; Bitmap Index maps each distinct value to a 
bitmap, making it impossible to determine which values share a common prefix 
without scanning all entries.When no suitable index exists, the query engine 
must perform a full file scan — reading the entire data file (often tens of 
MBs) just to discover that no rows match the prefix predicate. This becomes 
prohibitively expensive at scale.
   
   This PR introduces **PrefixFileIndex**, a new pluggable file-level index 
that accelerates prefix queries through a lightweight inverted index structure.
   
   Prefix File Index is an inverted index that maps prefix strings to row 
number bitmaps. Unlike Bitmap Index which indexes exact values, it extracts the 
first N characters from each string value and groups rows by their prefix.
   
   According to benchmark test result：
   ### Test Environment
   - **CPU**: Apple M4
   - **JVM**: Java HotSpot 17.0.12
   - **Data Volume**: 1 million string rows (`{category}_{id}` format, 5 
categories)
   - **Test Module**: `paimon-benchmark/paimon-micro-benchmarks`
   
   ---
   
   ### 1. Index Size Comparison
   
   | Cardinality | PrefixLen=2 | PrefixLen=3 | PrefixLen=4 | BitmapIndex | 
**Raw Data** | Prefix Space Saving |
   
|-------------|-------------|-------------|-------------|-------------|-------------|---------------------|
   | 100         | 649 KB      | 649 KB      | 649 KB      | 2.0 MB      | 
**13.1 MB** | **20x vs data**     |
   | 1000        | 649 KB      | 649 KB      | 649 KB      | 2.7 MB      | 
**14.7 MB** | **23x vs data**     |
   | 10000       | 649 KB      | 649 KB      | 649 KB      | 7.7 MB      | 
**15.7 MB** | **24x vs data**     |
   
   **Key Finding**: Prefix Index size is independent of data cardinality, 
depending only on the number of prefix types. Even at cardinality 10000, the 
index remains at ~649KB.
   
   ---
   
   ### 2. Index Build Time Comparison
   
   | Cardinality | PrefixLen=2 | PrefixLen=3 | PrefixLen=4 | BitmapIndex | 
**Prefix Build Speedup** |
   
|-------------|-------------|-------------|-------------|-------------|--------------------------|
   | 100         | 126 ms      | 114 ms      | 110 ms      | 163 ms      | 
**1.3-1.5x**             |
   | 1000        | 112 ms      | 112 ms      | 110 ms      | 246 ms      | 
**2.2x**                 |
   | 10000       | 111 ms      | 110 ms      | 113 ms      | 633 ms      | 
**5.6x**                 |
   
   ---
   
   ### 3. Query Performance — Hit Scenario (REMAIN)
   
   Querying an existing prefix; no-index scan checks approximately 500K rows on 
average (returns upon first match):
   
   | Cardinality | PrefixIndex | BitmapIndex | **No-Index-Full-Scan** | Index 
vs Full Scan |
   
|-------------|-------------|-------------|------------------------|--------------------|
   | 100         | ~1.3 μs     | ~2.1 μs     | **0.14 μs**            | Full 
scan 9x*      |
   | 1000        | ~1.1 μs     | ~1.9 μs     | **0.79 μs**            | Full 
scan 1.4x*    |
   | 10000       | ~1.1 μs     | ~2.0 μs     | **0.76 μs**            | Full 
scan 1.5x*    |
   
   > *Note: In this scenario, full scan runs in memory and returns upon first 
match, which does not represent real I/O scenarios. In production, data must be 
read from disk, and the index advantage would be magnified hundreds of times.
   
   ---
   
   ### 4. Query Performance — Skip Scenario (Core Value)
   
   Querying a non-existing prefix; no-index scan must check **all 1 million 
rows** to confirm no match:
   
   | Cardinality | PrefixIndex | BitmapIndex | **No-Index-Full-Scan** | 
**Prefix Index Speedup** |
   
|-------------|-------------|-------------|------------------------|--------------------------|
   | 100         | ~1.3 μs     | ~2.1 μs     | **12,558 μs**          | 
**~9.8x**                |
   | 1000        | ~1.1 μs     | ~2.1 μs     | **13,147 μs**          | 
**~11.9x**               |
   | 10000       | ~1.2 μs     | ~1.6 μs     | **13,262 μs**          | 
**~11.3x**               |
   
   ---
   
   ### 5. Production Scenario Inference
   
   The above tests were conducted **in memory**, without accounting for disk 
I/O. In production:
   
   | Scenario          | No Index              | Prefix Index          | 
Inferred Speedup |
   
|-------------------|-----------------------|-----------------------|------------------|
   | Data file size    | ~15 MB (1M rows)      | ~649 KB               | -      
          |
   | Disk read time    | ~50-200 ms            | ~0.1 ms (cache hit)   | 
**500-2000x**    |
   | Skip decision time| Must read all data    | Returns SKIP in 1 μs  | **Tens 
of thousands x** |
   
   ---
   
   ### Conclusion
   
   | Dimension      | Prefix Index Advantage                                    
                                                                   |
   
|----------------|------------------------------------------------------------------------------------------------------------------------------|
   | **Index Size** | Only **1/20** of raw data, **3-12x** smaller than Bitmap 
Index                                                               |
   | **Build Speed**| Up to **5.6x** faster in high-cardinality scenarios       
                                                                   |
   | **Skip Performance** | **10-12x** faster in memory, **hundreds to 
thousands x** in real disk I/O                                              
   
   
   
   ### Tests
   
   PrefixFileIndexTest
   PrefixIndexBenchmark


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] [common] Introduce PrefixFileIndex for prefix query optimization [paimon]

Reply via email to