xuzifu666 opened a new pull request, #7750:
URL: https://github.com/apache/paimon/pull/7750
### Purpose
In real-world analytics scenarios, prefix queries on high-cardinality string
columns are very common. For example:
```
WHERE url LIKE '/api/v1/%'
WHERE order_id LIKE 'ORD2024%'
```
Existing file indexes in Paimon, such as BloomFilter and Bitmap, excel at
equality lookups but cannot efficiently handle prefix matching. BloomFilter
only checks exact value existence; Bitmap Index maps each distinct value to a
bitmap, making it impossible to determine which values share a common prefix
without scanning all entries.When no suitable index exists, the query engine
must perform a full file scan — reading the entire data file (often tens of
MBs) just to discover that no rows match the prefix predicate. This becomes
prohibitively expensive at scale.
This PR introduces **PrefixFileIndex**, a new pluggable file-level index
that accelerates prefix queries through a lightweight inverted index structure.
Prefix File Index is an inverted index that maps prefix strings to row
number bitmaps. Unlike Bitmap Index which indexes exact values, it extracts the
first N characters from each string value and groups rows by their prefix.
According to benchmark test result:
### Test Environment
- **CPU**: Apple M4
- **JVM**: Java HotSpot 17.0.12
- **Data Volume**: 1 million string rows (`{category}_{id}` format, 5
categories)
- **Test Module**: `paimon-benchmark/paimon-micro-benchmarks`
---
### 1. Index Size Comparison
| Cardinality | PrefixLen=2 | PrefixLen=3 | PrefixLen=4 | BitmapIndex |
**Raw Data** | Prefix Space Saving |
|-------------|-------------|-------------|-------------|-------------|-------------|---------------------|
| 100 | 649 KB | 649 KB | 649 KB | 2.0 MB |
**13.1 MB** | **20x vs data** |
| 1000 | 649 KB | 649 KB | 649 KB | 2.7 MB |
**14.7 MB** | **23x vs data** |
| 10000 | 649 KB | 649 KB | 649 KB | 7.7 MB |
**15.7 MB** | **24x vs data** |
**Key Finding**: Prefix Index size is independent of data cardinality,
depending only on the number of prefix types. Even at cardinality 10000, the
index remains at ~649KB.
---
### 2. Index Build Time Comparison
| Cardinality | PrefixLen=2 | PrefixLen=3 | PrefixLen=4 | BitmapIndex |
**Prefix Build Speedup** |
|-------------|-------------|-------------|-------------|-------------|--------------------------|
| 100 | 126 ms | 114 ms | 110 ms | 163 ms |
**1.3-1.5x** |
| 1000 | 112 ms | 112 ms | 110 ms | 246 ms |
**2.2x** |
| 10000 | 111 ms | 110 ms | 113 ms | 633 ms |
**5.6x** |
---
### 3. Query Performance — Hit Scenario (REMAIN)
Querying an existing prefix; no-index scan checks approximately 500K rows on
average (returns upon first match):
| Cardinality | PrefixIndex | BitmapIndex | **No-Index-Full-Scan** | Index
vs Full Scan |
|-------------|-------------|-------------|------------------------|--------------------|
| 100 | ~1.3 μs | ~2.1 μs | **0.14 μs** | Full
scan 9x* |
| 1000 | ~1.1 μs | ~1.9 μs | **0.79 μs** | Full
scan 1.4x* |
| 10000 | ~1.1 μs | ~2.0 μs | **0.76 μs** | Full
scan 1.5x* |
> *Note: In this scenario, full scan runs in memory and returns upon first
match, which does not represent real I/O scenarios. In production, data must be
read from disk, and the index advantage would be magnified hundreds of times.
---
### 4. Query Performance — Skip Scenario (Core Value)
Querying a non-existing prefix; no-index scan must check **all 1 million
rows** to confirm no match:
| Cardinality | PrefixIndex | BitmapIndex | **No-Index-Full-Scan** |
**Prefix Index Speedup** |
|-------------|-------------|-------------|------------------------|--------------------------|
| 100 | ~1.3 μs | ~2.1 μs | **12,558 μs** |
**~9.8x** |
| 1000 | ~1.1 μs | ~2.1 μs | **13,147 μs** |
**~11.9x** |
| 10000 | ~1.2 μs | ~1.6 μs | **13,262 μs** |
**~11.3x** |
---
### 5. Production Scenario Inference
The above tests were conducted **in memory**, without accounting for disk
I/O. In production:
| Scenario | No Index | Prefix Index |
Inferred Speedup |
|-------------------|-----------------------|-----------------------|------------------|
| Data file size | ~15 MB (1M rows) | ~649 KB | -
|
| Disk read time | ~50-200 ms | ~0.1 ms (cache hit) |
**500-2000x** |
| Skip decision time| Must read all data | Returns SKIP in 1 μs | **Tens
of thousands x** |
---
### Conclusion
| Dimension | Prefix Index Advantage
|
|----------------|------------------------------------------------------------------------------------------------------------------------------|
| **Index Size** | Only **1/20** of raw data, **3-12x** smaller than Bitmap
Index |
| **Build Speed**| Up to **5.6x** faster in high-cardinality scenarios
|
| **Skip Performance** | **10-12x** faster in memory, **hundreds to
thousands x** in real disk I/O
### Tests
PrefixFileIndexTest
PrefixIndexBenchmark
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]