CrownChu opened a new pull request, #7777: URL: https://github.com/apache/paimon/pull/7777
## Summary
- Add `paimon-elasticsearch` module implementing GlobalIndex SPI backed by
Elasticsearch vector search
- Support archive-based index packaging (tar.gz) for Paimon file system
storage
- Include Lucene directory adapter (`ArchiveDirectory`,
`ArchiveBackedIndexInput`, `ArchiveFlatVectorReader`) for reading packed index
segments
- Add SLF4J bridge for ES internal logging
- Add benchmarks (Sift1M, Lucene vs ES comparison) and integration tests
## Key Components
| Component | Description |
|-----------|-------------|
| `ESVectorGlobalIndexWriter` | Builds DiskBBQ vector index and packs into
archive |
| `ESVectorGlobalIndexReader` | Reads archived index, supports multi-stage
retrieval (coarse + rescore) |
| `ESVectorGlobalIndexer` | SPI entry point implementing `GlobalIndexer` |
| `ESIndexArchiveUtils` | Pack/unpack Lucene index segments to/from tar.gz
|
| `ESVectorIndexOptions` | Configuration options (dimension, metric, ef,
numCandidates, etc.) |
## Test Plan
- [x] Unit tests: `ESVectorGlobalIndexTest` covers write/read/search
round-trip
- [x] Benchmarks: `ESVectorBenchmark`, `Sift1MBenchmark`,
`LuceneVsESBenchmark`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
