airborne12 opened a new pull request, #59827:
URL: https://github.com/apache/doris/pull/59827
## Summary
This PR adds support for the `skip_write_index_on_load` table property for
ANN (Approximate Nearest Neighbor) indexes. When enabled, ANN index
construction is skipped during data loading (INSERT/StreamLoad) and deferred to
compaction or BUILD INDEX, significantly improving write throughput.
### Key Changes
1. **Skip ANN index during write** (`segment_writer.cpp`,
`vertical_segment_writer.cpp`)
- Check `skip_write_index_on_load` property before creating ANN index
writer
- Write empty index file placeholder when skipping
2. **Build ANN index during compaction** (`segment.cpp`,
`segment_iterator.cpp`)
- Load ANN index data from source segments during compaction
- Build complete ANN index in output segment
3. **Support BUILD INDEX for empty ANN indexes** (`index_builder.cpp`)
- Handle `INVERTED_INDEX_BYPASS` error for empty index files
- Allow BUILD INDEX to create ANN index when
`skip_write_index_on_load=true`
4. **Accurate index existence detection** (`tablet.cpp`, `cloud_tablet.cpp`)
- Use `IndexFileReader::index_file_exist()` API to check actual index
presence
- Fast path optimization: check `index_size` for V2 format,
`fs->exists()` for V1
5. **Cloud mode support** (`cloud_tablet.cpp`,
`engine_cloud_index_change_task.cpp`)
- Same functionality for cloud deployment mode
### Test Plan
Added comprehensive regression test (`test_skip_write_index_on_load.groovy`)
with 4 test cases:
- Case 1: `skip_write_index_on_load=true` - verify no ANN index before
compaction
- Case 2: `skip_write_index_on_load=true` - verify ANN index exists after
compaction
- Case 3: `skip_write_index_on_load=false` - verify ANN index exists
immediately
- Case 4: BUILD INDEX successfully builds ANN index when
`skip_write_index_on_load=true`
### Check List (For Author)
- Test
- [x] Regression test
- [ ] Unit Test
- [ ] Manual test
- [ ] No need to test
- Behavior changed:
- [x] Yes. Added support for `skip_write_index_on_load` property for ANN
indexes
- Does this need documentation?
- [x] Yes. Document the `skip_write_index_on_load` property for ANN
indexes
### Check List (For Reviewer who merge this PR)
- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label
🤖 Generated with [Claude Code](https://claude.com/claude-code)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]