airborne12 opened a new pull request, #59827:
URL: https://github.com/apache/doris/pull/59827

   ## Summary
   
   This PR adds support for the `skip_write_index_on_load` table property for 
ANN (Approximate Nearest Neighbor) indexes. When enabled, ANN index 
construction is skipped during data loading (INSERT/StreamLoad) and deferred to 
compaction or BUILD INDEX, significantly improving write throughput.
   
   ### Key Changes
   
   1. **Skip ANN index during write** (`segment_writer.cpp`, 
`vertical_segment_writer.cpp`)
      - Check `skip_write_index_on_load` property before creating ANN index 
writer
      - Write empty index file placeholder when skipping
   
   2. **Build ANN index during compaction** (`segment.cpp`, 
`segment_iterator.cpp`)
      - Load ANN index data from source segments during compaction
      - Build complete ANN index in output segment
   
   3. **Support BUILD INDEX for empty ANN indexes** (`index_builder.cpp`)
      - Handle `INVERTED_INDEX_BYPASS` error for empty index files
      - Allow BUILD INDEX to create ANN index when 
`skip_write_index_on_load=true`
   
   4. **Accurate index existence detection** (`tablet.cpp`, `cloud_tablet.cpp`)
      - Use `IndexFileReader::index_file_exist()` API to check actual index 
presence
      - Fast path optimization: check `index_size` for V2 format, 
`fs->exists()` for V1
   
   5. **Cloud mode support** (`cloud_tablet.cpp`, 
`engine_cloud_index_change_task.cpp`)
      - Same functionality for cloud deployment mode
   
   ### Test Plan
   
   Added comprehensive regression test (`test_skip_write_index_on_load.groovy`) 
with 4 test cases:
   - Case 1: `skip_write_index_on_load=true` - verify no ANN index before 
compaction
   - Case 2: `skip_write_index_on_load=true` - verify ANN index exists after 
compaction
   - Case 3: `skip_write_index_on_load=false` - verify ANN index exists 
immediately
   - Case 4: BUILD INDEX successfully builds ANN index when 
`skip_write_index_on_load=true`
   
   ### Check List (For Author)
   
   - Test
       - [x] Regression test
       - [ ] Unit Test
       - [ ] Manual test
       - [ ] No need to test
   
   - Behavior changed:
       - [x] Yes. Added support for `skip_write_index_on_load` property for ANN 
indexes
   
   - Does this need documentation?
       - [x] Yes. Document the `skip_write_index_on_load` property for ANN 
indexes
   
   ### Check List (For Reviewer who merge this PR)
   
   - [ ] Confirm the release note
   - [ ] Confirm test cases
   - [ ] Confirm document
   - [ ] Add branch pick label
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to