JunRuiLee commented on PR #7940:
URL: https://github.com/apache/paimon/pull/7940#issuecomment-4527296470

   Thanks for the detailed review! I've split this PR into 3 parts as suggested:
   
   1. **#7943** — Read-only verification logic (`TableRepair.verify()`)
   2. **#7944** — Fix mode + catalog integration (depends on Part 1)
   3. **#7945** — CLI command (depends on Part 2)
   
   Also addressed the other feedback points:
   - **Progress logging**: Added `logging.info` every 1000 data files when 
`check_data_files=True`, and documented time complexity as O(total_data_files)
   - **Resume-from-failure**: Added per-table error isolation in 
`repair_database` — individual table failures are logged and skipped, so 
re-running after a crash continues from where it left off
   - **Idempotency**: The only fix operation (`_fix_latest_file`) performs a 
single atomic write. Re-running after interruption converges to valid state. 
Added docstring explaining the guarantee.
   - **Test for interrupted mid-fix**: Added `test_repair_is_idempotent` — runs 
repair twice, verifies second run is a no-op
   - **Return type annotations**: Added consistent `-> RepairReport` / `-> 
List[RepairReport]` annotations to catalog methods
   
   Please merge in order: Part 1 → Part 2 → Part 3. Closing this PR in favor of 
the split.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to