Re: [PR] Improve InventoryDumper query range after job restarting [shardingsphere]

via GitHub Tue, 11 Nov 2025 19:34:23 -0800


sandynz commented on PR #36878:
URL: https://github.com/apache/shardingsphere/pull/36878#issuecomment-3519740305


   `InventoryDumperContext.firstDump` might cause bug.
   
   ### Test case
   
   #### Table and configuration:
   - `tableA` with 300 million records, with integer primary key id 
AUTO_INCREMENT. Use default `SHARDING_SIZE` 10 million, then there could be 30 
records shards.
   - Use default `WORKER_THREAD` 20.
   
   #### Steps to reproduce
   1. Start migration job. (There will be 20 threads to do migration 
concurrently, and 10 shards tasks are queued)
   2. After job running a very short time, run `show migration status 
{jobId};`, make sure `processed_records_count` is greater than 0 and 
`inventory_finished_percentage` is less than 10%.
   3. Restart migration job.
   
   #### Reason analyze
   When job is stopped and there are still some shards tasks are queued, after 
job is started again, `InventoryDumperContext.firstDump` will return `false` 
(check on job item level, not on every shard task). When queued shards tasks 
running, query SQL condition will use `>`, so a shard task will skip one 
record, `>=` should be used at the first dump for every shard task.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Improve InventoryDumper query range after job restarting [shardingsphere]

Reply via email to