chaijunjie0101 commented on PR #2148: URL: https://github.com/apache/phoenix/pull/2148#issuecomment-2891312942
> Yes, the fix is needed to avoid parallel processing which can cause the out of order upsert. The test case still needs to be improved though, first of all, the batch upsert is not the correct way as I mentioned above. Also, to reproduce the scenario more deterministically, we need more rows (unless there is a clever trick to make it 100% with less rows). Without the fix the 8 salt buckets can create 8-way parallel processing but since only 4 rows are being inserted, it can achieve a max of 4-way parallelism. We can increase the number of salt buckets and number of rows (at least twice the bucket count) to ensure one row in each salt bucket. You can also reduce the number of unique rows for the 2nd row to just 1 to increase the collision and the chance of hitting the bug. The data can be very easily generated programmatically, if c2 is fixed to a single value, just increment c1 from 1 to n with a small time gap between the rows by managing it via EnvironmentEdgeManager and a reas onable sleep at the end (so that wall clocks catches up by the time the select query is run). -------------------------------------------------------------------------------- @sanjeet006py @haridsv thanks for reviewing, I agree to modify UT code, as you said, we need more data and parallelism to reproduce it, I just keep same c2 value on for T1, increase salt bucket to 16, and write more than 512 records,I run the UT 3 times and all failed when I revert the patch...please help to review again, thank you -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@phoenix.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org