leaves12138 opened a new pull request, #7938:
URL: https://github.com/apache/paimon/pull/7938

   ### Purpose
   
   This PR prevents Data Evolution blob compaction from combining blob files 
that belong to different regular data-file row-id ranges.
   
   ### Root Cause
   
   The compact planner grouped all blob files from a data compaction group 
before planning blob compact tasks. When blob files from multiple regular data 
files were compacted together, the compacted blob file could cover a row-id 
range spanning several data files. Conflict detection groups files by 
overlapping row-id range and filters blob files from the error message, so the 
failure surfaced as multiple regular data files with different row-id ranges 
conflicting during COMPACT.
   
   ### Changes
   
   - Plan blob compaction separately for each containing data file.
   - Update planner tests to expect independent blob compact tasks per 
data-file range.
   - Add a regression test ensuring a single blob file from each data file is 
not compacted across data-file ranges.
   
   ### Tests
   
   - `JAVA_HOME=/opt/zulu8.68.0.21-ca-jdk8.0.362-macosx_aarch64 mvn -pl 
paimon-core spotless:apply`
   - `JAVA_HOME=/opt/zulu8.68.0.21-ca-jdk8.0.362-macosx_aarch64 mvn -pl 
paimon-core -Dtest=DataEvolutionCompactCoordinatorTest test`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to