kabo87777 opened a new pull request, #16884:
URL: https://github.com/apache/iotdb/pull/16884

   # Fix Non-Deterministic Behavior in 
NewReadChunkCompactionPerformerWithAlignedSeriesTest
   
   ## Problem
   
   Thirteen tests in `NewReadChunkCompactionPerformerWithAlignedSeriesTest` 
were failing non-deterministically under NonDex due to order-dependent 
measurement schema comparisons:
   
   - `testCompactionWithAllDeletion`
   - `testCompactionWithDeletion`
   - `testCompactionWithDeletionAndEmptyPage`
   - `testCompactionWithPartialDeletion`
   - `testCompactionWithPartialDeletionAndEmptyPage`
   - `testSimpleCompaction`
   - `testSimpleCompactionByFlushChunk`
   - `testSimpleCompactionWithEmptyChunk`
   - `testSimpleCompactionWithEmptyPage`
   - `testSimpleCompactionWithNullValue`
   - `testSimpleCompactionWithNullValueAndEmptyChunk`
   - `testSimpleCompactionWithNullValueAndEmptyPage`
   - `testSimpleCompactionWithSomeDeviceNotInTargetFile`
   
   ## Way to Reproduce
   
   ```bash
   cd iotdb-core/datanode
   mvn edu.illinois:nondex-maven-plugin:2.1.7:nondex \
     -Dtest=NewReadChunkCompactionPerformerWithAlignedSeriesTest \
     -DnondexRuns=3 -Drat.skip=true
   # Result: Multiple failures due to non-deterministic measurement order
   ```
   
   ## Root Cause
   
   The `CompactionCheckerUtils.getAllPathsOfResources()` method used `HashSet` 
for storing paths and did not sort measurement schemas before constructing 
`AlignedFullPath` objects:
   
   ```java
   Set<IFullPath> paths = new HashSet<>();  // Non-deterministic iteration order
   // ...
   List<IMeasurementSchema> measurementSchemas = new 
ArrayList<>(schemaMap.values());  // Unsorted
   // ...
   seriesPath = new AlignedFullPath(deviceID, existedMeasurements, 
measurementSchemas);
   ```
   
   The `schemaMap` is backed by `ConcurrentHashMap` which has non-deterministic 
iteration order. When NonDex shuffled collection order, measurements appeared 
in different positions within `AlignedFullPath`, causing data comparison 
assertions to fail even though the compaction results were semantically correct.
   
   ## Solution
   
   Made measurement ordering **deterministic** by sorting schemas and using 
ordered collections.
   
   ### Changes Made
   
   **Before (Order-Dependent):**
   ```java
   Set<IFullPath> paths = new HashSet<>();
   // ...
   List<IMeasurementSchema> measurementSchemas = new 
ArrayList<>(schemaMap.values());
   ```
   
   **After (Order-Independent):**
   ```java
   Set<IFullPath> paths = new LinkedHashSet<>();
   // ...
   List<IMeasurementSchema> measurementSchemas =
       schemaMap.values().stream()
           .sorted(Comparator.comparing(IMeasurementSchema::getMeasurementName))
           .collect(Collectors.toList());
   ```
   
   ### Key Improvements
   
   - Changed `HashSet` to `LinkedHashSet` to preserve insertion order for 
deterministic iteration
   - Added sorting of measurement schemas by measurement name using 
`Comparator.comparing()`
   - Ensures `AlignedFullPath` objects always have measurements in consistent 
alphabetical order
   - Added `java.util.Comparator` import
   
   ## Verification
   
   ```bash
   mvn edu.illinois:nondex-maven-plugin:2.1.7:nondex \
     -Dtest=NewReadChunkCompactionPerformerWithAlignedSeriesTest \
     -DnondexRuns=3 -Drat.skip=true
   # Result: All 15 tests pass across 3 different random seeds (933178, 974622, 
1016066)
   ```
   
   ## Key Changed Classes
   
   - **CompactionCheckerUtils**: Modified `getAllPathsOfResources()` method 
(~10 lines), test utility changes only
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to