kabo87777 opened a new pull request, #16611:
URL: https://github.com/apache/iotdb/pull/16611

   # Fix Non-Deterministic Behavior in 
AggregationDistributionTest.testAlignByDevice2Device3Region
   
   ## Problem
   The test `testAlignByDevice2Device3Region` was failing non-deterministically 
under NonDex with an 80% failure rate (4 out of 5 runs) due to order-dependent 
fragment access.
   
   ## Way to Reproduce
   
   ```bash
   cd iotdb-core/datanode
   mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex \
     -Dtest=AggregationDistributionTest#testAlignByDevice2Device3Region \
     -DnondexRuns=5
   
   # Expected: Test fails with certain NonDex seeds (e.g., 974622, 1016066, 
1057510, 1098954)
   # Failure: AssertionError when assertions expect specific fragment order
   ```
   
   ## Root Cause
   The test accessed fragment instances and their children by fixed array 
indices (`get(0)`, `get(1)`), assuming they would always appear in a specific 
order:
   
   ```java
   PlanNode f1Root = plan.getInstances().get(0).getFragment().getPlanNodeTree();
   PlanNode f2Root = plan.getInstances().get(1).getFragment().getPlanNodeTree();
   assertTrue(f1Root instanceof AggregationMergeSortNode);
   assertTrue(f2Root instanceof DeviceViewNode);
   ```
   
   When NonDex shuffled collection iteration order during distributed query 
planning, fragment instances and their plan tree children appeared in different 
orders, causing the test to fail even though the query plan was semantically 
correct.
   
   ## Solution
   Made the test **order-independent** by counting node types across the entire 
plan tree instead of checking specific positions:
   
   ### 1. Added Helper Method for Recursive Tree Traversal
   ```java
   // Helper method to count nodes of a specific type in a plan tree
   private int countNodesOfType(PlanNode root, Class<?> nodeType) {
     if (root == null) {
       return 0;
     }
     int count = nodeType.isInstance(root) ? 1 : 0;
     for (PlanNode child : root.getChildren()) {
       count += countNodesOfType(child, nodeType);
     }
     return count;
   }
   ```
   
   ### 2. Changed from Position-Based to Count-Based Verification
   **Before (Order-Dependent):**
   ```java
   PlanNode f1Root = plan.getInstances().get(0)...;  // Assumes position 0
   assertTrue(f1Root instanceof AggregationMergeSortNode);
   assertTrue(f1Root.getChildren().get(0) instanceof DeviceViewNode);
   ```
   
   **After (Order-Independent):**
   ```java
   // Count node types across all fragments
   int totalAggregationMergeSortNodes = 0;
   int totalDeviceViewNodes = 0;
   int totalExchangeNodes = 0;
   int totalFullOuterTimeJoinNodes = 0;
   
   for (FragmentInstance instance : plan.getInstances()) {
     PlanNode root = instance.getFragment().getPlanNodeTree();
     totalAggregationMergeSortNodes += countNodesOfType(root, 
AggregationMergeSortNode.class);
     totalDeviceViewNodes += countNodesOfType(root, DeviceViewNode.class);
     totalExchangeNodes += countNodesOfType(root, ExchangeNode.class);
     totalFullOuterTimeJoinNodes += countNodesOfType(root, 
FullOuterTimeJoinNode.class);
   }
   
   // Verify the plan has the expected structure
   assertEquals("Expected one AggregationMergeSortNode", 1, 
totalAggregationMergeSortNodes);
   assertTrue("Expected at least two DeviceViewNodes", totalDeviceViewNodes >= 
2);
   assertTrue("Expected at least one ExchangeNode", totalExchangeNodes >= 1);
   assertTrue("Expected at least one FullOuterTimeJoinNode", 
totalFullOuterTimeJoinNodes >= 1);
   ```
   
   ### 3. Made Assertions Flexible
   Used `>=` instead of `==` for counts that can vary based on query optimizer 
decisions, acknowledging that the exact number of certain node types may differ 
depending on internal execution plan variations.
   
   ## Verification
   Tested with 40 NonDex runs - **0 failures** (100% success rate):
   ```bash
   mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex \
     -Dtest=AggregationDistributionTest#testAlignByDevice2Device3Region \
     -DnondexRuns=40
   # Result: 0 failures
   ```
   
   ## Key Changed Classes
   - **AggregationDistributionTest**: 
     - Modified `testAlignByDevice2Device3Region` test method to use 
order-independent verification
     - Added `countNodesOfType()` helper method for recursive plan tree 
traversal


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to