wardlican commented on PR #3856:
URL: https://github.com/apache/amoro/pull/3856#issuecomment-3713337642
> > The current implementation is based on the implementation of
SizeBasedFileRewritePlanner; please help review it. @xxubai
>
> Can you test this implementation in your enviroment? It is really helpful
for validating. Otherwise we should add a unit test to cover this case
Unit tests have been added for the targetSize() method, covering the
following scenarios:
- **testTargetSizeWithSmallInputFiles** - Merging multiple small files
**Scenario**: 5 files of 10MB each (Total 50MB < 128MB).
**Expected**: Returns inputSize (50MB), merging them into a single
file.
- **testTargetSizeWithSingleSmallFile** - Single small file
Scenario: 1 file of 10MB.
Expected: Returns targetSize (128MB).
- **testTargetSizeWithExactTargetSize** - Input size equals targetSize
**Scenario**: Input size is exactly 128MB.
**Expected**: Returns ≥targetSize.
- **testTargetSizeWithLargeRemainder** - Large remainder case
**Scenario**: 225MB input, remainder is 97MB (which is > 96MB
minFileSize).
**Expected**: Rounds up to calculate an appropriate split size.
- **testTargetSizeWithSmallRemainderDistributed** - Distributable small
remainder
**Scenario**: 256MB input (exactly 2x targetSize, no remainder).
**Expected**: Rounds down and distributes the size evenly.
- **testTargetSizeWithMultipleSmallFiles** - Multiple small files exceeding
targetSize
**Scenario**: 10 files of 20MB each (Total 200MB > 128MB).
**Expected**: Calculates an appropriate split size.
- **testTargetSizeWithVeryLargeInput** - Very large input
**Scenario**: 500MB input.
**Expected**: Capped at maxFileSize (192MB, calculated as
targetSize×1.5).
- **testTargetSizeWithCustomMinTargetSizeRatio** - Custom ratio
**Scenario**: Custom min-target-size-ratio set to 0.8.
**Expected**: Performs calculations based on the custom ratio.
- **testTargetSizeBoundaryConditions** - Boundary conditions
**Scenario**: Input size is near boundary values.
**Expected**: Correctly handles edge cases and boundary conditions.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]