lintingbin opened a new pull request, #4058:
URL: https://github.com/apache/amoro/pull/4058

   ## Summary
   
   This PR implements functionality to force rewrite Avro format files during 
table optimization. 
   
   Fixes #4057
   
   ## Changes
   
   This PR includes the following changes:
   
   1. **CommonPartitionEvaluator.java**
      - Added `hasAvroFile` flag to track Avro file presence
      - Updated `addFile()` to detect and flag Avro files
      - Modified `fileShouldFullOptimizing()` to always rewrite Avro files
      - Updated `fileShouldRewrite()` to prioritize Avro files for rewriting
      - Enhanced `isNecessary()` to consider Avro files as a trigger for 
optimization
   
   2. **IcebergPartitionPlan.java**
      - Updated task validation logic to avoid skipping single Avro file 
optimization
   
   3. **ContentFiles.java**
      - Added `isAvroFile()` utility method to identify Avro format files
   
   ## Motivation
   
   Avro files have different characteristics compared to columnar formats like 
Parquet or ORC. To maintain optimal table performance and consistency, Avro 
files should always be rewritten to the preferred format during optimization, 
regardless of other optimization conditions.
   
   ## Testing
   
   - Verified that Avro files are correctly identified
   - Confirmed that optimization is triggered when Avro files are present
   - Tested that Avro files are always included in rewrite operations
   
   ## Checklist
   
   - [x] Code changes are complete
   - [x] Changes maintain backward compatibility
   - [x] Code follows project conventions


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to