felixhzhu opened a new pull request, #4259:
URL: https://github.com/apache/amoro/pull/4259

   ## What changes were proposed in this pull request?
   
   Fix `expireSnapshots` task failing with `IllegalArgumentException: null` 
when the 
   underlying FileIO is an object store (S3FileIO / OSSFileIO / GCSFileIO / 
ResolvingFileIO).
   
   ### Root Cause
   
   `RollingFileCleaner.doCleanFiles()` unconditionally calls 
`TableFileUtil.deleteEmptyDirectory()`, 
   which calls `io.asFileSystemIO()`. For object-store FileIO implementations, 
   `supportFileSystemOperations()` returns `false`, causing 
`Preconditions.checkArgument()` 
   to throw `IllegalArgumentException: null`.
   
   "Delete empty parent directory" is only meaningful for HDFS/Hadoop FS 
semantics — object 
   stores have no real directory concept.
   
   ### Fix
   
   1. **`TableFileUtil.deleteEmptyDirectory()`**: Add early-return when 
`!io.supportFileSystemOperations()`
   2. **`RollingFileCleaner.doCleanFiles()`**: 
      - Guard the parent directory cleanup loop with 
`supportFileSystemOperations()` check
      - Catch per-directory exceptions as WARN to avoid interrupting the entire 
batch
      - Move `parentDirectories.clear()` and `collectedFiles.clear()` into 
`finally` block
   
   No schema / config / API signature changes. HDFS tables are completely 
unaffected.
   
   ## How was this patch tested?
   
   - Verified on production environment with S3FileIO (Tencent Cloud COS)
   - Confirmed HDFS-backed tables still clean empty directories correctly
   - No regression in existing expireSnapshots flows
   
   fix #4237
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to