9aman opened a new pull request, #15155:
URL: https://github.com/apache/pinot/pull/15155
## Issue
The GcsPinotFS returns directories names with training "/" unlike S3PinotFS
and LocalPinotFS. This leads to issue in deletion of segments by the retention
manager as it expects sanitized paths from the listFiles() FS calls.
## Scope of the PR/ Quick Resolution
Removing trailing delimiters, if present, before constructing file path to
delete segments.
## Testing
- Unit tests work fine and their is no regression for LocalPinotFS.
- Tested it for GcsPinotFS
Logs before removing the trailing delimiter "/"
```
2025/02/28 12:23:20.002 INFO [SegmentDeletionManager] [pool-19-thread-5]
tableNameDir:
gs://sc-sre-test-poc-pinot-fs/sc/managed/pinot/Deleted_Segments/events_100m/,
tableName:
2025/02/28 12:23:20.003 INFO [SegmentDeletionManager] [pool-19-thread-5]
tableNameURI: gs://sc-sre-test-poc-pinot-fs/sc/managed/pinot/Deleted_Segments/
2025/02/28 12:23:20.037 INFO [GcsPinotFS] [pool-19-thread-5] Listed 1 files
from URI: gs://sc-sre-test-poc-pinot-fs/sc/managed/pinot/Deleted_Segments/, is
recursive: false
2025/02/28 12:23:20.038 INFO [SegmentDeletionManager] [pool-19-thread-5]
Deleting file:
gs://sc-sre-test-poc-pinot-fs/sc/managed/pinot/Deleted_Segments/events_100m/
from the deleted directory. File uri being deleted:
gs://sc-sre-test-poc-pinot-fs/sc/managed/pinot/Deleted_Segments//
2025/02/28 12:23:20.069 WARN [SegmentDeletionManager] [pool-19-thread-5]
Caught exception while deleting file:
gs://sc-sre-test-poc-pinot-fs/sc/managed/pinot/Deleted_Segments/events_100m/
from deleted directory, file uri being deleted:
gs://sc-sre-test-poc-pinot-fs/sc/managed/pinot/Deleted_Segments//
```
Logs after removing the trailing delimiter "/"
```
2025/02/28 12:23:20.069 INFO [SegmentDeletionManager] [pool-19-thread-5]
tableNameDir:
gs://sc-sre-test-poc-pinot-fs/sc/managed/pinot/Deleted_Segments/events_100m,
tableName: events_100m
2025/02/28 12:23:20.069 INFO [SegmentDeletionManager] [pool-19-thread-5]
tableNameURI:
gs://sc-sre-test-poc-pinot-fs/sc/managed/pinot/Deleted_Segments/events_100m
2025/02/28 12:23:20.172 INFO [GcsPinotFS] [pool-19-thread-5] Listed 288
files from URI:
gs://sc-sre-test-poc-pinot-fs/sc/managed/pinot/Deleted_Segments/events_100m, is
recursive: false
2025/02/28 12:23:20.172 INFO [SegmentDeletionManager] [pool-19-thread-5]
Deleting file:
gs://sc-sre-test-poc-pinot-fs/sc/managed/pinot/Deleted_Segments/events_100m/events_100m_1717299073419_1717718305291_FileIngestionTask_1740369136126_0_15__RETENTION_UNTIL__202502281030
from the deleted directory. File uri being deleted:
gs://sc-sre-test-poc-pinot-fs/sc/managed/pinot/Deleted_Segments/events_100m/events_100m_1717299073419_1717718305291_FileIngestionTask_1740369136126_0_15__RETENTION_UNTIL__202502281030
2025/02/28 12:23:20.207 INFO [GcsPinotFS] [pool-19-thread-5] Deleting uri
gs://sc-sre-test-poc-pinot-fs/sc/managed/pinot/Deleted_Segments/events_100m/events_100m_1717299073419_1717718305291_FileIngestionTask_1740369136126_0_15__RETENTION_UNTIL__202502281030
force true
2025/02/28 12:23:20.571 INFO [SegmentDeletionManager] [pool-19-thread-5]
Deleting file:
gs://sc-sre-test-poc-pinot-fs/sc/managed/pinot/Deleted_Segments/events_100m/events_100m_1717299073419_1717718305291_FileIngestionTask_1740369136126_0_5__RETENTION_UNTIL__202502281030
from the deleted directory. File uri being deleted:
gs://sc-sre-test-poc-pinot-fs/sc/managed/pinot/Deleted_Segments/events_100m/events_100m_1717299073419_1717718305291_FileIngestionTask_1740369136126_0_5__RETENTION_UNTIL__202502281030
2025/02/28 12:23:20.596 INFO [GcsPinotFS] [pool-19-thread-5] Deleting uri
gs://sc-sre-test-poc-pinot-fs/sc/managed/pinot/Deleted_Segments/events_100m/events_100m_1717299073419_1717718305291_FileIngestionTask_1740369136126_0_5__RETENTION_UNTIL__202502281030
force true
```
#### Note
Extra logs were adding for testing and have not been made part of the this
PR. Some error logs have been added to catch any FileSystem related issues.
## Futher fixes
Change the implementation of list files to be consistent.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]