mro68 opened a new pull request, #7059:
URL: https://github.com/apache/opendal/pull/7059
## Summary
This PR implements a custom `GdriveFlatLister` that uses batch OR queries to
list multiple directories in a single API call, significantly improving
recursive listing performance.
**Depends on:** #7058 (needs size/modifiedTime metadata fix first)
## Motivation
When using OpenDAL's gdrive service for recursive listing (e.g., with backup
tools like rustic), the generic `FlatLister` makes one API call per directory.
For repositories with hundreds of subdirectories, this results in hundreds of
sequential API calls, making it ~50x slower than rclone.
## Solution
Inspired by rclone's approach, this PR implements batch queries using Google
Drive's OR syntax:
```
('id1' in parents or 'id2' in parents or 'id3' in parents ...)
```
### Key Changes
1. **New `gdrive_list_batch()` method** in `core.rs` - Builds OR queries for
multiple parent IDs
2. **New `GdriveFlatLister`** in `flat_lister.rs` - Custom recursive lister
with:
- Batch processing of up to 50 parent IDs per query
- Page size of 1000 (Google Drive API maximum)
- Efficient BFS traversal collecting directories as they're discovered
3. **Enable `list_with_recursive: true`** capability
4. **Add `parents` field** to `GdriveFile` struct for batch parent resolution
### Performance Results
| Metric | Before | After | Improvement |
|--------|--------|-------|-------------|
| Time (2000+ files) | ~55s | ~7.5s | **7x faster** |
| API calls | ~260 | ~12 | **~20x fewer** |
Tested with rustic backup tool against real Google Drive repositories.
## Technical Details
The `GdriveFlatLister` uses a BFS approach:
1. Start with the root directory ID
2. Query up to 50 directories at once using OR query
3. Process results: yield files, collect new directories
4. Repeat until all directories are processed
This is similar to how rclone implements `ListR` for Google Drive.
## Checklist
- [x] I have read the
[CONTRIBUTING](https://github.com/apache/opendal/blob/main/CONTRIBUTING.md)
documentation
- [x] I have added tests that prove my fix is effective or that my feature
works (behavior tests pass)
- [x] This PR is based on #7058 which must be merged first
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]