fresh-borzoni opened a new pull request, #187: URL: https://github.com/apache/fluss-rust/pull/187
Linked issue: close #140
This PR implements remote log segment downloading for the Rust client with
priority-based scheduling and concurrency limiting, matching the Java client's
behavior.
### Change log
- **Added `RemoteLogDownloader`**: Background coordinator that manages
concurrent downloads of remote log segments from S3/filesystem storage
- **Priority-based scheduling**: Downloads prioritized by segment
timestamp (oldest first), then offset, matching Java's `PriorityBlockingQueue`
ordering
- **Two-layer concurrency control**:
- Concurrency limit (default: 3) - limits simultaneous downloads
- Prefetch limit (default: 4) - limits memory usage from
downloaded-but-not-consumed segments
- **Exponential backoff retry**: Failed downloads retry with exponential
backoff (100ms to 5s) and jitter
- **RAII resource management**: Temp files and semaphore permits
automatically cleaned up via Drop
- **Configuration**: Added `scanner_remote_log_prefetch_num` and
`scanner_remote_log_download_threads` config options matching Java client
defaults
- **Protocol changes**: Extended `FetchLogRequest` to include
`remote_log_fetch_info` for remote segment metadata
### API and Format
**Configuration API additions:**
- `Config::scanner_remote_log_prefetch_num` (default: 4)
- `Config::scanner_remote_log_download_threads` (default: 3)
No breaking changes to existing public APIs.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
