stevenpall opened a new pull request, #64758:
URL: https://github.com/apache/doris/pull/64758
## Proposed changes
The cloud recycler builds its S3 client in `S3Accessor::init()`
(`cloud/src/recycler/s3_accessor.cpp`) from an
`Aws::Client::ClientConfiguration` that never sets `requestTimeoutMs`, so
it keeps the SDK default of 3000ms. The vendored aws-sdk-cpp maps that to
`CURLOPT_LOW_SPEED_TIME=3` / `CURLOPT_LOW_SPEED_LIMIT=1`, so any slow or
large `DeleteObjects` request that can't sustain >1 byte/s for 3 seconds
is aborted with curl error 28 ("Timeout was reached").
This is the same defect that #49315 fixed for the BE
(`be/src/util/s3_util.cpp`, `requestTimeoutMs = 30000`), but the cloud
recycler's client was missed by that change. On object stores with
higher per-request latency (e.g. an OVH cold object-storage vault) the
recycler wedges: every `delete_rowset_data` / `DeleteObjects` aborts at
3s with curlCode 28, the recycler burns its delete budget on timed-out
ops, and the orphan backlog never drains.
This change sets `requestTimeoutMs = 30000` and
`connectTimeoutMs = 5000` on the recycler client, mirroring #49315.
Symptom in MS recycler log before the fix:
```
s3_obj_client.cpp: failed to delete objects ... responseCode=-1
error="curlCode: 28, Timeout was reached"
recycler.cpp: failed to delete rowset data, instance_id=...
```
## Further comments
Pure timeout-config change in the recycler S3 client path; no behavior
change for fast object stores. `MaxDeleteBatch` is left unchanged.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]