dosubot[bot] commented on issue #6995: URL: https://github.com/apache/opendal/issues/6995#issuecomment-3646226316
<!-- Answer --> You've identified a real issue: the current implementation resolves an unbounded end range by performing a stat to fetch the content length before reading, which introduces a race condition if the file is overwritten between the stat and the read. This can result in partial or stale reads and data corruption, especially in environments with frequent overwrites. It also causes duplicated requests, increasing latency and cost. There aren't any existing issues or discussions that directly address this bug or propose a solution for handling unbounded ranges in a way that avoids the race condition and extra stat request. Related discussions (like [#5426](https://github.com/apache/opendal/issues/5426)) suggest that range handling should ideally happen at the time of reading, not at reader creation, but don't mention this specific problem. Given the current API, there isn't a documented workaround to avoid this race. The best next step would be for maintainers to review how unbounded ranges are handled and consider a design where the full-object read is performed atomically, without a pre-read stat, to ensure data integrity and avoid extra requests. This would align with your expectation that `..` should always return the latest full data without accidental truncation or extra requests. <!-- Dosu Comment Footer --> *To reply, just mention [@dosu](https://go.dosu.dev/dosubot).* --- <sup>How did I do? [Good](https://app.dosu.dev/response-feedback/8dfb44e9-f185-4639-bcec-7bb8409436fa?feedback_type=great_response) | [Irrelevant](https://app.dosu.dev/response-feedback/8dfb44e9-f185-4639-bcec-7bb8409436fa?feedback_type=irrelevant_answer) | [Incorrect](https://app.dosu.dev/response-feedback/8dfb44e9-f185-4639-bcec-7bb8409436fa?feedback_type=incorrect_sources) | [Verbose](https://app.dosu.dev/response-feedback/8dfb44e9-f185-4639-bcec-7bb8409436fa?feedback_type=too_verbose) | [Hallucination](https://app.dosu.dev/response-feedback/8dfb44e9-f185-4639-bcec-7bb8409436fa?feedback_type=hallucination) | [Report 🐛](https://app.dosu.dev/response-feedback/8dfb44e9-f185-4639-bcec-7bb8409436fa?feedback_type=bug_report) | [Other](https://app.dosu.dev/response-feedback/8dfb44e9-f185-4639-bcec-7bb8409436fa?feedback_type=other)</sup> [](https://app.dosu.dev/997a2c5a-0c54-497b-989d-44651635dd36/ask?utm_source=github) [](https://cursor.com/link/prompt?text=You%27ve%20identified%20a%20real%20issue%3A%20the%20current%20implementation%20resolves%20an%20unbounded%20end%20range%20by%20performing%20a%20stat%20to%20fetch%20the%20content%20length%20before%20reading%2C%20which%20introduces%20a%20race%20condition%20if%20the%20file%20is%20overwritten%20between%20the%20stat%20and%20the%20read.%20This%20can%20result%20in%20partial%20or%20stale%20reads%20and%20data%20corruption%2C%20especially%20in%20environments%20with%20frequent%20overwrites.%20It%20also%20causes%20duplicated%20requests%2C%20increasing%20latency%20and%20cost.%0A%0AThere%20aren%27t%20any%20existing%20issues%20or%20discussions%20that%20directly%20address%20this%20bug%20or%20propose%20a%20solution%20for%20handling%20unbounded%20ranges%20in%20a%20wa y%20that%20avoids%20the%20race%20condition%20and%20extra%20stat%20request.%20Related%20discussions%20%28like%20%5B%235426%5D%28https%3A//github.com/apache/opendal/issues/5426%29%29%20suggest%20that%20range%20handling%20should%20ideally%20happen%20at%20the%20time%20of%20reading%2C%20not%20at%20reader%20creation%2C%20but%20don%27t%20mention%20this%20specific%20problem.%0A%0AGiven%20the%20current%20API%2C%20there%20isn%27t%20a%20documented%20workaround%20to%20avoid%20this%20race.%20The%20best%20next%20step%20would%20be%20for%20maintainers%20to%20review%20how%20unbounded%20ranges%20are%20handled%20and%20consider%20a%20design%20where%20the%20full-object%20read%20is%20performed%20atomically%2C%20without%20a%20pre-read%20stat%2C%20to%20ensure%20data%20integrity%20and%20avoid%20extra%20requests.%20This%20would%20align%20with%20your%20expectation%20that%20%60..%60%20should%20always%20return%20the%20latest%20full%20data%20without%20accidental%20truncation%20or%20extra%20requests.) [](https://go.dosu.dev/discord-bot) [](https://twitter.com/intent/tweet?text=%40dosu_ai%20helped%20me%20solve%20this%20issue!&url=https%3A//github.com/apache/opendal/issues/6995) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
