[jira] [Commented] (HADOOP-18521) ABFS ReadBufferManager buffer sharing across concurrent HTTP requests

ASF GitHub Bot (Jira) Thu, 30 Oct 2025 17:24:37 -0700


    [ 
https://issues.apache.org/jira/browse/HADOOP-18521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18034293#comment-18034293
 ]


ASF GitHub Bot commented on HADOOP-18521:
-----------------------------------------

github-actions[bot] commented on PR #5133:
URL: https://github.com/apache/hadoop/pull/5133#issuecomment-3470795037

   We're closing this stale PR because it has been open for 100 days with no 
activity. This isn't a judgement on the merit of the PR in any way. It's just a 
way of keeping the PR queue manageable.
   If you feel like this was a mistake, or you would like to continue working 
on it, please feel free to re-open it and ask for a committer to remove the 
stale tag and review again.
   Thanks all for your contribution.




> ABFS ReadBufferManager buffer sharing across concurrent HTTP requests
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-18521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/azure
>    Affects Versions: 3.3.2, 3.3.3, 3.3.4
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 3.3.5
>
>         Attachments: HADOOP-18521 ABFS ReadBufferManager buffer sharing 
> across concurrent HTTP requests.pdf, validating-csv-record-io.sc
>
>
> {{AbfsInputStream.close()}} can trigger the return of buffers used for active 
> prefetch GET requests into the ReadBufferManager free buffer pool.
> A subsequent prefetch by a different stream in the same process may acquire 
> this same buffer. This can lead to risk of corruption of its own prefetched 
> data, data which may then be returned to that other thread.
> The full analysis in in the document attached to this JIRA.
> The issue is fixed in Hadoop 3.3.5 
> h2. Emergency fix through site configuration
> On releases without the fix for this (3.3.2-3.3.4), the bug can be avoided by 
> disabling all prefetching
> {code:java}
> fs.azure.readaheadqueue.depth = 0
> {code}
> h2. Automated probes for risk of exposure
> The [cloudstore|https://github.com/steveloughran/cloudstore] diagnostics JAR 
> has a command 
> [safeprefetch|https://github.com/steveloughran/cloudstore/blob/trunk/src/main/site/safeprefetch.md]
>  which probes an abfs client for being vulnerable. It does this through 
> {{PathCapabilities.hasPathCapability()}} probes. It can be invoked on the 
> command line to validate the version/configuration
> Consult [the 
> source|https://github.com/steveloughran/cloudstore/blob/trunk/src/main/java/org/apache/hadoop/fs/store/abfs/SafePrefetch.java#L96]
>  to see how to do this programmatically.
> Note also that the tool's 
> [mkcsv|https://github.com/steveloughran/cloudstore/blob/trunk/src/main/site/mkcsv.md]
>  command can be used to generate the multi-GB CSV files needed to trigger the 
> condition and so verify that the issue exists.
> h2. Microsoft Announcement
> {code}
> From: Sneha Vijayarajan
> Subject: RE: Alert ! ABFS Driver - Possible data corruption on read path
> Hi,
> One of the contributions made to ABFS Driver has a potential to cause data 
> corruption on read
> path.
> Please check if the below change is part of any of your releases:
> HADOOP-17156. Purging the buffers associated with input streams during 
> close() by mukund-thakur
> · Pull Request #3285 · apache/hadoop (github.com)
> RCA: Scenario that can lead to data corruption:
> Driver allocates a bunch of prefetch buffers at init and are shared by 
> different instances of
> InputStreams created within that process. These prefetch buffers could be in 
> 3 stages –
> * In ReadAheadQueue : request for prefetch logged
> * In ProgressList : Work has begun to talk to backend store to get the 
> requested data
> * In CompletedList: Prefetch data is now available for consumption.
> When multiple InputStreams have prefetch buffers across these states and 
> close is triggered on
> any InputStream/s, the commit above will remove buffers allotted to 
> respective stream from all
> the 3 lists and also declare that the buffers are available for new 
> prefetches to happen, but
> no action to cancel/prevent buffer from being updated with ongoing network 
> request is done.
> Data corruption can happen if one such freed up buffer from InProgressList is 
> allotted to a new
> prefetch request and then the buffer got filled up with the previous stream’s 
> network request.
> Mitigation: If this change is present in any release, kindly help communicate 
> to your customers
> to immediately set below config to 0 in their clusters. This will disable 
> prefetches which can
> have an impact on perf but will prevent the possibility of data corruption.
> fs.azure.readaheadqueue.depth: Sets the readahead queue depth in 
> AbfsInputStream. In case the
> set value is negative the read ahead queue depth will be set as
> Runtime.getRuntime().availableProcessors(). By default the value will be 2. 
> To disable
> readaheads, set this value to 0. If your workload is doing only random reads 
> (non-sequential)
> or you are seeing throttling, you may try setting this value to 0.
> Next steps: We are getting help to post the notifications for this in Apache 
> groups. Work on
> HotFix is also ongoing. Will update this thread once the change is checked in.
> Please reach out for any queries or clarifications.
> Thanks,
> Sneha Vijayarajan
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-18521) ABFS ReadBufferManager buffer sharing across concurrent HTTP requests

Reply via email to