[
https://issues.apache.org/jira/browse/HADOOP-17301?focusedWorklogId=498000&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498000
]
ASF GitHub Bot logged work on HADOOP-17301:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 09/Oct/20 13:26
Start Date: 09/Oct/20 13:26
Worklog Time Spent: 10m
Work Description: steveloughran commented on a change in pull request
#2369:
URL: https://github.com/apache/hadoop/pull/2369#discussion_r501695425
##########
File path:
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/ReadBufferManager.java
##########
@@ -242,13 +243,29 @@ private synchronized boolean tryEvict() {
}
// next, try any old nodes that have not been consumed
+ // Failed read buffers (with buffer index=-1) that are older than
+ // thresholdAge should be cleaned up, but at the same time should not
+ // report successful eviction.
+ // Queue logic expects that a buffer is freed up for read ahead when
+ // eviction is successful, whereas a failed ReadBuffer would have released
+ // its buffer when its status was set to READ_FAILED.
long earliestBirthday = Long.MAX_VALUE;
+ ArrayList<ReadBuffer> oldFailedBuffers = new ArrayList<>();
for (ReadBuffer buf : completedReadList) {
- if (buf.getTimeStamp() < earliestBirthday) {
+ if ((buf.getBufferindex() != -1)
+ && (buf.getTimeStamp() < earliestBirthday)) {
nodeToEvict = buf;
earliestBirthday = buf.getTimeStamp();
+ } else if ((buf.getBufferindex() == -1)
+ && (currentTimeMillis() - buf.getTimeStamp()) >
thresholdAgeMilliseconds) {
Review comment:
pull `currentTimeMillis()` outside the for loop as its an OS call with
potential cost, and things probably work best if the same value is used through
the loop and the code at L269
##########
File path:
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/TestAbfsInputStream.java
##########
@@ -264,12 +297,24 @@ public void testSuccessfulReadAhead() throws Exception {
any(String.class));
AbfsInputStream inputStream = getAbfsInputStream(client,
"testSuccessfulReadAhead.txt");
+ int beforeReadCompletedListSize =
ReadBufferManager.getBufferManager().getCompletedReadListSize();
// First read request that triggers readAheads.
inputStream.read(new byte[ONE_KB]);
// Only the 3 readAhead threads should have triggered client.read
verifyReadCallCount(client, 3);
+ int newAdditionsToCompletedRead =
+ ReadBufferManager.getBufferManager().getCompletedReadListSize()
+ - beforeReadCompletedListSize;
+ // read buffer might be dumped if the ReadBufferManager getblock preceded
+ // the action of buffer being picked for reading from readaheadqueue, so
that
+ // inputstream can proceed with read and not be blocked on readahead thread
+ // availability. So the count of buffers in completedReadQueue for the
stream
+ // can be same or lesser than the requests triggered to queue readahead.
+ assertTrue(
Review comment:
use Assertions.assertThat with an explicit `isLessThanOrEqualTo(3)`
assertion.
##########
File path:
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/TestAbfsInputStream.java
##########
@@ -182,7 +183,39 @@ public void testFailedReadAhead() throws Exception {
checkEvictedStatus(inputStream, 0, false);
}
+ @Test
+ public void testFailedReadAheadEviction() throws Exception {
+ AbfsClient client = getMockAbfsClient();
+ AbfsRestOperation successOp = getMockRestOp();
+
ReadBufferManager.setThresholdAgeMilliseconds(INCREASED_READ_BUFFER_AGE_THRESHOLD);
+ // Stub :
+ // Read request leads to 3 readahead calls: Fail all 3
readahead-client.read()
+ // Actual read request fails with the failure in readahead thread
+ doThrow(new TimeoutException("Internal Server error"))
+ .when(client)
+ .read(any(String.class), any(Long.class), any(byte[].class),
+ any(Integer.class), any(Integer.class), any(String.class),
+ any(String.class));
+
+ AbfsInputStream inputStream = getAbfsInputStream(client,
"testFailedReadAheadEviction.txt");
+
+ // Add a failed buffer to completed queue and set to no free buffers to
read ahead.
+ ReadBuffer buff = new ReadBuffer();
+ buff.setStatus(
+
org.apache.hadoop.fs.azurebfs.contracts.services.ReadBufferStatus.READ_FAILED);
Review comment:
import the field rather than a full reference
##########
File path:
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/ReadBufferManager.java
##########
@@ -464,4 +480,10 @@ int getCompletedReadListSize() {
void callTryEvict() {
tryEvict();
}
+
+ @VisibleForTesting
Review comment:
add (minimal) javadoc
##########
File path:
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/TestAbfsInputStream.java
##########
@@ -49,6 +49,7 @@
private static final int TWO_KB = 2 * 1024;
private static final int THREE_KB = 3 * 1024;
private static final int REDUCED_READ_BUFFER_AGE_THRESHOLD = 3000; // 3 sec
+ private static final int INCREASED_READ_BUFFER_AGE_THRESHOLD = 30000; // 30
sec
Review comment:
use `30_000` style integer
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 498000)
Time Spent: 1h 10m (was: 1h)
> ABFS: Fix bug introduced in HADOOP-16852 which reports read-ahead error back
> ----------------------------------------------------------------------------
>
> Key: HADOOP-17301
> URL: https://issues.apache.org/jira/browse/HADOOP-17301
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/azure
> Affects Versions: 3.3.0
> Reporter: Sneha Vijayarajan
> Assignee: Sneha Vijayarajan
> Priority: Critical
> Labels: pull-request-available
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> When reads done by readahead buffers failed, the exceptions where dropped and
> the failure was not getting reported to the calling app.
> Jira HADOOP-16852: Report read-ahead error back
> tried to handle the scenario by reporting the error back to calling app. But
> the commit has introduced a bug which can lead to ReadBuffer being injected
> into read completed queue twice.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]