[GitHub] [hadoop] anoopsjohn commented on a change in pull request #2646: HADOOP-17038 Support disabling buffered reads in ABFS positional reads.

GitBox Fri, 05 Feb 2021 02:14:34 -0800


anoopsjohn commented on a change in pull request #2646:
URL: https://github.com/apache/hadoop/pull/2646#discussion_r570700786




##########
File path: 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsInputStream.java
##########
@@ -135,6 +145,41 @@ public String getPath() {
     return path;
   }
 
+  @Override
+  public int read(long position, byte[] buffer, int offset, int length)
+      throws IOException {
+    // When bufferedPreadDisabled = true, this API do not use any shared 
buffer,
+    // cursor position etc. So this is implemented as NOT synchronized. HBase
+    // kind of random reads on a shared file input stream will greatly get
+    // benefited by such implementation.
+    // Strict close check at the begin of the API only not for the entire flow.
+    synchronized (this) {
+      if (closed) {
+        throw new IOException(FSExceptionMessages.STREAM_IS_CLOSED);
+      }
+    }
+    LOG.debug("pread requested offset = {} len = {} bufferedPreadDisabled = 
{}",
+        offset, length, bufferedPreadDisabled);
+    if (!bufferedPreadDisabled) {
+      return super.read(position, buffer, offset, length);
+    }
+    validatePositionedReadArgs(position, buffer, offset, length);
+    if (length == 0) {
+      return 0;
+    }
+    if (streamStatistics != null) {
+      streamStatistics.readOperationStarted();
+    }
+    int bytesRead = readRemote(position, buffer, offset, length);

Review comment:
       Will have to do as a follow up issue then.  Sounds ok? Will raise one 
and link to this issue.
   

##########
File path: 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java
##########
@@ -634,12 +638,15 @@ public AbfsInputStream openFileForRead(final Path path, 
final FileSystem.Statist
       // Add statistics for InputStream
       return new AbfsInputStream(client, statistics,
               relativePath, contentLength,
-              populateAbfsInputStreamContext(),
+              populateAbfsInputStreamContext(options),
               eTag);
     }
   }
 
-  private AbfsInputStreamContext populateAbfsInputStreamContext() {
+  private AbfsInputStreamContext populateAbfsInputStreamContext(
+      Optional<Configuration> options) {
+    boolean bufferedPreadDisabled = options.isPresent()

Review comment:
       Its Configuration in Optional and we need fetch the boolean config out 
of that. 
   So if use orElse() we will end up 
   Configuration conf = options.orElse();
   boolean bufferedPreadDisabled  = conf != null && 
conf.getBoolean(FS_AZURE_BUFFERED_PREAD_DISABLE, false);
   Otherwise
   boolean bufferedPreadDisabled  = options.orElseGet(()->{return new 
Configuration(false);}).getBoolean("key", false);
   Not sure why we should end up creating a new Conf object.
   I feel like its better to have current way of hasPresent() check.
   WDYT?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [hadoop] anoopsjohn commented on a change in pull request #2646: HADOOP-17038 Support disabling buffered reads in ABFS positional reads.

Reply via email to