[
https://issues.apache.org/jira/browse/HADOOP-19139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17837661#comment-17837661
]
ASF GitHub Bot commented on HADOOP-19139:
-----------------------------------------
anmolanmol1234 commented on code in PR #6699:
URL: https://github.com/apache/hadoop/pull/6699#discussion_r1566892915
##########
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAbfsInputStreamStatistics.java:
##########
@@ -227,15 +227,23 @@ public void testReadStatistics() throws IOException {
* readOps - Since each time read operation is performed OPERATIONS
* times, total number of read operations would be equal to OPERATIONS.
*
- * remoteReadOps - Only a single remote read operation is done. Hence,
+ * remoteReadOps -
+ * In case of Head Optimization for InputStream, the first read operation
+ * would read only the asked range and would not be able to read the
entire file
+ * ras it has no information on the contentLength of the file. The second
Review Comment:
typo: as
##########
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAbfsNetworkStatistics.java:
##########
@@ -231,7 +237,17 @@ public void testAbfsHttpResponseStatistics() throws
IOException {
// 1 read request = 1 connection and 1 get response
expectedConnectionsMade++;
expectedGetResponses++;
- expectedBytesReceived += bytesWrittenToFile;
+ if (!getConfiguration().getHeadOptimizationForInputStream()) {
+ expectedBytesReceived += bytesWrittenToFile;
+ } else {
+ /*
+ * With head optimization enabled, the abfsInputStream is not aware
+ * of the contentLength and hence, it would only read data for which
the range
+ * is provided. With the first remote call done, the inputStream will
get
+ * aware of the contentLength and would be able to use it for further
reads.
+ */
+ expectedBytesReceived += 1;
Review Comment:
why +1 ?
##########
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/contract/ITestAbfsFileSystemContractOpen.java:
##########
@@ -49,4 +61,55 @@ protected Configuration createConfiguration() {
protected AbstractFSContract createContract(final Configuration conf) {
return new AbfsFileSystemContract(conf, isSecure);
}
+
+ @Override
+ public FileSystem getFileSystem() {
Review Comment:
This code is repeated at multiple places, can it be made centralized
##########
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemAuthorization.java:
##########
@@ -327,7 +328,15 @@ private void executeOp(Path reqPath, AzureBlobFileSystem
fs,
fs.open(reqPath);
break;
case Open:
- fs.open(reqPath);
+ InputStream is = fs.open(reqPath);
+ if (getConfiguration().getHeadOptimizationForInputStream()) {
+ try {
+ is.read();
+ } catch (IOException ex) {
+ is.close();
Review Comment:
close should be in finally method
##########
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRandomRead.java:
##########
@@ -192,7 +192,11 @@ public void testSkipBounds() throws Exception {
Path testPath = path(TEST_FILE_PREFIX + "_testSkipBounds");
long testFileLength = assumeHugeFileExists(testPath);
- try (FSDataInputStream inputStream = this.getFileSystem().open(testPath)) {
+ try (FSDataInputStream inputStream = this.getFileSystem()
Review Comment:
need for this change ?
##########
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/ITestAbfsInputStreamReadFooter.java:
##########
@@ -234,15 +244,41 @@ private void seekReadAndTest(final AzureBlobFileSystem fs,
long expectedBCursor;
long expectedFCursor;
if (optimizationOn) {
- if (actualContentLength <= footerReadBufferSize) {
- expectedLimit = actualContentLength;
- expectedBCursor = seekPos + actualLength;
+ if (getConfiguration().getHeadOptimizationForInputStream()) {
Review Comment:
Too many variable changes, can we add comments please
> [ABFS]: No GetPathStatus call for opening AbfsInputStream
> ---------------------------------------------------------
>
> Key: HADOOP-19139
> URL: https://issues.apache.org/jira/browse/HADOOP-19139
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/azure
> Reporter: Pranav Saxena
> Assignee: Pranav Saxena
> Priority: Major
> Labels: pull-request-available
>
> Read API gives contentLen and etag of the path. This information would be
> used in future calls on that inputStream. Prior information of eTag is of not
> much importance.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]