[GitHub] [hadoop] ishaniahuja commented on a change in pull request #2072: HADOOP-17058. ABFS: Support for AppendBlob in Hadoop ABFS Driver

2020-06-25 Thread GitBox


ishaniahuja commented on a change in pull request #2072:
URL: https://github.com/apache/hadoop/pull/2072#discussion_r445745291



##
File path: 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java
##
@@ -144,6 +145,10 @@
   private final IdentityTransformerInterface identityTransformer;
   private final AbfsPerfTracker abfsPerfTracker;
 
+  /**
+   * The set of directories where we should store files as append blobs.
+   */
+  private Set appendBlobDirSet;

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] ishaniahuja commented on a change in pull request #2072: HADOOP-17058. ABFS: Support for AppendBlob in Hadoop ABFS Driver

2020-06-23 Thread GitBox


ishaniahuja commented on a change in pull request #2072:
URL: https://github.com/apache/hadoop/pull/2072#discussion_r47047



##
File path: 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/ConfigurationKeys.java
##
@@ -59,6 +59,9 @@
   public static final String FS_AZURE_ENABLE_AUTOTHROTTLING = 
"fs.azure.enable.autothrottling";
   public static final String FS_AZURE_ALWAYS_USE_HTTPS = 
"fs.azure.always.use.https";
   public static final String FS_AZURE_ATOMIC_RENAME_KEY = 
"fs.azure.atomic.rename.key";
+  /** Provides a config to provide comma separated path prefixes on which 
Appendblob based files are created
+   *  Default is empty. **/
+  public static final String FS_AZURE_APPEND_BLOB_KEY = 
"fs.azure.appendblob.key";

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] ishaniahuja commented on a change in pull request #2072: HADOOP-17058. ABFS: Support for AppendBlob in Hadoop ABFS Driver

2020-06-23 Thread GitBox


ishaniahuja commented on a change in pull request #2072:
URL: https://github.com/apache/hadoop/pull/2072#discussion_r47198



##
File path: 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsOutputStream.java
##
@@ -323,6 +328,35 @@ private synchronized void writeCurrentBufferToService() 
throws IOException {
 final long offset = position;
 position += bytesLength;
 
+if (this.isAppendBlob) {

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] ishaniahuja commented on a change in pull request #2072: HADOOP-17058. ABFS: Support for AppendBlob in Hadoop ABFS Driver

2020-06-23 Thread GitBox


ishaniahuja commented on a change in pull request #2072:
URL: https://github.com/apache/hadoop/pull/2072#discussion_r46862



##
File path: 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java
##
@@ -1314,7 +1352,30 @@ private String 
convertXmsPropertiesToCommaSeparatedString(final Hashtable dirSet) {
+
+for (String dir : dirSet) {
+  if (dir.isEmpty() || key.startsWith(dir)) {
+return true;
+  }
+
+  try {
+URI uri = new URI(dir);
+if (null == uri.getAuthority()) {

Review comment:
   code is removed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] ishaniahuja commented on a change in pull request #2072: HADOOP-17058. ABFS: Support for AppendBlob in Hadoop ABFS Driver

2020-06-15 Thread GitBox


ishaniahuja commented on a change in pull request #2072:
URL: https://github.com/apache/hadoop/pull/2072#discussion_r440095900



##
File path: 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java
##
@@ -1314,7 +1352,30 @@ private String 
convertXmsPropertiesToCommaSeparatedString(final Hashtable dirSet) {
+
+for (String dir : dirSet) {
+  if (dir.isEmpty() || key.startsWith(dir)) {
+return true;
+  }
+
+  try {
+URI uri = new URI(dir);
+if (null == uri.getAuthority()) {
+  if (key.startsWith(dir + "/")){
+return true;
+  }
+}
+  } catch (URISyntaxException e) {
+LOG.info("URI syntax error creating URI for {}", dir);

Review comment:
   but what if only one comma separated value is incorrect, we would be 
throwing an exception adn crashing the app/jvm?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] ishaniahuja commented on a change in pull request #2072: HADOOP-17058. ABFS Support for AppendBlob in Hadoop ABFS Driver

2020-06-15 Thread GitBox


ishaniahuja commented on a change in pull request #2072:
URL: https://github.com/apache/hadoop/pull/2072#discussion_r440081448



##
File path: 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsOutputStream.java
##
@@ -389,6 +423,12 @@ private synchronized void 
flushWrittenBytesToServiceAsync() throws IOException {
 
   private synchronized void flushWrittenBytesToServiceInternal(final long 
offset,
   final boolean retainUncommitedData, final boolean isClose) throws 
IOException {
+
+// flush is not called for appendblob as is not needed
+if (this.isAppendBlob) {
+  return;

Review comment:
   this can lead to frequent log lines, every time a flush(), hflush(), 
hsync() is called. will that be ok?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] ishaniahuja commented on a change in pull request #2072: HADOOP-17058. ABFS Support for AppendBlob in Hadoop ABFS Driver

2020-06-15 Thread GitBox


ishaniahuja commented on a change in pull request #2072:
URL: https://github.com/apache/hadoop/pull/2072#discussion_r440002446



##
File path: 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/ConfigurationKeys.java
##
@@ -59,6 +59,9 @@
   public static final String FS_AZURE_ENABLE_AUTOTHROTTLING = 
"fs.azure.enable.autothrottling";
   public static final String FS_AZURE_ALWAYS_USE_HTTPS = 
"fs.azure.always.use.https";
   public static final String FS_AZURE_ATOMIC_RENAME_KEY = 
"fs.azure.atomic.rename.key";
+  /** Provides a config to provide comma separated path prefixes on which 
Appendblob based files are created
+   *  Default is empty. **/
+  public static final String FS_AZURE_APPEND_BLOB_KEY = 
"fs.azure.appendblob.key";

Review comment:
   I have made is similar to FS_AZURE_ATOMIC_RENAME_KEY which also provide 
a set of directories. Further please note for appendblob, this is actually a 
prefix for the path (and not necessarily the directories). This is done so that 
the test suite (which all runs on a container with a randon guid can run) can 
run on appendblob based files. Let me know ur comments/thoughts here.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] ishaniahuja commented on a change in pull request #2072: HADOOP-17058. ABFS Support for AppendBlob in Hadoop ABFS Driver

2020-06-15 Thread GitBox


ishaniahuja commented on a change in pull request #2072:
URL: https://github.com/apache/hadoop/pull/2072#discussion_r440001201



##
File path: 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java
##
@@ -1314,7 +1352,30 @@ private String 
convertXmsPropertiesToCommaSeparatedString(final Hashtable dirSet) {
+
+for (String dir : dirSet) {
+  if (dir.isEmpty() || key.startsWith(dir)) {
+return true;
+  }
+
+  try {
+URI uri = new URI(dir);
+if (null == uri.getAuthority()) {
+  if (key.startsWith(dir + "/")){
+return true;
+  }
+}
+  } catch (URISyntaxException e) {
+LOG.info("URI syntax error creating URI for {}", dir);

Review comment:
   this is used for every file being created, returning true or false. 
Raising an exception can be a problem.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org