[GitHub] [hadoop] brahmareddybattula commented on a change in pull request #2189: HDFS-15025. Applying NVDIMM storage media to HDFS

2020-09-10 Thread GitBox


brahmareddybattula commented on a change in pull request #2189:
URL: https://github.com/apache/hadoop/pull/2189#discussion_r486548053



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsVolumeSpi.java
##
@@ -77,6 +77,9 @@
   /** Returns true if the volume is NOT backed by persistent storage. */
   boolean isTransientStorage();

Review comment:
   So, NVDIMM is peristent storage and RAM.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] brahmareddybattula commented on a change in pull request #2189: HDFS-15025. Applying NVDIMM storage media to HDFS

2020-09-10 Thread GitBox


brahmareddybattula commented on a change in pull request #2189:
URL: https://github.com/apache/hadoop/pull/2189#discussion_r486535639



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/StorageType.java
##
@@ -34,28 +34,35 @@
 @InterfaceStability.Unstable
 public enum StorageType {
   // sorted by the speed of the storage types, from fast to slow
-  RAM_DISK(true),
-  SSD(false),
-  DISK(false),
-  ARCHIVE(false),
-  PROVIDED(false);
+  RAM_DISK(true, true),
+  NVDIMM(false, true),
+  SSD(false, false),
+  DISK(false, false),
+  ARCHIVE(false, false),
+  PROVIDED(false, false);
 
   private final boolean isTransient;
+  private final boolean isRAM;
 
   public static final StorageType DEFAULT = DISK;
 
   public static final StorageType[] EMPTY_ARRAY = {};
 
   private static final StorageType[] VALUES = values();
 
-  StorageType(boolean isTransient) {
+  StorageType(boolean isTransient, boolean isRAM) {
 this.isTransient = isTransient;
+this.isRAM = isRAM;
   }
 
   public boolean isTransient() {
 return isTransient;
   }
 
+  public boolean isRAM() {
+return isRAM;
+  }

Review comment:
   ok, By design if you dn't want to move.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #2296: HDFS-15568. namenode start failed to start when dfs.namenode.max.snapshot.limit set.

2020-09-10 Thread GitBox


goiri commented on a change in pull request #2296:
URL: https://github.com/apache/hadoop/pull/2296#discussion_r486579348



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
##
@@ -508,6 +508,10 @@ FSNamesystem getFSNamesystem() {
 return namesystem;
   }
 
+  public boolean isImageLoaded() {

Review comment:
   Add javadoc

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java
##
@@ -368,6 +368,13 @@ void assertFirstSnapshot(INodeDirectory dir,
 }
   }
 
+  boolean captureOpenFiles() {

Review comment:
   javadoc

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java
##
@@ -368,6 +368,13 @@ void assertFirstSnapshot(INodeDirectory dir,
 }
   }
 
+  boolean captureOpenFiles() {
+return captureOpenFiles;
+  }
+
+  int getMaxSnapshotLimit() {

Review comment:
   VisibleForTesting

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotManager.java
##
@@ -133,4 +137,68 @@ public void testValidateSnapshotIDWidth() throws Exception 
{
 getMaxSnapshotID() < Snapshot.CURRENT_STATE_ID);
   }
 
+  @Test
+  public void SnapshotLimitOnRestart() throws Exception {
+final Configuration conf = new Configuration();
+final Path snapshottableDir
+= new Path("/" + getClass().getSimpleName());
+int numSnapshots = 5;
+conf.setInt(DFSConfigKeys.
+DFS_NAMENODE_SNAPSHOT_MAX_LIMIT, numSnapshots);
+conf.setInt(DFSConfigKeys.DFS_NAMENODE_SNAPSHOT_FILESYSTEM_LIMIT,
+numSnapshots * 2);
+MiniDFSCluster cluster = new MiniDFSCluster.Builder(conf).
+numDataNodes(0).build();
+cluster.waitActive();
+DistributedFileSystem hdfs = cluster.getFileSystem();
+hdfs.mkdirs(snapshottableDir);
+hdfs.allowSnapshot(snapshottableDir);
+int i = 0;
+for (; i < numSnapshots; i++) {
+  hdfs.createSnapshot(snapshottableDir, "s" + i);
+}
+try {
+  hdfs.createSnapshot(snapshottableDir, "s" + i);
+  Assert.fail("Expected SnapshotException not thrown");
+} catch (SnapshotException se) {
+  Assert.assertTrue(
+  StringUtils.toLowerCase(se.getMessage()).contains(
+  "max snapshot limit"));
+}
+
+// now change max snapshot directory limit to 2 and restart namenode
+cluster.getNameNode().getConf().setInt(DFSConfigKeys.
+DFS_NAMENODE_SNAPSHOT_MAX_LIMIT, 2);
+cluster.restartNameNodes();
+
+// make sure edits of all previous 5 create snapshots are replayed
+Assert.assertEquals(numSnapshots, cluster.getNamesystem().
+getSnapshotManager().getNumSnapshots());
+
+// make sure namenode has the new snapshot limit configured as 2
+Assert.assertEquals(2,
+cluster.getNamesystem().getSnapshotManager().getMaxSnapshotLimit());
+
+// Any new snapshot creation should still fail
+try {
+  hdfs.createSnapshot(snapshottableDir, "s" + i);
+  Assert.fail("Expected SnapshotException not thrown");
+} catch (SnapshotException se) {
+  Assert.assertTrue(

Review comment:
   LambdaTestUtils

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotManager.java
##
@@ -133,4 +137,68 @@ public void testValidateSnapshotIDWidth() throws Exception 
{
 getMaxSnapshotID() < Snapshot.CURRENT_STATE_ID);
   }
 
+  @Test
+  public void SnapshotLimitOnRestart() throws Exception {
+final Configuration conf = new Configuration();
+final Path snapshottableDir
+= new Path("/" + getClass().getSimpleName());
+int numSnapshots = 5;
+conf.setInt(DFSConfigKeys.
+DFS_NAMENODE_SNAPSHOT_MAX_LIMIT, numSnapshots);
+conf.setInt(DFSConfigKeys.DFS_NAMENODE_SNAPSHOT_FILESYSTEM_LIMIT,
+numSnapshots * 2);
+MiniDFSCluster cluster = new MiniDFSCluster.Builder(conf).
+numDataNodes(0).build();
+cluster.waitActive();
+DistributedFileSystem hdfs = cluster.getFileSystem();
+hdfs.mkdirs(snapshottableDir);
+hdfs.allowSnapshot(snapshottableDir);
+int i = 0;
+for (; i < numSnapshots; i++) {
+  hdfs.createSnapshot(snapshottableDir, "s" + i);
+}
+try {
+  hdfs.createSnapshot(snapshottableDir, "s" + i);
+  Assert.fail("Expected SnapshotException not thrown");
+} catch (SnapshotException se) {
+  Assert.assertTrue(
+  StringUtils.toLowerCase(se.getMessage()).contains(
+  "max snapshot limit"));
+}
+
+// now change max snapshot directory limit to 2 and restart namenode
+cluster.getNameNode().getConf().setInt(DFSConfigKeys.
+DFS_NAMENODE_SNAPSHOT_MAX_LIMIT, 2);
+cluster.restartNameNodes();
+
+// make sure 

[jira] [Work logged] (HADOOP-17244) HADOOP-17244. S3A directory delete tombstones dir markers prematurely.

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17244?focusedWorklogId=481541=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481541
 ]

ASF GitHub Bot logged work on HADOOP-17244:
---

Author: ASF GitHub Bot
Created on: 10/Sep/20 16:04
Start Date: 10/Sep/20 16:04
Worklog Time Spent: 10m 
  Work Description: steveloughran merged pull request #2280:
URL: https://github.com/apache/hadoop/pull/2280


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481541)
Time Spent: 1h 20m  (was: 1h 10m)

> HADOOP-17244. S3A directory delete tombstones dir markers prematurely.
> --
>
> Key: HADOOP-17244
> URL: https://issues.apache.org/jira/browse/HADOOP-17244
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Test failure: 
> {{ITestS3AFileContextMainOperations#testRenameDirectoryAsNonExistentDirectory}}
> This is repeatable on -Dauth runs (we haven't been running them, have we?)
> Either its from the recent dir marker changes (initial hypothesis) or its 
> been lurking a while and not been picked up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran merged pull request #2280: HADOOP-17244. S3A directory delete tombstones dir markers prematurely.

2020-09-10 Thread GitBox


steveloughran merged pull request #2280:
URL: https://github.com/apache/hadoop/pull/2280


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] brahmareddybattula commented on a change in pull request #2189: HDFS-15025. Applying NVDIMM storage media to HDFS

2020-09-10 Thread GitBox


brahmareddybattula commented on a change in pull request #2189:
URL: https://github.com/apache/hadoop/pull/2189#discussion_r486549682



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/StorageType.java
##
@@ -34,28 +34,35 @@
 @InterfaceStability.Unstable
 public enum StorageType {
   // sorted by the speed of the storage types, from fast to slow
-  RAM_DISK(true),
-  SSD(false),
-  DISK(false),
-  ARCHIVE(false),
-  PROVIDED(false);
+  RAM_DISK(true, true),
+  NVDIMM(false, true),
+  SSD(false, false),
+  DISK(false, false),
+  ARCHIVE(false, false),
+  PROVIDED(false, false);
 
   private final boolean isTransient;
+  private final boolean isRAM;
 
   public static final StorageType DEFAULT = DISK;
 
   public static final StorageType[] EMPTY_ARRAY = {};
 
   private static final StorageType[] VALUES = values();
 
-  StorageType(boolean isTransient) {
+  StorageType(boolean isTransient, boolean isRAM) {
 this.isTransient = isTransient;
+this.isRAM = isRAM;
   }
 
   public boolean isTransient() {
 return isTransient;
   }
 
+  public boolean isRAM() {
+return isRAM;
+  }

Review comment:
   My final query then, why can't have one NVDIMM like one SSD as this also 
movable and peristent..?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] szetszwo commented on a change in pull request #2296: HDFS-15568. namenode start failed to start when dfs.namenode.max.snapshot.limit set.

2020-09-10 Thread GitBox


szetszwo commented on a change in pull request #2296:
URL: https://github.com/apache/hadoop/pull/2296#discussion_r486594478



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java
##
@@ -448,22 +455,31 @@ public String createSnapshot(final LeaseManager 
leaseManager,
   "snapshot IDs and ID rollover is not supported.");
 }
 int n = numSnapshots.get();
-if (n >= maxSnapshotFSLimit) {
-  // We have reached the maximum snapshot limit
-  throw new SnapshotException(
-  "Failed to create snapshot: there are already " + (n + 1)
-  + " snapshot(s) and the max snapshot limit is "
-  + maxSnapshotFSLimit);
-}
-
-srcRoot.addSnapshot(snapshotCounter, snapshotName, leaseManager,
-this.captureOpenFiles, maxSnapshotLimit, mtime);
+checkSnapshotLimit(maxSnapshotFSLimit, n);
+srcRoot.addSnapshot(this, snapshotName, leaseManager, mtime);
   
 //create success, update id
 snapshotCounter++;
 numSnapshots.getAndIncrement();
 return Snapshot.getSnapshotPath(snapshotRoot, snapshotName);
   }
+
+  void checkSnapshotLimit(int limit, int numSnapshots)

Review comment:
   I suggest to add limit type to the error message as below.
   
   ```
   diff --git 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectorySnapshottableFeature.java
 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectorySnapshottableFeature.java
   index 266c0a71241..7a47ab4000d 100644
   --- 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectorySnapshottableFeature.java
   +++ 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectorySnapshottableFeature.java
   @@ -190,8 +190,7 @@ public Snapshot addSnapshot(INodeDirectory snapshotRoot,
  + n + " snapshot(s) and the snapshot quota is "
  + snapshotQuota);
}
   -snapshotManager.checkSnapshotLimit(snapshotManager.
   -getMaxSnapshotLimit(), n);
   +snapshotManager.checkPerDirectorySnapshotLimit(n);
final Snapshot s = new Snapshot(id, name, snapshotRoot);
final byte[] nameBytes = s.getRoot().getLocalNameBytes();
final int i = searchSnapshot(nameBytes);
   diff --git 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java
 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java
   index 0a2e18c3dc3..7c482074486 100644
   --- 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java
   +++ 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java
   @@ -455,7 +455,7 @@ public String createSnapshot(final LeaseManager 
leaseManager,
  "snapshot IDs and ID rollover is not supported.");
}
int n = numSnapshots.get();
   -checkSnapshotLimit(maxSnapshotFSLimit, n);
   +checkFileSystemSnapshotLimit(n);
srcRoot.addSnapshot(this, snapshotName, leaseManager, mtime);
  
//create success, update id
   @@ -464,12 +464,19 @@ public String createSnapshot(final LeaseManager 
leaseManager,
return Snapshot.getSnapshotPath(snapshotRoot, snapshotName);
  }

   -  void checkSnapshotLimit(int limit, int numSnapshots)
   -  throws SnapshotException {
   +  void checkFileSystemSnapshotLimit(int n) throws SnapshotException {
   +checkSnapshotLimit(maxSnapshotFSLimit, n, "file system");
   +  }
   +
   +  void checkPerDirectorySnapshotLimit(int n) throws SnapshotException {
   +checkSnapshotLimit(maxSnapshotLimit, n, "per directory");
   +  }
   +
   +  private void checkSnapshotLimit(int limit, int numSnapshots,
   +  String type) throws SnapshotException {
if (numSnapshots >= limit) {
   -  String msg = "there are already " + (numSnapshots + 1)
   -  + " snapshot(s) and the max snapshot limit is "
   -  + limit;
   +  String msg = "There are already " + (numSnapshots + 1)
   +  + " snapshot(s) and the " + type + " snapshot limit is " + limit;
  if (fsdir.isImageLoaded()) {
// We have reached the maximum snapshot limit
throw new SnapshotException(
   
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: 

[GitHub] [hadoop] viirya opened a new pull request #2297: HADOOP-17125. Using snappy-java in SnappyCodec

2020-09-10 Thread GitBox


viirya opened a new pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297


   See https://issues.apache.org/jira/browse/HADOOP-17125 for details.
   
   Offline discussed with @dbtsai and submitted this based on #2201.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=481670=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481670
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 10/Sep/20 20:01
Start Date: 10/Sep/20 20:01
Worklog Time Spent: 10m 
  Work Description: viirya opened a new pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297


   See https://issues.apache.org/jira/browse/HADOOP-17125 for details.
   
   Offline discussed with @dbtsai and submitted this based on #2201.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481670)
Time Spent: 1.5h  (was: 1h 20m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] brahmareddybattula commented on a change in pull request #2189: HDFS-15025. Applying NVDIMM storage media to HDFS

2020-09-10 Thread GitBox


brahmareddybattula commented on a change in pull request #2189:
URL: https://github.com/apache/hadoop/pull/2189#discussion_r486545111



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsVolumeSpi.java
##
@@ -77,6 +77,9 @@
   /** Returns true if the volume is NOT backed by persistent storage. */
   boolean isTransientStorage();

Review comment:
   Ok. Got it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-17244) HADOOP-17244. S3A directory delete tombstones dir markers prematurely.

2020-09-10 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-17244.
-
Fix Version/s: 3.3.1
   Resolution: Fixed

> HADOOP-17244. S3A directory delete tombstones dir markers prematurely.
> --
>
> Key: HADOOP-17244
> URL: https://issues.apache.org/jira/browse/HADOOP-17244
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.3.1
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Test failure: 
> {{ITestS3AFileContextMainOperations#testRenameDirectoryAsNonExistentDirectory}}
> This is repeatable on -Dauth runs (we haven't been running them, have we?)
> Either its from the recent dir marker changes (initial hypothesis) or its 
> been lurking a while and not been picked up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17254) Upgrade hbase to 1.4.13 on branch-2.10

2020-09-10 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193707#comment-17193707
 ] 

Mingliang Liu commented on HADOOP-17254:


+1

Thanks!

> Upgrade hbase to 1.4.13 on branch-2.10
> --
>
> Key: HADOOP-17254
> URL: https://issues.apache.org/jira/browse/HADOOP-17254
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> hbase.version must be updated to address CVE-2018-8025 on branch-2.10.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #2280: HADOOP-17244. S3A directory delete tombstones dir markers prematurely.

2020-09-10 Thread GitBox


steveloughran commented on pull request #2280:
URL: https://github.com/apache/hadoop/pull/2280#issuecomment-690402713


   repeatedly tested against london with options showing the error and the 
-Dkeep option *and unguarded*. Some transient failures related to local network 
issues, addressed in HADOOP-17181.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17244) HADOOP-17244. S3A directory delete tombstones dir markers prematurely.

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17244?focusedWorklogId=481539=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481539
 ]

ASF GitHub Bot logged work on HADOOP-17244:
---

Author: ASF GitHub Bot
Created on: 10/Sep/20 16:01
Start Date: 10/Sep/20 16:01
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2280:
URL: https://github.com/apache/hadoop/pull/2280#issuecomment-690402713


   repeatedly tested against london with options showing the error and the 
-Dkeep option *and unguarded*. Some transient failures related to local network 
issues, addressed in HADOOP-17181.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481539)
Time Spent: 1h 10m  (was: 1h)

> HADOOP-17244. S3A directory delete tombstones dir markers prematurely.
> --
>
> Key: HADOOP-17244
> URL: https://issues.apache.org/jira/browse/HADOOP-17244
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Test failure: 
> {{ITestS3AFileContextMainOperations#testRenameDirectoryAsNonExistentDirectory}}
> This is repeatable on -Dauth runs (we haven't been running them, have we?)
> Either its from the recent dir marker changes (initial hypothesis) or its 
> been lurking a while and not been picked up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] viirya commented on a change in pull request #2297: HADOOP-17125. Using snappy-java in SnappyCodec

2020-09-10 Thread GitBox


viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r486686483



##
File path: 
hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyCompressor.c
##
@@ -1,166 +0,0 @@
-/*

Review comment:
   Hmm, because we remove native method in java files, I think we don't 
generate .h file needed for compilation: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/1/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
   
   ```
   [WARNING] 
/home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-2297/src/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.c:32:10:
 fatal error: org_apache_hadoop_io_compress_snappy_SnappyDecompressor.h: No 
such file or directory
   [WARNING]  #include 
"org_apache_hadoop_io_compress_snappy_SnappyDecompressor.h"
   [WARNING]   
^~~
   [WARNING] compilation terminated.
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=481781=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481781
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 10/Sep/20 23:23
Start Date: 10/Sep/20 23:23
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r486686483



##
File path: 
hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyCompressor.c
##
@@ -1,166 +0,0 @@
-/*

Review comment:
   Hmm, because we remove native method in java files, I think we don't 
generate .h file needed for compilation: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/1/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
   
   ```
   [WARNING] 
/home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-2297/src/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.c:32:10:
 fatal error: org_apache_hadoop_io_compress_snappy_SnappyDecompressor.h: No 
such file or directory
   [WARNING]  #include 
"org_apache_hadoop_io_compress_snappy_SnappyDecompressor.h"
   [WARNING]   
^~~
   [WARNING] compilation terminated.
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481781)
Time Spent: 2.5h  (was: 2h 20m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] fengnanli commented on pull request #2266: HDFS-15554. RBF: force router check file existence in destinations before adding/updating mount points

2020-09-10 Thread GitBox


fengnanli commented on pull request #2266:
URL: https://github.com/apache/hadoop/pull/2266#issuecomment-690809794


   The TestRouterRpcMultiDestination test passed locally. @goiri Can you help 
commit it? Thanks a lot!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] dbtsai closed pull request #2201: HADOOP-17125. Using snappy-java in SnappyCodec

2020-09-10 Thread GitBox


dbtsai closed pull request #2201:
URL: https://github.com/apache/hadoop/pull/2201


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] smengcl commented on a change in pull request #2258: HDFS-15539. When disallowing snapshot on a dir, throw exception if its trash root is not empty

2020-09-10 Thread GitBox


smengcl commented on a change in pull request #2258:
URL: https://github.com/apache/hadoop/pull/2258#discussion_r486694935



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java
##
@@ -2442,4 +2442,38 @@ public void testGetTrashRootOnEZInSnapshottableDir()
   }
 }
   }
+
+  @Test
+  public void testDisallowSnapshotShouldThrowWhenTrashRootExists()
+  throws IOException {
+Configuration conf = getTestConfiguration();
+MiniDFSCluster cluster =
+new MiniDFSCluster.Builder(conf).numDataNodes(1).build();
+try {
+  DistributedFileSystem dfs = cluster.getFileSystem();
+  Path testDir = new Path("/disallowss/test1/");
+  Path file0path = new Path(testDir, "file-0");
+  dfs.create(file0path);
+  dfs.allowSnapshot(testDir);
+  // Create trash root manually
+  Path testDirTrashRoot = new Path(testDir, FileSystem.TRASH_PREFIX);
+  dfs.mkdirs(testDirTrashRoot);
+  // Try disallowing snapshot, should throw
+  try {
+dfs.disallowSnapshot(testDir);
+fail("Should have thrown IOException when trash root exists inside "

Review comment:
   Thanks! I have updated accordingly.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] umamaheswararao opened a new pull request #2298: HDFS-15532: listFiles on root/InternalDir will fail if fallback root has file.

2020-09-10 Thread GitBox


umamaheswararao opened a new pull request #2298:
URL: https://github.com/apache/hadoop/pull/2298


   https://issues.apache.org/jira/browse/HDFS-15532



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] hadoop-yetus commented on pull request #2266: HDFS-15554. RBF: force router check file existence in destinations before adding/updating mount points

2020-09-10 Thread GitBox


hadoop-yetus commented on pull request #2266:
URL: https://github.com/apache/hadoop/pull/2266#issuecomment-690766821


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |  33m 24s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files 
found.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
4 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  29m 21s |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 40s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |   0m 36s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   0m 27s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 40s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  15m 26s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 40s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 54s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 13s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   1m 10s |  trunk passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 31s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 32s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javac  |   0m 32s |  
hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 0 new + 30 
unchanged - 2 fixed = 30 total (was 32)  |
   | +1 :green_heart: |  compile  |   0m 28s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  javac  |   0m 28s |  
hadoop-hdfs-project_hadoop-hdfs-rbf-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 0 new 
+ 30 unchanged - 2 fixed = 30 total (was 32)  |
   | +1 :green_heart: |  checkstyle  |   0m 17s |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 30s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  xml  |   0m  2s |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  13m 56s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 33s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   1m 14s |  the patch passed  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  |   8m 39s |  hadoop-hdfs-rbf in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 33s |  The patch does not generate 
ASF License warnings.  |
   |  |   | 114m 33s |   |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2266/6/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2266 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle xml markdownlint |
   | uname | Linux 490356438ed4 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 9960c01a25c |
   | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   | unit | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2266/6/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2266/6/testReport/ |
   | Max. process+thread count | 2936 (vs. 

[jira] [Work logged] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17222?focusedWorklogId=481873=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481873
 ]

ASF GitHub Bot logged work on HADOOP-17222:
---

Author: ASF GitHub Bot
Created on: 11/Sep/20 03:25
Start Date: 11/Sep/20 03:25
Worklog Time Spent: 10m 
  Work Description: liuml07 commented on pull request #2241:
URL: https://github.com/apache/hadoop/pull/2241#issuecomment-690850909


   I see `TestNameNodeRetryCacheMetrics.testRetryCacheMetrics` and ` 
TestMultipleNNPortQOP.testMultipleNNPortOverwriteDownStream` fail on my Linux 
machine with the patch. Will confirm they are not related.
   
   If no objections, I'll commit later this week. Thanks,



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481873)
Time Spent: 2h  (was: 1h 50m)

> Create socket address combined with cache to speed up hdfs client choose 
> DataNode
> -
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Assignee: fanrui
>Priority: Major
>  Labels: pull-request-available
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=481778=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481778
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 10/Sep/20 23:19
Start Date: 10/Sep/20 23:19
Worklog Time Spent: 10m 
  Work Description: dbtsai commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r486685241



##
File path: 
hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyCompressor.c
##
@@ -1,166 +0,0 @@
-/*

Review comment:
   Per https://github.com/apache/hadoop/pull/2201#issuecomment-681687572 
Are those native code used in `hadoop-mapreduce-client-nativetask`? If so, we 
probably need to keep it now.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481778)
Time Spent: 2h 10m  (was: 2h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17254) Upgrade hbase to 1.4.13 on branch-2.10

2020-09-10 Thread Masatake Iwasaki (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HADOOP-17254:
--
Fix Version/s: 2.10.1
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Thanks, [~liuml07]. I merged this.

> Upgrade hbase to 1.4.13 on branch-2.10
> --
>
> Key: HADOOP-17254
> URL: https://issues.apache.org/jira/browse/HADOOP-17254
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.10.1
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> hbase.version must be updated to address CVE-2018-8025 on branch-2.10.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17254) Upgrade hbase to 1.4.13 on branch-2.10

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17254?focusedWorklogId=481797=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481797
 ]

ASF GitHub Bot logged work on HADOOP-17254:
---

Author: ASF GitHub Bot
Created on: 10/Sep/20 23:54
Start Date: 10/Sep/20 23:54
Worklog Time Spent: 10m 
  Work Description: iwasakims commented on pull request #2290:
URL: https://github.com/apache/hadoop/pull/2290#issuecomment-690791721


   Thanks, @liuml07. I merged this.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481797)
Time Spent: 50m  (was: 40m)

> Upgrade hbase to 1.4.13 on branch-2.10
> --
>
> Key: HADOOP-17254
> URL: https://issues.apache.org/jira/browse/HADOOP-17254
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> hbase.version must be updated to address CVE-2018-8025 on branch-2.10.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] iwasakims merged pull request #2290: HADOOP-17254. Upgrade hbase to 1.4.13 on branch-2.10.

2020-09-10 Thread GitBox


iwasakims merged pull request #2290:
URL: https://github.com/apache/hadoop/pull/2290


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] liuml07 commented on pull request #2241: HADOOP-17222. Create socket address combined with URI cache

2020-09-10 Thread GitBox


liuml07 commented on pull request #2241:
URL: https://github.com/apache/hadoop/pull/2241#issuecomment-690850909


   I see `TestNameNodeRetryCacheMetrics.testRetryCacheMetrics` and ` 
TestMultipleNNPortQOP.testMultipleNNPortOverwriteDownStream` fail on my Linux 
machine with the patch. Will confirm they are not related.
   
   If no objections, I'll commit later this week. Thanks,



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17222?focusedWorklogId=481876=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481876
 ]

ASF GitHub Bot logged work on HADOOP-17222:
---

Author: ASF GitHub Bot
Created on: 11/Sep/20 03:44
Start Date: 11/Sep/20 03:44
Worklog Time Spent: 10m 
  Work Description: 1996fanrui commented on pull request #2241:
URL: https://github.com/apache/hadoop/pull/2241#issuecomment-690855787


   > I see `TestNameNodeRetryCacheMetrics.testRetryCacheMetrics` and ` 
TestMultipleNNPortQOP.testMultipleNNPortOverwriteDownStream` fail on my Linux 
machine with the patch. Will confirm they are not related.
   > 
   > If no objections, I'll commit later this week. Thanks,
   @liuml07 , thanks for your test.
   I run these two unit tests and they are still successful. In the previous 
test of hadoop robot, TestMultipleNNPortQOP did not fail, only the last time it 
failed. But the last time I only committed an empty, my code did not change.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481876)
Time Spent: 2h 10m  (was: 2h)

> Create socket address combined with cache to speed up hdfs client choose 
> DataNode
> -
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Assignee: fanrui
>Priority: Major
>  Labels: pull-request-available
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, 

[GitHub] [hadoop] 1996fanrui commented on pull request #2241: HADOOP-17222. Create socket address combined with URI cache

2020-09-10 Thread GitBox


1996fanrui commented on pull request #2241:
URL: https://github.com/apache/hadoop/pull/2241#issuecomment-690855787


   > I see `TestNameNodeRetryCacheMetrics.testRetryCacheMetrics` and ` 
TestMultipleNNPortQOP.testMultipleNNPortOverwriteDownStream` fail on my Linux 
machine with the patch. Will confirm they are not related.
   > 
   > If no objections, I'll commit later this week. Thanks,
   @liuml07 , thanks for your test.
   I run these two unit tests and they are still successful. In the previous 
test of hadoop robot, TestMultipleNNPortQOP did not fail, only the last time it 
failed. But the last time I only committed an empty, my code did not change.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] dbtsai commented on pull request #2201: HADOOP-17125. Using snappy-java in SnappyCodec

2020-09-10 Thread GitBox


dbtsai commented on pull request #2201:
URL: https://github.com/apache/hadoop/pull/2201#issuecomment-690780546


   Closing this PR in favor of https://github.com/apache/hadoop/pull/2297



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=481776=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481776
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 10/Sep/20 23:16
Start Date: 10/Sep/20 23:16
Worklog Time Spent: 10m 
  Work Description: dbtsai commented on pull request #2201:
URL: https://github.com/apache/hadoop/pull/2201#issuecomment-690780546


   Closing this PR in favor of https://github.com/apache/hadoop/pull/2297



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481776)
Time Spent: 2h  (was: 1h 50m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=481775=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481775
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 10/Sep/20 23:16
Start Date: 10/Sep/20 23:16
Worklog Time Spent: 10m 
  Work Description: dbtsai closed pull request #2201:
URL: https://github.com/apache/hadoop/pull/2201


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481775)
Time Spent: 1h 50m  (was: 1h 40m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] hadoop-yetus commented on pull request #2297: HADOOP-17125. Using snappy-java in SnappyCodec

2020-09-10 Thread GitBox


hadoop-yetus commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690812236


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   0m 29s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
5 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 36s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m 46s |  trunk passed  |
   | +1 :green_heart: |  compile  |  19m 28s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  17m  8s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   3m 41s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m  1s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 48s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 14s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 10s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   2m 15s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +0 :ok: |  findbugs  |   0m 37s |  branch/hadoop-project no findbugs 
output file (findbugsXml.xml)  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 29s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m  4s |  the patch passed  |
   | -1 :x: |  compile  |   1m  4s |  root in the patch failed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.  |
   | -1 :x: |  cc  |   1m  4s |  root in the patch failed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.  |
   | -1 :x: |  golang  |   1m  4s |  root in the patch failed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.  |
   | -1 :x: |  javac  |   1m  4s |  root in the patch failed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.  |
   | -1 :x: |  compile  |   0m 56s |  root in the patch failed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.  |
   | -1 :x: |  cc  |   0m 56s |  root in the patch failed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.  |
   | -1 :x: |  golang  |   0m 56s |  root in the patch failed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.  |
   | -1 :x: |  javac  |   0m 56s |  root in the patch failed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.  |
   | -0 :warning: |  checkstyle  |   2m 21s |  root: The patch generated 1 new 
+ 151 unchanged - 5 fixed = 152 total (was 156)  |
   | +1 :green_heart: |  mvnsite  |   1m 24s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  xml  |   0m  3s |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  13m 48s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 41s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 36s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  findbugs  |   0m 19s |  hadoop-project has no data from 
findbugs  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 18s |  hadoop-project in the patch 
passed.  |
   | -1 :x: |  unit  |   0m 40s |  hadoop-common in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 34s |  The patch does not generate 
ASF License warnings.  |
   |  |   | 125m 50s |   |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2297 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient xml findbugs checkstyle cc golang |
   | uname | Linux 6a174de3a925 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 9960c01a25c |
   | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=481835=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481835
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 11/Sep/20 01:07
Start Date: 11/Sep/20 01:07
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690812236


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   0m 29s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
5 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 36s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m 46s |  trunk passed  |
   | +1 :green_heart: |  compile  |  19m 28s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  17m  8s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   3m 41s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m  1s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 48s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 14s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 10s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   2m 15s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +0 :ok: |  findbugs  |   0m 37s |  branch/hadoop-project no findbugs 
output file (findbugsXml.xml)  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 29s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m  4s |  the patch passed  |
   | -1 :x: |  compile  |   1m  4s |  root in the patch failed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.  |
   | -1 :x: |  cc  |   1m  4s |  root in the patch failed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.  |
   | -1 :x: |  golang  |   1m  4s |  root in the patch failed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.  |
   | -1 :x: |  javac  |   1m  4s |  root in the patch failed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.  |
   | -1 :x: |  compile  |   0m 56s |  root in the patch failed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.  |
   | -1 :x: |  cc  |   0m 56s |  root in the patch failed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.  |
   | -1 :x: |  golang  |   0m 56s |  root in the patch failed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.  |
   | -1 :x: |  javac  |   0m 56s |  root in the patch failed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.  |
   | -0 :warning: |  checkstyle  |   2m 21s |  root: The patch generated 1 new 
+ 151 unchanged - 5 fixed = 152 total (was 156)  |
   | +1 :green_heart: |  mvnsite  |   1m 24s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  xml  |   0m  3s |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  13m 48s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 41s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 36s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  findbugs  |   0m 19s |  hadoop-project has no data from 
findbugs  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 18s |  hadoop-project in the patch 
passed.  |
   | -1 :x: |  unit  |   0m 40s |  hadoop-common in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 34s |  The patch does not generate 
ASF License warnings.  |
   |  |   | 125m 50s |   |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2297 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient xml findbugs checkstyle cc golang |
   | uname | Linux 6a174de3a925 

[jira] [Commented] (HADOOP-17257) pid file delete when service stop (secure datanode ) show cat no directory

2020-09-10 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193954#comment-17193954
 ] 

zhuqi commented on HADOOP-17257:


cc [~boky01] 

Yeah , but i am not the contributor of hadoop-common module, so i add open a 
new issue, and i fix the condition that when the datanode is not running, the 
origin source code also show cat error.

Thanks.

> pid file delete when service stop (secure datanode ) show cat no directory
> --
>
> Key: HADOOP-17257
> URL: https://issues.apache.org/jira/browse/HADOOP-17257
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: scripts, security
>Affects Versions: 3.4.0
>Reporter: zhuqi
>Priority: Major
> Attachments: HADOOP-17257-0.0.1.patch
>
>
> when stop running secure datanode
> show cat no directory .
>  
> when stop unrunning secure datanode
> also show cat no pid directory
>  
> It's both unreasonable



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=481753=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481753
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 10/Sep/20 22:29
Start Date: 10/Sep/20 22:29
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690765839


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   0m 41s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
5 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 34s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  33m  4s |  trunk passed  |
   | +1 :green_heart: |  compile  |  25m 59s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  21m 36s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   3m 14s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 10s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 22s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 14s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  9s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   2m 16s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +0 :ok: |  findbugs  |   0m 38s |  branch/hadoop-project no findbugs 
output file (findbugsXml.xml)  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 29s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m  6s |  the patch passed  |
   | -1 :x: |  compile  |   1m  6s |  root in the patch failed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.  |
   | -1 :x: |  javac  |   1m  6s |  root in the patch failed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.  |
   | -1 :x: |  compile  |   1m  0s |  root in the patch failed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.  |
   | -1 :x: |  javac  |   1m  0s |  root in the patch failed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.  |
   | -0 :warning: |  checkstyle  |   2m 20s |  root: The patch generated 1 new 
+ 151 unchanged - 5 fixed = 152 total (was 156)  |
   | +1 :green_heart: |  mvnsite  |   1m 24s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  xml  |   0m  3s |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  14m 20s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 41s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 35s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  findbugs  |   0m 19s |  hadoop-project has no data from 
findbugs  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 18s |  hadoop-project in the patch 
passed.  |
   | -1 :x: |  unit  |   0m 42s |  hadoop-common in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 33s |  The patch does not generate 
ASF License warnings.  |
   |  |   | 146m 43s |   |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2297 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient xml findbugs checkstyle |
   | uname | Linux 205df60c0f1e 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 9960c01a25c |
   | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   | compile | 

[GitHub] [hadoop] hadoop-yetus commented on pull request #2297: HADOOP-17125. Using snappy-java in SnappyCodec

2020-09-10 Thread GitBox


hadoop-yetus commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690765839


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   0m 41s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
5 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 34s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  33m  4s |  trunk passed  |
   | +1 :green_heart: |  compile  |  25m 59s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  21m 36s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   3m 14s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 10s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 22s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 14s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  9s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   2m 16s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +0 :ok: |  findbugs  |   0m 38s |  branch/hadoop-project no findbugs 
output file (findbugsXml.xml)  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 29s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m  6s |  the patch passed  |
   | -1 :x: |  compile  |   1m  6s |  root in the patch failed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.  |
   | -1 :x: |  javac  |   1m  6s |  root in the patch failed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.  |
   | -1 :x: |  compile  |   1m  0s |  root in the patch failed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.  |
   | -1 :x: |  javac  |   1m  0s |  root in the patch failed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.  |
   | -0 :warning: |  checkstyle  |   2m 20s |  root: The patch generated 1 new 
+ 151 unchanged - 5 fixed = 152 total (was 156)  |
   | +1 :green_heart: |  mvnsite  |   1m 24s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  xml  |   0m  3s |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  14m 20s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 41s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 35s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  findbugs  |   0m 19s |  hadoop-project has no data from 
findbugs  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 18s |  hadoop-project in the patch 
passed.  |
   | -1 :x: |  unit  |   0m 42s |  hadoop-common in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 33s |  The patch does not generate 
ASF License warnings.  |
   |  |   | 146m 43s |   |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2297 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient xml findbugs checkstyle |
   | uname | Linux 205df60c0f1e 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 9960c01a25c |
   | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   | compile | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/1/artifact/out/patch-compile-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt
 |
   | javac | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/1/artifact/out/patch-compile-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt
 |
   | compile | 

[GitHub] [hadoop] iwasakims commented on pull request #2290: HADOOP-17254. Upgrade hbase to 1.4.13 on branch-2.10.

2020-09-10 Thread GitBox


iwasakims commented on pull request #2290:
URL: https://github.com/apache/hadoop/pull/2290#issuecomment-690791721


   Thanks, @liuml07. I merged this.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17254) Upgrade hbase to 1.4.13 on branch-2.10

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17254?focusedWorklogId=481796=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481796
 ]

ASF GitHub Bot logged work on HADOOP-17254:
---

Author: ASF GitHub Bot
Created on: 10/Sep/20 23:54
Start Date: 10/Sep/20 23:54
Worklog Time Spent: 10m 
  Work Description: iwasakims merged pull request #2290:
URL: https://github.com/apache/hadoop/pull/2290


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481796)
Time Spent: 40m  (was: 0.5h)

> Upgrade hbase to 1.4.13 on branch-2.10
> --
>
> Key: HADOOP-17254
> URL: https://issues.apache.org/jira/browse/HADOOP-17254
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> hbase.version must be updated to address CVE-2018-8025 on branch-2.10.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] YaYun-Wang commented on a change in pull request #2189: HDFS-15025. Applying NVDIMM storage media to HDFS

2020-09-10 Thread GitBox


YaYun-Wang commented on a change in pull request #2189:
URL: https://github.com/apache/hadoop/pull/2189#discussion_r486716810



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsVolumeSpi.java
##
@@ -77,6 +77,9 @@
   /** Returns true if the volume is NOT backed by persistent storage. */
   boolean isTransientStorage();

Review comment:
   > So, NVDIMM is peristent storage and RAM.
   
   yes, that’s right.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] YaYun-Wang commented on a change in pull request #2189: HDFS-15025. Applying NVDIMM storage media to HDFS

2020-09-10 Thread GitBox


YaYun-Wang commented on a change in pull request #2189:
URL: https://github.com/apache/hadoop/pull/2189#discussion_r486716471



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/StorageType.java
##
@@ -34,28 +34,35 @@
 @InterfaceStability.Unstable
 public enum StorageType {
   // sorted by the speed of the storage types, from fast to slow
-  RAM_DISK(true),
-  SSD(false),
-  DISK(false),
-  ARCHIVE(false),
-  PROVIDED(false);
+  RAM_DISK(true, true),
+  NVDIMM(false, true),
+  SSD(false, false),
+  DISK(false, false),
+  ARCHIVE(false, false),
+  PROVIDED(false, false);
 
   private final boolean isTransient;
+  private final boolean isRAM;
 
   public static final StorageType DEFAULT = DISK;
 
   public static final StorageType[] EMPTY_ARRAY = {};
 
   private static final StorageType[] VALUES = values();
 
-  StorageType(boolean isTransient) {
+  StorageType(boolean isTransient, boolean isRAM) {
 this.isTransient = isTransient;
+this.isRAM = isRAM;
   }
 
   public boolean isTransient() {
 return isTransient;
   }
 
+  public boolean isRAM() {
+return isRAM;
+  }

Review comment:
   > My final query then, why can't have one NVDIMM like one SSD as this 
also movable and peristent..?
   
   
   Considering NVDIMM is faster, so NVDIMM does not  use `FsDatasetCache()` 
which SSD needs in the design.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] liuml07 commented on pull request #2189: HDFS-15025. Applying NVDIMM storage media to HDFS

2020-09-10 Thread GitBox


liuml07 commented on pull request #2189:
URL: https://github.com/apache/hadoop/pull/2189#issuecomment-690839476


   Will check again later this week.
   
   Ideally we can get a clean QA. Could you check the test failures and make 
sure they are not related? Thanks,



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] hadoop-yetus commented on pull request #2258: HDFS-15539. When disallowing snapshot on a dir, throw exception if its trash root is not empty

2020-09-10 Thread GitBox


hadoop-yetus commented on pull request #2258:
URL: https://github.com/apache/hadoop/pull/2258#issuecomment-690845948


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   1m  7s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
1 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 28s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  28m 49s |  trunk passed  |
   | +1 :green_heart: |  compile  |   4m 37s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |   4m 15s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   1m  2s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 14s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  19m  0s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 25s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 56s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   2m 28s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   5m 38s |  trunk passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 24s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 56s |  the patch passed  |
   | +1 :green_heart: |  compile  |   4m 10s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javac  |   4m 10s |  the patch passed  |
   | +1 :green_heart: |  compile  |   3m 46s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  javac  |   3m 46s |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m 53s |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 59s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  shadedclient  |  15m 50s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 20s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 47s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   7m  3s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 18s |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  |  77m 54s |  hadoop-hdfs in the patch passed.  |
   | +0 :ok: |  asflicense  |   0m 38s |  ASF License check generated no 
output?  |
   |  |   | 194m 25s |   |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.TestReencryption |
   |   | hadoop.hdfs.server.namenode.TestFSEditLogLoader |
   |   | hadoop.hdfs.server.namenode.TestNameNodeReconfigure |
   |   | hadoop.hdfs.server.namenode.TestNamenodeStorageDirectives |
   |   | hadoop.hdfs.server.namenode.TestNameEditsConfigs |
   |   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
   |   | hadoop.hdfs.server.namenode.TestFileLimit |
   |   | 
hadoop.hdfs.server.namenode.sps.TestStoragePolicySatisfierWithStripedFile |
   |   | hadoop.hdfs.TestAppendDifferentChecksum |
   |   | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier |
   |   | hadoop.hdfs.server.namenode.TestCheckpoint |
   |   | hadoop.hdfs.server.namenode.TestMetadataVersionOutput |
   |   | 
hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness
 |
   |   | hadoop.hdfs.server.namenode.TestFSImageWithXAttr |
   |   | 
hadoop.hdfs.server.namenode.TestQuotaWithStripedBlocksWithRandomECPolicy |
   |   | hadoop.hdfs.server.namenode.TestPersistentStoragePolicySatisfier |
   |   | hadoop.hdfs.TestDFSStripedInputStream |
   |   | hadoop.hdfs.server.namenode.TestStoragePolicySatisfierWithHA |
   |   | hadoop.hdfs.server.datanode.TestDataNodeLifeline |
   |   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
   |   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   |   | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport |
   |   | hadoop.hdfs.server.datanode.TestBatchIbr |
   |   | hadoop.hdfs.server.namenode.TestLargeDirectoryDelete |
   |   | 

[GitHub] [hadoop] dbtsai commented on a change in pull request #2297: HADOOP-17125. Using snappy-java in SnappyCodec

2020-09-10 Thread GitBox


dbtsai commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r486685241



##
File path: 
hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyCompressor.c
##
@@ -1,166 +0,0 @@
-/*

Review comment:
   Per https://github.com/apache/hadoop/pull/2201#issuecomment-681687572 
Are those native code used in `hadoop-mapreduce-client-nativetask`? If so, we 
probably need to keep it now.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=481779=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481779
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 10/Sep/20 23:20
Start Date: 10/Sep/20 23:20
Worklog Time Spent: 10m 
  Work Description: dbtsai commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690781897


   Thanks @viirya for taking over my https://github.com/apache/hadoop/pull/2201 
, and continue working on it.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481779)
Time Spent: 2h 20m  (was: 2h 10m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] dbtsai commented on pull request #2297: HADOOP-17125. Using snappy-java in SnappyCodec

2020-09-10 Thread GitBox


dbtsai commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690781897


   Thanks @viirya for taking over my https://github.com/apache/hadoop/pull/2201 
, and continue working on it.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=481786=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481786
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 10/Sep/20 23:30
Start Date: 10/Sep/20 23:30
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r486688702



##
File path: 
hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyCompressor.c
##
@@ -1,166 +0,0 @@
-/*

Review comment:
   Btw, I don't see they are used in `hadoop-mapreduce-client-nativetask` 
if I don't miss it. Let's wait the build and test.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481786)
Time Spent: 2h 40m  (was: 2.5h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] viirya commented on a change in pull request #2297: HADOOP-17125. Using snappy-java in SnappyCodec

2020-09-10 Thread GitBox


viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r486688702



##
File path: 
hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyCompressor.c
##
@@ -1,166 +0,0 @@
-/*

Review comment:
   Btw, I don't see they are used in `hadoop-mapreduce-client-nativetask` 
if I don't miss it. Let's wait the build and test.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] huangtianhua commented on pull request #2189: HDFS-15025. Applying NVDIMM storage media to HDFS

2020-09-10 Thread GitBox


huangtianhua commented on pull request #2189:
URL: https://github.com/apache/hadoop/pull/2189#issuecomment-690829884


   @liuml07 and @brahmareddybattula So if all of the codes are ok, would you 
please to approve? Thanks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] smengcl commented on pull request #2258: HDFS-15539. When disallowing snapshot on a dir, throw exception if its trash root is not empty

2020-09-10 Thread GitBox


smengcl commented on pull request #2258:
URL: https://github.com/apache/hadoop/pull/2258#issuecomment-690847462


   > Thanks @smengcl for working on this,. The test failures like 
TestDistributedFileSystem#testGetTrashRoots look related. Can you plz verify?
   
   I'm checking. Probably need to add a line or two to clean up the old tests.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] hemanthboyina commented on pull request #2267: HDFS-15555. RBF: Refresh cacheNS when SocketException occurs.

2020-09-10 Thread GitBox


hemanthboyina commented on pull request #2267:
URL: https://github.com/apache/hadoop/pull/2267#issuecomment-690878265


   any update here @aajisaka  , as HDFS-15543 was modifying same part of code



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-15891) Provide Regex Based Mount Point In Inode Tree

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15891?focusedWorklogId=481883=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481883
 ]

ASF GitHub Bot logged work on HADOOP-15891:
---

Author: ASF GitHub Bot
Created on: 11/Sep/20 04:17
Start Date: 11/Sep/20 04:17
Worklog Time Spent: 10m 
  Work Description: umamaheswararao commented on pull request #2185:
URL: https://github.com/apache/hadoop/pull/2185#issuecomment-690863729


   +1 on the latest patch. Thanks again for your great work @JohnZZGithub 
   
   Test failures are unrelated.
   I think we are 3 line crossed for a warning. Looks like it's not worth minor 
refactor for sake of that unless we do full refactoring of that method. Please 
file separate issue to refactor into much cleaner with smaller methods. I will 
proceed to commit the current patch. 
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481883)
Time Spent: 8h 40m  (was: 8.5h)

> Provide Regex Based Mount Point In Inode Tree
> -
>
> Key: HADOOP-15891
> URL: https://issues.apache.org/jira/browse/HADOOP-15891
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: viewfs
>Reporter: zhenzhao wang
>Assignee: zhenzhao wang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15891.015.patch, HDFS-13948.001.patch, 
> HDFS-13948.002.patch, HDFS-13948.003.patch, HDFS-13948.004.patch, 
> HDFS-13948.005.patch, HDFS-13948.006.patch, HDFS-13948.007.patch, 
> HDFS-13948.008.patch, HDFS-13948.009.patch, HDFS-13948.011.patch, 
> HDFS-13948.012.patch, HDFS-13948.013.patch, HDFS-13948.014.patch, HDFS-13948_ 
> Regex Link Type In Mont Table-V0.pdf, HDFS-13948_ Regex Link Type In Mount 
> Table-v1.pdf
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> This jira is created to support regex based mount point in Inode Tree. We 
> noticed that mount point only support fixed target path. However, we might 
> have user cases when target needs to refer some fields from source. e.g. We 
> might want a mapping of /cluster1/user1 => /cluster1-dc1/user-nn-user1, we 
> want to refer `cluster` and `user` field in source to construct target. It's 
> impossible to archive this with current link type. Though we could set 
> one-to-one mapping, the mount table would become bloated if we have thousands 
> of users. Besides, a regex mapping would empower us more flexibility. So we 
> are going to build a regex based mount point which target could refer groups 
> from src regex mapping. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17222) Create socket address leveraging URI cache

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17222?focusedWorklogId=481898=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481898
 ]

ASF GitHub Bot logged work on HADOOP-17222:
---

Author: ASF GitHub Bot
Created on: 11/Sep/20 05:34
Start Date: 11/Sep/20 05:34
Worklog Time Spent: 10m 
  Work Description: liuml07 commented on pull request #2241:
URL: https://github.com/apache/hadoop/pull/2241#issuecomment-690885275


   Failing tests are not related and all pass locally in my laptop except 
`TestNameNodeRetryCacheMetrics` which is known flaky see 
[HDFS-15458](https://issues.apache.org/jira/browse/HDFS-15458)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481898)
Time Spent: 2h 40m  (was: 2.5h)

>  Create socket address leveraging URI cache
> ---
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Assignee: fanrui
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17222) Create socket address leveraging URI cache

2020-09-10 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17222:
---
Summary:  Create socket address leveraging URI cache  (was: Create socket 
address combined with cache to speed up hdfs client choose DataNode)

>  Create socket address leveraging URI cache
> ---
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Assignee: fanrui
>Priority: Major
>  Labels: pull-request-available
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] liuml07 commented on pull request #2241: HADOOP-17222. Create socket address combined with URI cache

2020-09-10 Thread GitBox


liuml07 commented on pull request #2241:
URL: https://github.com/apache/hadoop/pull/2241#issuecomment-690885275


   Failing tests are not related and all pass locally in my laptop except 
`TestNameNodeRetryCacheMetrics` which is known flaky see 
[HDFS-15458](https://issues.apache.org/jira/browse/HDFS-15458)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] umamaheswararao commented on pull request #2185: HADOOP-15891. provide Regex Based Mount Point In Inode Tree

2020-09-10 Thread GitBox


umamaheswararao commented on pull request #2185:
URL: https://github.com/apache/hadoop/pull/2185#issuecomment-690863729


   +1 on the latest patch. Thanks again for your great work @JohnZZGithub 
   
   Test failures are unrelated.
   I think we are 3 line crossed for a warning. Looks like it's not worth minor 
refactor for sake of that unless we do full refactoring of that method. Please 
file separate issue to refactor into much cleaner with smaller methods. I will 
proceed to commit the current patch. 
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15891) Provide Regex Based Mount Point In Inode Tree

2020-09-10 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HADOOP-15891:
-
Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Thanks [~wzzdreamer] for the nice work here. I have committed this patch to 
trunk !!

> Provide Regex Based Mount Point In Inode Tree
> -
>
> Key: HADOOP-15891
> URL: https://issues.apache.org/jira/browse/HADOOP-15891
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: viewfs
>Reporter: zhenzhao wang
>Assignee: zhenzhao wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: HADOOP-15891.015.patch, HDFS-13948.001.patch, 
> HDFS-13948.002.patch, HDFS-13948.003.patch, HDFS-13948.004.patch, 
> HDFS-13948.005.patch, HDFS-13948.006.patch, HDFS-13948.007.patch, 
> HDFS-13948.008.patch, HDFS-13948.009.patch, HDFS-13948.011.patch, 
> HDFS-13948.012.patch, HDFS-13948.013.patch, HDFS-13948.014.patch, HDFS-13948_ 
> Regex Link Type In Mont Table-V0.pdf, HDFS-13948_ Regex Link Type In Mount 
> Table-v1.pdf
>
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> This jira is created to support regex based mount point in Inode Tree. We 
> noticed that mount point only support fixed target path. However, we might 
> have user cases when target needs to refer some fields from source. e.g. We 
> might want a mapping of /cluster1/user1 => /cluster1-dc1/user-nn-user1, we 
> want to refer `cluster` and `user` field in source to construct target. It's 
> impossible to archive this with current link type. Though we could set 
> one-to-one mapping, the mount table would become bloated if we have thousands 
> of users. Besides, a regex mapping would empower us more flexibility. So we 
> are going to build a regex based mount point which target could refer groups 
> from src regex mapping. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-15891) Provide Regex Based Mount Point In Inode Tree

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15891?focusedWorklogId=481887=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481887
 ]

ASF GitHub Bot logged work on HADOOP-15891:
---

Author: ASF GitHub Bot
Created on: 11/Sep/20 04:28
Start Date: 11/Sep/20 04:28
Worklog Time Spent: 10m 
  Work Description: JohnZZGithub commented on pull request #2185:
URL: https://github.com/apache/hadoop/pull/2185#issuecomment-690866308


   @umamaheswararao Thanks a lot!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481887)
Time Spent: 9h  (was: 8h 50m)

> Provide Regex Based Mount Point In Inode Tree
> -
>
> Key: HADOOP-15891
> URL: https://issues.apache.org/jira/browse/HADOOP-15891
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: viewfs
>Reporter: zhenzhao wang
>Assignee: zhenzhao wang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15891.015.patch, HDFS-13948.001.patch, 
> HDFS-13948.002.patch, HDFS-13948.003.patch, HDFS-13948.004.patch, 
> HDFS-13948.005.patch, HDFS-13948.006.patch, HDFS-13948.007.patch, 
> HDFS-13948.008.patch, HDFS-13948.009.patch, HDFS-13948.011.patch, 
> HDFS-13948.012.patch, HDFS-13948.013.patch, HDFS-13948.014.patch, HDFS-13948_ 
> Regex Link Type In Mont Table-V0.pdf, HDFS-13948_ Regex Link Type In Mount 
> Table-v1.pdf
>
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> This jira is created to support regex based mount point in Inode Tree. We 
> noticed that mount point only support fixed target path. However, we might 
> have user cases when target needs to refer some fields from source. e.g. We 
> might want a mapping of /cluster1/user1 => /cluster1-dc1/user-nn-user1, we 
> want to refer `cluster` and `user` field in source to construct target. It's 
> impossible to archive this with current link type. Though we could set 
> one-to-one mapping, the mount table would become bloated if we have thousands 
> of users. Besides, a regex mapping would empower us more flexibility. So we 
> are going to build a regex based mount point which target could refer groups 
> from src regex mapping. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] JohnZZGithub commented on pull request #2185: HADOOP-15891. provide Regex Based Mount Point In Inode Tree

2020-09-10 Thread GitBox


JohnZZGithub commented on pull request #2185:
URL: https://github.com/apache/hadoop/pull/2185#issuecomment-690866308


   @umamaheswararao Thanks a lot!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17222?focusedWorklogId=481897=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481897
 ]

ASF GitHub Bot logged work on HADOOP-17222:
---

Author: ASF GitHub Bot
Created on: 11/Sep/20 05:31
Start Date: 11/Sep/20 05:31
Worklog Time Spent: 10m 
  Work Description: liuml07 edited a comment on pull request #2241:
URL: https://github.com/apache/hadoop/pull/2241#issuecomment-690850909


   I see `TestNameNodeRetryCacheMetrics.testRetryCacheMetrics` and ` 
TestMultipleNNPortQOP.testMultipleNNPortOverwriteDownStream` fail on my Linux 
machine with the patch. Will confirm they are not related.
   
   If no objections, I'll commit later today. Thanks,



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481897)
Time Spent: 2.5h  (was: 2h 20m)

> Create socket address combined with cache to speed up hdfs client choose 
> DataNode
> -
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Assignee: fanrui
>Priority: Major
>  Labels: pull-request-available
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17222?focusedWorklogId=481896=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481896
 ]

ASF GitHub Bot logged work on HADOOP-17222:
---

Author: ASF GitHub Bot
Created on: 11/Sep/20 05:31
Start Date: 11/Sep/20 05:31
Worklog Time Spent: 10m 
  Work Description: liuml07 merged pull request #2241:
URL: https://github.com/apache/hadoop/pull/2241


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481896)
Time Spent: 2h 20m  (was: 2h 10m)

> Create socket address combined with cache to speed up hdfs client choose 
> DataNode
> -
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Assignee: fanrui
>Priority: Major
>  Labels: pull-request-available
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17222) Create socket address leveraging URI cache

2020-09-10 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17222:
---
Fix Version/s: 3.4.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

>  Create socket address leveraging URI cache
> ---
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Assignee: fanrui
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] liuml07 edited a comment on pull request #2241: HADOOP-17222. Create socket address combined with URI cache

2020-09-10 Thread GitBox


liuml07 edited a comment on pull request #2241:
URL: https://github.com/apache/hadoop/pull/2241#issuecomment-690850909


   I see `TestNameNodeRetryCacheMetrics.testRetryCacheMetrics` and ` 
TestMultipleNNPortQOP.testMultipleNNPortOverwriteDownStream` fail on my Linux 
machine with the patch. Will confirm they are not related.
   
   If no objections, I'll commit later today. Thanks,



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-15891) Provide Regex Based Mount Point In Inode Tree

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15891?focusedWorklogId=481885=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481885
 ]

ASF GitHub Bot logged work on HADOOP-15891:
---

Author: ASF GitHub Bot
Created on: 11/Sep/20 04:20
Start Date: 11/Sep/20 04:20
Worklog Time Spent: 10m 
  Work Description: umamaheswararao merged pull request #2185:
URL: https://github.com/apache/hadoop/pull/2185


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481885)
Time Spent: 8h 50m  (was: 8h 40m)

> Provide Regex Based Mount Point In Inode Tree
> -
>
> Key: HADOOP-15891
> URL: https://issues.apache.org/jira/browse/HADOOP-15891
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: viewfs
>Reporter: zhenzhao wang
>Assignee: zhenzhao wang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-15891.015.patch, HDFS-13948.001.patch, 
> HDFS-13948.002.patch, HDFS-13948.003.patch, HDFS-13948.004.patch, 
> HDFS-13948.005.patch, HDFS-13948.006.patch, HDFS-13948.007.patch, 
> HDFS-13948.008.patch, HDFS-13948.009.patch, HDFS-13948.011.patch, 
> HDFS-13948.012.patch, HDFS-13948.013.patch, HDFS-13948.014.patch, HDFS-13948_ 
> Regex Link Type In Mont Table-V0.pdf, HDFS-13948_ Regex Link Type In Mount 
> Table-v1.pdf
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> This jira is created to support regex based mount point in Inode Tree. We 
> noticed that mount point only support fixed target path. However, we might 
> have user cases when target needs to refer some fields from source. e.g. We 
> might want a mapping of /cluster1/user1 => /cluster1-dc1/user-nn-user1, we 
> want to refer `cluster` and `user` field in source to construct target. It's 
> impossible to archive this with current link type. Though we could set 
> one-to-one mapping, the mount table would become bloated if we have thousands 
> of users. Besides, a regex mapping would empower us more flexibility. So we 
> are going to build a regex based mount point which target could refer groups 
> from src regex mapping. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] umamaheswararao merged pull request #2185: HADOOP-15891. provide Regex Based Mount Point In Inode Tree

2020-09-10 Thread GitBox


umamaheswararao merged pull request #2185:
URL: https://github.com/apache/hadoop/pull/2185


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] aajisaka commented on pull request #2267: HDFS-15555. RBF: Refresh cacheNS when SocketException occurs.

2020-09-10 Thread GitBox


aajisaka commented on pull request #2267:
URL: https://github.com/apache/hadoop/pull/2267#issuecomment-690880934


   HI @hemanthboyina 
   I don't have any updates here.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] LeonGao91 opened a new pull request #2299: HDFS-15456. TestExternalStoragePolicySatisfier fails intermittently

2020-09-10 Thread GitBox


LeonGao91 opened a new pull request #2299:
URL: https://github.com/apache/hadoop/pull/2299


   One-liner fix of this UT.
   Ignore datanode load so block placement can successfully pick fallback 
storage type.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17256) DistCp -update option will be invalid when distcp files from hdfs to S3

2020-09-10 Thread liuxiaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuxiaolong updated HADOOP-17256:
-
Attachment: image-2020-09-10-17-45-16-998.png

> DistCp -update option will be invalid when distcp files from hdfs to S3
> ---
>
> Key: HADOOP-17256
> URL: https://issues.apache.org/jira/browse/HADOOP-17256
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Reporter: liuxiaolong
>Priority: Major
> Attachments: image-2020-09-10-17-25-46-354.png, 
> image-2020-09-10-17-33-50-505.png, image-2020-09-10-17-45-16-998.png
>
>
> We use distcp with -update option to copy a dir from hdfs to S3. When we run 
> distcp job once more, it will overwrite S3 dir directly, rather than skip the 
> same files.
>  
> Test Case:
> 1. Run twice distcp cmd,  the modify time of S3 files will be modified
> hadoop distcp -update /testA/ s3a://tiered-storage-bigdata-1251625956/testA/
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-17256) DistCp -update option will be invalid when distcp files from hdfs to S3

2020-09-10 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-17256.
-
Resolution: Duplicate

caused by HADOOP-8143, which has now been rolled back everywhere it went in. It 
can also cause 404 errors, so was a critical roll back. Closing as a duplicate 
of HADOOP-8143

All future releases of Hadoop branch 3 will contain this fix

> DistCp -update option will be invalid when distcp files from hdfs to S3
> ---
>
> Key: HADOOP-17256
> URL: https://issues.apache.org/jira/browse/HADOOP-17256
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Reporter: liuxiaolong
>Priority: Major
> Attachments: image-2020-09-10-17-25-46-354.png, 
> image-2020-09-10-17-33-50-505.png, image-2020-09-10-17-45-16-998.png, 
> image-2020-09-10-17-47-01-653.png, image-2020-09-10-17-52-32-290.png
>
>
> We use distcp with -update option to copy a dir from hdfs to S3. When we run 
> distcp job once more, it will overwrite S3 dir directly, rather than skip the 
> same files.
>   
>  Test Case:
> Run twice the following cmd,  the modify time of S3 files will be modified 
> every time.
>  hadoop distcp -update /test/ s3a://${s3_buckect}/test/
>  
> Check code in CopyMapper.java and S3AFileSystem.java 
> (1) For the first time, distcp job will create files in S3, but blockSize is 
> unused!
> !image-2020-09-10-17-45-16-998.png|width=542,height=485!
>  
> (2) For the second time, the distcp job will compare fileSize and blockSize 
> between hdfs file and S3 file
> !image-2020-09-10-17-47-01-653.png|width=524,height=248!
>  
> (3) blockSize is unused, when get blockSize of S3 file, it return a default 
> value.
> In S3AFileSystem.java, we find that the default value of fs.s3a.block.size is 
> 32 * 1024 * 1024.
> !image-2020-09-10-17-33-50-505.png|width=451,height=762!
>  
> !image-2020-09-10-17-52-32-290.png|width=527,height=87!
>   
> The blockSize of HDFS seems invalid in Object Store, like S3. So I think 
> there's no need to compare blockSize when distcp with -update option.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-11452) Make FileSystem.rename(path, path, options) public, specified, tested

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-11452?focusedWorklogId=481375=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481375
 ]

ASF GitHub Bot logged work on HADOOP-11452:
---

Author: ASF GitHub Bot
Created on: 10/Sep/20 11:16
Start Date: 10/Sep/20 11:16
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #743:
URL: https://github.com/apache/hadoop/pull/743#issuecomment-685019888


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   1m 11s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files 
found.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
10 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 20s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  28m 46s |  trunk passed  |
   | +1 :green_heart: |  compile  |  20m 47s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  17m 39s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   3m  1s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   5m  5s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 24s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   3m  6s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   4m 37s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   0m 45s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   9m 54s |  trunk passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 22s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 39s |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 14s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | -1 :x: |  javac  |  20m 14s |  
root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 1 new + 2050 unchanged - 
5 fixed = 2051 total (was 2055)  |
   | +1 :green_heart: |  compile  |  17m 40s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | -1 :x: |  javac  |  17m 40s |  
root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 1 new + 1944 unchanged - 
5 fixed = 1945 total (was 1949)  |
   | -0 :warning: |  checkstyle  |   3m  3s |  root: The patch generated 103 
new + 484 unchanged - 30 fixed = 587 total (was 514)  |
   | +1 :green_heart: |  mvnsite  |   5m  0s |  the patch passed  |
   | -1 :x: |  whitespace  |   0m  0s |  The patch has 8 line(s) that end in 
whitespace. Use git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  shadedclient  |  15m 43s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   3m  7s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | -1 :x: |  javadoc  |   1m 30s |  
hadoop-common-project_hadoop-common-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 3 new 
+ 1 unchanged - 0 fixed = 4 total (was 1)  |
   | -1 :x: |  findbugs  |   2m 22s |  hadoop-common-project/hadoop-common 
generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  |   9m 48s |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   2m  7s |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 126m 42s |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  unit  |   0m 38s |  hadoop-openstack in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   1m 58s |  hadoop-aws in the patch passed.  |
   | -1 :x: |  asflicense  |   0m 59s |  The patch generated 3 ASF License 
warnings.  |
   |  |   | 345m  9s |   |
   
   
   | Reason | Tests |
   |---:|:--|
   | FindBugs | module:hadoop-common-project/hadoop-common |
   |  |  Should org.apache.hadoop.fs.impl.RenameHelper$RenameValidationResult 
be a _static_ inner class?  At RenameHelper.java:inner class?  At 
RenameHelper.java:[line 320] |
   | 

[GitHub] [hadoop] hadoop-yetus removed a comment on pull request #743: HADOOP-11452 make rename/3 public

2020-09-10 Thread GitBox


hadoop-yetus removed a comment on pull request #743:
URL: https://github.com/apache/hadoop/pull/743#issuecomment-685019888


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   1m 11s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files 
found.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
10 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 20s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  28m 46s |  trunk passed  |
   | +1 :green_heart: |  compile  |  20m 47s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  17m 39s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   3m  1s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   5m  5s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 24s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   3m  6s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   4m 37s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   0m 45s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   9m 54s |  trunk passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 22s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 39s |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 14s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | -1 :x: |  javac  |  20m 14s |  
root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 1 new + 2050 unchanged - 
5 fixed = 2051 total (was 2055)  |
   | +1 :green_heart: |  compile  |  17m 40s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | -1 :x: |  javac  |  17m 40s |  
root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 1 new + 1944 unchanged - 
5 fixed = 1945 total (was 1949)  |
   | -0 :warning: |  checkstyle  |   3m  3s |  root: The patch generated 103 
new + 484 unchanged - 30 fixed = 587 total (was 514)  |
   | +1 :green_heart: |  mvnsite  |   5m  0s |  the patch passed  |
   | -1 :x: |  whitespace  |   0m  0s |  The patch has 8 line(s) that end in 
whitespace. Use git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  shadedclient  |  15m 43s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   3m  7s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | -1 :x: |  javadoc  |   1m 30s |  
hadoop-common-project_hadoop-common-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 3 new 
+ 1 unchanged - 0 fixed = 4 total (was 1)  |
   | -1 :x: |  findbugs  |   2m 22s |  hadoop-common-project/hadoop-common 
generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  |   9m 48s |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   2m  7s |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 126m 42s |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  unit  |   0m 38s |  hadoop-openstack in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   1m 58s |  hadoop-aws in the patch passed.  |
   | -1 :x: |  asflicense  |   0m 59s |  The patch generated 3 ASF License 
warnings.  |
   |  |   | 345m  9s |   |
   
   
   | Reason | Tests |
   |---:|:--|
   | FindBugs | module:hadoop-common-project/hadoop-common |
   |  |  Should org.apache.hadoop.fs.impl.RenameHelper$RenameValidationResult 
be a _static_ inner class?  At RenameHelper.java:inner class?  At 
RenameHelper.java:[line 320] |
   | Failed junit tests | 
hadoop.fs.contract.rawlocal.TestRawlocalContractRenameEx |
   |   | hadoop.fs.TestFSMainOperationsLocalFileSystem |
   |   | hadoop.fs.viewfs.TestViewFsWithAuthorityLocalFs |
   |   | hadoop.fs.TestSymlinkLocalFSFileContext |
   |   | hadoop.fs.TestChecksumFs |
   |   | hadoop.fs.viewfs.TestFSMainOperationsLocalFileSystem |
   |   | hadoop.fs.viewfs.TestFcMainOperationsLocalFs |
   |   | 

[jira] [Work logged] (HADOOP-17253) Upgrade zookeeper to 3.4.14 on branch-2.10

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17253?focusedWorklogId=481376=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481376
 ]

ASF GitHub Bot logged work on HADOOP-17253:
---

Author: ASF GitHub Bot
Created on: 10/Sep/20 11:17
Start Date: 10/Sep/20 11:17
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2289:
URL: https://github.com/apache/hadoop/pull/2289#issuecomment-690175258


   LGTM
   
   +1



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481376)
Time Spent: 0.5h  (was: 20m)

> Upgrade zookeeper to 3.4.14 on branch-2.10
> --
>
> Key: HADOOP-17253
> URL: https://issues.apache.org/jira/browse/HADOOP-17253
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Since versions of zookeeper and curator have different history between 
> branch-2.10 and trunk, I filed this to upgrade both zookeeper and curator on 
> branch-2.10.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #2289: HADOOP-17253. Upgrade zookeeper to 3.4.14 on branch-2.10.

2020-09-10 Thread GitBox


steveloughran commented on pull request #2289:
URL: https://github.com/apache/hadoop/pull/2289#issuecomment-690175258


   LGTM
   
   +1



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17195) Intermittent OutOfMemory error while performing hdfs CopyFromLocal to abfs

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17195?focusedWorklogId=481381=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481381
 ]

ASF GitHub Bot logged work on HADOOP-17195:
---

Author: ASF GitHub Bot
Created on: 10/Sep/20 11:22
Start Date: 10/Sep/20 11:22
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2294:
URL: https://github.com/apache/hadoop/pull/2294#issuecomment-690183135


   Looking at this a bit more
   
   * its the use of buffer which causes the OOM not the thread pooling, so 
neither this nor its predecessor patch will directly fix that
   * need to support a bytebuffer pool with max capacity and/or disk buffering



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481381)
Time Spent: 0.5h  (was: 20m)

> Intermittent OutOfMemory error while performing hdfs CopyFromLocal to abfs 
> ---
>
> Key: HADOOP-17195
> URL: https://issues.apache.org/jira/browse/HADOOP-17195
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Affects Versions: 3.3.0
>Reporter: Mehakmeet Singh
>Assignee: Bilahari T H
>Priority: Major
>  Labels: abfsactive, pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> OutOfMemory error due to new ThreadPools being made each time 
> AbfsOutputStream is created. Since threadPool aren't limited a lot of data is 
> loaded in buffer and thus it causes OutOfMemory error.
> Possible fixes:
> - Limit the number of ThreadCounts while performing hdfs copyFromLocal (Using 
> -t property).
> - Reducing OUTPUT_BUFFER_SIZE significantly which would limit the amount of 
> buffer to be loaded in threads.
> - Don't create new ThreadPools each time AbfsOutputStream is created and 
> limit the number of ThreadPools each AbfsOutputStream could create.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17191) ABFS: Run the integration tests with various combinations of configurations and publish a consolidated results

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17191?focusedWorklogId=481382=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481382
 ]

ASF GitHub Bot logged work on HADOOP-17191:
---

Author: ASF GitHub Bot
Created on: 10/Sep/20 11:23
Start Date: 10/Sep/20 11:23
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #2278:
URL: https://github.com/apache/hadoop/pull/2278#issuecomment-687718737







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481382)
Time Spent: 3h 40m  (was: 3.5h)

> ABFS: Run the integration tests with various combinations of configurations 
> and publish a consolidated results
> --
>
> Key: HADOOP-17191
> URL: https://issues.apache.org/jira/browse/HADOOP-17191
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure, test
>Affects Versions: 3.3.0
>Reporter: Bilahari T H
>Assignee: Bilahari T H
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> ADLS Gen 2 supports accounts with and without hierarchical namespace support. 
> ABFS driver supports various authorization mechanisms like OAuth, haredKey, 
> Shared Access Signature. The integration tests need to be executed against 
> accounts with and without hierarchical namespace support using various 
> authorization mechanisms.
> Currently the developer has to manually run the tests with different 
> combinations of configurations ex: HNS account with SharedKey and OAuth, 
> NonHNS account with SharedKey etc..
> The expectation is to automate these runs with different combinations. This 
> will help the developer to run the integration tests with different variants 
> of configurations automatically. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] hadoop-yetus removed a comment on pull request #2278: HADOOP-17191. ABFS: Run the integration tests with various combinations of configurations and publish a consolidated results

2020-09-10 Thread GitBox


hadoop-yetus removed a comment on pull request #2278:
URL: https://github.com/apache/hadoop/pull/2278#issuecomment-687718737







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #2278: HADOOP-17191. ABFS: Run the integration tests with various combinations of configurations and publish a consolidated results

2020-09-10 Thread GitBox


steveloughran commented on pull request #2278:
URL: https://github.com/apache/hadoop/pull/2278#issuecomment-690188077


   Happy with all the changes. Still unsure about the full implications. Can I 
still run "mvn verify" at the command line and individual tests in the IDE?
   
   + @mehakmeet 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17191) ABFS: Run the integration tests with various combinations of configurations and publish a consolidated results

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17191?focusedWorklogId=481386=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481386
 ]

ASF GitHub Bot logged work on HADOOP-17191:
---

Author: ASF GitHub Bot
Created on: 10/Sep/20 11:28
Start Date: 10/Sep/20 11:28
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2278:
URL: https://github.com/apache/hadoop/pull/2278#issuecomment-690188077


   Happy with all the changes. Still unsure about the full implications. Can I 
still run "mvn verify" at the command line and individual tests in the IDE?
   
   + @mehakmeet 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481386)
Time Spent: 4h  (was: 3h 50m)

> ABFS: Run the integration tests with various combinations of configurations 
> and publish a consolidated results
> --
>
> Key: HADOOP-17191
> URL: https://issues.apache.org/jira/browse/HADOOP-17191
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure, test
>Affects Versions: 3.3.0
>Reporter: Bilahari T H
>Assignee: Bilahari T H
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> ADLS Gen 2 supports accounts with and without hierarchical namespace support. 
> ABFS driver supports various authorization mechanisms like OAuth, haredKey, 
> Shared Access Signature. The integration tests need to be executed against 
> accounts with and without hierarchical namespace support using various 
> authorization mechanisms.
> Currently the developer has to manually run the tests with different 
> combinations of configurations ex: HNS account with SharedKey and OAuth, 
> NonHNS account with SharedKey etc..
> The expectation is to automate these runs with different combinations. This 
> will help the developer to run the integration tests with different variants 
> of configurations automatically. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17253) Upgrade zookeeper to 3.4.14 on branch-2.10

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17253?focusedWorklogId=481394=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481394
 ]

ASF GitHub Bot logged work on HADOOP-17253:
---

Author: ASF GitHub Bot
Created on: 10/Sep/20 11:37
Start Date: 10/Sep/20 11:37
Worklog Time Spent: 10m 
  Work Description: iwasakims commented on pull request #2289:
URL: https://github.com/apache/hadoop/pull/2289#issuecomment-690200705


   Thanks, @steveloughran. I merged this.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 481394)
Time Spent: 50m  (was: 40m)

> Upgrade zookeeper to 3.4.14 on branch-2.10
> --
>
> Key: HADOOP-17253
> URL: https://issues.apache.org/jira/browse/HADOOP-17253
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Since versions of zookeeper and curator have different history between 
> branch-2.10 and trunk, I filed this to upgrade both zookeeper and curator on 
> branch-2.10.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] iwasakims commented on pull request #2289: HADOOP-17253. Upgrade zookeeper to 3.4.14 on branch-2.10.

2020-09-10 Thread GitBox


iwasakims commented on pull request #2289:
URL: https://github.com/apache/hadoop/pull/2289#issuecomment-690200705


   Thanks, @steveloughran. I merged this.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17256) DistCp -update option will be invalid when distcp files from hdfs to S3

2020-09-10 Thread liuxiaolong (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193566#comment-17193566
 ] 

liuxiaolong commented on HADOOP-17256:
--

Thanks, I try to roll back HADOOP-8143,  it's ok now

!image-2020-09-10-19-48-38-574.png|width=453,height=316!

> DistCp -update option will be invalid when distcp files from hdfs to S3
> ---
>
> Key: HADOOP-17256
> URL: https://issues.apache.org/jira/browse/HADOOP-17256
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Reporter: liuxiaolong
>Priority: Major
> Attachments: image-2020-09-10-17-25-46-354.png, 
> image-2020-09-10-17-33-50-505.png, image-2020-09-10-17-45-16-998.png, 
> image-2020-09-10-17-47-01-653.png, image-2020-09-10-17-52-32-290.png, 
> image-2020-09-10-19-48-38-574.png
>
>
> We use distcp with -update option to copy a dir from hdfs to S3. When we run 
> distcp job once more, it will overwrite S3 dir directly, rather than skip the 
> same files.
>   
>  Test Case:
> Run twice the following cmd,  the modify time of S3 files will be modified 
> every time.
>  hadoop distcp -update /test/ s3a://${s3_buckect}/test/
>  
> Check code in CopyMapper.java and S3AFileSystem.java 
> (1) For the first time, distcp job will create files in S3, but blockSize is 
> unused!
> !image-2020-09-10-17-45-16-998.png|width=542,height=485!
>  
> (2) For the second time, the distcp job will compare fileSize and blockSize 
> between hdfs file and S3 file
> !image-2020-09-10-17-47-01-653.png|width=524,height=248!
>  
> (3) blockSize is unused, when get blockSize of S3 file, it return a default 
> value.
> In S3AFileSystem.java, we find that the default value of fs.s3a.block.size is 
> 32 * 1024 * 1024.
> !image-2020-09-10-17-33-50-505.png|width=451,height=762!
>  
> !image-2020-09-10-17-52-32-290.png|width=527,height=87!
>   
> The blockSize of HDFS seems invalid in Object Store, like S3. So I think 
> there's no need to compare blockSize when distcp with -update option.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17256) DistCp -update option will be invalid when distcp files from hdfs to S3

2020-09-10 Thread liuxiaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuxiaolong updated HADOOP-17256:
-
Attachment: image-2020-09-10-17-33-50-505.png

> DistCp -update option will be invalid when distcp files from hdfs to S3
> ---
>
> Key: HADOOP-17256
> URL: https://issues.apache.org/jira/browse/HADOOP-17256
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Reporter: liuxiaolong
>Priority: Major
> Attachments: image-2020-09-10-17-25-46-354.png, 
> image-2020-09-10-17-33-50-505.png
>
>
> We use distcp with -update option to copy a dir from hdfs to S3. When we run 
> distcp job once more, it will overwrite S3 dir directly, rather than skip the 
> same files.
>  
> Test Case:
> 1. Run twice distcp cmd,  the modify time of S3 files will be modified
> hadoop distcp -update /testA/ s3a://tiered-storage-bigdata-1251625956/testA/
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17256) DistCp -update option will be invalid when distcp files from hdfs to S3

2020-09-10 Thread liuxiaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuxiaolong updated HADOOP-17256:
-
Attachment: image-2020-09-10-17-52-32-290.png

> DistCp -update option will be invalid when distcp files from hdfs to S3
> ---
>
> Key: HADOOP-17256
> URL: https://issues.apache.org/jira/browse/HADOOP-17256
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Reporter: liuxiaolong
>Priority: Major
> Attachments: image-2020-09-10-17-25-46-354.png, 
> image-2020-09-10-17-33-50-505.png, image-2020-09-10-17-45-16-998.png, 
> image-2020-09-10-17-47-01-653.png, image-2020-09-10-17-52-32-290.png
>
>
> We use distcp with -update option to copy a dir from hdfs to S3. When we run 
> distcp job once more, it will overwrite S3 dir directly, rather than skip the 
> same files.
>  
> Test Case:
> 1. Run twice distcp cmd,  the modify time of S3 files will be modified
> hadoop distcp -update /testA/ s3a://tiered-storage-bigdata-1251625956/testA/
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] hadoop-yetus commented on pull request #2185: HADOOP-15891. provide Regex Based Mount Point In Inode Tree

2020-09-10 Thread GitBox


hadoop-yetus commented on pull request #2185:
URL: https://github.com/apache/hadoop/pull/2185#issuecomment-690147488


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |  28m 24s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  No case conflicting files 
found.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
6 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 22s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m 14s |  trunk passed  |
   | +1 :green_heart: |  compile  |  20m 45s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  17m 51s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 52s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 51s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 53s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 35s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 59s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   3m 13s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   5m 31s |  trunk passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 30s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m  5s |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m 36s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javac  |  19m 36s |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 31s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  javac  |  18m 31s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   2m 52s |  root: The patch generated 1 new 
+ 182 unchanged - 1 fixed = 183 total (was 183)  |
   | +1 :green_heart: |  mvnsite  |   2m 51s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  shadedclient  |  14m 16s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 29s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   3m  6s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   5m 53s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   9m 48s |  hadoop-common in the patch passed. 
 |
   | -1 :x: |  unit  | 106m 51s |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  5s |  The patch does not generate 
ASF License warnings.  |
   |  |   | 322m 34s |   |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestFileChecksum |
   |   | hadoop.hdfs.TestCrcCorruption |
   |   | hadoop.hdfs.TestDecommissionWithStripedBackoffMonitor |
   |   | hadoop.hdfs.server.namenode.TestReconstructStripedBlocks |
   |   | hadoop.hdfs.TestViewDistributedFileSystem |
   |   | hadoop.hdfs.TestFileChecksumCompositeCrc |
   |   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy |
   |   | hadoop.hdfs.server.datanode.TestBPOfferService |
   |   | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier |
   |   | hadoop.hdfs.TestMaintenanceState |
   |   | hadoop.hdfs.TestDFSStripedInputStream |
   |   | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics |
   |   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
   |   | hadoop.hdfs.TestErasureCodingPolicyWithSnapshotWithRandomECPolicy |
   |   | hadoop.hdfs.TestBlocksScheduledCounter |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2185/22/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2185 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle markdownlint |
   | uname | Linux 16f6f77a6cb6 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | 

[GitHub] [hadoop] bshashikant commented on pull request #2295: HDFS-15563. Incorrect getTrashRoot return value when a non-snapshottable dir prefix matches the path of a snapshottable dir

2020-09-10 Thread GitBox


bshashikant commented on pull request #2295:
URL: https://github.com/apache/hadoop/pull/2295#issuecomment-690161198


   Thanks @smengcl for the contribution.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HADOOP-9331) Hadoop crypto codec framework and crypto codec implementations

2020-09-10 Thread nikhil panchal (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nikhil panchal updated HADOOP-9331:
---
Comment: was deleted

(was: Hello,

I have ORC data stored in HDFS. I have one use case, encrypt one of the column 
present in ORC data. Can anyone suggest standard steps i need to follow or what 
hadoop component i can use.)

> Hadoop crypto codec framework and crypto codec implementations
> --
>
> Key: HADOOP-9331
> URL: https://issues.apache.org/jira/browse/HADOOP-9331
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 3.0.0-alpha1
>Reporter: Haifeng Chen
>Priority: Major
> Attachments: Hadoop Crypto Design.pdf
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
>  For use cases that deal with sensitive data, we often need to encrypt data 
> to be stored safely at rest. Hadoop common provides a codec framework for 
> compression algorithms. We start here. However because encryption algorithms 
> require some additional configuration and methods for key management, we 
> introduce a crypto codec framework that builds on the compression codec 
> framework. It cleanly distinguishes crypto algorithms from compression 
> algorithms, but shares common interfaces between them where possible, and 
> also carries extended interfaces where necessary to satisfy those needs. We 
> also introduce a generic Key type, and supporting utility methods and 
> classes, as a necessary abstraction for dealing with both Java crypto keys 
> and PGP keys.
> The task for this feature breaks into two parts:
> 1. The crypto codec framework that based on compression codec which can be 
> shared by all crypto codec implementations.
> 2. The codec implementations such as AES and others.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17253) Upgrade zookeeper to 3.4.14 on branch-2.10

2020-09-10 Thread Masatake Iwasaki (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HADOOP-17253:
--
Fix Version/s: 2.10.1
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Upgrade zookeeper to 3.4.14 on branch-2.10
> --
>
> Key: HADOOP-17253
> URL: https://issues.apache.org/jira/browse/HADOOP-17253
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.10.1
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Since versions of zookeeper and curator have different history between 
> branch-2.10 and trunk, I filed this to upgrade both zookeeper and curator on 
> branch-2.10.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] hadoop-yetus commented on pull request #2296: HDFS-15568. namenode start failed to start when dfs.namenode.max.snapshot.limit set.

2020-09-10 Thread GitBox


hadoop-yetus commented on pull request #2296:
URL: https://github.com/apache/hadoop/pull/2296#issuecomment-690215954


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |  40m 35s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
1 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m  7s |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |   1m  8s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   0m 47s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 15s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  17m 51s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 19s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   3m 13s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 11s |  trunk passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  8s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 11s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javac  |   1m 11s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  1s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  javac  |   1m  1s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   0m 40s |  hadoop-hdfs-project/hadoop-hdfs: 
The patch generated 3 new + 89 unchanged - 0 fixed = 92 total (was 89)  |
   | +1 :green_heart: |  mvnsite  |   1m 10s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  shadedclient  |  15m 27s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 46s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 16s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m 27s |  the patch passed  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  |  80m 10s |  hadoop-hdfs in the patch passed.  |
   | -1 :x: |  asflicense  |   0m 35s |  The patch generated 4 ASF License 
warnings.  |
   |  |   | 209m 18s |   |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier |
   |   | hadoop.fs.viewfs.TestViewFSOverloadSchemeWithMountTableConfigInHDFS |
   |   | hadoop.fs.viewfs.TestViewFileSystemHdfs |
   |   | hadoop.fs.viewfs.TestViewFileSystemLinkMergeSlash |
   |   | hadoop.hdfs.TestSnapshotCommands |
   |   | hadoop.fs.viewfs.TestViewFsAtHdfsRoot |
   |   | hadoop.hdfs.TestFileChecksumCompositeCrc |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2296/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2296 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 94a92cfd8c29 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / e5fe3262702 |
   | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   | checkstyle | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2296/1/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
   | unit | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2296/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2296/1/testReport/ |
   | asflicense | 

[jira] [Updated] (HADOOP-17256) DistCp -update option will be invalid when distcp files from hdfs to S3

2020-09-10 Thread liuxiaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuxiaolong updated HADOOP-17256:
-
Attachment: image-2020-09-10-19-48-38-574.png

> DistCp -update option will be invalid when distcp files from hdfs to S3
> ---
>
> Key: HADOOP-17256
> URL: https://issues.apache.org/jira/browse/HADOOP-17256
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Reporter: liuxiaolong
>Priority: Major
> Attachments: image-2020-09-10-17-25-46-354.png, 
> image-2020-09-10-17-33-50-505.png, image-2020-09-10-17-45-16-998.png, 
> image-2020-09-10-17-47-01-653.png, image-2020-09-10-17-52-32-290.png, 
> image-2020-09-10-19-48-38-574.png
>
>
> We use distcp with -update option to copy a dir from hdfs to S3. When we run 
> distcp job once more, it will overwrite S3 dir directly, rather than skip the 
> same files.
>   
>  Test Case:
> Run twice the following cmd,  the modify time of S3 files will be modified 
> every time.
>  hadoop distcp -update /test/ s3a://${s3_buckect}/test/
>  
> Check code in CopyMapper.java and S3AFileSystem.java 
> (1) For the first time, distcp job will create files in S3, but blockSize is 
> unused!
> !image-2020-09-10-17-45-16-998.png|width=542,height=485!
>  
> (2) For the second time, the distcp job will compare fileSize and blockSize 
> between hdfs file and S3 file
> !image-2020-09-10-17-47-01-653.png|width=524,height=248!
>  
> (3) blockSize is unused, when get blockSize of S3 file, it return a default 
> value.
> In S3AFileSystem.java, we find that the default value of fs.s3a.block.size is 
> 32 * 1024 * 1024.
> !image-2020-09-10-17-33-50-505.png|width=451,height=762!
>  
> !image-2020-09-10-17-52-32-290.png|width=527,height=87!
>   
> The blockSize of HDFS seems invalid in Object Store, like S3. So I think 
> there's no need to compare blockSize when distcp with -update option.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16918) Dependency update for Hadoop 2.10

2020-09-10 Thread Masatake Iwasaki (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193563#comment-17193563
 ] 

Masatake Iwasaki commented on HADOOP-16918:
---

Thanks, [~symat]. Hadoop 3.3.0 and above already moved to ZooKeeper 3.5 in 
HADOOP-16579. Since we are preparing for 2.10.1 which is patch release of the 
oldest live branch, compatibility is the concern. Based on the patch of 
HADOOP-16579, upgrading to 3.5 brings a change of code and transitive 
dependency. I'm considering 3.4.14 as a safe candidate here.

> Dependency update for Hadoop 2.10
> -
>
> Key: HADOOP-16918
> URL: https://issues.apache.org/jira/browse/HADOOP-16918
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Wei-Chiu Chuang
>Priority: Major
>  Labels: release-blocker
> Attachments: dependency-check-report.html, 
> dependency-check-report.html
>
>
> A number of dependencies can be updated.
> nimbus-jose-jwt
> jetty
> netty
> zookeeper
> hbase-common
> jackson-databind
> and many more. They should be updated in the 2.10.1 release.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13238) pid handling is failing on secure datanode

2020-09-10 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-13238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193575#comment-17193575
 ] 

zhuqi commented on HADOOP-13238:


cc [~boky01]  [~aw]

If i can take it, recent i am using the new hadoop 3.2.1 to construct our 
production clusters, also meet the problem.

Now i update the latest patch, fix another problem ,when the service is not 
running, but we call stop, the cat will show problem, because there are no 
pidfile.

Thanks.

> pid handling is failing on secure datanode
> --
>
> Key: HADOOP-13238
> URL: https://issues.apache.org/jira/browse/HADOOP-13238
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: scripts, security
>Reporter: Allen Wittenauer
>Assignee: Andras Bokor
>Priority: Major
> Attachments: HADOOP-13238.01.patch, HADOOP-13238.02.patch
>
>
> {code}
> hdfs --daemon stop datanode
> cat: /home/hadoop/H/pids/hadoop-hdfs-root-datanode.pid: No such file or 
> directory
> WARNING: pid has changed for datanode, skip deleting pid file
> cat: /home/hadoop/H/pids/hadoop-hdfs-root-datanode.pid: No such file or 
> directory
> WARNING: daemon pid has changed for datanode, skip deleting daemon pid file
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17256) DistCp -update option will be invalid when distcp files from hdfs to S3

2020-09-10 Thread liuxiaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuxiaolong updated HADOOP-17256:
-
Attachment: image-2020-09-10-17-25-46-354.png

> DistCp -update option will be invalid when distcp files from hdfs to S3
> ---
>
> Key: HADOOP-17256
> URL: https://issues.apache.org/jira/browse/HADOOP-17256
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Reporter: liuxiaolong
>Priority: Major
> Attachments: image-2020-09-10-17-25-46-354.png
>
>
> We use distcp with -update option to copy a dir from hdfs to S3. When we run 
> distcp job once more, it will overwrite S3 dir directly, rather than skip the 
> same files.
>  
> Test Case:
> 1. Run twice distcp cmd,  the modify time of S3 files will be modified
> hadoop distcp -update /testA/ s3a://tiered-storage-bigdata-1251625956/testA/
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16918) Dependency update for Hadoop 2.10

2020-09-10 Thread Mate Szalay-Beko (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193492#comment-17193492
 ] 

Mate Szalay-Beko commented on HADOOP-16918:
---

Hello guys! I'm missing the context here a bit... still FYI:
 * ZooKeeper 3.4 is EOL now, I think no more new security updates / CVE fixes 
will be provided for the 3.4 line. The last 3.4 ZooKeeper is the 3.4.14 
version, released on April, 2019.
 * The 3.5 / 3.6 ZooKeeper versions are still active, we provide relatively 
frequent releases with bugfixes and security fixes.

> Dependency update for Hadoop 2.10
> -
>
> Key: HADOOP-16918
> URL: https://issues.apache.org/jira/browse/HADOOP-16918
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Wei-Chiu Chuang
>Priority: Major
>  Labels: release-blocker
> Attachments: dependency-check-report.html, 
> dependency-check-report.html
>
>
> A number of dependencies can be updated.
> nimbus-jose-jwt
> jetty
> netty
> zookeeper
> hbase-common
> jackson-databind
> and many more. They should be updated in the 2.10.1 release.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17256) DistCp -update option will be invalid when distcp files from hdfs to S3

2020-09-10 Thread liuxiaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuxiaolong updated HADOOP-17256:
-
Attachment: image-2020-09-10-17-47-01-653.png

> DistCp -update option will be invalid when distcp files from hdfs to S3
> ---
>
> Key: HADOOP-17256
> URL: https://issues.apache.org/jira/browse/HADOOP-17256
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Reporter: liuxiaolong
>Priority: Major
> Attachments: image-2020-09-10-17-25-46-354.png, 
> image-2020-09-10-17-33-50-505.png, image-2020-09-10-17-45-16-998.png, 
> image-2020-09-10-17-47-01-653.png
>
>
> We use distcp with -update option to copy a dir from hdfs to S3. When we run 
> distcp job once more, it will overwrite S3 dir directly, rather than skip the 
> same files.
>  
> Test Case:
> 1. Run twice distcp cmd,  the modify time of S3 files will be modified
> hadoop distcp -update /testA/ s3a://tiered-storage-bigdata-1251625956/testA/
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17256) DistCp -update option will be invalid when distcp files from hdfs to S3

2020-09-10 Thread liuxiaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuxiaolong updated HADOOP-17256:
-
Description: 
We use distcp with -update option to copy a dir from hdfs to S3. When we run 
distcp job once more, it will overwrite S3 dir directly, rather than skip the 
same files.
  
 Test Case:
Run twice the following cmd,  the modify time of S3 files will be modified 
every time.
 hadoop distcp -update /test/ s3a://${s3_buckect}/test/

 

Check code in CopyMapper.java and S3AFileSystem.java 

(1) For the first time, distcp job will create files in S3, but blockSize is 
unused!

!image-2020-09-10-17-45-16-998.png|width=542,height=485!

 

(2) For the second time, the distcp job will compare fileSize and blockSize 
between hdfs file and S3 file

!image-2020-09-10-17-47-01-653.png|width=524,height=248!

 

(3) blockSize is unused, when get blockSize of S3 file, it return a default 
value.

In S3AFileSystem.java, we find that the default value of fs.s3a.block.size is 
32 * 1024 * 1024.

!image-2020-09-10-17-33-50-505.png|width=451,height=762!

 

!image-2020-09-10-17-52-32-290.png|width=527,height=87!
  

The blockSize of HDFS seems invalid in Object Store, like S3. So I think 
there's no need to compare blockSize when distcp with -update option.

  was:
We use distcp with -update option to copy a dir from hdfs to S3. When we run 
distcp job once more, it will overwrite S3 dir directly, rather than skip the 
same files.
 
Test Case:
1. Run twice distcp cmd,  the modify time of S3 files will be modified
hadoop distcp -update /testA/ s3a://tiered-storage-bigdata-1251625956/testA/
 


> DistCp -update option will be invalid when distcp files from hdfs to S3
> ---
>
> Key: HADOOP-17256
> URL: https://issues.apache.org/jira/browse/HADOOP-17256
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Reporter: liuxiaolong
>Priority: Major
> Attachments: image-2020-09-10-17-25-46-354.png, 
> image-2020-09-10-17-33-50-505.png, image-2020-09-10-17-45-16-998.png, 
> image-2020-09-10-17-47-01-653.png, image-2020-09-10-17-52-32-290.png
>
>
> We use distcp with -update option to copy a dir from hdfs to S3. When we run 
> distcp job once more, it will overwrite S3 dir directly, rather than skip the 
> same files.
>   
>  Test Case:
> Run twice the following cmd,  the modify time of S3 files will be modified 
> every time.
>  hadoop distcp -update /test/ s3a://${s3_buckect}/test/
>  
> Check code in CopyMapper.java and S3AFileSystem.java 
> (1) For the first time, distcp job will create files in S3, but blockSize is 
> unused!
> !image-2020-09-10-17-45-16-998.png|width=542,height=485!
>  
> (2) For the second time, the distcp job will compare fileSize and blockSize 
> between hdfs file and S3 file
> !image-2020-09-10-17-47-01-653.png|width=524,height=248!
>  
> (3) blockSize is unused, when get blockSize of S3 file, it return a default 
> value.
> In S3AFileSystem.java, we find that the default value of fs.s3a.block.size is 
> 32 * 1024 * 1024.
> !image-2020-09-10-17-33-50-505.png|width=451,height=762!
>  
> !image-2020-09-10-17-52-32-290.png|width=527,height=87!
>   
> The blockSize of HDFS seems invalid in Object Store, like S3. So I think 
> there's no need to compare blockSize when distcp with -update option.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-15891) Provide Regex Based Mount Point In Inode Tree

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15891?focusedWorklogId=481364=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481364
 ]

ASF GitHub Bot logged work on HADOOP-15891:
---

Author: ASF GitHub Bot
Created on: 10/Sep/20 10:43
Start Date: 10/Sep/20 10:43
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2185:
URL: https://github.com/apache/hadoop/pull/2185#issuecomment-690147488


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |  28m 24s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  No case conflicting files 
found.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
6 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 22s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m 14s |  trunk passed  |
   | +1 :green_heart: |  compile  |  20m 45s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  17m 51s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 52s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 51s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 53s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 35s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 59s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   3m 13s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   5m 31s |  trunk passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 30s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m  5s |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m 36s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javac  |  19m 36s |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 31s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  javac  |  18m 31s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   2m 52s |  root: The patch generated 1 new 
+ 182 unchanged - 1 fixed = 183 total (was 183)  |
   | +1 :green_heart: |  mvnsite  |   2m 51s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  shadedclient  |  14m 16s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 29s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   3m  6s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   5m 53s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   9m 48s |  hadoop-common in the patch passed. 
 |
   | -1 :x: |  unit  | 106m 51s |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  5s |  The patch does not generate 
ASF License warnings.  |
   |  |   | 322m 34s |   |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestFileChecksum |
   |   | hadoop.hdfs.TestCrcCorruption |
   |   | hadoop.hdfs.TestDecommissionWithStripedBackoffMonitor |
   |   | hadoop.hdfs.server.namenode.TestReconstructStripedBlocks |
   |   | hadoop.hdfs.TestViewDistributedFileSystem |
   |   | hadoop.hdfs.TestFileChecksumCompositeCrc |
   |   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy |
   |   | hadoop.hdfs.server.datanode.TestBPOfferService |
   |   | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier |
   |   | hadoop.hdfs.TestMaintenanceState |
   |   | hadoop.hdfs.TestDFSStripedInputStream |
   |   | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics |
   |   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
   |   | hadoop.hdfs.TestErasureCodingPolicyWithSnapshotWithRandomECPolicy |
   |   | hadoop.hdfs.TestBlocksScheduledCounter |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 

[jira] [Commented] (HADOOP-9331) Hadoop crypto codec framework and crypto codec implementations

2020-09-10 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-9331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193549#comment-17193549
 ] 

Steve Loughran commented on HADOOP-9331:


bq. I have ORC data stored in HDFS. I have one use case, encrypt one of the 
column present in ORC data. Can anyone suggest standard steps i need to follow 
or what hadoop component i can use.

something to take up with the ORC team. JIRAs aren't the place for queries like 
that. Thanks

> Hadoop crypto codec framework and crypto codec implementations
> --
>
> Key: HADOOP-9331
> URL: https://issues.apache.org/jira/browse/HADOOP-9331
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 3.0.0-alpha1
>Reporter: Haifeng Chen
>Priority: Major
> Attachments: Hadoop Crypto Design.pdf
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
>  For use cases that deal with sensitive data, we often need to encrypt data 
> to be stored safely at rest. Hadoop common provides a codec framework for 
> compression algorithms. We start here. However because encryption algorithms 
> require some additional configuration and methods for key management, we 
> introduce a crypto codec framework that builds on the compression codec 
> framework. It cleanly distinguishes crypto algorithms from compression 
> algorithms, but shares common interfaces between them where possible, and 
> also carries extended interfaces where necessary to satisfy those needs. We 
> also introduce a generic Key type, and supporting utility methods and 
> classes, as a necessary abstraction for dealing with both Java crypto keys 
> and PGP keys.
> The task for this feature breaks into two parts:
> 1. The crypto codec framework that based on compression codec which can be 
> shared by all crypto codec implementations.
> 2. The codec implementations such as AES and others.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16872) Performance improvement when distcp files in large dir with -direct option

2020-09-10 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-16872:

Affects Version/s: 3.3.0
   3.2.1

> Performance improvement when distcp files in large dir with -direct option
> --
>
> Key: HADOOP-16872
> URL: https://issues.apache.org/jira/browse/HADOOP-16872
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 3.3.0, 3.2.1
>Reporter: liuxiaolong
>Priority: Major
> Attachments: HADOOP-16872.001.patch, optimise after.png, optimise 
> before.png
>
>
> We use distcp with -direct option to copy a file between two large 
> directories. We found it costed a few minutes. If we launch too much distcp 
> jobs at the same time, NameNode  performance degradation is serious.
> hadoop -direct -skipcrccheck -update -prbugaxt -i -numListstatusThreads 1 
> hdfs://cluster1:8020/source/100.log  hdfs://cluster2:8020/target/100.jpg
> || ||Dir path||Count||
> ||Source dir||  hdfs://cluster1:8020/source/ ||100k+ files||
> ||Target dir||hdfs://cluster2:8020/target/ ||100k+  files||
>  
> Check code in CopyCommitter.java, we find in function
> deleteAttemptTempFiles() has a code targetFS.globStatus(new 
> Path(targetWorkPath, ".distcp.tmp." + jobId.replaceAll("job","attempt") + 
> "*")); 
> It will waste a lot of time when distcp between two large dirs. When we use 
> distcp with -direct option,  it will direct write to the target file without 
> generate a  '.distcp.tmp'  temp file. So, i think this code need add a 
> judgment before call function deleteAttemptTempFiles, if distcp with -direct 
> option, do nothing , directly return .  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16872) Performance improvement when distcp files in large dir with -direct option

2020-09-10 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-16872:

Component/s: tools/distcp

> Performance improvement when distcp files in large dir with -direct option
> --
>
> Key: HADOOP-16872
> URL: https://issues.apache.org/jira/browse/HADOOP-16872
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Reporter: liuxiaolong
>Priority: Major
> Attachments: HADOOP-16872.001.patch, optimise after.png, optimise 
> before.png
>
>
> We use distcp with -direct option to copy a file between two large 
> directories. We found it costed a few minutes. If we launch too much distcp 
> jobs at the same time, NameNode  performance degradation is serious.
> hadoop -direct -skipcrccheck -update -prbugaxt -i -numListstatusThreads 1 
> hdfs://cluster1:8020/source/100.log  hdfs://cluster2:8020/target/100.jpg
> || ||Dir path||Count||
> ||Source dir||  hdfs://cluster1:8020/source/ ||100k+ files||
> ||Target dir||hdfs://cluster2:8020/target/ ||100k+  files||
>  
> Check code in CopyCommitter.java, we find in function
> deleteAttemptTempFiles() has a code targetFS.globStatus(new 
> Path(targetWorkPath, ".distcp.tmp." + jobId.replaceAll("job","attempt") + 
> "*")); 
> It will waste a lot of time when distcp between two large dirs. When we use 
> distcp with -direct option,  it will direct write to the target file without 
> generate a  '.distcp.tmp'  temp file. So, i think this code need add a 
> judgment before call function deleteAttemptTempFiles, if distcp with -direct 
> option, do nothing , directly return .  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16872) Performance improvement when distcp files in large dir with -direct option

2020-09-10 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193548#comment-17193548
 ] 

Steve Loughran commented on HADOOP-16872:
-

missed this. Could you submit it as a github PR, as that is where we review 
patches? thanks

> Performance improvement when distcp files in large dir with -direct option
> --
>
> Key: HADOOP-16872
> URL: https://issues.apache.org/jira/browse/HADOOP-16872
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: liuxiaolong
>Priority: Major
> Attachments: HADOOP-16872.001.patch, optimise after.png, optimise 
> before.png
>
>
> We use distcp with -direct option to copy a file between two large 
> directories. We found it costed a few minutes. If we launch too much distcp 
> jobs at the same time, NameNode  performance degradation is serious.
> hadoop -direct -skipcrccheck -update -prbugaxt -i -numListstatusThreads 1 
> hdfs://cluster1:8020/source/100.log  hdfs://cluster2:8020/target/100.jpg
> || ||Dir path||Count||
> ||Source dir||  hdfs://cluster1:8020/source/ ||100k+ files||
> ||Target dir||hdfs://cluster2:8020/target/ ||100k+  files||
>  
> Check code in CopyCommitter.java, we find in function
> deleteAttemptTempFiles() has a code targetFS.globStatus(new 
> Path(targetWorkPath, ".distcp.tmp." + jobId.replaceAll("job","attempt") + 
> "*")); 
> It will waste a lot of time when distcp between two large dirs. When we use 
> distcp with -direct option,  it will direct write to the target file without 
> generate a  '.distcp.tmp'  temp file. So, i think this code need add a 
> judgment before call function deleteAttemptTempFiles, if distcp with -direct 
> option, do nothing , directly return .  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



  1   2   >