[GitHub] [hadoop] brahmareddybattula commented on a change in pull request #2189: HDFS-15025. Applying NVDIMM storage media to HDFS
brahmareddybattula commented on a change in pull request #2189: URL: https://github.com/apache/hadoop/pull/2189#discussion_r486548053 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsVolumeSpi.java ## @@ -77,6 +77,9 @@ /** Returns true if the volume is NOT backed by persistent storage. */ boolean isTransientStorage(); Review comment: So, NVDIMM is peristent storage and RAM. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] brahmareddybattula commented on a change in pull request #2189: HDFS-15025. Applying NVDIMM storage media to HDFS
brahmareddybattula commented on a change in pull request #2189: URL: https://github.com/apache/hadoop/pull/2189#discussion_r486535639 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/StorageType.java ## @@ -34,28 +34,35 @@ @InterfaceStability.Unstable public enum StorageType { // sorted by the speed of the storage types, from fast to slow - RAM_DISK(true), - SSD(false), - DISK(false), - ARCHIVE(false), - PROVIDED(false); + RAM_DISK(true, true), + NVDIMM(false, true), + SSD(false, false), + DISK(false, false), + ARCHIVE(false, false), + PROVIDED(false, false); private final boolean isTransient; + private final boolean isRAM; public static final StorageType DEFAULT = DISK; public static final StorageType[] EMPTY_ARRAY = {}; private static final StorageType[] VALUES = values(); - StorageType(boolean isTransient) { + StorageType(boolean isTransient, boolean isRAM) { this.isTransient = isTransient; +this.isRAM = isRAM; } public boolean isTransient() { return isTransient; } + public boolean isRAM() { +return isRAM; + } Review comment: ok, By design if you dn't want to move. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] goiri commented on a change in pull request #2296: HDFS-15568. namenode start failed to start when dfs.namenode.max.snapshot.limit set.
goiri commented on a change in pull request #2296: URL: https://github.com/apache/hadoop/pull/2296#discussion_r486579348 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java ## @@ -508,6 +508,10 @@ FSNamesystem getFSNamesystem() { return namesystem; } + public boolean isImageLoaded() { Review comment: Add javadoc ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java ## @@ -368,6 +368,13 @@ void assertFirstSnapshot(INodeDirectory dir, } } + boolean captureOpenFiles() { Review comment: javadoc ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java ## @@ -368,6 +368,13 @@ void assertFirstSnapshot(INodeDirectory dir, } } + boolean captureOpenFiles() { +return captureOpenFiles; + } + + int getMaxSnapshotLimit() { Review comment: VisibleForTesting ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotManager.java ## @@ -133,4 +137,68 @@ public void testValidateSnapshotIDWidth() throws Exception { getMaxSnapshotID() < Snapshot.CURRENT_STATE_ID); } + @Test + public void SnapshotLimitOnRestart() throws Exception { +final Configuration conf = new Configuration(); +final Path snapshottableDir += new Path("/" + getClass().getSimpleName()); +int numSnapshots = 5; +conf.setInt(DFSConfigKeys. +DFS_NAMENODE_SNAPSHOT_MAX_LIMIT, numSnapshots); +conf.setInt(DFSConfigKeys.DFS_NAMENODE_SNAPSHOT_FILESYSTEM_LIMIT, +numSnapshots * 2); +MiniDFSCluster cluster = new MiniDFSCluster.Builder(conf). +numDataNodes(0).build(); +cluster.waitActive(); +DistributedFileSystem hdfs = cluster.getFileSystem(); +hdfs.mkdirs(snapshottableDir); +hdfs.allowSnapshot(snapshottableDir); +int i = 0; +for (; i < numSnapshots; i++) { + hdfs.createSnapshot(snapshottableDir, "s" + i); +} +try { + hdfs.createSnapshot(snapshottableDir, "s" + i); + Assert.fail("Expected SnapshotException not thrown"); +} catch (SnapshotException se) { + Assert.assertTrue( + StringUtils.toLowerCase(se.getMessage()).contains( + "max snapshot limit")); +} + +// now change max snapshot directory limit to 2 and restart namenode +cluster.getNameNode().getConf().setInt(DFSConfigKeys. +DFS_NAMENODE_SNAPSHOT_MAX_LIMIT, 2); +cluster.restartNameNodes(); + +// make sure edits of all previous 5 create snapshots are replayed +Assert.assertEquals(numSnapshots, cluster.getNamesystem(). +getSnapshotManager().getNumSnapshots()); + +// make sure namenode has the new snapshot limit configured as 2 +Assert.assertEquals(2, +cluster.getNamesystem().getSnapshotManager().getMaxSnapshotLimit()); + +// Any new snapshot creation should still fail +try { + hdfs.createSnapshot(snapshottableDir, "s" + i); + Assert.fail("Expected SnapshotException not thrown"); +} catch (SnapshotException se) { + Assert.assertTrue( Review comment: LambdaTestUtils ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotManager.java ## @@ -133,4 +137,68 @@ public void testValidateSnapshotIDWidth() throws Exception { getMaxSnapshotID() < Snapshot.CURRENT_STATE_ID); } + @Test + public void SnapshotLimitOnRestart() throws Exception { +final Configuration conf = new Configuration(); +final Path snapshottableDir += new Path("/" + getClass().getSimpleName()); +int numSnapshots = 5; +conf.setInt(DFSConfigKeys. +DFS_NAMENODE_SNAPSHOT_MAX_LIMIT, numSnapshots); +conf.setInt(DFSConfigKeys.DFS_NAMENODE_SNAPSHOT_FILESYSTEM_LIMIT, +numSnapshots * 2); +MiniDFSCluster cluster = new MiniDFSCluster.Builder(conf). +numDataNodes(0).build(); +cluster.waitActive(); +DistributedFileSystem hdfs = cluster.getFileSystem(); +hdfs.mkdirs(snapshottableDir); +hdfs.allowSnapshot(snapshottableDir); +int i = 0; +for (; i < numSnapshots; i++) { + hdfs.createSnapshot(snapshottableDir, "s" + i); +} +try { + hdfs.createSnapshot(snapshottableDir, "s" + i); + Assert.fail("Expected SnapshotException not thrown"); +} catch (SnapshotException se) { + Assert.assertTrue( + StringUtils.toLowerCase(se.getMessage()).contains( + "max snapshot limit")); +} + +// now change max snapshot directory limit to 2 and restart namenode +cluster.getNameNode().getConf().setInt(DFSConfigKeys. +DFS_NAMENODE_SNAPSHOT_MAX_LIMIT, 2); +cluster.restartNameNodes(); + +// make sure
[jira] [Work logged] (HADOOP-17244) HADOOP-17244. S3A directory delete tombstones dir markers prematurely.
[ https://issues.apache.org/jira/browse/HADOOP-17244?focusedWorklogId=481541=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481541 ] ASF GitHub Bot logged work on HADOOP-17244: --- Author: ASF GitHub Bot Created on: 10/Sep/20 16:04 Start Date: 10/Sep/20 16:04 Worklog Time Spent: 10m Work Description: steveloughran merged pull request #2280: URL: https://github.com/apache/hadoop/pull/2280 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481541) Time Spent: 1h 20m (was: 1h 10m) > HADOOP-17244. S3A directory delete tombstones dir markers prematurely. > -- > > Key: HADOOP-17244 > URL: https://issues.apache.org/jira/browse/HADOOP-17244 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.1 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Blocker > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > Test failure: > {{ITestS3AFileContextMainOperations#testRenameDirectoryAsNonExistentDirectory}} > This is repeatable on -Dauth runs (we haven't been running them, have we?) > Either its from the recent dir marker changes (initial hypothesis) or its > been lurking a while and not been picked up. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran merged pull request #2280: HADOOP-17244. S3A directory delete tombstones dir markers prematurely.
steveloughran merged pull request #2280: URL: https://github.com/apache/hadoop/pull/2280 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] brahmareddybattula commented on a change in pull request #2189: HDFS-15025. Applying NVDIMM storage media to HDFS
brahmareddybattula commented on a change in pull request #2189: URL: https://github.com/apache/hadoop/pull/2189#discussion_r486549682 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/StorageType.java ## @@ -34,28 +34,35 @@ @InterfaceStability.Unstable public enum StorageType { // sorted by the speed of the storage types, from fast to slow - RAM_DISK(true), - SSD(false), - DISK(false), - ARCHIVE(false), - PROVIDED(false); + RAM_DISK(true, true), + NVDIMM(false, true), + SSD(false, false), + DISK(false, false), + ARCHIVE(false, false), + PROVIDED(false, false); private final boolean isTransient; + private final boolean isRAM; public static final StorageType DEFAULT = DISK; public static final StorageType[] EMPTY_ARRAY = {}; private static final StorageType[] VALUES = values(); - StorageType(boolean isTransient) { + StorageType(boolean isTransient, boolean isRAM) { this.isTransient = isTransient; +this.isRAM = isRAM; } public boolean isTransient() { return isTransient; } + public boolean isRAM() { +return isRAM; + } Review comment: My final query then, why can't have one NVDIMM like one SSD as this also movable and peristent..? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] szetszwo commented on a change in pull request #2296: HDFS-15568. namenode start failed to start when dfs.namenode.max.snapshot.limit set.
szetszwo commented on a change in pull request #2296: URL: https://github.com/apache/hadoop/pull/2296#discussion_r486594478 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java ## @@ -448,22 +455,31 @@ public String createSnapshot(final LeaseManager leaseManager, "snapshot IDs and ID rollover is not supported."); } int n = numSnapshots.get(); -if (n >= maxSnapshotFSLimit) { - // We have reached the maximum snapshot limit - throw new SnapshotException( - "Failed to create snapshot: there are already " + (n + 1) - + " snapshot(s) and the max snapshot limit is " - + maxSnapshotFSLimit); -} - -srcRoot.addSnapshot(snapshotCounter, snapshotName, leaseManager, -this.captureOpenFiles, maxSnapshotLimit, mtime); +checkSnapshotLimit(maxSnapshotFSLimit, n); +srcRoot.addSnapshot(this, snapshotName, leaseManager, mtime); //create success, update id snapshotCounter++; numSnapshots.getAndIncrement(); return Snapshot.getSnapshotPath(snapshotRoot, snapshotName); } + + void checkSnapshotLimit(int limit, int numSnapshots) Review comment: I suggest to add limit type to the error message as below. ``` diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectorySnapshottableFeature.java b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectorySnapshottableFeature.java index 266c0a71241..7a47ab4000d 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectorySnapshottableFeature.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectorySnapshottableFeature.java @@ -190,8 +190,7 @@ public Snapshot addSnapshot(INodeDirectory snapshotRoot, + n + " snapshot(s) and the snapshot quota is " + snapshotQuota); } -snapshotManager.checkSnapshotLimit(snapshotManager. -getMaxSnapshotLimit(), n); +snapshotManager.checkPerDirectorySnapshotLimit(n); final Snapshot s = new Snapshot(id, name, snapshotRoot); final byte[] nameBytes = s.getRoot().getLocalNameBytes(); final int i = searchSnapshot(nameBytes); diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java index 0a2e18c3dc3..7c482074486 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java @@ -455,7 +455,7 @@ public String createSnapshot(final LeaseManager leaseManager, "snapshot IDs and ID rollover is not supported."); } int n = numSnapshots.get(); -checkSnapshotLimit(maxSnapshotFSLimit, n); +checkFileSystemSnapshotLimit(n); srcRoot.addSnapshot(this, snapshotName, leaseManager, mtime); //create success, update id @@ -464,12 +464,19 @@ public String createSnapshot(final LeaseManager leaseManager, return Snapshot.getSnapshotPath(snapshotRoot, snapshotName); } - void checkSnapshotLimit(int limit, int numSnapshots) - throws SnapshotException { + void checkFileSystemSnapshotLimit(int n) throws SnapshotException { +checkSnapshotLimit(maxSnapshotFSLimit, n, "file system"); + } + + void checkPerDirectorySnapshotLimit(int n) throws SnapshotException { +checkSnapshotLimit(maxSnapshotLimit, n, "per directory"); + } + + private void checkSnapshotLimit(int limit, int numSnapshots, + String type) throws SnapshotException { if (numSnapshots >= limit) { - String msg = "there are already " + (numSnapshots + 1) - + " snapshot(s) and the max snapshot limit is " - + limit; + String msg = "There are already " + (numSnapshots + 1) + + " snapshot(s) and the " + type + " snapshot limit is " + limit; if (fsdir.isImageLoaded()) { // We have reached the maximum snapshot limit throw new SnapshotException( ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail:
[GitHub] [hadoop] viirya opened a new pull request #2297: HADOOP-17125. Using snappy-java in SnappyCodec
viirya opened a new pull request #2297: URL: https://github.com/apache/hadoop/pull/2297 See https://issues.apache.org/jira/browse/HADOOP-17125 for details. Offline discussed with @dbtsai and submitted this based on #2201. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=481670=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481670 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 10/Sep/20 20:01 Start Date: 10/Sep/20 20:01 Worklog Time Spent: 10m Work Description: viirya opened a new pull request #2297: URL: https://github.com/apache/hadoop/pull/2297 See https://issues.apache.org/jira/browse/HADOOP-17125 for details. Offline discussed with @dbtsai and submitted this based on #2201. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481670) Time Spent: 1.5h (was: 1h 20m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] brahmareddybattula commented on a change in pull request #2189: HDFS-15025. Applying NVDIMM storage media to HDFS
brahmareddybattula commented on a change in pull request #2189: URL: https://github.com/apache/hadoop/pull/2189#discussion_r486545111 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsVolumeSpi.java ## @@ -77,6 +77,9 @@ /** Returns true if the volume is NOT backed by persistent storage. */ boolean isTransientStorage(); Review comment: Ok. Got it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-17244) HADOOP-17244. S3A directory delete tombstones dir markers prematurely.
[ https://issues.apache.org/jira/browse/HADOOP-17244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-17244. - Fix Version/s: 3.3.1 Resolution: Fixed > HADOOP-17244. S3A directory delete tombstones dir markers prematurely. > -- > > Key: HADOOP-17244 > URL: https://issues.apache.org/jira/browse/HADOOP-17244 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.1 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Blocker > Labels: pull-request-available > Fix For: 3.3.1 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Test failure: > {{ITestS3AFileContextMainOperations#testRenameDirectoryAsNonExistentDirectory}} > This is repeatable on -Dauth runs (we haven't been running them, have we?) > Either its from the recent dir marker changes (initial hypothesis) or its > been lurking a while and not been picked up. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17254) Upgrade hbase to 1.4.13 on branch-2.10
[ https://issues.apache.org/jira/browse/HADOOP-17254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193707#comment-17193707 ] Mingliang Liu commented on HADOOP-17254: +1 Thanks! > Upgrade hbase to 1.4.13 on branch-2.10 > -- > > Key: HADOOP-17254 > URL: https://issues.apache.org/jira/browse/HADOOP-17254 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > hbase.version must be updated to address CVE-2018-8025 on branch-2.10. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #2280: HADOOP-17244. S3A directory delete tombstones dir markers prematurely.
steveloughran commented on pull request #2280: URL: https://github.com/apache/hadoop/pull/2280#issuecomment-690402713 repeatedly tested against london with options showing the error and the -Dkeep option *and unguarded*. Some transient failures related to local network issues, addressed in HADOOP-17181. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17244) HADOOP-17244. S3A directory delete tombstones dir markers prematurely.
[ https://issues.apache.org/jira/browse/HADOOP-17244?focusedWorklogId=481539=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481539 ] ASF GitHub Bot logged work on HADOOP-17244: --- Author: ASF GitHub Bot Created on: 10/Sep/20 16:01 Start Date: 10/Sep/20 16:01 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2280: URL: https://github.com/apache/hadoop/pull/2280#issuecomment-690402713 repeatedly tested against london with options showing the error and the -Dkeep option *and unguarded*. Some transient failures related to local network issues, addressed in HADOOP-17181. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481539) Time Spent: 1h 10m (was: 1h) > HADOOP-17244. S3A directory delete tombstones dir markers prematurely. > -- > > Key: HADOOP-17244 > URL: https://issues.apache.org/jira/browse/HADOOP-17244 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.1 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Blocker > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Test failure: > {{ITestS3AFileContextMainOperations#testRenameDirectoryAsNonExistentDirectory}} > This is repeatable on -Dauth runs (we haven't been running them, have we?) > Either its from the recent dir marker changes (initial hypothesis) or its > been lurking a while and not been picked up. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] viirya commented on a change in pull request #2297: HADOOP-17125. Using snappy-java in SnappyCodec
viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r486686483 ## File path: hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyCompressor.c ## @@ -1,166 +0,0 @@ -/* Review comment: Hmm, because we remove native method in java files, I think we don't generate .h file needed for compilation: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/1/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt ``` [WARNING] /home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-2297/src/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.c:32:10: fatal error: org_apache_hadoop_io_compress_snappy_SnappyDecompressor.h: No such file or directory [WARNING] #include "org_apache_hadoop_io_compress_snappy_SnappyDecompressor.h" [WARNING] ^~~ [WARNING] compilation terminated. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=481781=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481781 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 10/Sep/20 23:23 Start Date: 10/Sep/20 23:23 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r486686483 ## File path: hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyCompressor.c ## @@ -1,166 +0,0 @@ -/* Review comment: Hmm, because we remove native method in java files, I think we don't generate .h file needed for compilation: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/1/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt ``` [WARNING] /home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-2297/src/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.c:32:10: fatal error: org_apache_hadoop_io_compress_snappy_SnappyDecompressor.h: No such file or directory [WARNING] #include "org_apache_hadoop_io_compress_snappy_SnappyDecompressor.h" [WARNING] ^~~ [WARNING] compilation terminated. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481781) Time Spent: 2.5h (was: 2h 20m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] fengnanli commented on pull request #2266: HDFS-15554. RBF: force router check file existence in destinations before adding/updating mount points
fengnanli commented on pull request #2266: URL: https://github.com/apache/hadoop/pull/2266#issuecomment-690809794 The TestRouterRpcMultiDestination test passed locally. @goiri Can you help commit it? Thanks a lot! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] dbtsai closed pull request #2201: HADOOP-17125. Using snappy-java in SnappyCodec
dbtsai closed pull request #2201: URL: https://github.com/apache/hadoop/pull/2201 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] smengcl commented on a change in pull request #2258: HDFS-15539. When disallowing snapshot on a dir, throw exception if its trash root is not empty
smengcl commented on a change in pull request #2258: URL: https://github.com/apache/hadoop/pull/2258#discussion_r486694935 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java ## @@ -2442,4 +2442,38 @@ public void testGetTrashRootOnEZInSnapshottableDir() } } } + + @Test + public void testDisallowSnapshotShouldThrowWhenTrashRootExists() + throws IOException { +Configuration conf = getTestConfiguration(); +MiniDFSCluster cluster = +new MiniDFSCluster.Builder(conf).numDataNodes(1).build(); +try { + DistributedFileSystem dfs = cluster.getFileSystem(); + Path testDir = new Path("/disallowss/test1/"); + Path file0path = new Path(testDir, "file-0"); + dfs.create(file0path); + dfs.allowSnapshot(testDir); + // Create trash root manually + Path testDirTrashRoot = new Path(testDir, FileSystem.TRASH_PREFIX); + dfs.mkdirs(testDirTrashRoot); + // Try disallowing snapshot, should throw + try { +dfs.disallowSnapshot(testDir); +fail("Should have thrown IOException when trash root exists inside " Review comment: Thanks! I have updated accordingly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] umamaheswararao opened a new pull request #2298: HDFS-15532: listFiles on root/InternalDir will fail if fallback root has file.
umamaheswararao opened a new pull request #2298: URL: https://github.com/apache/hadoop/pull/2298 https://issues.apache.org/jira/browse/HDFS-15532 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus commented on pull request #2266: HDFS-15554. RBF: force router check file existence in destinations before adding/updating mount points
hadoop-yetus commented on pull request #2266: URL: https://github.com/apache/hadoop/pull/2266#issuecomment-690766821 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 33m 24s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +0 :ok: | markdownlint | 0m 1s | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 4 new or modified test files. | ||| _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 29m 21s | trunk passed | | +1 :green_heart: | compile | 0m 40s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 0m 36s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 0m 27s | trunk passed | | +1 :green_heart: | mvnsite | 0m 40s | trunk passed | | +1 :green_heart: | shadedclient | 15m 26s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 40s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 0m 54s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 1m 13s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 1m 10s | trunk passed | ||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 31s | the patch passed | | +1 :green_heart: | compile | 0m 32s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javac | 0m 32s | hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 0 new + 30 unchanged - 2 fixed = 30 total (was 32) | | +1 :green_heart: | compile | 0m 28s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | javac | 0m 28s | hadoop-hdfs-project_hadoop-hdfs-rbf-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 0 new + 30 unchanged - 2 fixed = 30 total (was 32) | | +1 :green_heart: | checkstyle | 0m 17s | the patch passed | | +1 :green_heart: | mvnsite | 0m 30s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 2s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 13m 56s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 33s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 0m 50s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 1m 14s | the patch passed | ||| _ Other Tests _ | | -1 :x: | unit | 8m 39s | hadoop-hdfs-rbf in the patch passed. | | +1 :green_heart: | asflicense | 0m 33s | The patch does not generate ASF License warnings. | | | | 114m 33s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2266/6/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2266 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml markdownlint | | uname | Linux 490356438ed4 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 9960c01a25c | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | unit | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2266/6/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2266/6/testReport/ | | Max. process+thread count | 2936 (vs.
[jira] [Work logged] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode
[ https://issues.apache.org/jira/browse/HADOOP-17222?focusedWorklogId=481873=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481873 ] ASF GitHub Bot logged work on HADOOP-17222: --- Author: ASF GitHub Bot Created on: 11/Sep/20 03:25 Start Date: 11/Sep/20 03:25 Worklog Time Spent: 10m Work Description: liuml07 commented on pull request #2241: URL: https://github.com/apache/hadoop/pull/2241#issuecomment-690850909 I see `TestNameNodeRetryCacheMetrics.testRetryCacheMetrics` and ` TestMultipleNNPortQOP.testMultipleNNPortOverwriteDownStream` fail on my Linux machine with the patch. Will confirm they are not related. If no objections, I'll commit later this week. Thanks, This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481873) Time Spent: 2h (was: 1h 50m) > Create socket address combined with cache to speed up hdfs client choose > DataNode > - > > Key: HADOOP-17222 > URL: https://issues.apache.org/jira/browse/HADOOP-17222 > Project: Hadoop Common > Issue Type: Improvement > Components: common, hdfs-client > Environment: HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) >Reporter: fanrui >Assignee: fanrui >Priority: Major > Labels: pull-request-available > Attachments: After Optimization remark.png, After optimization.svg, > Before Optimization remark.png, Before optimization.svg > > Time Spent: 2h > Remaining Estimate: 0h > > Note:Not only the hdfs client can get the current benefit, all callers of > NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an > example. > > Hdfs client selects best DN for hdfs Block. method call stack: > DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> > NetUtils.createSocketAddr > NetUtils.createSocketAddr creates the corresponding InetSocketAddress based > on the host and port. There are some heavier operations in the > NetUtils.createSocketAddr method, for example: URI.create(target), so > NetUtils.createSocketAddr takes more time to execute. > The following is my performance report. The report is based on HBase calling > hdfs. HBase is a high-frequency access client for hdfs, because HBase read > operations often access a small DataBlock (about 64k) instead of the entire > HFile. In the case of high frequency access, the NetUtils.createSocketAddr > method is time-consuming. > h3. Test Environment: > > {code:java} > HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) > {code} > h4. Before Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 4.86% of the entire CPU, and the creation of URIs accounts for a larger > proportion. > !Before Optimization remark.png! > h3. Optimization ideas: > NetUtils.createSocketAddr creates InetSocketAddress based on host and port. > Here we can add Cache to InetSocketAddress. The key of Cache is host and > port, and the value is InetSocketAddress. > h4. After Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, > and the ConcurrentHashMap.get() method gets data from the Cache. The CPU > usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% > to 0.54%. > !After Optimization remark.png! > h3. Original FlameGraph link: > [Before > Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] > [After Optimization > FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=481778=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481778 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 10/Sep/20 23:19 Start Date: 10/Sep/20 23:19 Worklog Time Spent: 10m Work Description: dbtsai commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r486685241 ## File path: hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyCompressor.c ## @@ -1,166 +0,0 @@ -/* Review comment: Per https://github.com/apache/hadoop/pull/2201#issuecomment-681687572 Are those native code used in `hadoop-mapreduce-client-nativetask`? If so, we probably need to keep it now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481778) Time Spent: 2h 10m (was: 2h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17254) Upgrade hbase to 1.4.13 on branch-2.10
[ https://issues.apache.org/jira/browse/HADOOP-17254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HADOOP-17254: -- Fix Version/s: 2.10.1 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Thanks, [~liuml07]. I merged this. > Upgrade hbase to 1.4.13 on branch-2.10 > -- > > Key: HADOOP-17254 > URL: https://issues.apache.org/jira/browse/HADOOP-17254 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Fix For: 2.10.1 > > Time Spent: 50m > Remaining Estimate: 0h > > hbase.version must be updated to address CVE-2018-8025 on branch-2.10. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17254) Upgrade hbase to 1.4.13 on branch-2.10
[ https://issues.apache.org/jira/browse/HADOOP-17254?focusedWorklogId=481797=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481797 ] ASF GitHub Bot logged work on HADOOP-17254: --- Author: ASF GitHub Bot Created on: 10/Sep/20 23:54 Start Date: 10/Sep/20 23:54 Worklog Time Spent: 10m Work Description: iwasakims commented on pull request #2290: URL: https://github.com/apache/hadoop/pull/2290#issuecomment-690791721 Thanks, @liuml07. I merged this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481797) Time Spent: 50m (was: 40m) > Upgrade hbase to 1.4.13 on branch-2.10 > -- > > Key: HADOOP-17254 > URL: https://issues.apache.org/jira/browse/HADOOP-17254 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > hbase.version must be updated to address CVE-2018-8025 on branch-2.10. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] iwasakims merged pull request #2290: HADOOP-17254. Upgrade hbase to 1.4.13 on branch-2.10.
iwasakims merged pull request #2290: URL: https://github.com/apache/hadoop/pull/2290 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] liuml07 commented on pull request #2241: HADOOP-17222. Create socket address combined with URI cache
liuml07 commented on pull request #2241: URL: https://github.com/apache/hadoop/pull/2241#issuecomment-690850909 I see `TestNameNodeRetryCacheMetrics.testRetryCacheMetrics` and ` TestMultipleNNPortQOP.testMultipleNNPortOverwriteDownStream` fail on my Linux machine with the patch. Will confirm they are not related. If no objections, I'll commit later this week. Thanks, This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode
[ https://issues.apache.org/jira/browse/HADOOP-17222?focusedWorklogId=481876=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481876 ] ASF GitHub Bot logged work on HADOOP-17222: --- Author: ASF GitHub Bot Created on: 11/Sep/20 03:44 Start Date: 11/Sep/20 03:44 Worklog Time Spent: 10m Work Description: 1996fanrui commented on pull request #2241: URL: https://github.com/apache/hadoop/pull/2241#issuecomment-690855787 > I see `TestNameNodeRetryCacheMetrics.testRetryCacheMetrics` and ` TestMultipleNNPortQOP.testMultipleNNPortOverwriteDownStream` fail on my Linux machine with the patch. Will confirm they are not related. > > If no objections, I'll commit later this week. Thanks, @liuml07 , thanks for your test. I run these two unit tests and they are still successful. In the previous test of hadoop robot, TestMultipleNNPortQOP did not fail, only the last time it failed. But the last time I only committed an empty, my code did not change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481876) Time Spent: 2h 10m (was: 2h) > Create socket address combined with cache to speed up hdfs client choose > DataNode > - > > Key: HADOOP-17222 > URL: https://issues.apache.org/jira/browse/HADOOP-17222 > Project: Hadoop Common > Issue Type: Improvement > Components: common, hdfs-client > Environment: HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) >Reporter: fanrui >Assignee: fanrui >Priority: Major > Labels: pull-request-available > Attachments: After Optimization remark.png, After optimization.svg, > Before Optimization remark.png, Before optimization.svg > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Note:Not only the hdfs client can get the current benefit, all callers of > NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an > example. > > Hdfs client selects best DN for hdfs Block. method call stack: > DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> > NetUtils.createSocketAddr > NetUtils.createSocketAddr creates the corresponding InetSocketAddress based > on the host and port. There are some heavier operations in the > NetUtils.createSocketAddr method, for example: URI.create(target), so > NetUtils.createSocketAddr takes more time to execute. > The following is my performance report. The report is based on HBase calling > hdfs. HBase is a high-frequency access client for hdfs, because HBase read > operations often access a small DataBlock (about 64k) instead of the entire > HFile. In the case of high frequency access, the NetUtils.createSocketAddr > method is time-consuming. > h3. Test Environment: > > {code:java} > HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) > {code} > h4. Before Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 4.86% of the entire CPU, and the creation of URIs accounts for a larger > proportion. > !Before Optimization remark.png! > h3. Optimization ideas: > NetUtils.createSocketAddr creates InetSocketAddress based on host and port. > Here we can add Cache to InetSocketAddress. The key of Cache is host and > port, and the value is InetSocketAddress. > h4. After Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, > and the ConcurrentHashMap.get() method gets data from the Cache. The CPU > usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% > to 0.54%. > !After Optimization remark.png! > h3. Original FlameGraph link: > [Before > Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] > [After Optimization > FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe,
[GitHub] [hadoop] 1996fanrui commented on pull request #2241: HADOOP-17222. Create socket address combined with URI cache
1996fanrui commented on pull request #2241: URL: https://github.com/apache/hadoop/pull/2241#issuecomment-690855787 > I see `TestNameNodeRetryCacheMetrics.testRetryCacheMetrics` and ` TestMultipleNNPortQOP.testMultipleNNPortOverwriteDownStream` fail on my Linux machine with the patch. Will confirm they are not related. > > If no objections, I'll commit later this week. Thanks, @liuml07 , thanks for your test. I run these two unit tests and they are still successful. In the previous test of hadoop robot, TestMultipleNNPortQOP did not fail, only the last time it failed. But the last time I only committed an empty, my code did not change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] dbtsai commented on pull request #2201: HADOOP-17125. Using snappy-java in SnappyCodec
dbtsai commented on pull request #2201: URL: https://github.com/apache/hadoop/pull/2201#issuecomment-690780546 Closing this PR in favor of https://github.com/apache/hadoop/pull/2297 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=481776=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481776 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 10/Sep/20 23:16 Start Date: 10/Sep/20 23:16 Worklog Time Spent: 10m Work Description: dbtsai commented on pull request #2201: URL: https://github.com/apache/hadoop/pull/2201#issuecomment-690780546 Closing this PR in favor of https://github.com/apache/hadoop/pull/2297 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481776) Time Spent: 2h (was: 1h 50m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=481775=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481775 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 10/Sep/20 23:16 Start Date: 10/Sep/20 23:16 Worklog Time Spent: 10m Work Description: dbtsai closed pull request #2201: URL: https://github.com/apache/hadoop/pull/2201 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481775) Time Spent: 1h 50m (was: 1h 40m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus commented on pull request #2297: HADOOP-17125. Using snappy-java in SnappyCodec
hadoop-yetus commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690812236 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 29s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 36s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 26m 46s | trunk passed | | +1 :green_heart: | compile | 19m 28s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 17m 8s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 3m 41s | trunk passed | | +1 :green_heart: | mvnsite | 2m 1s | trunk passed | | +1 :green_heart: | shadedclient | 20m 48s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 14s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 10s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 2m 15s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 37s | branch/hadoop-project no findbugs output file (findbugsXml.xml) | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 29s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 4s | the patch passed | | -1 :x: | compile | 1m 4s | root in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | cc | 1m 4s | root in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | golang | 1m 4s | root in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | javac | 1m 4s | root in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | compile | 0m 56s | root in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -1 :x: | cc | 0m 56s | root in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -1 :x: | golang | 0m 56s | root in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -1 :x: | javac | 0m 56s | root in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -0 :warning: | checkstyle | 2m 21s | root: The patch generated 1 new + 151 unchanged - 5 fixed = 152 total (was 156) | | +1 :green_heart: | mvnsite | 1m 24s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 3s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 13m 48s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 41s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 36s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | findbugs | 0m 19s | hadoop-project has no data from findbugs | ||| _ Other Tests _ | | +1 :green_heart: | unit | 0m 18s | hadoop-project in the patch passed. | | -1 :x: | unit | 0m 40s | hadoop-common in the patch failed. | | +1 :green_heart: | asflicense | 0m 34s | The patch does not generate ASF License warnings. | | | | 125m 50s | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2297 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle cc golang | | uname | Linux 6a174de3a925 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 9960c01a25c | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=481835=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481835 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 11/Sep/20 01:07 Start Date: 11/Sep/20 01:07 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690812236 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 29s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 36s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 26m 46s | trunk passed | | +1 :green_heart: | compile | 19m 28s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 17m 8s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 3m 41s | trunk passed | | +1 :green_heart: | mvnsite | 2m 1s | trunk passed | | +1 :green_heart: | shadedclient | 20m 48s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 14s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 10s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 2m 15s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 37s | branch/hadoop-project no findbugs output file (findbugsXml.xml) | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 29s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 4s | the patch passed | | -1 :x: | compile | 1m 4s | root in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | cc | 1m 4s | root in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | golang | 1m 4s | root in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | javac | 1m 4s | root in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | compile | 0m 56s | root in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -1 :x: | cc | 0m 56s | root in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -1 :x: | golang | 0m 56s | root in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -1 :x: | javac | 0m 56s | root in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -0 :warning: | checkstyle | 2m 21s | root: The patch generated 1 new + 151 unchanged - 5 fixed = 152 total (was 156) | | +1 :green_heart: | mvnsite | 1m 24s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 3s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 13m 48s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 41s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 36s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | findbugs | 0m 19s | hadoop-project has no data from findbugs | ||| _ Other Tests _ | | +1 :green_heart: | unit | 0m 18s | hadoop-project in the patch passed. | | -1 :x: | unit | 0m 40s | hadoop-common in the patch failed. | | +1 :green_heart: | asflicense | 0m 34s | The patch does not generate ASF License warnings. | | | | 125m 50s | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2297 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle cc golang | | uname | Linux 6a174de3a925
[jira] [Commented] (HADOOP-17257) pid file delete when service stop (secure datanode ) show cat no directory
[ https://issues.apache.org/jira/browse/HADOOP-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193954#comment-17193954 ] zhuqi commented on HADOOP-17257: cc [~boky01] Yeah , but i am not the contributor of hadoop-common module, so i add open a new issue, and i fix the condition that when the datanode is not running, the origin source code also show cat error. Thanks. > pid file delete when service stop (secure datanode ) show cat no directory > -- > > Key: HADOOP-17257 > URL: https://issues.apache.org/jira/browse/HADOOP-17257 > Project: Hadoop Common > Issue Type: Bug > Components: scripts, security >Affects Versions: 3.4.0 >Reporter: zhuqi >Priority: Major > Attachments: HADOOP-17257-0.0.1.patch > > > when stop running secure datanode > show cat no directory . > > when stop unrunning secure datanode > also show cat no pid directory > > It's both unreasonable -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=481753=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481753 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 10/Sep/20 22:29 Start Date: 10/Sep/20 22:29 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690765839 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 41s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 34s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 33m 4s | trunk passed | | +1 :green_heart: | compile | 25m 59s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 21m 36s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 3m 14s | trunk passed | | +1 :green_heart: | mvnsite | 2m 10s | trunk passed | | +1 :green_heart: | shadedclient | 23m 22s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 14s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 9s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 2m 16s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 38s | branch/hadoop-project no findbugs output file (findbugsXml.xml) | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 29s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 6s | the patch passed | | -1 :x: | compile | 1m 6s | root in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | javac | 1m 6s | root in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | compile | 1m 0s | root in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -1 :x: | javac | 1m 0s | root in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -0 :warning: | checkstyle | 2m 20s | root: The patch generated 1 new + 151 unchanged - 5 fixed = 152 total (was 156) | | +1 :green_heart: | mvnsite | 1m 24s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 3s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 14m 20s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 41s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 35s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | findbugs | 0m 19s | hadoop-project has no data from findbugs | ||| _ Other Tests _ | | +1 :green_heart: | unit | 0m 18s | hadoop-project in the patch passed. | | -1 :x: | unit | 0m 42s | hadoop-common in the patch failed. | | +1 :green_heart: | asflicense | 0m 33s | The patch does not generate ASF License warnings. | | | | 146m 43s | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2297 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle | | uname | Linux 205df60c0f1e 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 9960c01a25c | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | compile |
[GitHub] [hadoop] hadoop-yetus commented on pull request #2297: HADOOP-17125. Using snappy-java in SnappyCodec
hadoop-yetus commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690765839 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 41s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 34s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 33m 4s | trunk passed | | +1 :green_heart: | compile | 25m 59s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 21m 36s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 3m 14s | trunk passed | | +1 :green_heart: | mvnsite | 2m 10s | trunk passed | | +1 :green_heart: | shadedclient | 23m 22s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 14s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 9s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 2m 16s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 38s | branch/hadoop-project no findbugs output file (findbugsXml.xml) | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 29s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 6s | the patch passed | | -1 :x: | compile | 1m 6s | root in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | javac | 1m 6s | root in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | compile | 1m 0s | root in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -1 :x: | javac | 1m 0s | root in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -0 :warning: | checkstyle | 2m 20s | root: The patch generated 1 new + 151 unchanged - 5 fixed = 152 total (was 156) | | +1 :green_heart: | mvnsite | 1m 24s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 3s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 14m 20s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 41s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 35s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | findbugs | 0m 19s | hadoop-project has no data from findbugs | ||| _ Other Tests _ | | +1 :green_heart: | unit | 0m 18s | hadoop-project in the patch passed. | | -1 :x: | unit | 0m 42s | hadoop-common in the patch failed. | | +1 :green_heart: | asflicense | 0m 33s | The patch does not generate ASF License warnings. | | | | 146m 43s | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2297 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle | | uname | Linux 205df60c0f1e 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 9960c01a25c | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | compile | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/1/artifact/out/patch-compile-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt | | javac | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/1/artifact/out/patch-compile-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt | | compile |
[GitHub] [hadoop] iwasakims commented on pull request #2290: HADOOP-17254. Upgrade hbase to 1.4.13 on branch-2.10.
iwasakims commented on pull request #2290: URL: https://github.com/apache/hadoop/pull/2290#issuecomment-690791721 Thanks, @liuml07. I merged this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17254) Upgrade hbase to 1.4.13 on branch-2.10
[ https://issues.apache.org/jira/browse/HADOOP-17254?focusedWorklogId=481796=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481796 ] ASF GitHub Bot logged work on HADOOP-17254: --- Author: ASF GitHub Bot Created on: 10/Sep/20 23:54 Start Date: 10/Sep/20 23:54 Worklog Time Spent: 10m Work Description: iwasakims merged pull request #2290: URL: https://github.com/apache/hadoop/pull/2290 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481796) Time Spent: 40m (was: 0.5h) > Upgrade hbase to 1.4.13 on branch-2.10 > -- > > Key: HADOOP-17254 > URL: https://issues.apache.org/jira/browse/HADOOP-17254 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > hbase.version must be updated to address CVE-2018-8025 on branch-2.10. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] YaYun-Wang commented on a change in pull request #2189: HDFS-15025. Applying NVDIMM storage media to HDFS
YaYun-Wang commented on a change in pull request #2189: URL: https://github.com/apache/hadoop/pull/2189#discussion_r486716810 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsVolumeSpi.java ## @@ -77,6 +77,9 @@ /** Returns true if the volume is NOT backed by persistent storage. */ boolean isTransientStorage(); Review comment: > So, NVDIMM is peristent storage and RAM. yes, that’s right. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] YaYun-Wang commented on a change in pull request #2189: HDFS-15025. Applying NVDIMM storage media to HDFS
YaYun-Wang commented on a change in pull request #2189: URL: https://github.com/apache/hadoop/pull/2189#discussion_r486716471 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/StorageType.java ## @@ -34,28 +34,35 @@ @InterfaceStability.Unstable public enum StorageType { // sorted by the speed of the storage types, from fast to slow - RAM_DISK(true), - SSD(false), - DISK(false), - ARCHIVE(false), - PROVIDED(false); + RAM_DISK(true, true), + NVDIMM(false, true), + SSD(false, false), + DISK(false, false), + ARCHIVE(false, false), + PROVIDED(false, false); private final boolean isTransient; + private final boolean isRAM; public static final StorageType DEFAULT = DISK; public static final StorageType[] EMPTY_ARRAY = {}; private static final StorageType[] VALUES = values(); - StorageType(boolean isTransient) { + StorageType(boolean isTransient, boolean isRAM) { this.isTransient = isTransient; +this.isRAM = isRAM; } public boolean isTransient() { return isTransient; } + public boolean isRAM() { +return isRAM; + } Review comment: > My final query then, why can't have one NVDIMM like one SSD as this also movable and peristent..? Considering NVDIMM is faster, so NVDIMM does not use `FsDatasetCache()` which SSD needs in the design. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] liuml07 commented on pull request #2189: HDFS-15025. Applying NVDIMM storage media to HDFS
liuml07 commented on pull request #2189: URL: https://github.com/apache/hadoop/pull/2189#issuecomment-690839476 Will check again later this week. Ideally we can get a clean QA. Could you check the test failures and make sure they are not related? Thanks, This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus commented on pull request #2258: HDFS-15539. When disallowing snapshot on a dir, throw exception if its trash root is not empty
hadoop-yetus commented on pull request #2258: URL: https://github.com/apache/hadoop/pull/2258#issuecomment-690845948 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 1m 7s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 28s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 28m 49s | trunk passed | | +1 :green_heart: | compile | 4m 37s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 4m 15s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 1m 2s | trunk passed | | +1 :green_heart: | mvnsite | 2m 14s | trunk passed | | +1 :green_heart: | shadedclient | 19m 0s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 25s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 56s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 2m 28s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 5m 38s | trunk passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 24s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 56s | the patch passed | | +1 :green_heart: | compile | 4m 10s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javac | 4m 10s | the patch passed | | +1 :green_heart: | compile | 3m 46s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | javac | 3m 46s | the patch passed | | +1 :green_heart: | checkstyle | 0m 53s | the patch passed | | +1 :green_heart: | mvnsite | 1m 59s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 15m 50s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 20s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 47s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 7m 3s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | unit | 2m 18s | hadoop-hdfs-client in the patch passed. | | -1 :x: | unit | 77m 54s | hadoop-hdfs in the patch passed. | | +0 :ok: | asflicense | 0m 38s | ASF License check generated no output? | | | | 194m 25s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestReencryption | | | hadoop.hdfs.server.namenode.TestFSEditLogLoader | | | hadoop.hdfs.server.namenode.TestNameNodeReconfigure | | | hadoop.hdfs.server.namenode.TestNamenodeStorageDirectives | | | hadoop.hdfs.server.namenode.TestNameEditsConfigs | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.server.namenode.TestFileLimit | | | hadoop.hdfs.server.namenode.sps.TestStoragePolicySatisfierWithStripedFile | | | hadoop.hdfs.TestAppendDifferentChecksum | | | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier | | | hadoop.hdfs.server.namenode.TestCheckpoint | | | hadoop.hdfs.server.namenode.TestMetadataVersionOutput | | | hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness | | | hadoop.hdfs.server.namenode.TestFSImageWithXAttr | | | hadoop.hdfs.server.namenode.TestQuotaWithStripedBlocksWithRandomECPolicy | | | hadoop.hdfs.server.namenode.TestPersistentStoragePolicySatisfier | | | hadoop.hdfs.TestDFSStripedInputStream | | | hadoop.hdfs.server.namenode.TestStoragePolicySatisfierWithHA | | | hadoop.hdfs.server.datanode.TestDataNodeLifeline | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport | | | hadoop.hdfs.server.datanode.TestBatchIbr | | | hadoop.hdfs.server.namenode.TestLargeDirectoryDelete | | |
[GitHub] [hadoop] dbtsai commented on a change in pull request #2297: HADOOP-17125. Using snappy-java in SnappyCodec
dbtsai commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r486685241 ## File path: hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyCompressor.c ## @@ -1,166 +0,0 @@ -/* Review comment: Per https://github.com/apache/hadoop/pull/2201#issuecomment-681687572 Are those native code used in `hadoop-mapreduce-client-nativetask`? If so, we probably need to keep it now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=481779=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481779 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 10/Sep/20 23:20 Start Date: 10/Sep/20 23:20 Worklog Time Spent: 10m Work Description: dbtsai commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690781897 Thanks @viirya for taking over my https://github.com/apache/hadoop/pull/2201 , and continue working on it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481779) Time Spent: 2h 20m (was: 2h 10m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] dbtsai commented on pull request #2297: HADOOP-17125. Using snappy-java in SnappyCodec
dbtsai commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690781897 Thanks @viirya for taking over my https://github.com/apache/hadoop/pull/2201 , and continue working on it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=481786=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481786 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 10/Sep/20 23:30 Start Date: 10/Sep/20 23:30 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r486688702 ## File path: hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyCompressor.c ## @@ -1,166 +0,0 @@ -/* Review comment: Btw, I don't see they are used in `hadoop-mapreduce-client-nativetask` if I don't miss it. Let's wait the build and test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481786) Time Spent: 2h 40m (was: 2.5h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 2h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] viirya commented on a change in pull request #2297: HADOOP-17125. Using snappy-java in SnappyCodec
viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r486688702 ## File path: hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyCompressor.c ## @@ -1,166 +0,0 @@ -/* Review comment: Btw, I don't see they are used in `hadoop-mapreduce-client-nativetask` if I don't miss it. Let's wait the build and test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] huangtianhua commented on pull request #2189: HDFS-15025. Applying NVDIMM storage media to HDFS
huangtianhua commented on pull request #2189: URL: https://github.com/apache/hadoop/pull/2189#issuecomment-690829884 @liuml07 and @brahmareddybattula So if all of the codes are ok, would you please to approve? Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] smengcl commented on pull request #2258: HDFS-15539. When disallowing snapshot on a dir, throw exception if its trash root is not empty
smengcl commented on pull request #2258: URL: https://github.com/apache/hadoop/pull/2258#issuecomment-690847462 > Thanks @smengcl for working on this,. The test failures like TestDistributedFileSystem#testGetTrashRoots look related. Can you plz verify? I'm checking. Probably need to add a line or two to clean up the old tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hemanthboyina commented on pull request #2267: HDFS-15555. RBF: Refresh cacheNS when SocketException occurs.
hemanthboyina commented on pull request #2267: URL: https://github.com/apache/hadoop/pull/2267#issuecomment-690878265 any update here @aajisaka , as HDFS-15543 was modifying same part of code This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-15891) Provide Regex Based Mount Point In Inode Tree
[ https://issues.apache.org/jira/browse/HADOOP-15891?focusedWorklogId=481883=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481883 ] ASF GitHub Bot logged work on HADOOP-15891: --- Author: ASF GitHub Bot Created on: 11/Sep/20 04:17 Start Date: 11/Sep/20 04:17 Worklog Time Spent: 10m Work Description: umamaheswararao commented on pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#issuecomment-690863729 +1 on the latest patch. Thanks again for your great work @JohnZZGithub Test failures are unrelated. I think we are 3 line crossed for a warning. Looks like it's not worth minor refactor for sake of that unless we do full refactoring of that method. Please file separate issue to refactor into much cleaner with smaller methods. I will proceed to commit the current patch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481883) Time Spent: 8h 40m (was: 8.5h) > Provide Regex Based Mount Point In Inode Tree > - > > Key: HADOOP-15891 > URL: https://issues.apache.org/jira/browse/HADOOP-15891 > Project: Hadoop Common > Issue Type: New Feature > Components: viewfs >Reporter: zhenzhao wang >Assignee: zhenzhao wang >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15891.015.patch, HDFS-13948.001.patch, > HDFS-13948.002.patch, HDFS-13948.003.patch, HDFS-13948.004.patch, > HDFS-13948.005.patch, HDFS-13948.006.patch, HDFS-13948.007.patch, > HDFS-13948.008.patch, HDFS-13948.009.patch, HDFS-13948.011.patch, > HDFS-13948.012.patch, HDFS-13948.013.patch, HDFS-13948.014.patch, HDFS-13948_ > Regex Link Type In Mont Table-V0.pdf, HDFS-13948_ Regex Link Type In Mount > Table-v1.pdf > > Time Spent: 8h 40m > Remaining Estimate: 0h > > This jira is created to support regex based mount point in Inode Tree. We > noticed that mount point only support fixed target path. However, we might > have user cases when target needs to refer some fields from source. e.g. We > might want a mapping of /cluster1/user1 => /cluster1-dc1/user-nn-user1, we > want to refer `cluster` and `user` field in source to construct target. It's > impossible to archive this with current link type. Though we could set > one-to-one mapping, the mount table would become bloated if we have thousands > of users. Besides, a regex mapping would empower us more flexibility. So we > are going to build a regex based mount point which target could refer groups > from src regex mapping. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17222) Create socket address leveraging URI cache
[ https://issues.apache.org/jira/browse/HADOOP-17222?focusedWorklogId=481898=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481898 ] ASF GitHub Bot logged work on HADOOP-17222: --- Author: ASF GitHub Bot Created on: 11/Sep/20 05:34 Start Date: 11/Sep/20 05:34 Worklog Time Spent: 10m Work Description: liuml07 commented on pull request #2241: URL: https://github.com/apache/hadoop/pull/2241#issuecomment-690885275 Failing tests are not related and all pass locally in my laptop except `TestNameNodeRetryCacheMetrics` which is known flaky see [HDFS-15458](https://issues.apache.org/jira/browse/HDFS-15458) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481898) Time Spent: 2h 40m (was: 2.5h) > Create socket address leveraging URI cache > --- > > Key: HADOOP-17222 > URL: https://issues.apache.org/jira/browse/HADOOP-17222 > Project: Hadoop Common > Issue Type: Improvement > Components: common, hdfs-client > Environment: HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) >Reporter: fanrui >Assignee: fanrui >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: After Optimization remark.png, After optimization.svg, > Before Optimization remark.png, Before optimization.svg > > Time Spent: 2h 40m > Remaining Estimate: 0h > > Note:Not only the hdfs client can get the current benefit, all callers of > NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an > example. > > Hdfs client selects best DN for hdfs Block. method call stack: > DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> > NetUtils.createSocketAddr > NetUtils.createSocketAddr creates the corresponding InetSocketAddress based > on the host and port. There are some heavier operations in the > NetUtils.createSocketAddr method, for example: URI.create(target), so > NetUtils.createSocketAddr takes more time to execute. > The following is my performance report. The report is based on HBase calling > hdfs. HBase is a high-frequency access client for hdfs, because HBase read > operations often access a small DataBlock (about 64k) instead of the entire > HFile. In the case of high frequency access, the NetUtils.createSocketAddr > method is time-consuming. > h3. Test Environment: > > {code:java} > HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) > {code} > h4. Before Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 4.86% of the entire CPU, and the creation of URIs accounts for a larger > proportion. > !Before Optimization remark.png! > h3. Optimization ideas: > NetUtils.createSocketAddr creates InetSocketAddress based on host and port. > Here we can add Cache to InetSocketAddress. The key of Cache is host and > port, and the value is InetSocketAddress. > h4. After Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, > and the ConcurrentHashMap.get() method gets data from the Cache. The CPU > usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% > to 0.54%. > !After Optimization remark.png! > h3. Original FlameGraph link: > [Before > Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] > [After Optimization > FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17222) Create socket address leveraging URI cache
[ https://issues.apache.org/jira/browse/HADOOP-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HADOOP-17222: --- Summary: Create socket address leveraging URI cache (was: Create socket address combined with cache to speed up hdfs client choose DataNode) > Create socket address leveraging URI cache > --- > > Key: HADOOP-17222 > URL: https://issues.apache.org/jira/browse/HADOOP-17222 > Project: Hadoop Common > Issue Type: Improvement > Components: common, hdfs-client > Environment: HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) >Reporter: fanrui >Assignee: fanrui >Priority: Major > Labels: pull-request-available > Attachments: After Optimization remark.png, After optimization.svg, > Before Optimization remark.png, Before optimization.svg > > Time Spent: 2.5h > Remaining Estimate: 0h > > Note:Not only the hdfs client can get the current benefit, all callers of > NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an > example. > > Hdfs client selects best DN for hdfs Block. method call stack: > DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> > NetUtils.createSocketAddr > NetUtils.createSocketAddr creates the corresponding InetSocketAddress based > on the host and port. There are some heavier operations in the > NetUtils.createSocketAddr method, for example: URI.create(target), so > NetUtils.createSocketAddr takes more time to execute. > The following is my performance report. The report is based on HBase calling > hdfs. HBase is a high-frequency access client for hdfs, because HBase read > operations often access a small DataBlock (about 64k) instead of the entire > HFile. In the case of high frequency access, the NetUtils.createSocketAddr > method is time-consuming. > h3. Test Environment: > > {code:java} > HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) > {code} > h4. Before Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 4.86% of the entire CPU, and the creation of URIs accounts for a larger > proportion. > !Before Optimization remark.png! > h3. Optimization ideas: > NetUtils.createSocketAddr creates InetSocketAddress based on host and port. > Here we can add Cache to InetSocketAddress. The key of Cache is host and > port, and the value is InetSocketAddress. > h4. After Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, > and the ConcurrentHashMap.get() method gets data from the Cache. The CPU > usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% > to 0.54%. > !After Optimization remark.png! > h3. Original FlameGraph link: > [Before > Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] > [After Optimization > FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] liuml07 commented on pull request #2241: HADOOP-17222. Create socket address combined with URI cache
liuml07 commented on pull request #2241: URL: https://github.com/apache/hadoop/pull/2241#issuecomment-690885275 Failing tests are not related and all pass locally in my laptop except `TestNameNodeRetryCacheMetrics` which is known flaky see [HDFS-15458](https://issues.apache.org/jira/browse/HDFS-15458) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] umamaheswararao commented on pull request #2185: HADOOP-15891. provide Regex Based Mount Point In Inode Tree
umamaheswararao commented on pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#issuecomment-690863729 +1 on the latest patch. Thanks again for your great work @JohnZZGithub Test failures are unrelated. I think we are 3 line crossed for a warning. Looks like it's not worth minor refactor for sake of that unless we do full refactoring of that method. Please file separate issue to refactor into much cleaner with smaller methods. I will proceed to commit the current patch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15891) Provide Regex Based Mount Point In Inode Tree
[ https://issues.apache.org/jira/browse/HADOOP-15891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HADOOP-15891: - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Thanks [~wzzdreamer] for the nice work here. I have committed this patch to trunk !! > Provide Regex Based Mount Point In Inode Tree > - > > Key: HADOOP-15891 > URL: https://issues.apache.org/jira/browse/HADOOP-15891 > Project: Hadoop Common > Issue Type: New Feature > Components: viewfs >Reporter: zhenzhao wang >Assignee: zhenzhao wang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: HADOOP-15891.015.patch, HDFS-13948.001.patch, > HDFS-13948.002.patch, HDFS-13948.003.patch, HDFS-13948.004.patch, > HDFS-13948.005.patch, HDFS-13948.006.patch, HDFS-13948.007.patch, > HDFS-13948.008.patch, HDFS-13948.009.patch, HDFS-13948.011.patch, > HDFS-13948.012.patch, HDFS-13948.013.patch, HDFS-13948.014.patch, HDFS-13948_ > Regex Link Type In Mont Table-V0.pdf, HDFS-13948_ Regex Link Type In Mount > Table-v1.pdf > > Time Spent: 9h > Remaining Estimate: 0h > > This jira is created to support regex based mount point in Inode Tree. We > noticed that mount point only support fixed target path. However, we might > have user cases when target needs to refer some fields from source. e.g. We > might want a mapping of /cluster1/user1 => /cluster1-dc1/user-nn-user1, we > want to refer `cluster` and `user` field in source to construct target. It's > impossible to archive this with current link type. Though we could set > one-to-one mapping, the mount table would become bloated if we have thousands > of users. Besides, a regex mapping would empower us more flexibility. So we > are going to build a regex based mount point which target could refer groups > from src regex mapping. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-15891) Provide Regex Based Mount Point In Inode Tree
[ https://issues.apache.org/jira/browse/HADOOP-15891?focusedWorklogId=481887=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481887 ] ASF GitHub Bot logged work on HADOOP-15891: --- Author: ASF GitHub Bot Created on: 11/Sep/20 04:28 Start Date: 11/Sep/20 04:28 Worklog Time Spent: 10m Work Description: JohnZZGithub commented on pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#issuecomment-690866308 @umamaheswararao Thanks a lot! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481887) Time Spent: 9h (was: 8h 50m) > Provide Regex Based Mount Point In Inode Tree > - > > Key: HADOOP-15891 > URL: https://issues.apache.org/jira/browse/HADOOP-15891 > Project: Hadoop Common > Issue Type: New Feature > Components: viewfs >Reporter: zhenzhao wang >Assignee: zhenzhao wang >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15891.015.patch, HDFS-13948.001.patch, > HDFS-13948.002.patch, HDFS-13948.003.patch, HDFS-13948.004.patch, > HDFS-13948.005.patch, HDFS-13948.006.patch, HDFS-13948.007.patch, > HDFS-13948.008.patch, HDFS-13948.009.patch, HDFS-13948.011.patch, > HDFS-13948.012.patch, HDFS-13948.013.patch, HDFS-13948.014.patch, HDFS-13948_ > Regex Link Type In Mont Table-V0.pdf, HDFS-13948_ Regex Link Type In Mount > Table-v1.pdf > > Time Spent: 9h > Remaining Estimate: 0h > > This jira is created to support regex based mount point in Inode Tree. We > noticed that mount point only support fixed target path. However, we might > have user cases when target needs to refer some fields from source. e.g. We > might want a mapping of /cluster1/user1 => /cluster1-dc1/user-nn-user1, we > want to refer `cluster` and `user` field in source to construct target. It's > impossible to archive this with current link type. Though we could set > one-to-one mapping, the mount table would become bloated if we have thousands > of users. Besides, a regex mapping would empower us more flexibility. So we > are going to build a regex based mount point which target could refer groups > from src regex mapping. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] JohnZZGithub commented on pull request #2185: HADOOP-15891. provide Regex Based Mount Point In Inode Tree
JohnZZGithub commented on pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#issuecomment-690866308 @umamaheswararao Thanks a lot! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode
[ https://issues.apache.org/jira/browse/HADOOP-17222?focusedWorklogId=481897=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481897 ] ASF GitHub Bot logged work on HADOOP-17222: --- Author: ASF GitHub Bot Created on: 11/Sep/20 05:31 Start Date: 11/Sep/20 05:31 Worklog Time Spent: 10m Work Description: liuml07 edited a comment on pull request #2241: URL: https://github.com/apache/hadoop/pull/2241#issuecomment-690850909 I see `TestNameNodeRetryCacheMetrics.testRetryCacheMetrics` and ` TestMultipleNNPortQOP.testMultipleNNPortOverwriteDownStream` fail on my Linux machine with the patch. Will confirm they are not related. If no objections, I'll commit later today. Thanks, This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481897) Time Spent: 2.5h (was: 2h 20m) > Create socket address combined with cache to speed up hdfs client choose > DataNode > - > > Key: HADOOP-17222 > URL: https://issues.apache.org/jira/browse/HADOOP-17222 > Project: Hadoop Common > Issue Type: Improvement > Components: common, hdfs-client > Environment: HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) >Reporter: fanrui >Assignee: fanrui >Priority: Major > Labels: pull-request-available > Attachments: After Optimization remark.png, After optimization.svg, > Before Optimization remark.png, Before optimization.svg > > Time Spent: 2.5h > Remaining Estimate: 0h > > Note:Not only the hdfs client can get the current benefit, all callers of > NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an > example. > > Hdfs client selects best DN for hdfs Block. method call stack: > DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> > NetUtils.createSocketAddr > NetUtils.createSocketAddr creates the corresponding InetSocketAddress based > on the host and port. There are some heavier operations in the > NetUtils.createSocketAddr method, for example: URI.create(target), so > NetUtils.createSocketAddr takes more time to execute. > The following is my performance report. The report is based on HBase calling > hdfs. HBase is a high-frequency access client for hdfs, because HBase read > operations often access a small DataBlock (about 64k) instead of the entire > HFile. In the case of high frequency access, the NetUtils.createSocketAddr > method is time-consuming. > h3. Test Environment: > > {code:java} > HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) > {code} > h4. Before Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 4.86% of the entire CPU, and the creation of URIs accounts for a larger > proportion. > !Before Optimization remark.png! > h3. Optimization ideas: > NetUtils.createSocketAddr creates InetSocketAddress based on host and port. > Here we can add Cache to InetSocketAddress. The key of Cache is host and > port, and the value is InetSocketAddress. > h4. After Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, > and the ConcurrentHashMap.get() method gets data from the Cache. The CPU > usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% > to 0.54%. > !After Optimization remark.png! > h3. Original FlameGraph link: > [Before > Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] > [After Optimization > FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode
[ https://issues.apache.org/jira/browse/HADOOP-17222?focusedWorklogId=481896=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481896 ] ASF GitHub Bot logged work on HADOOP-17222: --- Author: ASF GitHub Bot Created on: 11/Sep/20 05:31 Start Date: 11/Sep/20 05:31 Worklog Time Spent: 10m Work Description: liuml07 merged pull request #2241: URL: https://github.com/apache/hadoop/pull/2241 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481896) Time Spent: 2h 20m (was: 2h 10m) > Create socket address combined with cache to speed up hdfs client choose > DataNode > - > > Key: HADOOP-17222 > URL: https://issues.apache.org/jira/browse/HADOOP-17222 > Project: Hadoop Common > Issue Type: Improvement > Components: common, hdfs-client > Environment: HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) >Reporter: fanrui >Assignee: fanrui >Priority: Major > Labels: pull-request-available > Attachments: After Optimization remark.png, After optimization.svg, > Before Optimization remark.png, Before optimization.svg > > Time Spent: 2h 20m > Remaining Estimate: 0h > > Note:Not only the hdfs client can get the current benefit, all callers of > NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an > example. > > Hdfs client selects best DN for hdfs Block. method call stack: > DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> > NetUtils.createSocketAddr > NetUtils.createSocketAddr creates the corresponding InetSocketAddress based > on the host and port. There are some heavier operations in the > NetUtils.createSocketAddr method, for example: URI.create(target), so > NetUtils.createSocketAddr takes more time to execute. > The following is my performance report. The report is based on HBase calling > hdfs. HBase is a high-frequency access client for hdfs, because HBase read > operations often access a small DataBlock (about 64k) instead of the entire > HFile. In the case of high frequency access, the NetUtils.createSocketAddr > method is time-consuming. > h3. Test Environment: > > {code:java} > HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) > {code} > h4. Before Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 4.86% of the entire CPU, and the creation of URIs accounts for a larger > proportion. > !Before Optimization remark.png! > h3. Optimization ideas: > NetUtils.createSocketAddr creates InetSocketAddress based on host and port. > Here we can add Cache to InetSocketAddress. The key of Cache is host and > port, and the value is InetSocketAddress. > h4. After Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, > and the ConcurrentHashMap.get() method gets data from the Cache. The CPU > usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% > to 0.54%. > !After Optimization remark.png! > h3. Original FlameGraph link: > [Before > Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] > [After Optimization > FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17222) Create socket address leveraging URI cache
[ https://issues.apache.org/jira/browse/HADOOP-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HADOOP-17222: --- Fix Version/s: 3.4.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Create socket address leveraging URI cache > --- > > Key: HADOOP-17222 > URL: https://issues.apache.org/jira/browse/HADOOP-17222 > Project: Hadoop Common > Issue Type: Improvement > Components: common, hdfs-client > Environment: HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) >Reporter: fanrui >Assignee: fanrui >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: After Optimization remark.png, After optimization.svg, > Before Optimization remark.png, Before optimization.svg > > Time Spent: 2.5h > Remaining Estimate: 0h > > Note:Not only the hdfs client can get the current benefit, all callers of > NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an > example. > > Hdfs client selects best DN for hdfs Block. method call stack: > DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> > NetUtils.createSocketAddr > NetUtils.createSocketAddr creates the corresponding InetSocketAddress based > on the host and port. There are some heavier operations in the > NetUtils.createSocketAddr method, for example: URI.create(target), so > NetUtils.createSocketAddr takes more time to execute. > The following is my performance report. The report is based on HBase calling > hdfs. HBase is a high-frequency access client for hdfs, because HBase read > operations often access a small DataBlock (about 64k) instead of the entire > HFile. In the case of high frequency access, the NetUtils.createSocketAddr > method is time-consuming. > h3. Test Environment: > > {code:java} > HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) > {code} > h4. Before Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 4.86% of the entire CPU, and the creation of URIs accounts for a larger > proportion. > !Before Optimization remark.png! > h3. Optimization ideas: > NetUtils.createSocketAddr creates InetSocketAddress based on host and port. > Here we can add Cache to InetSocketAddress. The key of Cache is host and > port, and the value is InetSocketAddress. > h4. After Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, > and the ConcurrentHashMap.get() method gets data from the Cache. The CPU > usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% > to 0.54%. > !After Optimization remark.png! > h3. Original FlameGraph link: > [Before > Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] > [After Optimization > FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] liuml07 edited a comment on pull request #2241: HADOOP-17222. Create socket address combined with URI cache
liuml07 edited a comment on pull request #2241: URL: https://github.com/apache/hadoop/pull/2241#issuecomment-690850909 I see `TestNameNodeRetryCacheMetrics.testRetryCacheMetrics` and ` TestMultipleNNPortQOP.testMultipleNNPortOverwriteDownStream` fail on my Linux machine with the patch. Will confirm they are not related. If no objections, I'll commit later today. Thanks, This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-15891) Provide Regex Based Mount Point In Inode Tree
[ https://issues.apache.org/jira/browse/HADOOP-15891?focusedWorklogId=481885=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481885 ] ASF GitHub Bot logged work on HADOOP-15891: --- Author: ASF GitHub Bot Created on: 11/Sep/20 04:20 Start Date: 11/Sep/20 04:20 Worklog Time Spent: 10m Work Description: umamaheswararao merged pull request #2185: URL: https://github.com/apache/hadoop/pull/2185 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481885) Time Spent: 8h 50m (was: 8h 40m) > Provide Regex Based Mount Point In Inode Tree > - > > Key: HADOOP-15891 > URL: https://issues.apache.org/jira/browse/HADOOP-15891 > Project: Hadoop Common > Issue Type: New Feature > Components: viewfs >Reporter: zhenzhao wang >Assignee: zhenzhao wang >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15891.015.patch, HDFS-13948.001.patch, > HDFS-13948.002.patch, HDFS-13948.003.patch, HDFS-13948.004.patch, > HDFS-13948.005.patch, HDFS-13948.006.patch, HDFS-13948.007.patch, > HDFS-13948.008.patch, HDFS-13948.009.patch, HDFS-13948.011.patch, > HDFS-13948.012.patch, HDFS-13948.013.patch, HDFS-13948.014.patch, HDFS-13948_ > Regex Link Type In Mont Table-V0.pdf, HDFS-13948_ Regex Link Type In Mount > Table-v1.pdf > > Time Spent: 8h 50m > Remaining Estimate: 0h > > This jira is created to support regex based mount point in Inode Tree. We > noticed that mount point only support fixed target path. However, we might > have user cases when target needs to refer some fields from source. e.g. We > might want a mapping of /cluster1/user1 => /cluster1-dc1/user-nn-user1, we > want to refer `cluster` and `user` field in source to construct target. It's > impossible to archive this with current link type. Though we could set > one-to-one mapping, the mount table would become bloated if we have thousands > of users. Besides, a regex mapping would empower us more flexibility. So we > are going to build a regex based mount point which target could refer groups > from src regex mapping. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] umamaheswararao merged pull request #2185: HADOOP-15891. provide Regex Based Mount Point In Inode Tree
umamaheswararao merged pull request #2185: URL: https://github.com/apache/hadoop/pull/2185 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] aajisaka commented on pull request #2267: HDFS-15555. RBF: Refresh cacheNS when SocketException occurs.
aajisaka commented on pull request #2267: URL: https://github.com/apache/hadoop/pull/2267#issuecomment-690880934 HI @hemanthboyina I don't have any updates here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] LeonGao91 opened a new pull request #2299: HDFS-15456. TestExternalStoragePolicySatisfier fails intermittently
LeonGao91 opened a new pull request #2299: URL: https://github.com/apache/hadoop/pull/2299 One-liner fix of this UT. Ignore datanode load so block placement can successfully pick fallback storage type. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17256) DistCp -update option will be invalid when distcp files from hdfs to S3
[ https://issues.apache.org/jira/browse/HADOOP-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liuxiaolong updated HADOOP-17256: - Attachment: image-2020-09-10-17-45-16-998.png > DistCp -update option will be invalid when distcp files from hdfs to S3 > --- > > Key: HADOOP-17256 > URL: https://issues.apache.org/jira/browse/HADOOP-17256 > Project: Hadoop Common > Issue Type: Bug > Components: tools/distcp >Reporter: liuxiaolong >Priority: Major > Attachments: image-2020-09-10-17-25-46-354.png, > image-2020-09-10-17-33-50-505.png, image-2020-09-10-17-45-16-998.png > > > We use distcp with -update option to copy a dir from hdfs to S3. When we run > distcp job once more, it will overwrite S3 dir directly, rather than skip the > same files. > > Test Case: > 1. Run twice distcp cmd, the modify time of S3 files will be modified > hadoop distcp -update /testA/ s3a://tiered-storage-bigdata-1251625956/testA/ > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-17256) DistCp -update option will be invalid when distcp files from hdfs to S3
[ https://issues.apache.org/jira/browse/HADOOP-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-17256. - Resolution: Duplicate caused by HADOOP-8143, which has now been rolled back everywhere it went in. It can also cause 404 errors, so was a critical roll back. Closing as a duplicate of HADOOP-8143 All future releases of Hadoop branch 3 will contain this fix > DistCp -update option will be invalid when distcp files from hdfs to S3 > --- > > Key: HADOOP-17256 > URL: https://issues.apache.org/jira/browse/HADOOP-17256 > Project: Hadoop Common > Issue Type: Bug > Components: tools/distcp >Reporter: liuxiaolong >Priority: Major > Attachments: image-2020-09-10-17-25-46-354.png, > image-2020-09-10-17-33-50-505.png, image-2020-09-10-17-45-16-998.png, > image-2020-09-10-17-47-01-653.png, image-2020-09-10-17-52-32-290.png > > > We use distcp with -update option to copy a dir from hdfs to S3. When we run > distcp job once more, it will overwrite S3 dir directly, rather than skip the > same files. > > Test Case: > Run twice the following cmd, the modify time of S3 files will be modified > every time. > hadoop distcp -update /test/ s3a://${s3_buckect}/test/ > > Check code in CopyMapper.java and S3AFileSystem.java > (1) For the first time, distcp job will create files in S3, but blockSize is > unused! > !image-2020-09-10-17-45-16-998.png|width=542,height=485! > > (2) For the second time, the distcp job will compare fileSize and blockSize > between hdfs file and S3 file > !image-2020-09-10-17-47-01-653.png|width=524,height=248! > > (3) blockSize is unused, when get blockSize of S3 file, it return a default > value. > In S3AFileSystem.java, we find that the default value of fs.s3a.block.size is > 32 * 1024 * 1024. > !image-2020-09-10-17-33-50-505.png|width=451,height=762! > > !image-2020-09-10-17-52-32-290.png|width=527,height=87! > > The blockSize of HDFS seems invalid in Object Store, like S3. So I think > there's no need to compare blockSize when distcp with -update option. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-11452) Make FileSystem.rename(path, path, options) public, specified, tested
[ https://issues.apache.org/jira/browse/HADOOP-11452?focusedWorklogId=481375=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481375 ] ASF GitHub Bot logged work on HADOOP-11452: --- Author: ASF GitHub Bot Created on: 10/Sep/20 11:16 Start Date: 10/Sep/20 11:16 Worklog Time Spent: 10m Work Description: hadoop-yetus removed a comment on pull request #743: URL: https://github.com/apache/hadoop/pull/743#issuecomment-685019888 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 1m 11s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +0 :ok: | markdownlint | 0m 1s | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 10 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 20s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 28m 46s | trunk passed | | +1 :green_heart: | compile | 20m 47s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 17m 39s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 3m 1s | trunk passed | | +1 :green_heart: | mvnsite | 5m 5s | trunk passed | | +1 :green_heart: | shadedclient | 24m 24s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 3m 6s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 4m 37s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 0m 45s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 9m 54s | trunk passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 22s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 3m 39s | the patch passed | | +1 :green_heart: | compile | 20m 14s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | javac | 20m 14s | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 1 new + 2050 unchanged - 5 fixed = 2051 total (was 2055) | | +1 :green_heart: | compile | 17m 40s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | javac | 17m 40s | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 1 new + 1944 unchanged - 5 fixed = 1945 total (was 1949) | | -0 :warning: | checkstyle | 3m 3s | root: The patch generated 103 new + 484 unchanged - 30 fixed = 587 total (was 514) | | +1 :green_heart: | mvnsite | 5m 0s | the patch passed | | -1 :x: | whitespace | 0m 0s | The patch has 8 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply | | +1 :green_heart: | shadedclient | 15m 43s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 3m 7s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | javadoc | 1m 30s | hadoop-common-project_hadoop-common-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 3 new + 1 unchanged - 0 fixed = 4 total (was 1) | | -1 :x: | findbugs | 2m 22s | hadoop-common-project/hadoop-common generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | ||| _ Other Tests _ | | -1 :x: | unit | 9m 48s | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 2m 7s | hadoop-hdfs-client in the patch passed. | | -1 :x: | unit | 126m 42s | hadoop-hdfs in the patch passed. | | +1 :green_heart: | unit | 0m 38s | hadoop-openstack in the patch passed. | | +1 :green_heart: | unit | 1m 58s | hadoop-aws in the patch passed. | | -1 :x: | asflicense | 0m 59s | The patch generated 3 ASF License warnings. | | | | 345m 9s | | | Reason | Tests | |---:|:--| | FindBugs | module:hadoop-common-project/hadoop-common | | | Should org.apache.hadoop.fs.impl.RenameHelper$RenameValidationResult be a _static_ inner class? At RenameHelper.java:inner class? At RenameHelper.java:[line 320] | |
[GitHub] [hadoop] hadoop-yetus removed a comment on pull request #743: HADOOP-11452 make rename/3 public
hadoop-yetus removed a comment on pull request #743: URL: https://github.com/apache/hadoop/pull/743#issuecomment-685019888 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 1m 11s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +0 :ok: | markdownlint | 0m 1s | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 10 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 20s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 28m 46s | trunk passed | | +1 :green_heart: | compile | 20m 47s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 17m 39s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 3m 1s | trunk passed | | +1 :green_heart: | mvnsite | 5m 5s | trunk passed | | +1 :green_heart: | shadedclient | 24m 24s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 3m 6s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 4m 37s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 0m 45s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 9m 54s | trunk passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 22s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 3m 39s | the patch passed | | +1 :green_heart: | compile | 20m 14s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | javac | 20m 14s | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 1 new + 2050 unchanged - 5 fixed = 2051 total (was 2055) | | +1 :green_heart: | compile | 17m 40s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | javac | 17m 40s | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 1 new + 1944 unchanged - 5 fixed = 1945 total (was 1949) | | -0 :warning: | checkstyle | 3m 3s | root: The patch generated 103 new + 484 unchanged - 30 fixed = 587 total (was 514) | | +1 :green_heart: | mvnsite | 5m 0s | the patch passed | | -1 :x: | whitespace | 0m 0s | The patch has 8 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply | | +1 :green_heart: | shadedclient | 15m 43s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 3m 7s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | javadoc | 1m 30s | hadoop-common-project_hadoop-common-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 3 new + 1 unchanged - 0 fixed = 4 total (was 1) | | -1 :x: | findbugs | 2m 22s | hadoop-common-project/hadoop-common generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | ||| _ Other Tests _ | | -1 :x: | unit | 9m 48s | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 2m 7s | hadoop-hdfs-client in the patch passed. | | -1 :x: | unit | 126m 42s | hadoop-hdfs in the patch passed. | | +1 :green_heart: | unit | 0m 38s | hadoop-openstack in the patch passed. | | +1 :green_heart: | unit | 1m 58s | hadoop-aws in the patch passed. | | -1 :x: | asflicense | 0m 59s | The patch generated 3 ASF License warnings. | | | | 345m 9s | | | Reason | Tests | |---:|:--| | FindBugs | module:hadoop-common-project/hadoop-common | | | Should org.apache.hadoop.fs.impl.RenameHelper$RenameValidationResult be a _static_ inner class? At RenameHelper.java:inner class? At RenameHelper.java:[line 320] | | Failed junit tests | hadoop.fs.contract.rawlocal.TestRawlocalContractRenameEx | | | hadoop.fs.TestFSMainOperationsLocalFileSystem | | | hadoop.fs.viewfs.TestViewFsWithAuthorityLocalFs | | | hadoop.fs.TestSymlinkLocalFSFileContext | | | hadoop.fs.TestChecksumFs | | | hadoop.fs.viewfs.TestFSMainOperationsLocalFileSystem | | | hadoop.fs.viewfs.TestFcMainOperationsLocalFs | | |
[jira] [Work logged] (HADOOP-17253) Upgrade zookeeper to 3.4.14 on branch-2.10
[ https://issues.apache.org/jira/browse/HADOOP-17253?focusedWorklogId=481376=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481376 ] ASF GitHub Bot logged work on HADOOP-17253: --- Author: ASF GitHub Bot Created on: 10/Sep/20 11:17 Start Date: 10/Sep/20 11:17 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2289: URL: https://github.com/apache/hadoop/pull/2289#issuecomment-690175258 LGTM +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481376) Time Spent: 0.5h (was: 20m) > Upgrade zookeeper to 3.4.14 on branch-2.10 > -- > > Key: HADOOP-17253 > URL: https://issues.apache.org/jira/browse/HADOOP-17253 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Since versions of zookeeper and curator have different history between > branch-2.10 and trunk, I filed this to upgrade both zookeeper and curator on > branch-2.10. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #2289: HADOOP-17253. Upgrade zookeeper to 3.4.14 on branch-2.10.
steveloughran commented on pull request #2289: URL: https://github.com/apache/hadoop/pull/2289#issuecomment-690175258 LGTM +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17195) Intermittent OutOfMemory error while performing hdfs CopyFromLocal to abfs
[ https://issues.apache.org/jira/browse/HADOOP-17195?focusedWorklogId=481381=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481381 ] ASF GitHub Bot logged work on HADOOP-17195: --- Author: ASF GitHub Bot Created on: 10/Sep/20 11:22 Start Date: 10/Sep/20 11:22 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2294: URL: https://github.com/apache/hadoop/pull/2294#issuecomment-690183135 Looking at this a bit more * its the use of buffer which causes the OOM not the thread pooling, so neither this nor its predecessor patch will directly fix that * need to support a bytebuffer pool with max capacity and/or disk buffering This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481381) Time Spent: 0.5h (was: 20m) > Intermittent OutOfMemory error while performing hdfs CopyFromLocal to abfs > --- > > Key: HADOOP-17195 > URL: https://issues.apache.org/jira/browse/HADOOP-17195 > Project: Hadoop Common > Issue Type: Bug > Components: fs/azure >Affects Versions: 3.3.0 >Reporter: Mehakmeet Singh >Assignee: Bilahari T H >Priority: Major > Labels: abfsactive, pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > OutOfMemory error due to new ThreadPools being made each time > AbfsOutputStream is created. Since threadPool aren't limited a lot of data is > loaded in buffer and thus it causes OutOfMemory error. > Possible fixes: > - Limit the number of ThreadCounts while performing hdfs copyFromLocal (Using > -t property). > - Reducing OUTPUT_BUFFER_SIZE significantly which would limit the amount of > buffer to be loaded in threads. > - Don't create new ThreadPools each time AbfsOutputStream is created and > limit the number of ThreadPools each AbfsOutputStream could create. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17191) ABFS: Run the integration tests with various combinations of configurations and publish a consolidated results
[ https://issues.apache.org/jira/browse/HADOOP-17191?focusedWorklogId=481382=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481382 ] ASF GitHub Bot logged work on HADOOP-17191: --- Author: ASF GitHub Bot Created on: 10/Sep/20 11:23 Start Date: 10/Sep/20 11:23 Worklog Time Spent: 10m Work Description: hadoop-yetus removed a comment on pull request #2278: URL: https://github.com/apache/hadoop/pull/2278#issuecomment-687718737 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481382) Time Spent: 3h 40m (was: 3.5h) > ABFS: Run the integration tests with various combinations of configurations > and publish a consolidated results > -- > > Key: HADOOP-17191 > URL: https://issues.apache.org/jira/browse/HADOOP-17191 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure, test >Affects Versions: 3.3.0 >Reporter: Bilahari T H >Assignee: Bilahari T H >Priority: Minor > Labels: pull-request-available > Time Spent: 3h 40m > Remaining Estimate: 0h > > ADLS Gen 2 supports accounts with and without hierarchical namespace support. > ABFS driver supports various authorization mechanisms like OAuth, haredKey, > Shared Access Signature. The integration tests need to be executed against > accounts with and without hierarchical namespace support using various > authorization mechanisms. > Currently the developer has to manually run the tests with different > combinations of configurations ex: HNS account with SharedKey and OAuth, > NonHNS account with SharedKey etc.. > The expectation is to automate these runs with different combinations. This > will help the developer to run the integration tests with different variants > of configurations automatically. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus removed a comment on pull request #2278: HADOOP-17191. ABFS: Run the integration tests with various combinations of configurations and publish a consolidated results
hadoop-yetus removed a comment on pull request #2278: URL: https://github.com/apache/hadoop/pull/2278#issuecomment-687718737 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #2278: HADOOP-17191. ABFS: Run the integration tests with various combinations of configurations and publish a consolidated results
steveloughran commented on pull request #2278: URL: https://github.com/apache/hadoop/pull/2278#issuecomment-690188077 Happy with all the changes. Still unsure about the full implications. Can I still run "mvn verify" at the command line and individual tests in the IDE? + @mehakmeet This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17191) ABFS: Run the integration tests with various combinations of configurations and publish a consolidated results
[ https://issues.apache.org/jira/browse/HADOOP-17191?focusedWorklogId=481386=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481386 ] ASF GitHub Bot logged work on HADOOP-17191: --- Author: ASF GitHub Bot Created on: 10/Sep/20 11:28 Start Date: 10/Sep/20 11:28 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2278: URL: https://github.com/apache/hadoop/pull/2278#issuecomment-690188077 Happy with all the changes. Still unsure about the full implications. Can I still run "mvn verify" at the command line and individual tests in the IDE? + @mehakmeet This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481386) Time Spent: 4h (was: 3h 50m) > ABFS: Run the integration tests with various combinations of configurations > and publish a consolidated results > -- > > Key: HADOOP-17191 > URL: https://issues.apache.org/jira/browse/HADOOP-17191 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure, test >Affects Versions: 3.3.0 >Reporter: Bilahari T H >Assignee: Bilahari T H >Priority: Minor > Labels: pull-request-available > Time Spent: 4h > Remaining Estimate: 0h > > ADLS Gen 2 supports accounts with and without hierarchical namespace support. > ABFS driver supports various authorization mechanisms like OAuth, haredKey, > Shared Access Signature. The integration tests need to be executed against > accounts with and without hierarchical namespace support using various > authorization mechanisms. > Currently the developer has to manually run the tests with different > combinations of configurations ex: HNS account with SharedKey and OAuth, > NonHNS account with SharedKey etc.. > The expectation is to automate these runs with different combinations. This > will help the developer to run the integration tests with different variants > of configurations automatically. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17253) Upgrade zookeeper to 3.4.14 on branch-2.10
[ https://issues.apache.org/jira/browse/HADOOP-17253?focusedWorklogId=481394=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481394 ] ASF GitHub Bot logged work on HADOOP-17253: --- Author: ASF GitHub Bot Created on: 10/Sep/20 11:37 Start Date: 10/Sep/20 11:37 Worklog Time Spent: 10m Work Description: iwasakims commented on pull request #2289: URL: https://github.com/apache/hadoop/pull/2289#issuecomment-690200705 Thanks, @steveloughran. I merged this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 481394) Time Spent: 50m (was: 40m) > Upgrade zookeeper to 3.4.14 on branch-2.10 > -- > > Key: HADOOP-17253 > URL: https://issues.apache.org/jira/browse/HADOOP-17253 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Since versions of zookeeper and curator have different history between > branch-2.10 and trunk, I filed this to upgrade both zookeeper and curator on > branch-2.10. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] iwasakims commented on pull request #2289: HADOOP-17253. Upgrade zookeeper to 3.4.14 on branch-2.10.
iwasakims commented on pull request #2289: URL: https://github.com/apache/hadoop/pull/2289#issuecomment-690200705 Thanks, @steveloughran. I merged this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17256) DistCp -update option will be invalid when distcp files from hdfs to S3
[ https://issues.apache.org/jira/browse/HADOOP-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193566#comment-17193566 ] liuxiaolong commented on HADOOP-17256: -- Thanks, I try to roll back HADOOP-8143, it's ok now !image-2020-09-10-19-48-38-574.png|width=453,height=316! > DistCp -update option will be invalid when distcp files from hdfs to S3 > --- > > Key: HADOOP-17256 > URL: https://issues.apache.org/jira/browse/HADOOP-17256 > Project: Hadoop Common > Issue Type: Bug > Components: tools/distcp >Reporter: liuxiaolong >Priority: Major > Attachments: image-2020-09-10-17-25-46-354.png, > image-2020-09-10-17-33-50-505.png, image-2020-09-10-17-45-16-998.png, > image-2020-09-10-17-47-01-653.png, image-2020-09-10-17-52-32-290.png, > image-2020-09-10-19-48-38-574.png > > > We use distcp with -update option to copy a dir from hdfs to S3. When we run > distcp job once more, it will overwrite S3 dir directly, rather than skip the > same files. > > Test Case: > Run twice the following cmd, the modify time of S3 files will be modified > every time. > hadoop distcp -update /test/ s3a://${s3_buckect}/test/ > > Check code in CopyMapper.java and S3AFileSystem.java > (1) For the first time, distcp job will create files in S3, but blockSize is > unused! > !image-2020-09-10-17-45-16-998.png|width=542,height=485! > > (2) For the second time, the distcp job will compare fileSize and blockSize > between hdfs file and S3 file > !image-2020-09-10-17-47-01-653.png|width=524,height=248! > > (3) blockSize is unused, when get blockSize of S3 file, it return a default > value. > In S3AFileSystem.java, we find that the default value of fs.s3a.block.size is > 32 * 1024 * 1024. > !image-2020-09-10-17-33-50-505.png|width=451,height=762! > > !image-2020-09-10-17-52-32-290.png|width=527,height=87! > > The blockSize of HDFS seems invalid in Object Store, like S3. So I think > there's no need to compare blockSize when distcp with -update option. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17256) DistCp -update option will be invalid when distcp files from hdfs to S3
[ https://issues.apache.org/jira/browse/HADOOP-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liuxiaolong updated HADOOP-17256: - Attachment: image-2020-09-10-17-33-50-505.png > DistCp -update option will be invalid when distcp files from hdfs to S3 > --- > > Key: HADOOP-17256 > URL: https://issues.apache.org/jira/browse/HADOOP-17256 > Project: Hadoop Common > Issue Type: Bug > Components: tools/distcp >Reporter: liuxiaolong >Priority: Major > Attachments: image-2020-09-10-17-25-46-354.png, > image-2020-09-10-17-33-50-505.png > > > We use distcp with -update option to copy a dir from hdfs to S3. When we run > distcp job once more, it will overwrite S3 dir directly, rather than skip the > same files. > > Test Case: > 1. Run twice distcp cmd, the modify time of S3 files will be modified > hadoop distcp -update /testA/ s3a://tiered-storage-bigdata-1251625956/testA/ > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17256) DistCp -update option will be invalid when distcp files from hdfs to S3
[ https://issues.apache.org/jira/browse/HADOOP-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liuxiaolong updated HADOOP-17256: - Attachment: image-2020-09-10-17-52-32-290.png > DistCp -update option will be invalid when distcp files from hdfs to S3 > --- > > Key: HADOOP-17256 > URL: https://issues.apache.org/jira/browse/HADOOP-17256 > Project: Hadoop Common > Issue Type: Bug > Components: tools/distcp >Reporter: liuxiaolong >Priority: Major > Attachments: image-2020-09-10-17-25-46-354.png, > image-2020-09-10-17-33-50-505.png, image-2020-09-10-17-45-16-998.png, > image-2020-09-10-17-47-01-653.png, image-2020-09-10-17-52-32-290.png > > > We use distcp with -update option to copy a dir from hdfs to S3. When we run > distcp job once more, it will overwrite S3 dir directly, rather than skip the > same files. > > Test Case: > 1. Run twice distcp cmd, the modify time of S3 files will be modified > hadoop distcp -update /testA/ s3a://tiered-storage-bigdata-1251625956/testA/ > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus commented on pull request #2185: HADOOP-15891. provide Regex Based Mount Point In Inode Tree
hadoop-yetus commented on pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#issuecomment-690147488 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 28m 24s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | No case conflicting files found. | | +0 :ok: | markdownlint | 0m 0s | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 6 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 22s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 26m 14s | trunk passed | | +1 :green_heart: | compile | 20m 45s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 17m 51s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 2m 52s | trunk passed | | +1 :green_heart: | mvnsite | 2m 51s | trunk passed | | +1 :green_heart: | shadedclient | 21m 53s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 35s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 59s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 3m 13s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 5m 31s | trunk passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 30s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 5s | the patch passed | | +1 :green_heart: | compile | 19m 36s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javac | 19m 36s | the patch passed | | +1 :green_heart: | compile | 18m 31s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | javac | 18m 31s | the patch passed | | -0 :warning: | checkstyle | 2m 52s | root: The patch generated 1 new + 182 unchanged - 1 fixed = 183 total (was 183) | | +1 :green_heart: | mvnsite | 2m 51s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 14m 16s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 29s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 3m 6s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 5m 53s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | unit | 9m 48s | hadoop-common in the patch passed. | | -1 :x: | unit | 106m 51s | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 5s | The patch does not generate ASF License warnings. | | | | 322m 34s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.TestFileChecksum | | | hadoop.hdfs.TestCrcCorruption | | | hadoop.hdfs.TestDecommissionWithStripedBackoffMonitor | | | hadoop.hdfs.server.namenode.TestReconstructStripedBlocks | | | hadoop.hdfs.TestViewDistributedFileSystem | | | hadoop.hdfs.TestFileChecksumCompositeCrc | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy | | | hadoop.hdfs.server.datanode.TestBPOfferService | | | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier | | | hadoop.hdfs.TestMaintenanceState | | | hadoop.hdfs.TestDFSStripedInputStream | | | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics | | | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | | hadoop.hdfs.TestErasureCodingPolicyWithSnapshotWithRandomECPolicy | | | hadoop.hdfs.TestBlocksScheduledCounter | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2185/22/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2185 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle markdownlint | | uname | Linux 16f6f77a6cb6 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[GitHub] [hadoop] bshashikant commented on pull request #2295: HDFS-15563. Incorrect getTrashRoot return value when a non-snapshottable dir prefix matches the path of a snapshottable dir
bshashikant commented on pull request #2295: URL: https://github.com/apache/hadoop/pull/2295#issuecomment-690161198 Thanks @smengcl for the contribution. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (HADOOP-9331) Hadoop crypto codec framework and crypto codec implementations
[ https://issues.apache.org/jira/browse/HADOOP-9331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nikhil panchal updated HADOOP-9331: --- Comment: was deleted (was: Hello, I have ORC data stored in HDFS. I have one use case, encrypt one of the column present in ORC data. Can anyone suggest standard steps i need to follow or what hadoop component i can use.) > Hadoop crypto codec framework and crypto codec implementations > -- > > Key: HADOOP-9331 > URL: https://issues.apache.org/jira/browse/HADOOP-9331 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0-alpha1 >Reporter: Haifeng Chen >Priority: Major > Attachments: Hadoop Crypto Design.pdf > > Original Estimate: 504h > Remaining Estimate: 504h > > For use cases that deal with sensitive data, we often need to encrypt data > to be stored safely at rest. Hadoop common provides a codec framework for > compression algorithms. We start here. However because encryption algorithms > require some additional configuration and methods for key management, we > introduce a crypto codec framework that builds on the compression codec > framework. It cleanly distinguishes crypto algorithms from compression > algorithms, but shares common interfaces between them where possible, and > also carries extended interfaces where necessary to satisfy those needs. We > also introduce a generic Key type, and supporting utility methods and > classes, as a necessary abstraction for dealing with both Java crypto keys > and PGP keys. > The task for this feature breaks into two parts: > 1. The crypto codec framework that based on compression codec which can be > shared by all crypto codec implementations. > 2. The codec implementations such as AES and others. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17253) Upgrade zookeeper to 3.4.14 on branch-2.10
[ https://issues.apache.org/jira/browse/HADOOP-17253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HADOOP-17253: -- Fix Version/s: 2.10.1 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > Upgrade zookeeper to 3.4.14 on branch-2.10 > -- > > Key: HADOOP-17253 > URL: https://issues.apache.org/jira/browse/HADOOP-17253 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Fix For: 2.10.1 > > Time Spent: 50m > Remaining Estimate: 0h > > Since versions of zookeeper and curator have different history between > branch-2.10 and trunk, I filed this to upgrade both zookeeper and curator on > branch-2.10. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus commented on pull request #2296: HDFS-15568. namenode start failed to start when dfs.namenode.max.snapshot.limit set.
hadoop-yetus commented on pull request #2296: URL: https://github.com/apache/hadoop/pull/2296#issuecomment-690215954 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 40m 35s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. | ||| _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 32m 7s | trunk passed | | +1 :green_heart: | compile | 1m 16s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 1m 8s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 0m 47s | trunk passed | | +1 :green_heart: | mvnsite | 1m 15s | trunk passed | | +1 :green_heart: | shadedclient | 17m 51s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 50s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 19s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 3m 13s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 3m 11s | trunk passed | ||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 8s | the patch passed | | +1 :green_heart: | compile | 1m 11s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javac | 1m 11s | the patch passed | | +1 :green_heart: | compile | 1m 1s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | javac | 1m 1s | the patch passed | | -0 :warning: | checkstyle | 0m 40s | hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 89 unchanged - 0 fixed = 92 total (was 89) | | +1 :green_heart: | mvnsite | 1m 10s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 15m 27s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 46s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 16s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 3m 27s | the patch passed | ||| _ Other Tests _ | | -1 :x: | unit | 80m 10s | hadoop-hdfs in the patch passed. | | -1 :x: | asflicense | 0m 35s | The patch generated 4 ASF License warnings. | | | | 209m 18s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier | | | hadoop.fs.viewfs.TestViewFSOverloadSchemeWithMountTableConfigInHDFS | | | hadoop.fs.viewfs.TestViewFileSystemHdfs | | | hadoop.fs.viewfs.TestViewFileSystemLinkMergeSlash | | | hadoop.hdfs.TestSnapshotCommands | | | hadoop.fs.viewfs.TestViewFsAtHdfsRoot | | | hadoop.hdfs.TestFileChecksumCompositeCrc | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2296/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2296 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 94a92cfd8c29 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / e5fe3262702 | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | checkstyle | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2296/1/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | unit | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2296/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2296/1/testReport/ | | asflicense |
[jira] [Updated] (HADOOP-17256) DistCp -update option will be invalid when distcp files from hdfs to S3
[ https://issues.apache.org/jira/browse/HADOOP-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liuxiaolong updated HADOOP-17256: - Attachment: image-2020-09-10-19-48-38-574.png > DistCp -update option will be invalid when distcp files from hdfs to S3 > --- > > Key: HADOOP-17256 > URL: https://issues.apache.org/jira/browse/HADOOP-17256 > Project: Hadoop Common > Issue Type: Bug > Components: tools/distcp >Reporter: liuxiaolong >Priority: Major > Attachments: image-2020-09-10-17-25-46-354.png, > image-2020-09-10-17-33-50-505.png, image-2020-09-10-17-45-16-998.png, > image-2020-09-10-17-47-01-653.png, image-2020-09-10-17-52-32-290.png, > image-2020-09-10-19-48-38-574.png > > > We use distcp with -update option to copy a dir from hdfs to S3. When we run > distcp job once more, it will overwrite S3 dir directly, rather than skip the > same files. > > Test Case: > Run twice the following cmd, the modify time of S3 files will be modified > every time. > hadoop distcp -update /test/ s3a://${s3_buckect}/test/ > > Check code in CopyMapper.java and S3AFileSystem.java > (1) For the first time, distcp job will create files in S3, but blockSize is > unused! > !image-2020-09-10-17-45-16-998.png|width=542,height=485! > > (2) For the second time, the distcp job will compare fileSize and blockSize > between hdfs file and S3 file > !image-2020-09-10-17-47-01-653.png|width=524,height=248! > > (3) blockSize is unused, when get blockSize of S3 file, it return a default > value. > In S3AFileSystem.java, we find that the default value of fs.s3a.block.size is > 32 * 1024 * 1024. > !image-2020-09-10-17-33-50-505.png|width=451,height=762! > > !image-2020-09-10-17-52-32-290.png|width=527,height=87! > > The blockSize of HDFS seems invalid in Object Store, like S3. So I think > there's no need to compare blockSize when distcp with -update option. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16918) Dependency update for Hadoop 2.10
[ https://issues.apache.org/jira/browse/HADOOP-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193563#comment-17193563 ] Masatake Iwasaki commented on HADOOP-16918: --- Thanks, [~symat]. Hadoop 3.3.0 and above already moved to ZooKeeper 3.5 in HADOOP-16579. Since we are preparing for 2.10.1 which is patch release of the oldest live branch, compatibility is the concern. Based on the patch of HADOOP-16579, upgrading to 3.5 brings a change of code and transitive dependency. I'm considering 3.4.14 as a safe candidate here. > Dependency update for Hadoop 2.10 > - > > Key: HADOOP-16918 > URL: https://issues.apache.org/jira/browse/HADOOP-16918 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Priority: Major > Labels: release-blocker > Attachments: dependency-check-report.html, > dependency-check-report.html > > > A number of dependencies can be updated. > nimbus-jose-jwt > jetty > netty > zookeeper > hbase-common > jackson-databind > and many more. They should be updated in the 2.10.1 release. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13238) pid handling is failing on secure datanode
[ https://issues.apache.org/jira/browse/HADOOP-13238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193575#comment-17193575 ] zhuqi commented on HADOOP-13238: cc [~boky01] [~aw] If i can take it, recent i am using the new hadoop 3.2.1 to construct our production clusters, also meet the problem. Now i update the latest patch, fix another problem ,when the service is not running, but we call stop, the cat will show problem, because there are no pidfile. Thanks. > pid handling is failing on secure datanode > -- > > Key: HADOOP-13238 > URL: https://issues.apache.org/jira/browse/HADOOP-13238 > Project: Hadoop Common > Issue Type: Bug > Components: scripts, security >Reporter: Allen Wittenauer >Assignee: Andras Bokor >Priority: Major > Attachments: HADOOP-13238.01.patch, HADOOP-13238.02.patch > > > {code} > hdfs --daemon stop datanode > cat: /home/hadoop/H/pids/hadoop-hdfs-root-datanode.pid: No such file or > directory > WARNING: pid has changed for datanode, skip deleting pid file > cat: /home/hadoop/H/pids/hadoop-hdfs-root-datanode.pid: No such file or > directory > WARNING: daemon pid has changed for datanode, skip deleting daemon pid file > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17256) DistCp -update option will be invalid when distcp files from hdfs to S3
[ https://issues.apache.org/jira/browse/HADOOP-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liuxiaolong updated HADOOP-17256: - Attachment: image-2020-09-10-17-25-46-354.png > DistCp -update option will be invalid when distcp files from hdfs to S3 > --- > > Key: HADOOP-17256 > URL: https://issues.apache.org/jira/browse/HADOOP-17256 > Project: Hadoop Common > Issue Type: Bug > Components: tools/distcp >Reporter: liuxiaolong >Priority: Major > Attachments: image-2020-09-10-17-25-46-354.png > > > We use distcp with -update option to copy a dir from hdfs to S3. When we run > distcp job once more, it will overwrite S3 dir directly, rather than skip the > same files. > > Test Case: > 1. Run twice distcp cmd, the modify time of S3 files will be modified > hadoop distcp -update /testA/ s3a://tiered-storage-bigdata-1251625956/testA/ > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16918) Dependency update for Hadoop 2.10
[ https://issues.apache.org/jira/browse/HADOOP-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193492#comment-17193492 ] Mate Szalay-Beko commented on HADOOP-16918: --- Hello guys! I'm missing the context here a bit... still FYI: * ZooKeeper 3.4 is EOL now, I think no more new security updates / CVE fixes will be provided for the 3.4 line. The last 3.4 ZooKeeper is the 3.4.14 version, released on April, 2019. * The 3.5 / 3.6 ZooKeeper versions are still active, we provide relatively frequent releases with bugfixes and security fixes. > Dependency update for Hadoop 2.10 > - > > Key: HADOOP-16918 > URL: https://issues.apache.org/jira/browse/HADOOP-16918 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Priority: Major > Labels: release-blocker > Attachments: dependency-check-report.html, > dependency-check-report.html > > > A number of dependencies can be updated. > nimbus-jose-jwt > jetty > netty > zookeeper > hbase-common > jackson-databind > and many more. They should be updated in the 2.10.1 release. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17256) DistCp -update option will be invalid when distcp files from hdfs to S3
[ https://issues.apache.org/jira/browse/HADOOP-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liuxiaolong updated HADOOP-17256: - Attachment: image-2020-09-10-17-47-01-653.png > DistCp -update option will be invalid when distcp files from hdfs to S3 > --- > > Key: HADOOP-17256 > URL: https://issues.apache.org/jira/browse/HADOOP-17256 > Project: Hadoop Common > Issue Type: Bug > Components: tools/distcp >Reporter: liuxiaolong >Priority: Major > Attachments: image-2020-09-10-17-25-46-354.png, > image-2020-09-10-17-33-50-505.png, image-2020-09-10-17-45-16-998.png, > image-2020-09-10-17-47-01-653.png > > > We use distcp with -update option to copy a dir from hdfs to S3. When we run > distcp job once more, it will overwrite S3 dir directly, rather than skip the > same files. > > Test Case: > 1. Run twice distcp cmd, the modify time of S3 files will be modified > hadoop distcp -update /testA/ s3a://tiered-storage-bigdata-1251625956/testA/ > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17256) DistCp -update option will be invalid when distcp files from hdfs to S3
[ https://issues.apache.org/jira/browse/HADOOP-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liuxiaolong updated HADOOP-17256: - Description: We use distcp with -update option to copy a dir from hdfs to S3. When we run distcp job once more, it will overwrite S3 dir directly, rather than skip the same files. Test Case: Run twice the following cmd, the modify time of S3 files will be modified every time. hadoop distcp -update /test/ s3a://${s3_buckect}/test/ Check code in CopyMapper.java and S3AFileSystem.java (1) For the first time, distcp job will create files in S3, but blockSize is unused! !image-2020-09-10-17-45-16-998.png|width=542,height=485! (2) For the second time, the distcp job will compare fileSize and blockSize between hdfs file and S3 file !image-2020-09-10-17-47-01-653.png|width=524,height=248! (3) blockSize is unused, when get blockSize of S3 file, it return a default value. In S3AFileSystem.java, we find that the default value of fs.s3a.block.size is 32 * 1024 * 1024. !image-2020-09-10-17-33-50-505.png|width=451,height=762! !image-2020-09-10-17-52-32-290.png|width=527,height=87! The blockSize of HDFS seems invalid in Object Store, like S3. So I think there's no need to compare blockSize when distcp with -update option. was: We use distcp with -update option to copy a dir from hdfs to S3. When we run distcp job once more, it will overwrite S3 dir directly, rather than skip the same files. Test Case: 1. Run twice distcp cmd, the modify time of S3 files will be modified hadoop distcp -update /testA/ s3a://tiered-storage-bigdata-1251625956/testA/ > DistCp -update option will be invalid when distcp files from hdfs to S3 > --- > > Key: HADOOP-17256 > URL: https://issues.apache.org/jira/browse/HADOOP-17256 > Project: Hadoop Common > Issue Type: Bug > Components: tools/distcp >Reporter: liuxiaolong >Priority: Major > Attachments: image-2020-09-10-17-25-46-354.png, > image-2020-09-10-17-33-50-505.png, image-2020-09-10-17-45-16-998.png, > image-2020-09-10-17-47-01-653.png, image-2020-09-10-17-52-32-290.png > > > We use distcp with -update option to copy a dir from hdfs to S3. When we run > distcp job once more, it will overwrite S3 dir directly, rather than skip the > same files. > > Test Case: > Run twice the following cmd, the modify time of S3 files will be modified > every time. > hadoop distcp -update /test/ s3a://${s3_buckect}/test/ > > Check code in CopyMapper.java and S3AFileSystem.java > (1) For the first time, distcp job will create files in S3, but blockSize is > unused! > !image-2020-09-10-17-45-16-998.png|width=542,height=485! > > (2) For the second time, the distcp job will compare fileSize and blockSize > between hdfs file and S3 file > !image-2020-09-10-17-47-01-653.png|width=524,height=248! > > (3) blockSize is unused, when get blockSize of S3 file, it return a default > value. > In S3AFileSystem.java, we find that the default value of fs.s3a.block.size is > 32 * 1024 * 1024. > !image-2020-09-10-17-33-50-505.png|width=451,height=762! > > !image-2020-09-10-17-52-32-290.png|width=527,height=87! > > The blockSize of HDFS seems invalid in Object Store, like S3. So I think > there's no need to compare blockSize when distcp with -update option. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-15891) Provide Regex Based Mount Point In Inode Tree
[ https://issues.apache.org/jira/browse/HADOOP-15891?focusedWorklogId=481364=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481364 ] ASF GitHub Bot logged work on HADOOP-15891: --- Author: ASF GitHub Bot Created on: 10/Sep/20 10:43 Start Date: 10/Sep/20 10:43 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#issuecomment-690147488 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 28m 24s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | No case conflicting files found. | | +0 :ok: | markdownlint | 0m 0s | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 6 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 22s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 26m 14s | trunk passed | | +1 :green_heart: | compile | 20m 45s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 17m 51s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 2m 52s | trunk passed | | +1 :green_heart: | mvnsite | 2m 51s | trunk passed | | +1 :green_heart: | shadedclient | 21m 53s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 35s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 59s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 3m 13s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 5m 31s | trunk passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 30s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 5s | the patch passed | | +1 :green_heart: | compile | 19m 36s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javac | 19m 36s | the patch passed | | +1 :green_heart: | compile | 18m 31s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | javac | 18m 31s | the patch passed | | -0 :warning: | checkstyle | 2m 52s | root: The patch generated 1 new + 182 unchanged - 1 fixed = 183 total (was 183) | | +1 :green_heart: | mvnsite | 2m 51s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 14m 16s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 29s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 3m 6s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 5m 53s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | unit | 9m 48s | hadoop-common in the patch passed. | | -1 :x: | unit | 106m 51s | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 5s | The patch does not generate ASF License warnings. | | | | 322m 34s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.TestFileChecksum | | | hadoop.hdfs.TestCrcCorruption | | | hadoop.hdfs.TestDecommissionWithStripedBackoffMonitor | | | hadoop.hdfs.server.namenode.TestReconstructStripedBlocks | | | hadoop.hdfs.TestViewDistributedFileSystem | | | hadoop.hdfs.TestFileChecksumCompositeCrc | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy | | | hadoop.hdfs.server.datanode.TestBPOfferService | | | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier | | | hadoop.hdfs.TestMaintenanceState | | | hadoop.hdfs.TestDFSStripedInputStream | | | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics | | | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | | hadoop.hdfs.TestErasureCodingPolicyWithSnapshotWithRandomECPolicy | | | hadoop.hdfs.TestBlocksScheduledCounter | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base:
[jira] [Commented] (HADOOP-9331) Hadoop crypto codec framework and crypto codec implementations
[ https://issues.apache.org/jira/browse/HADOOP-9331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193549#comment-17193549 ] Steve Loughran commented on HADOOP-9331: bq. I have ORC data stored in HDFS. I have one use case, encrypt one of the column present in ORC data. Can anyone suggest standard steps i need to follow or what hadoop component i can use. something to take up with the ORC team. JIRAs aren't the place for queries like that. Thanks > Hadoop crypto codec framework and crypto codec implementations > -- > > Key: HADOOP-9331 > URL: https://issues.apache.org/jira/browse/HADOOP-9331 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0-alpha1 >Reporter: Haifeng Chen >Priority: Major > Attachments: Hadoop Crypto Design.pdf > > Original Estimate: 504h > Remaining Estimate: 504h > > For use cases that deal with sensitive data, we often need to encrypt data > to be stored safely at rest. Hadoop common provides a codec framework for > compression algorithms. We start here. However because encryption algorithms > require some additional configuration and methods for key management, we > introduce a crypto codec framework that builds on the compression codec > framework. It cleanly distinguishes crypto algorithms from compression > algorithms, but shares common interfaces between them where possible, and > also carries extended interfaces where necessary to satisfy those needs. We > also introduce a generic Key type, and supporting utility methods and > classes, as a necessary abstraction for dealing with both Java crypto keys > and PGP keys. > The task for this feature breaks into two parts: > 1. The crypto codec framework that based on compression codec which can be > shared by all crypto codec implementations. > 2. The codec implementations such as AES and others. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16872) Performance improvement when distcp files in large dir with -direct option
[ https://issues.apache.org/jira/browse/HADOOP-16872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-16872: Affects Version/s: 3.3.0 3.2.1 > Performance improvement when distcp files in large dir with -direct option > -- > > Key: HADOOP-16872 > URL: https://issues.apache.org/jira/browse/HADOOP-16872 > Project: Hadoop Common > Issue Type: Improvement > Components: tools/distcp >Affects Versions: 3.3.0, 3.2.1 >Reporter: liuxiaolong >Priority: Major > Attachments: HADOOP-16872.001.patch, optimise after.png, optimise > before.png > > > We use distcp with -direct option to copy a file between two large > directories. We found it costed a few minutes. If we launch too much distcp > jobs at the same time, NameNode performance degradation is serious. > hadoop -direct -skipcrccheck -update -prbugaxt -i -numListstatusThreads 1 > hdfs://cluster1:8020/source/100.log hdfs://cluster2:8020/target/100.jpg > || ||Dir path||Count|| > ||Source dir|| hdfs://cluster1:8020/source/ ||100k+ files|| > ||Target dir||hdfs://cluster2:8020/target/ ||100k+ files|| > > Check code in CopyCommitter.java, we find in function > deleteAttemptTempFiles() has a code targetFS.globStatus(new > Path(targetWorkPath, ".distcp.tmp." + jobId.replaceAll("job","attempt") + > "*")); > It will waste a lot of time when distcp between two large dirs. When we use > distcp with -direct option, it will direct write to the target file without > generate a '.distcp.tmp' temp file. So, i think this code need add a > judgment before call function deleteAttemptTempFiles, if distcp with -direct > option, do nothing , directly return . > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16872) Performance improvement when distcp files in large dir with -direct option
[ https://issues.apache.org/jira/browse/HADOOP-16872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-16872: Component/s: tools/distcp > Performance improvement when distcp files in large dir with -direct option > -- > > Key: HADOOP-16872 > URL: https://issues.apache.org/jira/browse/HADOOP-16872 > Project: Hadoop Common > Issue Type: Improvement > Components: tools/distcp >Reporter: liuxiaolong >Priority: Major > Attachments: HADOOP-16872.001.patch, optimise after.png, optimise > before.png > > > We use distcp with -direct option to copy a file between two large > directories. We found it costed a few minutes. If we launch too much distcp > jobs at the same time, NameNode performance degradation is serious. > hadoop -direct -skipcrccheck -update -prbugaxt -i -numListstatusThreads 1 > hdfs://cluster1:8020/source/100.log hdfs://cluster2:8020/target/100.jpg > || ||Dir path||Count|| > ||Source dir|| hdfs://cluster1:8020/source/ ||100k+ files|| > ||Target dir||hdfs://cluster2:8020/target/ ||100k+ files|| > > Check code in CopyCommitter.java, we find in function > deleteAttemptTempFiles() has a code targetFS.globStatus(new > Path(targetWorkPath, ".distcp.tmp." + jobId.replaceAll("job","attempt") + > "*")); > It will waste a lot of time when distcp between two large dirs. When we use > distcp with -direct option, it will direct write to the target file without > generate a '.distcp.tmp' temp file. So, i think this code need add a > judgment before call function deleteAttemptTempFiles, if distcp with -direct > option, do nothing , directly return . > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16872) Performance improvement when distcp files in large dir with -direct option
[ https://issues.apache.org/jira/browse/HADOOP-16872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193548#comment-17193548 ] Steve Loughran commented on HADOOP-16872: - missed this. Could you submit it as a github PR, as that is where we review patches? thanks > Performance improvement when distcp files in large dir with -direct option > -- > > Key: HADOOP-16872 > URL: https://issues.apache.org/jira/browse/HADOOP-16872 > Project: Hadoop Common > Issue Type: Improvement >Reporter: liuxiaolong >Priority: Major > Attachments: HADOOP-16872.001.patch, optimise after.png, optimise > before.png > > > We use distcp with -direct option to copy a file between two large > directories. We found it costed a few minutes. If we launch too much distcp > jobs at the same time, NameNode performance degradation is serious. > hadoop -direct -skipcrccheck -update -prbugaxt -i -numListstatusThreads 1 > hdfs://cluster1:8020/source/100.log hdfs://cluster2:8020/target/100.jpg > || ||Dir path||Count|| > ||Source dir|| hdfs://cluster1:8020/source/ ||100k+ files|| > ||Target dir||hdfs://cluster2:8020/target/ ||100k+ files|| > > Check code in CopyCommitter.java, we find in function > deleteAttemptTempFiles() has a code targetFS.globStatus(new > Path(targetWorkPath, ".distcp.tmp." + jobId.replaceAll("job","attempt") + > "*")); > It will waste a lot of time when distcp between two large dirs. When we use > distcp with -direct option, it will direct write to the target file without > generate a '.distcp.tmp' temp file. So, i think this code need add a > judgment before call function deleteAttemptTempFiles, if distcp with -direct > option, do nothing , directly return . > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org