[jira] [Commented] (HDFS-10434) Fix intermittent test failure of TestDataNodeErasureCodingMetrics
[ https://issues.apache.org/jira/browse/HDFS-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299509#comment-15299509 ] Kai Zheng commented on HDFS-10434: -- Thanks Rakesh for tracking this. The fix LGTM and +1. A minor is the *counter* could be renamed to avoid a little confusing with metric counters. {code} +int counter = 20; {code} Will commit this shortly addressing the minor. > Fix intermittent test failure of TestDataNodeErasureCodingMetrics > - > > Key: HDFS-10434 > URL: https://issues.apache.org/jira/browse/HDFS-10434 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-10434-00.patch > > > This jira is to fix the test case failure. > Reference : > [Build15485_TestDataNodeErasureCodingMetrics_testEcTasks|https://builds.apache.org/job/PreCommit-HDFS-Build/15485/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeErasureCodingMetrics/testEcTasks/] > {code} > Error Message > Bad value for metric EcReconstructionTasks expected:<1> but was:<0> > Stacktrace > java.lang.AssertionError: Bad value for metric EcReconstructionTasks > expected:<1> but was:<0> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.test.MetricsAsserts.assertCounter(MetricsAsserts.java:228) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.testEcTasks(TestDataNodeErasureCodingMetrics.java:92) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8057) Move BlockReader implementation to the client implementation package
[ https://issues.apache.org/jira/browse/HDFS-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299471#comment-15299471 ] Hadoop QA commented on HDFS-8057: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 7s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 27 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 33s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s {color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 20s {color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 28s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} branch-2 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 48s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in branch-2 has 7 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 5s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in branch-2 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 31s {color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 20s {color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 16s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 40s {color} | {color:red} hadoop-hdfs-project: patch generated 107 new + 697 unchanged - 110 fixed = 804 total (was 807) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 49 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 4s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 28s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 15s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 56s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 43s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 3s {color} | {color:green} hadoop-hdfs-client in the patch passed
[jira] [Commented] (HDFS-10433) Make retry also works well for Async DFS
[ https://issues.apache.org/jira/browse/HDFS-10433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299468#comment-15299468 ] Hadoop QA commented on HDFS-10433: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 45s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 37s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 42s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 41s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 50s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 32s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 26s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 9m 54s {color} | {color:red} root generated 1 new + 695 unchanged - 1 fixed = 696 total (was 696) {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 26s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 34s {color} | {color:red} root: patch generated 42 new + 211 unchanged - 5 fixed = 253 total (was 216) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 44s {color} | {color:red} hadoop-common-project/hadoop-common generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 9s {color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 57s {color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 92m 6s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 153m 59s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-common-project/hadoop-common | | | Inconsistent synchronization of org.apache.hadoop.io.retry.RetryInvocationHandler$CallReturn.returnValue; locked 66% of time Unsynchronized access at RetryInvocationHandler.java:66% of time Unsynchronized access at RetryInvocationHandler.java:[line 219] | | | Inconsistent synchronization of org.apache.hadoop.io.retry.RetryInvocationHandler$CallReturn.state; locked 83% of time Unsynchronized access at RetryInvocationHandler.java:83% of time Unsynchronized access at RetryInvocationHandler.java:[line 219] | | | Inconsistent synchronization of org.apache.hadoop.io.retry.RetryInvocationHandler$CallReturn.thrown; locked 80% of time Unsynchronized access at
[jira] [Created] (HDFS-10458) getFileEncryptionInfo should return quickly for non-encrypted cluster
Zhe Zhang created HDFS-10458: Summary: getFileEncryptionInfo should return quickly for non-encrypted cluster Key: HDFS-10458 URL: https://issues.apache.org/jira/browse/HDFS-10458 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Zhe Zhang Assignee: Zhe Zhang {{FSDirectory#getFileEncryptionInfo}} always acquires {{readLock}} and checks if the path belongs to an EZ. For a busy system with potentially many listing operations, this could cause locking contention. I think we should add a call {{EncryptionZoneManager#hasEncryptionZone()}} to return whether the system has any EZ. If no EZ at all, {{getFileEncryptionInfo}} should return null without {{readLock}}. If {{hasEncryptionZone}} is only used in the above scenario, maybe itself doesn't need a {{readLock}} -- if the system doesn't have any EZ when {{getFileEncryptionInfo}} is called on a path, it means the path cannot be encrypted. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10457) DataNode should not auto-format block pool directory if VERSION is missing
[ https://issues.apache.org/jira/browse/HDFS-10457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-10457: --- Attachment: HDFS-10457.001.patch v01: a one-liner fix. > DataNode should not auto-format block pool directory if VERSION is missing > -- > > Key: HDFS-10457 > URL: https://issues.apache.org/jira/browse/HDFS-10457 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-10457.001.patch > > > HDFS-10360 prevents DN to auto-formats a volume directory if the > current/VERSION is missing. However, if instead, the current/VERSION in a > block pool directory is missing, DN still auto-formats the directory. > Filing this jira to fix the bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10457) DataNode should not auto-format block pool directory if VERSION is missing
[ https://issues.apache.org/jira/browse/HDFS-10457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-10457: --- Attachment: (was: HDFS-10457.001.patch) > DataNode should not auto-format block pool directory if VERSION is missing > -- > > Key: HDFS-10457 > URL: https://issues.apache.org/jira/browse/HDFS-10457 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-10457.001.patch > > > HDFS-10360 prevents DN to auto-formats a volume directory if the > current/VERSION is missing. However, if instead, the current/VERSION in a > block pool directory is missing, DN still auto-formats the directory. > Filing this jira to fix the bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10457) DataNode should not auto-format block pool directory if VERSION is missing
[ https://issues.apache.org/jira/browse/HDFS-10457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-10457: --- Attachment: HDFS-10457.001.patch > DataNode should not auto-format block pool directory if VERSION is missing > -- > > Key: HDFS-10457 > URL: https://issues.apache.org/jira/browse/HDFS-10457 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-10457.001.patch > > > HDFS-10360 prevents DN to auto-formats a volume directory if the > current/VERSION is missing. However, if instead, the current/VERSION in a > block pool directory is missing, DN still auto-formats the directory. > Filing this jira to fix the bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10457) DataNode should not auto-format block pool directory if VERSION is missing
Wei-Chiu Chuang created HDFS-10457: -- Summary: DataNode should not auto-format block pool directory if VERSION is missing Key: HDFS-10457 URL: https://issues.apache.org/jira/browse/HDFS-10457 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Wei-Chiu Chuang Assignee: Wei-Chiu Chuang HDFS-10360 prevents DN to auto-formats a volume directory if the current/VERSION is missing. However, if instead, the current/VERSION in a block pool directory is missing, DN still auto-formats the directory. Filing this jira to fix the bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10440) Improve DataNode web UI
[ https://issues.apache.org/jira/browse/HDFS-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299423#comment-15299423 ] Weiwei Yang commented on HDFS-10440: [~vinayrpet] Sure, thanks. I just attached a new mockup [^dn_web_ui_mockup.jpg], please take a look if that looks OK? > Improve DataNode web UI > --- > > Key: HDFS-10440 > URL: https://issues.apache.org/jira/browse/HDFS-10440 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.5.0, 2.6.0, 2.7.0 >Reporter: Weiwei Yang > Attachments: dn_UI_logs.jpg, dn_web_ui_mockup.jpg > > > At present, datanode web UI doesn't have much information except for node > name and port. Propose to add more information similar to namenode UI, > including, > * Static info (version, block pool and cluster ID) > * Running state (active, decommissioning, decommissioned or lost etc) > * Summary (blocks, capacity, storage etc) > * Utilities (logs) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10440) Improve DataNode web UI
[ https://issues.apache.org/jira/browse/HDFS-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated HDFS-10440: --- Attachment: (was: datanode_UI_mockup.jpg) > Improve DataNode web UI > --- > > Key: HDFS-10440 > URL: https://issues.apache.org/jira/browse/HDFS-10440 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.5.0, 2.6.0, 2.7.0 >Reporter: Weiwei Yang > Attachments: dn_UI_logs.jpg, dn_web_ui_mockup.jpg > > > At present, datanode web UI doesn't have much information except for node > name and port. Propose to add more information similar to namenode UI, > including, > * Static info (version, block pool and cluster ID) > * Running state (active, decommissioning, decommissioned or lost etc) > * Summary (blocks, capacity, storage etc) > * Utilities (logs) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10440) Improve DataNode web UI
[ https://issues.apache.org/jira/browse/HDFS-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated HDFS-10440: --- Attachment: dn_web_ui_mockup.jpg > Improve DataNode web UI > --- > > Key: HDFS-10440 > URL: https://issues.apache.org/jira/browse/HDFS-10440 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.5.0, 2.6.0, 2.7.0 >Reporter: Weiwei Yang > Attachments: dn_UI_logs.jpg, dn_web_ui_mockup.jpg > > > At present, datanode web UI doesn't have much information except for node > name and port. Propose to add more information similar to namenode UI, > including, > * Static info (version, block pool and cluster ID) > * Running state (active, decommissioning, decommissioned or lost etc) > * Summary (blocks, capacity, storage etc) > * Utilities (logs) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10456) Print out the file path when the writes on snapshots are denied
[ https://issues.apache.org/jira/browse/HDFS-10456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299365#comment-15299365 ] Hadoop QA commented on HDFS-10456: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 40s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 59m 37s {color} | {color:green} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 78m 25s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12806041/HDFS-10456.000.patch | | JIRA Issue | HDFS-10456 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 3427e1cbe321 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 09b866f | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/15553/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/15553/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > Print out the file path when the writes on snapshots are denied > --- > > Key: HDFS-10456 > URL: https://issues.apache.org/jira/browse/HDFS-10456 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.2 >Reporter: Tianyin Xu > Attachments: HDFS-10456.000.patch > > > In {{getINodesInPath4Write}}, the writes on snapshots are denied by throwing >
[jira] [Commented] (HDFS-7240) Object store in HDFS
[ https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299336#comment-15299336 ] Kai Zheng commented on HDFS-7240: - Looking forward, things could be better, given the prototype implementation, the upcoming updated design doc, and now also important, it's in the right track under this active discussion. IMHO, it may help if you guys could meet and discuss about this together, as HDFS erasure coding effort did, considering this as another significant architecture change to the project. I wish the overall direction and design doc could be settled down sooner, and would also try to catch up. > Object store in HDFS > > > Key: HDFS-7240 > URL: https://issues.apache.org/jira/browse/HDFS-7240 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Jitendra Nath Pandey >Assignee: Jitendra Nath Pandey > Attachments: Ozone-architecture-v1.pdf, ozone_user_v0.pdf > > > This jira proposes to add object store capabilities into HDFS. > As part of the federation work (HDFS-1052) we separated block storage as a > generic storage layer. Using the Block Pool abstraction, new kinds of > namespaces can be built on top of the storage layer i.e. datanodes. > In this jira I will explore building an object store using the datanode > storage, but independent of namespace metadata. > I will soon update with a detailed design document. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10433) Make retry also works well for Async DFS
[ https://issues.apache.org/jira/browse/HDFS-10433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-10433: --- Status: Patch Available (was: Open) > Make retry also works well for Async DFS > > > Key: HDFS-10433 > URL: https://issues.apache.org/jira/browse/HDFS-10433 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs >Reporter: Xiaobing Zhou >Assignee: Tsz Wo Nicholas Sze > Attachments: h10433_20160524.patch > > > In current Async DFS implementation, file system calls are invoked and > returns Future immediately to clients. Clients call Future#get to retrieve > final results. Future#get internally invokes a chain of callbacks residing in > ClientNamenodeProtocolTranslatorPB, ProtobufRpcEngine and ipc.Client. The > callback path bypasses the original retry layer/logic designed for > synchronous DFS. This proposes refactoring to make retry also works for Async > DFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10433) Make retry also works well for Async DFS
[ https://issues.apache.org/jira/browse/HDFS-10433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-10433: --- Attachment: h10433_20160524.patch h10433_20160524.patch: adds an async call queue to RetryInvocationHandler to support retry and failover. > Make retry also works well for Async DFS > > > Key: HDFS-10433 > URL: https://issues.apache.org/jira/browse/HDFS-10433 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs >Reporter: Xiaobing Zhou >Assignee: Tsz Wo Nicholas Sze > Attachments: h10433_20160524.patch > > > In current Async DFS implementation, file system calls are invoked and > returns Future immediately to clients. Clients call Future#get to retrieve > final results. Future#get internally invokes a chain of callbacks residing in > ClientNamenodeProtocolTranslatorPB, ProtobufRpcEngine and ipc.Client. The > callback path bypasses the original retry layer/logic designed for > synchronous DFS. This proposes refactoring to make retry also works for Async > DFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.
[ https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Xiaoqiao updated HDFS-10453: --- Target Version/s: 2.7.1 Status: Patch Available (was: Open) > ReplicationMonitor thread could stuck for long time due to the race between > replication and delete of same file in a large cluster. > --- > > Key: HDFS-10453 > URL: https://issues.apache.org/jira/browse/HDFS-10453 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.1 >Reporter: He Xiaoqiao > Fix For: 2.7.1 > > Attachments: HDFS-10453-branch-2.001.patch > > > ReplicationMonitor thread could stuck for long time and loss data with little > probability. Consider the typical scenario: > (1) create and close a file with the default replicas(3); > (2) increase replication (to 10) of the file. > (3) delete the file while ReplicationMonitor is scheduling blocks belong to > that file for replications. > if ReplicationMonitor stuck reappeared, NameNode will print log as: > {code:xml} > 2016-04-19 10:20:48,083 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 7 to reach 10 > (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, > newBlock=false) For more information, please enable DEBUG log level on > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy > .. > 2016-04-19 10:21:17,184 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 7 to reach 10 > (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, > newBlock=false) For more information, please enable DEBUG log level on > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy > 2016-04-19 10:21:17,184 WARN > org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough > replicas: expected size is 7 but only 0 storage types can be selected > (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, > DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}) > 2016-04-19 10:21:17,184 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 7 to reach 10 > (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, > newBlock=false) All required storage types are unavailable: > unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]} > {code} > This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) > process same block at the same moment. > (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to > replicate and leave the global lock. > (2) FSNamesystem#delete invoked to delete blocks then clear the reference in > blocksmap, needReplications, etc. the block's NumBytes will set > NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does > not need explicit ACK from the node. > (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to > chooseTargets for the same blocks and no node will be selected after traverse > whole cluster because no node choice satisfy the goodness criteria > (remaining spaces achieve required size Long.MAX_VALUE). > During of stage#3 ReplicationMonitor stuck for long time, especial in a large > cluster. invalidateBlocks & neededReplications continues to grow and no > consumes. it will loss data at the worst. > This can mostly be avoided by skip chooseTarget for BlockCommand.NO_ACK block > and remove it from neededReplications. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.
[ https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Xiaoqiao updated HDFS-10453: --- Status: Open (was: Patch Available) > ReplicationMonitor thread could stuck for long time due to the race between > replication and delete of same file in a large cluster. > --- > > Key: HDFS-10453 > URL: https://issues.apache.org/jira/browse/HDFS-10453 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.1 >Reporter: He Xiaoqiao > Fix For: 2.7.1 > > Attachments: HDFS-10453-branch-2.001.patch > > > ReplicationMonitor thread could stuck for long time and loss data with little > probability. Consider the typical scenario: > (1) create and close a file with the default replicas(3); > (2) increase replication (to 10) of the file. > (3) delete the file while ReplicationMonitor is scheduling blocks belong to > that file for replications. > if ReplicationMonitor stuck reappeared, NameNode will print log as: > {code:xml} > 2016-04-19 10:20:48,083 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 7 to reach 10 > (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, > newBlock=false) For more information, please enable DEBUG log level on > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy > .. > 2016-04-19 10:21:17,184 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 7 to reach 10 > (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, > newBlock=false) For more information, please enable DEBUG log level on > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy > 2016-04-19 10:21:17,184 WARN > org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough > replicas: expected size is 7 but only 0 storage types can be selected > (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, > DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}) > 2016-04-19 10:21:17,184 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 7 to reach 10 > (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, > newBlock=false) All required storage types are unavailable: > unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]} > {code} > This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) > process same block at the same moment. > (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to > replicate and leave the global lock. > (2) FSNamesystem#delete invoked to delete blocks then clear the reference in > blocksmap, needReplications, etc. the block's NumBytes will set > NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does > not need explicit ACK from the node. > (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to > chooseTargets for the same blocks and no node will be selected after traverse > whole cluster because no node choice satisfy the goodness criteria > (remaining spaces achieve required size Long.MAX_VALUE). > During of stage#3 ReplicationMonitor stuck for long time, especial in a large > cluster. invalidateBlocks & neededReplications continues to grow and no > consumes. it will loss data at the worst. > This can mostly be avoided by skip chooseTarget for BlockCommand.NO_ACK block > and remove it from neededReplications. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10449) TestRollingFileSystemSinkWithHdfs#testFailedClose() fails on branch-2
[ https://issues.apache.org/jira/browse/HDFS-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299322#comment-15299322 ] Takanobu Asanuma commented on HDFS-10449: - Thanks for the comment. Actually, I got the contributor role and I was able to assign to myself. I'd like to work on this task. > TestRollingFileSystemSinkWithHdfs#testFailedClose() fails on branch-2 > - > > Key: HDFS-10449 > URL: https://issues.apache.org/jira/browse/HDFS-10449 > Project: Hadoop HDFS > Issue Type: Bug > Components: test > Environment: jenkins >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma > > {noformat} > Running org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.263 sec <<< > FAILURE! - in > org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs > testFailedClose(org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs) > Time elapsed: 8.729 sec <<< FAILURE! > java.lang.AssertionError: No exception was generated while stopping sink even > though HDFS was unavailable > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs.testFailedClose(TestRollingFileSystemSinkWithHdfs.java:187) > {noformat} > This passes fine on trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-10449) TestRollingFileSystemSinkWithHdfs#testFailedClose() fails on branch-2
[ https://issues.apache.org/jira/browse/HDFS-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takanobu Asanuma reassigned HDFS-10449: --- Assignee: Takanobu Asanuma > TestRollingFileSystemSinkWithHdfs#testFailedClose() fails on branch-2 > - > > Key: HDFS-10449 > URL: https://issues.apache.org/jira/browse/HDFS-10449 > Project: Hadoop HDFS > Issue Type: Bug > Components: test > Environment: jenkins >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma > > {noformat} > Running org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.263 sec <<< > FAILURE! - in > org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs > testFailedClose(org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs) > Time elapsed: 8.729 sec <<< FAILURE! > java.lang.AssertionError: No exception was generated while stopping sink even > though HDFS was unavailable > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs.testFailedClose(TestRollingFileSystemSinkWithHdfs.java:187) > {noformat} > This passes fine on trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7240) Object store in HDFS
[ https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299319#comment-15299319 ] Arpit Agarwal commented on HDFS-7240: - bq. I actually would be delighted to commit my time and energy to Ozone development bq. I would love to collaborate with everyone on this project. Andrew, what has been your technical contribution over the last year to help move the project forward? Did you give any thought to how the architecture spec could be converted to a technically feasible design and did you at any time post your ideas on the Jira or approach the developers who were prototyping in the feature branch? > Object store in HDFS > > > Key: HDFS-7240 > URL: https://issues.apache.org/jira/browse/HDFS-7240 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Jitendra Nath Pandey >Assignee: Jitendra Nath Pandey > Attachments: Ozone-architecture-v1.pdf, ozone_user_v0.pdf > > > This jira proposes to add object store capabilities into HDFS. > As part of the federation work (HDFS-1052) we separated block storage as a > generic storage layer. Using the Block Pool abstraction, new kinds of > namespaces can be built on top of the storage layer i.e. datanodes. > In this jira I will explore building an object store using the datanode > storage, but independent of namespace metadata. > I will soon update with a detailed design document. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10448) CacheManager#checkLimit always assumes a replication factor of 1
[ https://issues.apache.org/jira/browse/HDFS-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299316#comment-15299316 ] Yiqun Lin commented on HDFS-10448: -- Hi, [~cmccabe], I am a little confused of what you said. {quote} I think it should change computeNeeded to take replication into account {quote} I think I have took replication into account in {{computeNeeded}}. {code} return new CacheDirectiveStats.Builder() .setBytesNeeded(requestedBytes * replication) .setFilesCached(requestedFiles) .build(); {code} Because the change of this, I have to update the logic of the origin code that calls {{computeNeeded}}, From {code} pool.getBytesNeeded() + (stats.getBytesNeeded() * replication) > pool .getLimit() {code} To {code} pool.getBytesNeeded() + stats.getBytesNeeded() > pool.getLimit() {code} I think this is needed. Correct me If I am wrong, thanks. > CacheManager#checkLimit always assumes a replication factor of 1 > - > > Key: HDFS-10448 > URL: https://issues.apache.org/jira/browse/HDFS-10448 > Project: Hadoop HDFS > Issue Type: Bug > Components: caching >Affects Versions: 2.7.1 >Reporter: Yiqun Lin >Assignee: Yiqun Lin > Attachments: HDFS-10448.001.patch > > > The logic in {{CacheManager#checkLimit}} is not correct. In this method, it > does with these three logic: > First, it will compute needed bytes for the specific path. > {code} > CacheDirectiveStats stats = computeNeeded(path, replication); > {code} > But the param {{replication}} is not used here. And the bytesNeeded is just > one replication's vaue. > {code} > return new CacheDirectiveStats.Builder() > .setBytesNeeded(requestedBytes) > .setFilesCached(requestedFiles) > .build(); > {code} > Second, then it should be multiply by the replication to compare the limit > size because the method {{computeNeeded}} was not used replication. > {code} > pool.getBytesNeeded() + (stats.getBytesNeeded() * replication) > > pool.getLimit() > {code} > Third, if we find the size was more than the limit value and then print > warning info. It divided by replication here, while the > {{stats.getBytesNeeded()}} was just one replication value. > {code} > throw new InvalidRequestException("Caching path " + path + " of size " > + stats.getBytesNeeded() / replication + " bytes at replication " > + replication + " would exceed pool " + pool.getPoolName() > + "'s remaining capacity of " > + (pool.getLimit() - pool.getBytesNeeded()) + " bytes."); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8057) Move BlockReader implementation to the client implementation package
[ https://issues.apache.org/jira/browse/HDFS-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takanobu Asanuma updated HDFS-8057: --- Attachment: HDFS-8057.branch-2.003.patch Oh, I'm sorry to you, too. I updated the patch. > Move BlockReader implementation to the client implementation package > > > Key: HDFS-8057 > URL: https://issues.apache.org/jira/browse/HDFS-8057 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Tsz Wo Nicholas Sze >Assignee: Takanobu Asanuma > Attachments: HDFS-8057.1.patch, HDFS-8057.2.patch, HDFS-8057.3.patch, > HDFS-8057.4.patch, HDFS-8057.branch-2.001.patch, > HDFS-8057.branch-2.002.patch, HDFS-8057.branch-2.003.patch, > HDFS-8057.branch-2.5.patch > > > BlockReaderLocal, RemoteBlockReader, etc should be moved to > org.apache.hadoop.hdfs.client.impl. We may as well rename RemoteBlockReader > to BlockReaderRemote. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299313#comment-15299313 ] Konstantin Shvachko commented on HDFS-10301: Hey Colin, let's decide on the way to move forward. I do not see a point in making this change in two steps. * Your changes will essentially be completely removed by Vinitha's patch. * I do not see her patch introducing incompatible changes. So it can and should be backported through to branch 2.6. A thorough review is needed and will be quite helpful. I think the [004 patch|https://issues.apache.org/jira/secure/attachment/12805798/HDFS-10301.004.patch] covers * the upgrade case, that is, it works consistently for both old (pre-patch) and new (patched) DataNodes block reports * the case when the entire block report is sent in a single RPC and * the case when block reports are split into multiple RPCs * the leases So apart from the failed test I do not see any issues. It would be good if you could take a fresh look, see if any corner cases were missed. > BlockReport retransmissions may lead to storages falsely being declared > zombie if storage report processing happens out of order > > > Key: HDFS-10301 > URL: https://issues.apache.org/jira/browse/HDFS-10301 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.1 >Reporter: Konstantin Shvachko >Assignee: Colin Patrick McCabe >Priority: Critical > Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, > HDFS-10301.004.patch, HDFS-10301.005.patch, HDFS-10301.01.patch, > HDFS-10301.sample.patch, zombieStorageLogs.rtf > > > When NameNode is busy a DataNode can timeout sending a block report. Then it > sends the block report again. Then NameNode while process these two reports > at the same time can interleave processing storages from different reports. > This screws up the blockReportId field, which makes NameNode think that some > storages are zombie. Replicas from zombie storages are immediately removed, > causing missing blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10341) Add a metric to expose the timeout number of pending replication blocks
[ https://issues.apache.org/jira/browse/HDFS-10341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299305#comment-15299305 ] Hadoop QA commented on HDFS-10341: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 24s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 39s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 24s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 57s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 54s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 54s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 42s {color} | {color:red} root: patch generated 1 new + 334 unchanged - 0 fixed = 335 total (was 334) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 41s {color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 72m 26s {color} | {color:green} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 127m 56s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12806014/HDFS-10341.02.patch | | JIRA Issue | HDFS-10341 | | Optional Tests | asflicense mvnsite compile javac javadoc mvninstall unit findbugs checkstyle | | uname | Linux 2900f0a9aa38 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / edd716e | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/15551/artifact/patchprocess/diff-checkstyle-root.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/15551/testReport/ | | modules | C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs U: . | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/15551/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > Add a metric to expose the timeout
[jira] [Updated] (HDFS-10456) Print out the file path when the writes on snapshots are denied
[ https://issues.apache.org/jira/browse/HDFS-10456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tianyin Xu updated HDFS-10456: -- Status: Patch Available (was: Open) > Print out the file path when the writes on snapshots are denied > --- > > Key: HDFS-10456 > URL: https://issues.apache.org/jira/browse/HDFS-10456 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.2 >Reporter: Tianyin Xu > Attachments: HDFS-10456.000.patch > > > In {{getINodesInPath4Write}}, the writes on snapshots are denied by throwing > {{SnapshotAccessControlException}}. > Unlike other file operations that print out the file path upon failure, > {{getINodesInPath4Write}} does not. > The attached patch appends the directory path on the logged exception. > {code:title=ViewFs.java|borderStyle=solid} > if (inodesInPath.isSnapshot()) { >throw new SnapshotAccessControlException( > - "Modification on a read-only snapshot is disallowed"); > + "Modification on a read-only snapshot is disallowed: " > + + inodesInPath.getPath()); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10456) Print out the file path when the writes on snapshots are denied
[ https://issues.apache.org/jira/browse/HDFS-10456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tianyin Xu updated HDFS-10456: -- Attachment: HDFS-10456.000.patch Patch against trunk > Print out the file path when the writes on snapshots are denied > --- > > Key: HDFS-10456 > URL: https://issues.apache.org/jira/browse/HDFS-10456 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.2 >Reporter: Tianyin Xu > Attachments: HDFS-10456.000.patch > > > In {{getINodesInPath4Write}}, the writes on snapshots are denied by throwing > {{SnapshotAccessControlException}}. > Unlike other file operations that print out the file path upon failure, > {{getINodesInPath4Write}} does not. > The attached patch appends the directory path on the logged exception. > {code:title=ViewFs.java|borderStyle=solid} > if (inodesInPath.isSnapshot()) { >throw new SnapshotAccessControlException( > - "Modification on a read-only snapshot is disallowed"); > + "Modification on a read-only snapshot is disallowed: " > + + inodesInPath.getPath()); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10456) Print out the file path when the writes on snapshots are denied
Tianyin Xu created HDFS-10456: - Summary: Print out the file path when the writes on snapshots are denied Key: HDFS-10456 URL: https://issues.apache.org/jira/browse/HDFS-10456 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.2 Reporter: Tianyin Xu In {{getINodesInPath4Write}}, the writes on snapshots are denied by throwing {{SnapshotAccessControlException}}. Unlike other file operations that print out the file path upon failure, {{getINodesInPath4Write}} does not. The attached patch appends the directory path on the logged exception. {code:title=ViewFs.java|borderStyle=solid} if (inodesInPath.isSnapshot()) { throw new SnapshotAccessControlException( - "Modification on a read-only snapshot is disallowed"); + "Modification on a read-only snapshot is disallowed: " + + inodesInPath.getPath()); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299255#comment-15299255 ] Vinitha Reddy Gankidi commented on HDFS-10301: -- Thanks for your review [~cmccabe]. By legacy reports do you mean block reports from DNs before the concept of leases was introduced for block reports? {code} public synchronized boolean checkLease(DatanodeDescriptor dn, long monotonicNowMs, long id) { if (id == 0) { LOG.debug("Datanode {} is using BR lease id 0x0 to bypass " + "rate-limiting.", dn.getDatanodeUuid()); return true; } NodeData node = nodes.get(dn.getDatanodeUuid()); if (node == null) { LOG.info("BR lease 0x{} is not valid for unknown datanode {}", Long.toHexString(id), dn.getDatanodeUuid()); return false; } if (node.leaseId == 0) { LOG.warn("BR lease 0x{} is not valid for DN {}, because the DN " + "is not in the pending set.", Long.toHexString(id), dn.getDatanodeUuid()); return false; } {code} Isn't {{id}} equal to 0 for legacy block reports and when block reports are manually triggered? My understanding is that {{node.leaseId}} is set to zero only when the lease is removed. In my patch, the lease is removed by looking at the current rpc index in the block report context. {code} if (context != null) { if (context.getTotalRpcs() == context.getCurRpc() + 1) { long leaseId = this.getBlockReportLeaseManager().removeLease(node); BlockManagerFaultInjector.getInstance().removeBlockReportLease(node, leaseId); } {code} When processing of storage report happens out of order, we may set {{node.leaseId=0}} before all DN storage reports are processed. Therefore, we log a message and continue to process the storage report even if {{node.leaseId=0}}. Please let me know if you see any issue with this approach. During upgrades, we do not remove zombie storages. Once the upgrade is finalized, we go ahead and remove the zombie storages. {code} if (nn.getFSImage().isUpgradeFinalized() && noStaleStorages) { Set storageIDsInBlockReport = new HashSet<>(); if (context.getTotalRpcs() == 1) { for (StorageBlockReport report : reports) { storageIDsInBlockReport.add(report.getStorage().getStorageID()); } bm.removeZombieStorages(nodeReg, context, storageIDsInBlockReport); } } {code} Can you please elaborate on what you meant by "In general, your solution doesn't fix the problem during upgrade". What problems do you foresee? I am currently investigating why the test {{TestAddOverReplicatedStripedBlocks}} failed. > BlockReport retransmissions may lead to storages falsely being declared > zombie if storage report processing happens out of order > > > Key: HDFS-10301 > URL: https://issues.apache.org/jira/browse/HDFS-10301 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.1 >Reporter: Konstantin Shvachko >Assignee: Colin Patrick McCabe >Priority: Critical > Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, > HDFS-10301.004.patch, HDFS-10301.005.patch, HDFS-10301.01.patch, > HDFS-10301.sample.patch, zombieStorageLogs.rtf > > > When NameNode is busy a DataNode can timeout sending a block report. Then it > sends the block report again. Then NameNode while process these two reports > at the same time can interleave processing storages from different reports. > This screws up the blockReportId field, which makes NameNode think that some > storages are zombie. Replicas from zombie storages are immediately removed, > causing missing blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10415) TestDistributedFileSystem#testDFSCloseOrdering() fails on branch-2
[ https://issues.apache.org/jira/browse/HDFS-10415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299248#comment-15299248 ] Andrew Wang commented on HDFS-10415: I think that's just an omission, though it's been 3 years. I'm okay with reconciling with trunk if it fixes the problem. > TestDistributedFileSystem#testDFSCloseOrdering() fails on branch-2 > -- > > Key: HDFS-10415 > URL: https://issues.apache.org/jira/browse/HDFS-10415 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.9.0 > Environment: jenkins >Reporter: Sangjin Lee >Assignee: Mingliang Liu > Attachments: HDFS-10415-branch-2.000.patch, > HDFS-10415-branch-2.001.patch, HDFS-10415.000.patch > > > {noformat} > Tests run: 24, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 51.096 sec > <<< FAILURE! - in org.apache.hadoop.hdfs.TestDistributedFileSystem > testDFSCloseOrdering(org.apache.hadoop.hdfs.TestDistributedFileSystem) Time > elapsed: 0.045 sec <<< ERROR! > java.lang.NullPointerException: null > at > org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:790) > at > org.apache.hadoop.fs.FileSystem.processDeleteOnExit(FileSystem.java:1417) > at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:2084) > at > org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:1187) > at > org.apache.hadoop.hdfs.TestDistributedFileSystem.testDFSCloseOrdering(TestDistributedFileSystem.java:217) > {noformat} > This is with Java 8 on Mac. It passes fine on trunk. I haven't tried other > combinations. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10455) Logging the username when deny the setOwner operation
[ https://issues.apache.org/jira/browse/HDFS-10455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299242#comment-15299242 ] Hadoop QA commented on HDFS-10455: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 2s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 37s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s {color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: patch generated 0 new + 12 unchanged - 1 fixed = 12 total (was 13) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 59m 46s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 77m 41s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestAsyncDFSRename | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12806012/HDFS-10455.000.patch | | JIRA Issue | HDFS-10455 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 5c83e5fefcb1 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / edd716e | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/15550/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | unit test logs | https://builds.apache.org/job/PreCommit-HDFS-Build/15550/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/15550/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/15550/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > Logging the username when deny the setOwner operation >
[jira] [Commented] (HDFS-9924) [umbrella] Asynchronous HDFS Access
[ https://issues.apache.org/jira/browse/HDFS-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299220#comment-15299220 ] Andrew Wang commented on HDFS-9924: --- Nicholas, I proposed two solutions above, neither of which you have commented on. Have you looked into Deferred and CompletableFuture? This is also why I asked for a review of other async APIs in the design doc. > [umbrella] Asynchronous HDFS Access > --- > > Key: HDFS-9924 > URL: https://issues.apache.org/jira/browse/HDFS-9924 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Reporter: Tsz Wo Nicholas Sze >Assignee: Xiaobing Zhou > Attachments: AsyncHdfs20160510.pdf > > > This is an umbrella JIRA for supporting Asynchronous HDFS Access. > Currently, all the API methods are blocking calls -- the caller is blocked > until the method returns. It is very slow if a client makes a large number > of independent calls in a single thread since each call has to wait until the > previous call is finished. It is inefficient if a client needs to create a > large number of threads to invoke the calls. > We propose adding a new API to support asynchronous calls, i.e. the caller is > not blocked. The methods in the new API immediately return a Java Future > object. The return value can be obtained by the usual Future.get() method. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7240) Object store in HDFS
[ https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299197#comment-15299197 ] Andrew Wang commented on HDFS-7240: --- [~arpitagarwal] the impedance mismatch here is illustrated in your most recent comment: bq. We will post a design doc soon and there is ample opportunity/need for community contributions as the implementation is still in an early stage. This is in line with how features are developed in Apache. The Apache community is supposed to be involved in the design too, not just the implementation. I thought we were doing this, since we had a nice design discussion when the architecture doc was released, and when we last spoke in late February this year, the design seemed unchanged from the design doc. Since then, it's clear that a lot of work has been done internally at Hortonworks, without community involvement. I consider changing how metadata is stored to be a very significant design change, as well as the addition of a new master service. If the design is still flexible and under discussion, great. What it feels like though is a completed design being dropped on us. It's hard for external contributors to interpret these design changes without the related context and discussions. If the design is viewed as completed and just needs implementation, it's also hard for us to make meaningful design changes. Again, I would love to collaborate with everyone on this project. HDFS scale is a topic at the forefront of my mind, and we would all benefit from working together on a single solution. But that requires opening it up so non-Hortonworkers can be deeply involved in the requirements and design, not just implementation. > Object store in HDFS > > > Key: HDFS-7240 > URL: https://issues.apache.org/jira/browse/HDFS-7240 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Jitendra Nath Pandey >Assignee: Jitendra Nath Pandey > Attachments: Ozone-architecture-v1.pdf, ozone_user_v0.pdf > > > This jira proposes to add object store capabilities into HDFS. > As part of the federation work (HDFS-1052) we separated block storage as a > generic storage layer. Using the Block Pool abstraction, new kinds of > namespaces can be built on top of the storage layer i.e. datanodes. > In this jira I will explore building an object store using the datanode > storage, but independent of namespace metadata. > I will soon update with a detailed design document. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9924) [umbrella] Asynchronous HDFS Access
[ https://issues.apache.org/jira/browse/HDFS-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299174#comment-15299174 ] Tsz Wo Nicholas Sze commented on HDFS-9924: --- > ... . It's not fair to push the burden of supporting multiple APIs onto our > downstreams, ... We are not going to support multiple APIs. Once we have decided the async API, the unstable API can be removed. That is the meaning of "unstable". The down streams are intelligent people. They can decide whether they want to use the unstable API. It is even more unfair if we delay to provide any async API to the down streams. No? [~andrew.wang], is it your intention to slow down the async hdfs development? I hope not. > [umbrella] Asynchronous HDFS Access > --- > > Key: HDFS-9924 > URL: https://issues.apache.org/jira/browse/HDFS-9924 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Reporter: Tsz Wo Nicholas Sze >Assignee: Xiaobing Zhou > Attachments: AsyncHdfs20160510.pdf > > > This is an umbrella JIRA for supporting Asynchronous HDFS Access. > Currently, all the API methods are blocking calls -- the caller is blocked > until the method returns. It is very slow if a client makes a large number > of independent calls in a single thread since each call has to wait until the > previous call is finished. It is inefficient if a client needs to create a > large number of threads to invoke the calls. > We propose adding a new API to support asynchronous calls, i.e. the caller is > not blocked. The methods in the new API immediately return a Java Future > object. The return value can be obtained by the usual Future.get() method. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9782) RollingFileSystemSink should have configurable roll interval
[ https://issues.apache.org/jira/browse/HDFS-9782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299165#comment-15299165 ] Karthik Kambatla commented on HDFS-9782: Oh, and thanks [~andrew.wang] and [~rkanter] for your reviews. > RollingFileSystemSink should have configurable roll interval > > > Key: HDFS-9782 > URL: https://issues.apache.org/jira/browse/HDFS-9782 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Fix For: 2.9.0 > > Attachments: HDFS-9782.001.patch, HDFS-9782.002.patch, > HDFS-9782.003.patch, HDFS-9782.004.patch, HDFS-9782.005.patch, > HDFS-9782.006.patch, HDFS-9782.007.patch, HDFS-9782.008.patch, > HDFS-9782.009.patch > > > Right now it defaults to rolling at the top of every hour. Instead that > interval should be configurable. The interval should also allow for some > play so that all hosts don't try to flush their files simultaneously. > I'm filing this in HDFS because I suspect it will involve touching the HDFS > tests. If it turns out not to, I'll move it into common instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9782) RollingFileSystemSink should have configurable roll interval
[ https://issues.apache.org/jira/browse/HDFS-9782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HDFS-9782: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.9.0 Status: Resolved (was: Patch Available) Thanks for the contribution, [~templedf]. Just committed this to trunk and branch-2. > RollingFileSystemSink should have configurable roll interval > > > Key: HDFS-9782 > URL: https://issues.apache.org/jira/browse/HDFS-9782 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Fix For: 2.9.0 > > Attachments: HDFS-9782.001.patch, HDFS-9782.002.patch, > HDFS-9782.003.patch, HDFS-9782.004.patch, HDFS-9782.005.patch, > HDFS-9782.006.patch, HDFS-9782.007.patch, HDFS-9782.008.patch, > HDFS-9782.009.patch > > > Right now it defaults to rolling at the top of every hour. Instead that > interval should be configurable. The interval should also allow for some > play so that all hosts don't try to flush their files simultaneously. > I'm filing this in HDFS because I suspect it will involve touching the HDFS > tests. If it turns out not to, I'll move it into common instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9924) [umbrella] Asynchronous HDFS Access
[ https://issues.apache.org/jira/browse/HDFS-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299161#comment-15299161 ] Andrew Wang commented on HDFS-9924: --- I'm still not convinced enough to change my -1 on Future in 2.8. Even if what's currently committed is marked Unstable, I don't want to rush ahead with an API we know is insufficient for async-style programming. Earlier in this JIRA's comments, others were asking about ListenableFuture for the same reasons. It's not fair to push the burden of supporting multiple APIs onto our downstreams, when we have a few possible solutions close at-hand: * Use Deferred, which HBase and Kudu adopted due to the lack of CompletableFuture in JDK7. ListenableFuture might be good too. * Target this for 3.0 and use CompletableFuture. We're actively working on 3.0, and the first 3.0.0 alpha is likely coming out around the same time as 2.8.0. > [umbrella] Asynchronous HDFS Access > --- > > Key: HDFS-9924 > URL: https://issues.apache.org/jira/browse/HDFS-9924 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Reporter: Tsz Wo Nicholas Sze >Assignee: Xiaobing Zhou > Attachments: AsyncHdfs20160510.pdf > > > This is an umbrella JIRA for supporting Asynchronous HDFS Access. > Currently, all the API methods are blocking calls -- the caller is blocked > until the method returns. It is very slow if a client makes a large number > of independent calls in a single thread since each call has to wait until the > previous call is finished. It is inefficient if a client needs to create a > large number of threads to invoke the calls. > We propose adding a new API to support asynchronous calls, i.e. the caller is > not blocked. The methods in the new API immediately return a Java Future > object. The return value can be obtained by the usual Future.get() method. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9782) RollingFileSystemSink should have configurable roll interval
[ https://issues.apache.org/jira/browse/HDFS-9782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299147#comment-15299147 ] Karthik Kambatla commented on HDFS-9782: Okay, will check this into branch-2 then. > RollingFileSystemSink should have configurable roll interval > > > Key: HDFS-9782 > URL: https://issues.apache.org/jira/browse/HDFS-9782 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Attachments: HDFS-9782.001.patch, HDFS-9782.002.patch, > HDFS-9782.003.patch, HDFS-9782.004.patch, HDFS-9782.005.patch, > HDFS-9782.006.patch, HDFS-9782.007.patch, HDFS-9782.008.patch, > HDFS-9782.009.patch > > > Right now it defaults to rolling at the top of every hour. Instead that > interval should be configurable. The interval should also allow for some > play so that all hosts don't try to flush their files simultaneously. > I'm filing this in HDFS because I suspect it will involve touching the HDFS > tests. If it turns out not to, I'll move it into common instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10341) Add a metric to expose the timeout number of pending replication blocks
[ https://issues.apache.org/jira/browse/HDFS-10341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-10341: - Attachment: HDFS-10341.03.patch 03: Expose the total number of timed out pending replication blocks without using additional AtomicInteger. > Add a metric to expose the timeout number of pending replication blocks > --- > > Key: HDFS-10341 > URL: https://issues.apache.org/jira/browse/HDFS-10341 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Akira AJISAKA >Assignee: Akira AJISAKA > Attachments: HDFS-10341.01.patch, HDFS-10341.02.patch, > HDFS-10341.03.patch > > > Per HDFS-6682, recording the timeout number of pending replication blocks is > useful to get the cluster health. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10341) Add a metric to expose the timeout number of pending replication blocks
[ https://issues.apache.org/jira/browse/HDFS-10341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-10341: - Attachment: HDFS-10341.02.patch Thanks Arpit for the comment. Updated the patch. I wanted to expose the *total* number of timeouts, so I created additional AtomicInt in the previous patch. In the 02 patch, the metric shows the *current* number of timed out pending replication blocks because {{timedOutItems}} is cleared in {{getTimedOutBlocks()}}. I suspect that the current number doesn't fit for us because the number is cleared very frequently (recheck interval is 3 sec by default, which is probably smaller than the interval of metrics sink). What do you think? > Add a metric to expose the timeout number of pending replication blocks > --- > > Key: HDFS-10341 > URL: https://issues.apache.org/jira/browse/HDFS-10341 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Akira AJISAKA >Assignee: Akira AJISAKA > Attachments: HDFS-10341.01.patch, HDFS-10341.02.patch > > > Per HDFS-6682, recording the timeout number of pending replication blocks is > useful to get the cluster health. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10455) Logging the username when deny the setOwner operation
Tianyin Xu created HDFS-10455: - Summary: Logging the username when deny the setOwner operation Key: HDFS-10455 URL: https://issues.apache.org/jira/browse/HDFS-10455 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.2 Reporter: Tianyin Xu Priority: Minor Attachments: HDFS-10455.000.patch The attached patch appends the user name in the logging when the setOwner operation is denied due to insufficient permissions on this user (based on his/her name). The same practice is used in {{FSPermissionChecker}} such as {{checkOwner()}} and {{checkSuperuserPrivilege()}}. {code:title=FSDirAttrOp.java|borderStyle=solid} if (!pc.isSuperUser()) { if (username != null && !pc.getUser().equals(username)) { - throw new AccessControlException("Non-super user cannot change owner"); + throw new AccessControlException("User " + pc.getUser() + + " is not a super user (non-super user cannot change owner)."); } if (group != null && !pc.containsGroup(group)) { - throw new AccessControlException("User does not belong to " + group); + throw new AccessControlException("User " + pc.getUser() + + " does not belong to " + group); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10455) Logging the username when deny the setOwner operation
[ https://issues.apache.org/jira/browse/HDFS-10455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tianyin Xu updated HDFS-10455: -- Attachment: HDFS-10455.000.patch Patch against trunk > Logging the username when deny the setOwner operation > - > > Key: HDFS-10455 > URL: https://issues.apache.org/jira/browse/HDFS-10455 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.2 >Reporter: Tianyin Xu >Priority: Minor > Attachments: HDFS-10455.000.patch > > > The attached patch appends the user name in the logging when the setOwner > operation is denied due to insufficient permissions on this user (based on > his/her name). > The same practice is used in {{FSPermissionChecker}} such as {{checkOwner()}} > and {{checkSuperuserPrivilege()}}. > {code:title=FSDirAttrOp.java|borderStyle=solid} >if (!pc.isSuperUser()) { > if (username != null && !pc.getUser().equals(username)) { > - throw new AccessControlException("Non-super user cannot change > owner"); > + throw new AccessControlException("User " + pc.getUser() > + + " is not a super user (non-super user cannot change > owner)."); > } > if (group != null && !pc.containsGroup(group)) { > - throw new AccessControlException("User does not belong to " + > group); > + throw new AccessControlException("User " + pc.getUser() > + + " does not belong to " + group); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10455) Logging the username when deny the setOwner operation
[ https://issues.apache.org/jira/browse/HDFS-10455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tianyin Xu updated HDFS-10455: -- Status: Patch Available (was: Open) > Logging the username when deny the setOwner operation > - > > Key: HDFS-10455 > URL: https://issues.apache.org/jira/browse/HDFS-10455 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.2 >Reporter: Tianyin Xu >Priority: Minor > Attachments: HDFS-10455.000.patch > > > The attached patch appends the user name in the logging when the setOwner > operation is denied due to insufficient permissions on this user (based on > his/her name). > The same practice is used in {{FSPermissionChecker}} such as {{checkOwner()}} > and {{checkSuperuserPrivilege()}}. > {code:title=FSDirAttrOp.java|borderStyle=solid} >if (!pc.isSuperUser()) { > if (username != null && !pc.getUser().equals(username)) { > - throw new AccessControlException("Non-super user cannot change > owner"); > + throw new AccessControlException("User " + pc.getUser() > + + " is not a super user (non-super user cannot change > owner)."); > } > if (group != null && !pc.containsGroup(group)) { > - throw new AccessControlException("User does not belong to " + > group); > + throw new AccessControlException("User " + pc.getUser() > + + " does not belong to " + group); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10341) Add a metric to expose the timeout number of pending replication blocks
[ https://issues.apache.org/jira/browse/HDFS-10341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299076#comment-15299076 ] Arpit Agarwal commented on HDFS-10341: -- Hi [~ajisakaa], can we just return timedOutItems.size() by querying it within the timedOutItems object lock instead of adding a new Atomic int? I checked every place that lock is taken and it is fairly lightweight so so I see no problem acquiring it when querying metrics. Thanks. > Add a metric to expose the timeout number of pending replication blocks > --- > > Key: HDFS-10341 > URL: https://issues.apache.org/jira/browse/HDFS-10341 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Akira AJISAKA >Assignee: Akira AJISAKA > Attachments: HDFS-10341.01.patch > > > Per HDFS-6682, recording the timeout number of pending replication blocks is > useful to get the cluster health. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10390) Implement asynchronous setAcl/getAclStatus for DistributedFileSystem
[ https://issues.apache.org/jira/browse/HDFS-10390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299035#comment-15299035 ] Hudson commented on HDFS-10390: --- SUCCESS: Integrated in Hadoop-trunk-Commit #9850 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9850/]) HDFS-10390. Implement asynchronous setAcl/getAclStatus for (szetszwo: rev 02d4e478a398c24a5e5e8ea2b0822a5b9d4a97ae) * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/AsyncDistributedFileSystem.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSAclBaseTest.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestAsyncDFSRename.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestAsyncDFS.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolTranslatorPB.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java > Implement asynchronous setAcl/getAclStatus for DistributedFileSystem > > > Key: HDFS-10390 > URL: https://issues.apache.org/jira/browse/HDFS-10390 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Fix For: 2.8.0 > > Attachments: HDFS-10390-HDFS-9924.000.patch, > HDFS-10390-HDFS-9924.001.patch, HDFS-10390-HDFS-9924.002.patch, > HDFS-10390-HDFS-9924.003.patch, HDFS-10390-HDFS-9924.004.patch, > HDFS-10390-HDFS-9924.005.patch, HDFS-10390-HDFS-9924.006.patch, > HDFS-10390-HDFS-9924.007.patch, HDFS-10390-HDFS-9924.008.patch, > HDFS-10390-HDFS-9924.009.patch, HDFS-10390-HDFS-9924.010.patch, > HDFS-10390-HDFS-9924.011.patch > > > This is proposed to implement asynchronous setAcl/getAclStatus. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10390) Implement asynchronous setAcl/getAclStatus for DistributedFileSystem
[ https://issues.apache.org/jira/browse/HDFS-10390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-10390: --- Resolution: Fixed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) I have committed this. Thanks, Xiaobing! > Implement asynchronous setAcl/getAclStatus for DistributedFileSystem > > > Key: HDFS-10390 > URL: https://issues.apache.org/jira/browse/HDFS-10390 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Fix For: 2.8.0 > > Attachments: HDFS-10390-HDFS-9924.000.patch, > HDFS-10390-HDFS-9924.001.patch, HDFS-10390-HDFS-9924.002.patch, > HDFS-10390-HDFS-9924.003.patch, HDFS-10390-HDFS-9924.004.patch, > HDFS-10390-HDFS-9924.005.patch, HDFS-10390-HDFS-9924.006.patch, > HDFS-10390-HDFS-9924.007.patch, HDFS-10390-HDFS-9924.008.patch, > HDFS-10390-HDFS-9924.009.patch, HDFS-10390-HDFS-9924.010.patch, > HDFS-10390-HDFS-9924.011.patch > > > This is proposed to implement asynchronous setAcl/getAclStatus. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10390) Implement asynchronous setAcl/getAclStatus for DistributedFileSystem
[ https://issues.apache.org/jira/browse/HDFS-10390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298998#comment-15298998 ] Tsz Wo Nicholas Sze commented on HDFS-10390: Jenkins seem not picking up the v011. Since the difference between v010 and v011 is only in TestAsyncDFS. I will test it manually and commit the patch if there are no problems found. > Implement asynchronous setAcl/getAclStatus for DistributedFileSystem > > > Key: HDFS-10390 > URL: https://issues.apache.org/jira/browse/HDFS-10390 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-10390-HDFS-9924.000.patch, > HDFS-10390-HDFS-9924.001.patch, HDFS-10390-HDFS-9924.002.patch, > HDFS-10390-HDFS-9924.003.patch, HDFS-10390-HDFS-9924.004.patch, > HDFS-10390-HDFS-9924.005.patch, HDFS-10390-HDFS-9924.006.patch, > HDFS-10390-HDFS-9924.007.patch, HDFS-10390-HDFS-9924.008.patch, > HDFS-10390-HDFS-9924.009.patch, HDFS-10390-HDFS-9924.010.patch, > HDFS-10390-HDFS-9924.011.patch > > > This is proposed to implement asynchronous setAcl/getAclStatus. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10390) Implement asynchronous setAcl/getAclStatus for DistributedFileSystem
[ https://issues.apache.org/jira/browse/HDFS-10390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-10390: --- Hadoop Flags: Reviewed +1 patch looks good. Thanks for all the hard works! > Implement asynchronous setAcl/getAclStatus for DistributedFileSystem > > > Key: HDFS-10390 > URL: https://issues.apache.org/jira/browse/HDFS-10390 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-10390-HDFS-9924.000.patch, > HDFS-10390-HDFS-9924.001.patch, HDFS-10390-HDFS-9924.002.patch, > HDFS-10390-HDFS-9924.003.patch, HDFS-10390-HDFS-9924.004.patch, > HDFS-10390-HDFS-9924.005.patch, HDFS-10390-HDFS-9924.006.patch, > HDFS-10390-HDFS-9924.007.patch, HDFS-10390-HDFS-9924.008.patch, > HDFS-10390-HDFS-9924.009.patch, HDFS-10390-HDFS-9924.010.patch, > HDFS-10390-HDFS-9924.011.patch > > > This is proposed to implement asynchronous setAcl/getAclStatus. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9365) Balaner does not work with the HDFS-6376 HA setup
[ https://issues.apache.org/jira/browse/HDFS-9365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-9365: -- Resolution: Fixed Fix Version/s: 2.7.3 Status: Resolved (was: Patch Available) Thanks Rakesh, Haohui and Jing for the review comments. I have committed this. > Balaner does not work with the HDFS-6376 HA setup > - > > Key: HDFS-9365 > URL: https://issues.apache.org/jira/browse/HDFS-9365 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Fix For: 2.7.3 > > Attachments: h9365_20151119.patch, h9365_20151120.patch, > h9365_20160523.patch > > > HDFS-6376 added support for DistCp between two HA clusters. After the > change, Balaner will use all the NN from both the local and the remote > clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9547) DiskBalancer : Add user documentation
[ https://issues.apache.org/jira/browse/HDFS-9547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298877#comment-15298877 ] Hadoop QA commented on HDFS-9547: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 36s {color} | {color:green} HDFS-1312 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s {color} | {color:green} HDFS-1312 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 24 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 10 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 9m 0s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12805976/HDFS-9547-HDFS-1312.002.patch | | JIRA Issue | HDFS-9547 | | Optional Tests | asflicense mvnsite | | uname | Linux b3bd7bd3cbbf 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-1312 / fa600a6 | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/15549/artifact/patchprocess/whitespace-eol.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/15549/artifact/patchprocess/whitespace-tabs.txt | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/15549/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > DiskBalancer : Add user documentation > - > > Key: HDFS-9547 > URL: https://issues.apache.org/jira/browse/HDFS-9547 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: HDFS-9547-HDFS-1312.001.patch, > HDFS-9547-HDFS-1312.002.patch > > > Write diskbalancer.md since this is a new tool and explain the usage with > examples. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9547) DiskBalancer : Add user documentation
[ https://issues.apache.org/jira/browse/HDFS-9547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDFS-9547: --- Attachment: HDFS-9547-HDFS-1312.002.patch Addressed all code review comments > DiskBalancer : Add user documentation > - > > Key: HDFS-9547 > URL: https://issues.apache.org/jira/browse/HDFS-9547 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: HDFS-9547-HDFS-1312.001.patch, > HDFS-9547-HDFS-1312.002.patch > > > Write diskbalancer.md since this is a new tool and explain the usage with > examples. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6376) Distcp data between two HA clusters requires another configuration
[ https://issues.apache.org/jira/browse/HDFS-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298811#comment-15298811 ] Hudson commented on HDFS-6376: -- SUCCESS: Integrated in Hadoop-trunk-Commit #9849 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9849/]) HDFS-9365. Balaner does not work with the HDFS-6376 HA setup. (szetszwo: rev 15ed080e3610b7526eff12391de780948a75fa7b) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancerWithNodeGroup.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancerWithHANameNodes.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestStorageMover.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancerWithMultipleNameNodes.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java > Distcp data between two HA clusters requires another configuration > -- > > Key: HDFS-6376 > URL: https://issues.apache.org/jira/browse/HDFS-6376 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, federation, hdfs-client >Affects Versions: 2.2.0, 2.3.0, 2.4.0 > Environment: Hadoop 2.3.0 >Reporter: Dave Marion >Assignee: Dave Marion > Fix For: 2.6.0 > > Attachments: HDFS-6376-2.patch, HDFS-6376-3-branch-2.4.patch, > HDFS-6376-4-branch-2.4.patch, HDFS-6376-5-trunk.patch, > HDFS-6376-6-trunk.patch, HDFS-6376-7-trunk.patch, HDFS-6376-branch-2.4.patch, > HDFS-6376-patch-1.patch, HDFS-6376.000.patch, HDFS-6376.008.patch, > HDFS-6376.009.patch, HDFS-6376.010.patch, HDFS-6376.011.patch > > > User has to create a third set of configuration files for distcp when > transferring data between two HA clusters. > Consider the scenario in [1]. You cannot put all of the required properties > in core-site.xml and hdfs-site.xml for the client to resolve the location of > both active namenodes. If you do, then the datanodes from cluster A may join > cluster B. I can not find a configuration option that tells the datanodes to > federate blocks for only one of the clusters in the configuration. > [1] > http://mail-archives.apache.org/mod_mbox/hadoop-user/201404.mbox/%3CBAY172-W2133964E0C283968C161DD1520%40phx.gbl%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10370) Allow DataNode to be started with numactl
[ https://issues.apache.org/jira/browse/HDFS-10370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Marion updated HDFS-10370: --- Attachment: HDFS-10370-branch-2.004.patch HDFS-10370.004.patch Updated patches for master and branch-2. Note that I have run the branch-2 patch locally on a Hadoop 2.6.0 distribution. I have not tested 3.0, so I'm not sure if the code is in the correct location. One way to test is to enable numactl, set the args to bind the DN process to a cpu, start the DN and verify with `taskset -pc ` > Allow DataNode to be started with numactl > - > > Key: HDFS-10370 > URL: https://issues.apache.org/jira/browse/HDFS-10370 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Dave Marion >Assignee: Dave Marion > Attachments: HDFS-10370-1.patch, HDFS-10370-2.patch, > HDFS-10370-3.patch, HDFS-10370-branch-2.004.patch, HDFS-10370.004.patch > > > Allow numactl constraints to be applied to the datanode process. The > implementation I have in mind involves two environment variables (enable and > parameters) in the datanode startup process. Basically, if enabled and > numactl exists on the system, then start the java process using it. Provide a > default set of parameters, and allow the user to override the default. Wiring > this up for the non-jsvc use case seems straightforward. Not sure how this > can be supported using jsvc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9365) Balaner does not work with the HDFS-6376 HA setup
[ https://issues.apache.org/jira/browse/HDFS-9365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298812#comment-15298812 ] Hudson commented on HDFS-9365: -- SUCCESS: Integrated in Hadoop-trunk-Commit #9849 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9849/]) HDFS-9365. Balaner does not work with the HDFS-6376 HA setup. (szetszwo: rev 15ed080e3610b7526eff12391de780948a75fa7b) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancerWithNodeGroup.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancerWithHANameNodes.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestStorageMover.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancerWithMultipleNameNodes.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java > Balaner does not work with the HDFS-6376 HA setup > - > > Key: HDFS-9365 > URL: https://issues.apache.org/jira/browse/HDFS-9365 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Attachments: h9365_20151119.patch, h9365_20151120.patch, > h9365_20160523.patch > > > HDFS-6376 added support for DistCp between two HA clusters. After the > change, Balaner will use all the NN from both the local and the remote > clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8057) Move BlockReader implementation to the client implementation package
[ https://issues.apache.org/jira/browse/HDFS-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298804#comment-15298804 ] Tsz Wo Nicholas Sze commented on HDFS-8057: --- There are some also three "RemoteBlockReader"s in BlockReaderRemote2. Sorry that I did not mention it last time. > Move BlockReader implementation to the client implementation package > > > Key: HDFS-8057 > URL: https://issues.apache.org/jira/browse/HDFS-8057 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Tsz Wo Nicholas Sze >Assignee: Takanobu Asanuma > Attachments: HDFS-8057.1.patch, HDFS-8057.2.patch, HDFS-8057.3.patch, > HDFS-8057.4.patch, HDFS-8057.branch-2.001.patch, > HDFS-8057.branch-2.002.patch, HDFS-8057.branch-2.5.patch > > > BlockReaderLocal, RemoteBlockReader, etc should be moved to > org.apache.hadoop.hdfs.client.impl. We may as well rename RemoteBlockReader > to BlockReaderRemote. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298742#comment-15298742 ] Hadoop QA commented on HDFS-10301: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 47s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 37s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 46s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 26s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 6 new + 293 unchanged - 0 fixed = 299 total (was 293) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 2s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 95m 7s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency | | | hadoop.hdfs.TestSafeMode | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12805931/HDFS-10301.005.patch | | JIRA Issue | HDFS-10301 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 26dc17f5173c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 57c31a3 | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/15548/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/15548/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | unit test logs | https://builds.apache.org/job/PreCommit-HDFS-Build/15548/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/15548/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/15548/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. >
[jira] [Commented] (HDFS-10415) TestDistributedFileSystem#testDFSCloseOrdering() fails on branch-2
[ https://issues.apache.org/jira/browse/HDFS-10415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298714#comment-15298714 ] Mingliang Liu commented on HDFS-10415: -- If no objections about the 1st approach, does the v1 patch look good? Thanks! > TestDistributedFileSystem#testDFSCloseOrdering() fails on branch-2 > -- > > Key: HDFS-10415 > URL: https://issues.apache.org/jira/browse/HDFS-10415 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.9.0 > Environment: jenkins >Reporter: Sangjin Lee >Assignee: Mingliang Liu > Attachments: HDFS-10415-branch-2.000.patch, > HDFS-10415-branch-2.001.patch, HDFS-10415.000.patch > > > {noformat} > Tests run: 24, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 51.096 sec > <<< FAILURE! - in org.apache.hadoop.hdfs.TestDistributedFileSystem > testDFSCloseOrdering(org.apache.hadoop.hdfs.TestDistributedFileSystem) Time > elapsed: 0.045 sec <<< ERROR! > java.lang.NullPointerException: null > at > org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:790) > at > org.apache.hadoop.fs.FileSystem.processDeleteOnExit(FileSystem.java:1417) > at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:2084) > at > org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:1187) > at > org.apache.hadoop.hdfs.TestDistributedFileSystem.testDFSCloseOrdering(TestDistributedFileSystem.java:217) > {noformat} > This is with Java 8 on Mac. It passes fine on trunk. I haven't tried other > combinations. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10220) A large number of expired leases can make namenode unresponsive and cause failover
[ https://issues.apache.org/jira/browse/HDFS-10220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298678#comment-15298678 ] Ravi Prakash commented on HDFS-10220: - I think this probably happens rarely so having dynamic duty cycles may be overkill IMHO. If you can make the MAX_LOCK_HOLD_TO_RELEASE_LEASE_MS and the lease check interval config parameters, I'll +1 and commit. We've vascillated on this long enough. > A large number of expired leases can make namenode unresponsive and cause > failover > -- > > Key: HDFS-10220 > URL: https://issues.apache.org/jira/browse/HDFS-10220 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Nicolas Fraison >Assignee: Nicolas Fraison >Priority: Minor > Attachments: HADOOP-10220.001.patch, HADOOP-10220.002.patch, > HADOOP-10220.003.patch, HADOOP-10220.004.patch, HADOOP-10220.005.patch, > HADOOP-10220.006.patch, threaddump_zkfc.txt > > > I have faced a namenode failover due to unresponsive namenode detected by the > zkfc with lot's of WARN messages (5 millions) like this one: > _org.apache.hadoop.hdfs.StateChange: BLOCK* internalReleaseLease: All > existing blocks are COMPLETE, lease removed, file closed._ > On the threaddump taken by the zkfc there are lots of thread blocked due to a > lock. > Looking at the code, there are a lock taken by the LeaseManager.Monitor when > some lease must be released. Due to the really big number of lease to be > released the namenode has taken too many times to release them blocking all > other tasks and making the zkfc thinking that the namenode was not > available/stuck. > The idea of this patch is to limit the number of leased released each time we > check for lease so the lock won't be taken for a too long time period. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298675#comment-15298675 ] Colin Patrick McCabe commented on HDFS-10301: - Oh, sorry! I didn't realize we had added a new rule about attaching patches. > BlockReport retransmissions may lead to storages falsely being declared > zombie if storage report processing happens out of order > > > Key: HDFS-10301 > URL: https://issues.apache.org/jira/browse/HDFS-10301 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.1 >Reporter: Konstantin Shvachko >Assignee: Colin Patrick McCabe >Priority: Critical > Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, > HDFS-10301.004.patch, HDFS-10301.005.patch, HDFS-10301.01.patch, > HDFS-10301.sample.patch, zombieStorageLogs.rtf > > > When NameNode is busy a DataNode can timeout sending a block report. Then it > sends the block report again. Then NameNode while process these two reports > at the same time can interleave processing storages from different reports. > This screws up the blockReportId field, which makes NameNode think that some > storages are zombie. Replicas from zombie storages are immediately removed, > causing missing blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10370) Allow DataNode to be started with numactl
[ https://issues.apache.org/jira/browse/HDFS-10370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298674#comment-15298674 ] Dave Marion commented on HDFS-10370: Not yet, we are just starting to test this. I don't expect that we would have any different performance #'s for the DN, I would expect them to be in Accumulo. > Allow DataNode to be started with numactl > - > > Key: HDFS-10370 > URL: https://issues.apache.org/jira/browse/HDFS-10370 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Dave Marion >Assignee: Dave Marion > Attachments: HDFS-10370-1.patch, HDFS-10370-2.patch, > HDFS-10370-3.patch > > > Allow numactl constraints to be applied to the datanode process. The > implementation I have in mind involves two environment variables (enable and > parameters) in the datanode startup process. Basically, if enabled and > numactl exists on the system, then start the java process using it. Provide a > default set of parameters, and allow the user to override the default. Wiring > this up for the non-jsvc use case seems straightforward. Not sure how this > can be supported using jsvc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9782) RollingFileSystemSink should have configurable roll interval
[ https://issues.apache.org/jira/browse/HDFS-9782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298651#comment-15298651 ] Daniel Templeton commented on HDFS-9782: There's already a JIRA for it: HDFS-10449 > RollingFileSystemSink should have configurable roll interval > > > Key: HDFS-9782 > URL: https://issues.apache.org/jira/browse/HDFS-9782 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Attachments: HDFS-9782.001.patch, HDFS-9782.002.patch, > HDFS-9782.003.patch, HDFS-9782.004.patch, HDFS-9782.005.patch, > HDFS-9782.006.patch, HDFS-9782.007.patch, HDFS-9782.008.patch, > HDFS-9782.009.patch > > > Right now it defaults to rolling at the top of every hour. Instead that > interval should be configurable. The interval should also allow for some > play so that all hosts don't try to flush their files simultaneously. > I'm filing this in HDFS because I suspect it will involve touching the HDFS > tests. If it turns out not to, I'll move it into common instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10449) TestRollingFileSystemSinkWithHdfs#testFailedClose() fails on branch-2
[ https://issues.apache.org/jira/browse/HDFS-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298656#comment-15298656 ] Daniel Templeton commented on HDFS-10449: - Go for it. We'll work out the JIRA ownership later. > TestRollingFileSystemSinkWithHdfs#testFailedClose() fails on branch-2 > - > > Key: HDFS-10449 > URL: https://issues.apache.org/jira/browse/HDFS-10449 > Project: Hadoop HDFS > Issue Type: Bug > Components: test > Environment: jenkins >Reporter: Takanobu Asanuma > > {noformat} > Running org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.263 sec <<< > FAILURE! - in > org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs > testFailedClose(org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs) > Time elapsed: 8.729 sec <<< FAILURE! > java.lang.AssertionError: No exception was generated while stopping sink even > though HDFS was unavailable > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs.testFailedClose(TestRollingFileSystemSinkWithHdfs.java:187) > {noformat} > This passes fine on trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10370) Allow DataNode to be started with numactl
[ https://issues.apache.org/jira/browse/HDFS-10370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298631#comment-15298631 ] John Zhuge commented on HDFS-10370: --- Thanks for the clarifications. I understand the current focus is on "interleave memory allocations", do you have any performance test results? > Allow DataNode to be started with numactl > - > > Key: HDFS-10370 > URL: https://issues.apache.org/jira/browse/HDFS-10370 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Dave Marion >Assignee: Dave Marion > Attachments: HDFS-10370-1.patch, HDFS-10370-2.patch, > HDFS-10370-3.patch > > > Allow numactl constraints to be applied to the datanode process. The > implementation I have in mind involves two environment variables (enable and > parameters) in the datanode startup process. Basically, if enabled and > numactl exists on the system, then start the java process using it. Provide a > default set of parameters, and allow the user to override the default. Wiring > this up for the non-jsvc use case seems straightforward. Not sure how this > can be supported using jsvc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-10301: Attachment: HDFS-10301.005.patch Rebasing patch 003 on trunk. > BlockReport retransmissions may lead to storages falsely being declared > zombie if storage report processing happens out of order > > > Key: HDFS-10301 > URL: https://issues.apache.org/jira/browse/HDFS-10301 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.1 >Reporter: Konstantin Shvachko >Assignee: Colin Patrick McCabe >Priority: Critical > Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, > HDFS-10301.004.patch, HDFS-10301.005.patch, HDFS-10301.01.patch, > HDFS-10301.sample.patch, zombieStorageLogs.rtf > > > When NameNode is busy a DataNode can timeout sending a block report. Then it > sends the block report again. Then NameNode while process these two reports > at the same time can interleave processing storages from different reports. > This screws up the blockReportId field, which makes NameNode think that some > storages are zombie. Replicas from zombie storages are immediately removed, > causing missing blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9782) RollingFileSystemSink should have configurable roll interval
[ https://issues.apache.org/jira/browse/HDFS-9782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298616#comment-15298616 ] Hudson commented on HDFS-9782: -- SUCCESS: Integrated in Hadoop-trunk-Commit #9847 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9847/]) HDFS-9782. RollingFileSystemSink should have configurable roll interval. (kasha: rev 57c31a3fef625f1ec609d7e8873d4941f7ed5cbc) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/metrics2/sink/TestRollingFileSystemSinkWithSecureHdfs.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/metrics2/sink/TestRollingFileSystemSinkWithHdfs.java * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/sink/TestRollingFileSystemSinkWithLocal.java * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/sink/TestRollingFileSystemSink.java * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/sink/RollingFileSystemSinkTestBase.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/sink/RollingFileSystemSink.java > RollingFileSystemSink should have configurable roll interval > > > Key: HDFS-9782 > URL: https://issues.apache.org/jira/browse/HDFS-9782 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Attachments: HDFS-9782.001.patch, HDFS-9782.002.patch, > HDFS-9782.003.patch, HDFS-9782.004.patch, HDFS-9782.005.patch, > HDFS-9782.006.patch, HDFS-9782.007.patch, HDFS-9782.008.patch, > HDFS-9782.009.patch > > > Right now it defaults to rolling at the top of every hour. Instead that > interval should be configurable. The interval should also allow for some > play so that all hosts don't try to flush their files simultaneously. > I'm filing this in HDFS because I suspect it will involve touching the HDFS > tests. If it turns out not to, I'll move it into common instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9782) RollingFileSystemSink should have configurable roll interval
[ https://issues.apache.org/jira/browse/HDFS-9782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298602#comment-15298602 ] Karthik Kambatla commented on HDFS-9782: Committed to trunk, but TestRollingFileSystemSinkWithHdfs#testFailedClose fails on branch-2. [~templedf] - can you look into the test failure? > RollingFileSystemSink should have configurable roll interval > > > Key: HDFS-9782 > URL: https://issues.apache.org/jira/browse/HDFS-9782 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Attachments: HDFS-9782.001.patch, HDFS-9782.002.patch, > HDFS-9782.003.patch, HDFS-9782.004.patch, HDFS-9782.005.patch, > HDFS-9782.006.patch, HDFS-9782.007.patch, HDFS-9782.008.patch, > HDFS-9782.009.patch > > > Right now it defaults to rolling at the top of every hour. Instead that > interval should be configurable. The interval should also allow for some > play so that all hosts don't try to flush their files simultaneously. > I'm filing this in HDFS because I suspect it will involve touching the HDFS > tests. If it turns out not to, I'll move it into common instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298604#comment-15298604 ] Zhe Zhang commented on HDFS-10301: -- [~cmccabe] Just a quick note that it's a new JIRA rule that you have to be either the assignee or a committer to attach a patch. > BlockReport retransmissions may lead to storages falsely being declared > zombie if storage report processing happens out of order > > > Key: HDFS-10301 > URL: https://issues.apache.org/jira/browse/HDFS-10301 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.1 >Reporter: Konstantin Shvachko >Assignee: Colin Patrick McCabe >Priority: Critical > Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, > HDFS-10301.004.patch, HDFS-10301.005.patch, HDFS-10301.01.patch, > HDFS-10301.sample.patch, zombieStorageLogs.rtf > > > When NameNode is busy a DataNode can timeout sending a block report. Then it > sends the block report again. Then NameNode while process these two reports > at the same time can interleave processing storages from different reports. > This screws up the blockReportId field, which makes NameNode think that some > storages are zombie. Replicas from zombie storages are immediately removed, > causing missing blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10430) Refactor FileSystem#checkAccessPermissions for better reuse from tests
[ https://issues.apache.org/jira/browse/HDFS-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298596#comment-15298596 ] Chris Nauroth commented on HDFS-10430: -- It's not immediately clear to me why another project's tests would need direct access to this method instead of using the public {{FileSystem#access}} method. Maybe seeing the proposed patch or pointing out examples would help clarify. The reason for the existence of the package-private {{FileSystem#checkAccessPermissions}} method is to provide code sharing between {{FileSystem}} and {{AbstractFileSystem}} for a default implementation of {{access}} in the base classes. However, that default implementation is not necessarily complete or correct for all file systems. For HDFS, {{DistributedFileSystem}} overrides {{access}} to use an RPC to the NameNode. The implementation of that RPC at the NameNode is different from the base class implementation, because it considers not only permissions but also HDFS ACLs. If {{checkAccessPermissions}} is made public, then there is a risk that applications would call it directly from main code, unaware that they could be bypassing ACL logic when connected to HDFS. > Refactor FileSystem#checkAccessPermissions for better reuse from tests > -- > > Key: HDFS-10430 > URL: https://issues.apache.org/jira/browse/HDFS-10430 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > > FileSystem#checkAccessPermissions could be used in a bunch of tests from > different projects, but it's in hadoop-common, which is not visible in some > cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298565#comment-15298565 ] Colin Patrick McCabe commented on HDFS-10301: - Hi [~redvine], Thanks for your interest in this. I wish I could get more people interested in this JIRA-- it has been hard to raise interest, unfortunately. Just to clarify, you don't need to assign a JIRA to yourself in order to post a patch or suggest a solution. In general, when someone is actively working on a patch, you should ask before reassigning their JIRAs to yourself. A whole separate RPC just for reporting the storages which are present seems excessive. It will add additional load to the namenode. {code} if (node.leaseId == 0) { - LOG.warn("BR lease 0x{} is not valid for DN {}, because the DN " + - "is not in the pending set.", - Long.toHexString(id), dn.getDatanodeUuid()); - return false; + LOG.debug("DN {} is not in the pending set because BR with lease 0x{} was processed out of order", + dn.getDatanodeUuid(), Long.toHexString(id)); + return true; {code} The leaseId being 0 doesn't mean that the block report was processed out of order. If you manually trigger a block report with the {{hdfs dfsadmin \-triggerBlockReport}} command, it will also have lease id 0. Legacy block reports will also have lease ID 0. In general, your solution doesn't fix the problem during upgrade and is a much bigger patch, which is why I think HDFS-10301.003.patch should be committed and the RPC changes should be done in a follow-on JIRA. I do not see us backporting RPC changes to all the stable branches. > BlockReport retransmissions may lead to storages falsely being declared > zombie if storage report processing happens out of order > > > Key: HDFS-10301 > URL: https://issues.apache.org/jira/browse/HDFS-10301 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.1 >Reporter: Konstantin Shvachko >Assignee: Colin Patrick McCabe >Priority: Critical > Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, > HDFS-10301.004.patch, HDFS-10301.01.patch, HDFS-10301.sample.patch, > zombieStorageLogs.rtf > > > When NameNode is busy a DataNode can timeout sending a block report. Then it > sends the block report again. Then NameNode while process these two reports > at the same time can interleave processing storages from different reports. > This screws up the blockReportId field, which makes NameNode think that some > storages are zombie. Replicas from zombie storages are immediately removed, > causing missing blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe reassigned HDFS-10301: --- Assignee: Colin Patrick McCabe (was: Vinitha Reddy Gankidi) > BlockReport retransmissions may lead to storages falsely being declared > zombie if storage report processing happens out of order > > > Key: HDFS-10301 > URL: https://issues.apache.org/jira/browse/HDFS-10301 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.1 >Reporter: Konstantin Shvachko >Assignee: Colin Patrick McCabe >Priority: Critical > Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, > HDFS-10301.004.patch, HDFS-10301.01.patch, HDFS-10301.sample.patch, > zombieStorageLogs.rtf > > > When NameNode is busy a DataNode can timeout sending a block report. Then it > sends the block report again. Then NameNode while process these two reports > at the same time can interleave processing storages from different reports. > This screws up the blockReportId field, which makes NameNode think that some > storages are zombie. Replicas from zombie storages are immediately removed, > causing missing blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10448) CacheManager#checkLimit always assumes a replication factor of 1
[ https://issues.apache.org/jira/browse/HDFS-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298543#comment-15298543 ] Colin Patrick McCabe commented on HDFS-10448: - I think it should change {{computeNeeded}} to take replication into account, rather than modifying the code that calls {{computeNeeded}}. > CacheManager#checkLimit always assumes a replication factor of 1 > - > > Key: HDFS-10448 > URL: https://issues.apache.org/jira/browse/HDFS-10448 > Project: Hadoop HDFS > Issue Type: Bug > Components: caching >Affects Versions: 2.7.1 >Reporter: Yiqun Lin >Assignee: Yiqun Lin > Attachments: HDFS-10448.001.patch > > > The logic in {{CacheManager#checkLimit}} is not correct. In this method, it > does with these three logic: > First, it will compute needed bytes for the specific path. > {code} > CacheDirectiveStats stats = computeNeeded(path, replication); > {code} > But the param {{replication}} is not used here. And the bytesNeeded is just > one replication's vaue. > {code} > return new CacheDirectiveStats.Builder() > .setBytesNeeded(requestedBytes) > .setFilesCached(requestedFiles) > .build(); > {code} > Second, then it should be multiply by the replication to compare the limit > size because the method {{computeNeeded}} was not used replication. > {code} > pool.getBytesNeeded() + (stats.getBytesNeeded() * replication) > > pool.getLimit() > {code} > Third, if we find the size was more than the limit value and then print > warning info. It divided by replication here, while the > {{stats.getBytesNeeded()}} was just one replication value. > {code} > throw new InvalidRequestException("Caching path " + path + " of size " > + stats.getBytesNeeded() / replication + " bytes at replication " > + replication + " would exceed pool " + pool.getPoolName() > + "'s remaining capacity of " > + (pool.getLimit() - pool.getBytesNeeded()) + " bytes."); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9782) RollingFileSystemSink should have configurable roll interval
[ https://issues.apache.org/jira/browse/HDFS-9782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298504#comment-15298504 ] Karthik Kambatla commented on HDFS-9782: Thanks for the updates, [~templedf]. +1, checking this in. > RollingFileSystemSink should have configurable roll interval > > > Key: HDFS-9782 > URL: https://issues.apache.org/jira/browse/HDFS-9782 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Attachments: HDFS-9782.001.patch, HDFS-9782.002.patch, > HDFS-9782.003.patch, HDFS-9782.004.patch, HDFS-9782.005.patch, > HDFS-9782.006.patch, HDFS-9782.007.patch, HDFS-9782.008.patch, > HDFS-9782.009.patch > > > Right now it defaults to rolling at the top of every hour. Instead that > interval should be configurable. The interval should also allow for some > play so that all hosts don't try to flush their files simultaneously. > I'm filing this in HDFS because I suspect it will involve touching the HDFS > tests. If it turns out not to, I'll move it into common instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10370) Allow DataNode to be started with numactl
[ https://issues.apache.org/jira/browse/HDFS-10370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298453#comment-15298453 ] Dave Marion commented on HDFS-10370: bq. Could you please update the patch? I'll try to get to it in the next day or so. bq. Could you elaborate a bit more the use cases? I want to be able to balance memory and cpu allocation for multiple processes on a single server. To do so I need those processes to have the ability to be managed. Specifically, I want to run multiple Accumulo tablet servers on a single host where a DN resides. One example is to interleave memory allocations for the DN across the numa nodes, then start 1 Accumulo tablet server per numa node. bq. If we are moving into the territory of numa awareness, shall we consider a solution more generic than just Datanode? I'm not looking for that at the moment. bq. Do we plan to support membind or cpubind? How to assign daemons to different numa nodes? The patch provides a default behavior (interleave memory), but that can be overridden such that the user can change the numactl options bq. How to deal with imbalance in usage? This is an advanced feature. I would assume that the person enabling this knows what numactl is, how to use it, and what the side effects could be. bq. How to support this feature across platforms? I'm not sure if/how other platforms support something like this. > Allow DataNode to be started with numactl > - > > Key: HDFS-10370 > URL: https://issues.apache.org/jira/browse/HDFS-10370 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Dave Marion >Assignee: Dave Marion > Attachments: HDFS-10370-1.patch, HDFS-10370-2.patch, > HDFS-10370-3.patch > > > Allow numactl constraints to be applied to the datanode process. The > implementation I have in mind involves two environment variables (enable and > parameters) in the datanode startup process. Basically, if enabled and > numactl exists on the system, then start the java process using it. Provide a > default set of parameters, and allow the user to override the default. Wiring > this up for the non-jsvc use case seems straightforward. Not sure how this > can be supported using jsvc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8678) Bring back the feature to view chunks of files in the HDFS file browser
[ https://issues.apache.org/jira/browse/HDFS-8678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298396#comment-15298396 ] Hadoop QA commented on HDFS-8678: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 0m 44s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12805912/HDFS-8678.02.patch | | JIRA Issue | HDFS-8678 | | Optional Tests | asflicense | | uname | Linux 6428874d5e1a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / b4078bd | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/15547/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > Bring back the feature to view chunks of files in the HDFS file browser > --- > > Key: HDFS-8678 > URL: https://issues.apache.org/jira/browse/HDFS-8678 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: HDFS-8678.01.patch, HDFS-8678.02.patch > > > The legacy file browser displayed small chunks of a file in the browser > itself. This was useful to users because they can quickly verify that their > input or output is in the format they expect. We should bring back this > functionality. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8678) Bring back the feature to view chunks of files in the HDFS file browser
[ https://issues.apache.org/jira/browse/HDFS-8678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated HDFS-8678: --- Attachment: HDFS-8678.02.patch Here's a new patch which simplifies the alignment of the links > Bring back the feature to view chunks of files in the HDFS file browser > --- > > Key: HDFS-8678 > URL: https://issues.apache.org/jira/browse/HDFS-8678 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: HDFS-8678.01.patch, HDFS-8678.02.patch > > > The legacy file browser displayed small chunks of a file in the browser > itself. This was useful to users because they can quickly verify that their > input or output is in the format they expect. We should bring back this > functionality. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9782) RollingFileSystemSink should have configurable roll interval
[ https://issues.apache.org/jira/browse/HDFS-9782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298384#comment-15298384 ] Hadoop QA commented on HDFS-9782: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 19s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 47s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 58s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 2s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 19s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 19s {color} | {color:red} root: patch generated 1 new + 12 unchanged - 6 fixed = 13 total (was 18) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 0s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 18s {color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 59m 23s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 118m 21s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.net.TestDNS | | | hadoop.hdfs.server.namenode.TestNameNodeMetricsLogger | | | hadoop.hdfs.TestCrcCorruption | | Timed out junit tests | org.apache.hadoop.http.TestHttpServerLifecycle | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12805875/HDFS-9782.009.patch | | JIRA Issue | HDFS-9782 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 8c3674ce809f 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / b4078bd | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/15545/artifact/patchprocess/diff-checkstyle-root.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/15545/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt | | unit |
[jira] [Commented] (HDFS-10434) Fix intermittent test failure of TestDataNodeErasureCodingMetrics
[ https://issues.apache.org/jira/browse/HDFS-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298369#comment-15298369 ] Rakesh R commented on HDFS-10434: - Thanks [~libo-intel]. [~drankye] please a look at this when you get a chance, thanks! > Fix intermittent test failure of TestDataNodeErasureCodingMetrics > - > > Key: HDFS-10434 > URL: https://issues.apache.org/jira/browse/HDFS-10434 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-10434-00.patch > > > This jira is to fix the test case failure. > Reference : > [Build15485_TestDataNodeErasureCodingMetrics_testEcTasks|https://builds.apache.org/job/PreCommit-HDFS-Build/15485/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeErasureCodingMetrics/testEcTasks/] > {code} > Error Message > Bad value for metric EcReconstructionTasks expected:<1> but was:<0> > Stacktrace > java.lang.AssertionError: Bad value for metric EcReconstructionTasks > expected:<1> but was:<0> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.test.MetricsAsserts.assertCounter(MetricsAsserts.java:228) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.testEcTasks(TestDataNodeErasureCodingMetrics.java:92) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9877) HDFS Namenode UI: Fix browsing directories that need to be encoded
[ https://issues.apache.org/jira/browse/HDFS-9877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298364#comment-15298364 ] Hadoop QA commented on HDFS-9877: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 0m 37s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12790615/HDFS-9877.01.patch | | JIRA Issue | HDFS-9877 | | Optional Tests | asflicense | | uname | Linux 6de6786474cb 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / b4078bd | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/15546/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > HDFS Namenode UI: Fix browsing directories that need to be encoded > -- > > Key: HDFS-9877 > URL: https://issues.apache.org/jira/browse/HDFS-9877 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: HDFS-9877.01.patch > > > When we browse a directory whose encoded path is not the same as unencoded > path, *2* HTTP requests are sent (instead of 1) and the 2nd request fails. > e.g. when I browse {{/tmp/new directory}} (note the space) it gets encoded to > {{/tmp/new%20directory}} and then > [this|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/explorer.js#L40] > check fails: > {code} > if(current_directory != dir) { > browse_directory(dir); > }{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9877) HDFS Namenode UI: Fix browsing directories that need to be encoded
[ https://issues.apache.org/jira/browse/HDFS-9877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated HDFS-9877: --- Target Version/s: (was: 2.8.0) Status: Patch Available (was: Open) > HDFS Namenode UI: Fix browsing directories that need to be encoded > -- > > Key: HDFS-9877 > URL: https://issues.apache.org/jira/browse/HDFS-9877 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: HDFS-9877.01.patch > > > When we browse a directory whose encoded path is not the same as unencoded > path, *2* HTTP requests are sent (instead of 1) and the 2nd request fails. > e.g. when I browse {{/tmp/new directory}} (note the space) it gets encoded to > {{/tmp/new%20directory}} and then > [this|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/explorer.js#L40] > check fails: > {code} > if(current_directory != dir) { > browse_directory(dir); > }{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10217) show "blockScheduled count" in datanodes table.
[ https://issues.apache.org/jira/browse/HDFS-10217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298335#comment-15298335 ] Ravi Prakash commented on HDFS-10217: - I'm fine with exposing the information as a tooltip. > show "blockScheduled count" in datanodes table. > --- > > Key: HDFS-10217 > URL: https://issues.apache.org/jira/browse/HDFS-10217 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > Attachments: HDFS-10217-002.patch, HDFS-10217.patch > > > It will more useful for debugging purpose where user can see how many blocks > got schduled for DN -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298307#comment-15298307 ] Kai Zheng commented on HDFS-9833: - Thanks Rakesh for the update on this. I will take a careful review tomorrow. Sounds good to me to do the tasks split up and would you please go ahead. > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, > HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10430) Refactor FileSystem#checkAccessPermissions for better reuse from tests
[ https://issues.apache.org/jira/browse/HDFS-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298287#comment-15298287 ] Andras Bokor commented on HDFS-10430: - cc: [~cnauroth]. He added this method. He may be able to share some more thoughts about this. > Refactor FileSystem#checkAccessPermissions for better reuse from tests > -- > > Key: HDFS-10430 > URL: https://issues.apache.org/jira/browse/HDFS-10430 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > > FileSystem#checkAccessPermissions could be used in a bunch of tests from > different projects, but it's in hadoop-common, which is not visible in some > cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10384) MiniDFSCluster doesnt support multiple HTTPS server instances
[ https://issues.apache.org/jira/browse/HDFS-10384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298280#comment-15298280 ] Hadoop QA commented on HDFS-10384: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 4s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 3s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 2s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 33s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 1 new + 225 unchanged - 0 fixed = 226 total (was 225) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 38s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 98m 47s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeTransferSocketSize | | | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations | | Timed out junit tests | org.apache.hadoop.hdfs.TestMiniDFSCluster | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12803214/HDFS-10384-01.patch | | JIRA Issue | HDFS-10384 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 6fe5dd505994 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / b4078bd | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/15544/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/15544/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | unit test logs | https://builds.apache.org/job/PreCommit-HDFS-Build/15544/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/15544/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/15544/console | | Powered by
[jira] [Commented] (HDFS-10385) LocalFileSystem rename() function should return false when destination file exists
[ https://issues.apache.org/jira/browse/HDFS-10385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298267#comment-15298267 ] Andras Bokor commented on HDFS-10385: - Who can/should resolve this? [~cnauroth], I have a similar one. HADOOP-9819 also changes a rename behavior but it seems as a real bug. Could you please check. I am confused whether we should do that change or not. > LocalFileSystem rename() function should return false when destination file > exists > -- > > Key: HDFS-10385 > URL: https://issues.apache.org/jira/browse/HDFS-10385 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs >Affects Versions: 2.6.0 >Reporter: Aihua Xu >Assignee: Xiaobing Zhou > > Currently rename() of LocalFileSystem returns true and renames successfully > when the destination file exists. That seems to have different behavior from > DFSFileSystem. > If they can have the same behavior, then we can use one call to do rename > rather than checking if destination exists and then making rename() call. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9782) RollingFileSystemSink should have configurable roll interval
[ https://issues.apache.org/jira/browse/HDFS-9782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated HDFS-9782: --- Attachment: (was: HDFS-9782.009.patch) > RollingFileSystemSink should have configurable roll interval > > > Key: HDFS-9782 > URL: https://issues.apache.org/jira/browse/HDFS-9782 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Attachments: HDFS-9782.001.patch, HDFS-9782.002.patch, > HDFS-9782.003.patch, HDFS-9782.004.patch, HDFS-9782.005.patch, > HDFS-9782.006.patch, HDFS-9782.007.patch, HDFS-9782.008.patch, > HDFS-9782.009.patch > > > Right now it defaults to rolling at the top of every hour. Instead that > interval should be configurable. The interval should also allow for some > play so that all hosts don't try to flush their files simultaneously. > I'm filing this in HDFS because I suspect it will involve touching the HDFS > tests. If it turns out not to, I'll move it into common instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9782) RollingFileSystemSink should have configurable roll interval
[ https://issues.apache.org/jira/browse/HDFS-9782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated HDFS-9782: --- Attachment: HDFS-9782.009.patch > RollingFileSystemSink should have configurable roll interval > > > Key: HDFS-9782 > URL: https://issues.apache.org/jira/browse/HDFS-9782 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Attachments: HDFS-9782.001.patch, HDFS-9782.002.patch, > HDFS-9782.003.patch, HDFS-9782.004.patch, HDFS-9782.005.patch, > HDFS-9782.006.patch, HDFS-9782.007.patch, HDFS-9782.008.patch, > HDFS-9782.009.patch > > > Right now it defaults to rolling at the top of every hour. Instead that > interval should be configurable. The interval should also allow for some > play so that all hosts don't try to flush their files simultaneously. > I'm filing this in HDFS because I suspect it will involve touching the HDFS > tests. If it turns out not to, I'll move it into common instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298145#comment-15298145 ] Rakesh R commented on HDFS-9833: Hi, [~drankye], [~umamaheswararao] would be great to see feedback on the latest patch. I will create separate jira tasks if you agree on [tasks split up|https://issues.apache.org/jira/browse/HDFS-9833?focusedCommentId=15295644=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15295644] mentioned earlier in this jira and will try to push this basic patch in. Thanks! Note: I feel the checkstyle warning can be ignored, if needed will rename the args in next patch preparation time. Also, the test case failure is unrelated to my patch. > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, > HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10217) show "blockScheduled count" in datanodes table.
[ https://issues.apache.org/jira/browse/HDFS-10217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298139#comment-15298139 ] Vinayakumar B commented on HDFS-10217: -- Tooltip patch looks good. +1 [~raviprak], Can you also take a look? I am planning to commit this tomorrow, unless there is something to be addressed, from Ravi. > show "blockScheduled count" in datanodes table. > --- > > Key: HDFS-10217 > URL: https://issues.apache.org/jira/browse/HDFS-10217 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > Attachments: HDFS-10217-002.patch, HDFS-10217.patch > > > It will more useful for debugging purpose where user can see how many blocks > got schduled for DN -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10454) libhdfspp: Move NameNodeOp to a separate file
Anatoli Shein created HDFS-10454: Summary: libhdfspp: Move NameNodeOp to a separate file Key: HDFS-10454 URL: https://issues.apache.org/jira/browse/HDFS-10454 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Anatoli Shein Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298102#comment-15298102 ] Hadoop QA commented on HDFS-9833: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 41s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 9s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 29s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 18s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 28s {color} | {color:red} hadoop-hdfs-project: patch generated 1 new + 104 unchanged - 0 fixed = 105 total (was 104) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 51s {color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 37s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 95m 55s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12805847/HDFS-9833-04.patch | | JIRA Issue | HDFS-9833 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux af060e86d437 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / b4078bd | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/15543/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/15543/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | unit test
[jira] [Commented] (HDFS-10256) Use GenericTestUtils.getTestDir method in tests for temporary directories
[ https://issues.apache.org/jira/browse/HDFS-10256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298090#comment-15298090 ] Vinayakumar B commented on HDFS-10256: -- Pushed below changes to the latest PR along with rebase against latest trunk code. 1. {{MiniDfsCluster#shutdown()}}, deletes entire basedir using {{FileUtil.fullyDelete()}} in case of {{deleteDfsDir}} is true. Else, registers all files recursively for deleteOnExit in reverse order. 2. {{MiniDfsCluster#getBaseDirectory()}} will now return an unique directory named with junit test method name. > Use GenericTestUtils.getTestDir method in tests for temporary directories > - > > Key: HDFS-10256 > URL: https://issues.apache.org/jira/browse/HDFS-10256 > Project: Hadoop HDFS > Issue Type: Improvement > Components: build, test >Reporter: Vinayakumar B >Assignee: Vinayakumar B > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298047#comment-15298047 ] Rakesh R commented on HDFS-9833: I'm attaching new patch. Here I corrected the reconstruction of missing blocks and its checksum calculation logic. Also, added one more unit test case. > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, > HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-9833: --- Attachment: HDFS-9833-04.patch > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, > HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10367) TestDFSShell.testMoveWithTargetPortEmpty fails with Address bind exception.
[ https://issues.apache.org/jira/browse/HDFS-10367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula updated HDFS-10367: Attachment: HDFS-10367-004.patch Uploaded the patch to fix the checkstyle comment. > TestDFSShell.testMoveWithTargetPortEmpty fails with Address bind exception. > --- > > Key: HDFS-10367 > URL: https://issues.apache.org/jira/browse/HDFS-10367 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > Attachments: HDFS-10367-002.patch, HDFS-10367-003.patch, > HDFS-10367-004.patch, HDFS-10367.patch > > > {noformat} > Problem binding to [localhost:9820] java.net.BindException: Address already > in use; For more details see: http://wiki.apache.org/hadoop/BindException > Stack Trace: > java.net.BindException: Problem binding to [localhost:9820] > java.net.BindException: Address already in use; For more details see: > http://wiki.apache.org/hadoop/BindException > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at org.apache.hadoop.ipc.Server.bind(Server.java:530) > at org.apache.hadoop.ipc.Server$Listener.(Server.java:793) > at org.apache.hadoop.ipc.Server.(Server.java:2592) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:958) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:563) > at > org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:538) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:800) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:426) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:783) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:710) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:924) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:903) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1620) > at > org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1247) > at > org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1016) > at > org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:891) > at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:823) > at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:482) > at > org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:441) > at > org.apache.hadoop.hdfs.TestDFSShell.testMoveWithTargetPortEmpty(TestDFSShell.java:567) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10367) TestDFSShell.testMoveWithTargetPortEmpty fails with Address bind exception.
[ https://issues.apache.org/jira/browse/HDFS-10367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297920#comment-15297920 ] Hadoop QA commented on HDFS-10367: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 44s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 27s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 0s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 25s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 3s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 30s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 31s {color} | {color:red} root: patch generated 1 new + 183 unchanged - 0 fixed = 184 total (was 183) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 59s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 20s {color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 12s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 119m 13s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestAsyncDFSRename | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12805821/HDFS-10367-003.patch | | JIRA Issue | HDFS-10367 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 531dd6eadb8e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / b4078bd | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/15537/artifact/patchprocess/diff-checkstyle-root.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/15537/artifact/patchprocess/whitespace-eol.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/15537/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | unit test logs |
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297903#comment-15297903 ] Hadoop QA commented on HDFS-9833: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 46s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 24s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 11s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 27s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 30s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 32s {color} | {color:red} hadoop-hdfs-project: patch generated 8 new + 104 unchanged - 0 fixed = 112 total (was 104) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 3s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 58s {color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 58s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 92m 16s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-project/hadoop-hdfs | | | Unread field:BlockChecksumHelper.java:[line 346] | | | Should org.apache.hadoop.hdfs.server.datanode.BlockChecksumHelper$BlockGroupNonStripedChecksumComputer$ECBlockInfo be a _static_ inner class? At BlockChecksumHelper.java:inner class? At BlockChecksumHelper.java:[lines 353-363] | | Failed junit tests | hadoop.hdfs.TestAsyncDFSRename | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12805825/HDFS-9833-03.patch | | JIRA Issue | HDFS-9833 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux 6c5e4653856e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297887#comment-15297887 ] Hadoop QA commented on HDFS-9833: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 49s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 39s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 42s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 24s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 31s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 34s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 30s {color} | {color:red} hadoop-hdfs-project: patch generated 10 new + 103 unchanged - 0 fixed = 113 total (was 103) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 1s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 0s {color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 8s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 97m 30s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-project/hadoop-hdfs | | | Unread field:BlockChecksumHelper.java:[line 347] | | | Should org.apache.hadoop.hdfs.server.datanode.BlockChecksumHelper$BlockGroupNonStripedChecksumComputer$ECBlockInfo be a _static_ inner class? At BlockChecksumHelper.java:inner class? At BlockChecksumHelper.java:[lines 354-364] | | Failed junit tests | hadoop.hdfs.server.balancer.TestBalancer | | | hadoop.hdfs.TestAsyncDFSRename | | | hadoop.hdfs.TestMissingBlocksAlert | | | hadoop.hdfs.TestDFSUpgradeFromImage | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12805822/HDFS-9833-03.patch | | JIRA Issue | HDFS-9833 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux 41f122c5b459 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC
[jira] [Commented] (HDFS-8057) Move BlockReader implementation to the client implementation package
[ https://issues.apache.org/jira/browse/HDFS-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297867#comment-15297867 ] Takanobu Asanuma commented on HDFS-8057: The failed test, {{TestEditLog.testBatchedSyncWithClosedLogs}}, is passed in my laptop. The others seem not to be related to this patch. Please let me know if there is any problem. Thanks. > Move BlockReader implementation to the client implementation package > > > Key: HDFS-8057 > URL: https://issues.apache.org/jira/browse/HDFS-8057 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Tsz Wo Nicholas Sze >Assignee: Takanobu Asanuma > Attachments: HDFS-8057.1.patch, HDFS-8057.2.patch, HDFS-8057.3.patch, > HDFS-8057.4.patch, HDFS-8057.branch-2.001.patch, > HDFS-8057.branch-2.002.patch, HDFS-8057.branch-2.5.patch > > > BlockReaderLocal, RemoteBlockReader, etc should be moved to > org.apache.hadoop.hdfs.client.impl. We may as well rename RemoteBlockReader > to BlockReaderRemote. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.
[ https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Xiaoqiao updated HDFS-10453: --- Status: Patch Available (was: Open) > ReplicationMonitor thread could stuck for long time due to the race between > replication and delete of same file in a large cluster. > --- > > Key: HDFS-10453 > URL: https://issues.apache.org/jira/browse/HDFS-10453 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.1 >Reporter: He Xiaoqiao > Fix For: 2.7.1 > > Attachments: HDFS-10453-branch-2.001.patch > > > ReplicationMonitor thread could stuck for long time and loss data with little > probability. Consider the typical scenario: > (1) create and close a file with the default replicas(3); > (2) increase replication (to 10) of the file. > (3) delete the file while ReplicationMonitor is scheduling blocks belong to > that file for replications. > if ReplicationMonitor stuck reappeared, NameNode will print log as: > {code:xml} > 2016-04-19 10:20:48,083 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 7 to reach 10 > (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, > newBlock=false) For more information, please enable DEBUG log level on > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy > .. > 2016-04-19 10:21:17,184 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 7 to reach 10 > (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, > newBlock=false) For more information, please enable DEBUG log level on > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy > 2016-04-19 10:21:17,184 WARN > org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough > replicas: expected size is 7 but only 0 storage types can be selected > (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, > DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}) > 2016-04-19 10:21:17,184 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 7 to reach 10 > (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, > newBlock=false) All required storage types are unavailable: > unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]} > {code} > This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) > process same block at the same moment. > (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to > replicate and leave the global lock. > (2) FSNamesystem#delete invoked to delete blocks then clear the reference in > blocksmap, needReplications, etc. the block's NumBytes will set > NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does > not need explicit ACK from the node. > (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to > chooseTargets for the same blocks and no node will be selected after traverse > whole cluster because no node choice satisfy the goodness criteria > (remaining spaces achieve required size Long.MAX_VALUE). > During of stage#3 ReplicationMonitor stuck for long time, especial in a large > cluster. invalidateBlocks & neededReplications continues to grow and no > consumes. it will loss data at the worst. > This can mostly be avoided by skip chooseTarget for BlockCommand.NO_ACK block > and remove it from neededReplications. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.
[ https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Xiaoqiao updated HDFS-10453: --- Status: Open (was: Patch Available) > ReplicationMonitor thread could stuck for long time due to the race between > replication and delete of same file in a large cluster. > --- > > Key: HDFS-10453 > URL: https://issues.apache.org/jira/browse/HDFS-10453 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.1 >Reporter: He Xiaoqiao > Fix For: 2.7.1 > > Attachments: HDFS-10453-branch-2.001.patch > > > ReplicationMonitor thread could stuck for long time and loss data with little > probability. Consider the typical scenario: > (1) create and close a file with the default replicas(3); > (2) increase replication (to 10) of the file. > (3) delete the file while ReplicationMonitor is scheduling blocks belong to > that file for replications. > if ReplicationMonitor stuck reappeared, NameNode will print log as: > {code:xml} > 2016-04-19 10:20:48,083 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 7 to reach 10 > (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, > newBlock=false) For more information, please enable DEBUG log level on > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy > .. > 2016-04-19 10:21:17,184 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 7 to reach 10 > (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, > newBlock=false) For more information, please enable DEBUG log level on > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy > 2016-04-19 10:21:17,184 WARN > org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough > replicas: expected size is 7 but only 0 storage types can be selected > (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, > DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}) > 2016-04-19 10:21:17,184 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 7 to reach 10 > (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, > newBlock=false) All required storage types are unavailable: > unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]} > {code} > This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) > process same block at the same moment. > (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to > replicate and leave the global lock. > (2) FSNamesystem#delete invoked to delete blocks then clear the reference in > blocksmap, needReplications, etc. the block's NumBytes will set > NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does > not need explicit ACK from the node. > (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to > chooseTargets for the same blocks and no node will be selected after traverse > whole cluster because no node choice satisfy the goodness criteria > (remaining spaces achieve required size Long.MAX_VALUE). > During of stage#3 ReplicationMonitor stuck for long time, especial in a large > cluster. invalidateBlocks & neededReplications continues to grow and no > consumes. it will loss data at the worst. > This can mostly be avoided by skip chooseTarget for BlockCommand.NO_ACK block > and remove it from neededReplications. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.
[ https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Xiaoqiao updated HDFS-10453: --- Attachment: (was: HDFS-10453.patch) > ReplicationMonitor thread could stuck for long time due to the race between > replication and delete of same file in a large cluster. > --- > > Key: HDFS-10453 > URL: https://issues.apache.org/jira/browse/HDFS-10453 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.1 >Reporter: He Xiaoqiao > Fix For: 2.7.1 > > Attachments: HDFS-10453-branch-2.001.patch > > > ReplicationMonitor thread could stuck for long time and loss data with little > probability. Consider the typical scenario: > (1) create and close a file with the default replicas(3); > (2) increase replication (to 10) of the file. > (3) delete the file while ReplicationMonitor is scheduling blocks belong to > that file for replications. > if ReplicationMonitor stuck reappeared, NameNode will print log as: > {code:xml} > 2016-04-19 10:20:48,083 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 7 to reach 10 > (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, > newBlock=false) For more information, please enable DEBUG log level on > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy > .. > 2016-04-19 10:21:17,184 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 7 to reach 10 > (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, > newBlock=false) For more information, please enable DEBUG log level on > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy > 2016-04-19 10:21:17,184 WARN > org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough > replicas: expected size is 7 but only 0 storage types can be selected > (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, > DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}) > 2016-04-19 10:21:17,184 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 7 to reach 10 > (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, > newBlock=false) All required storage types are unavailable: > unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]} > {code} > This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) > process same block at the same moment. > (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to > replicate and leave the global lock. > (2) FSNamesystem#delete invoked to delete blocks then clear the reference in > blocksmap, needReplications, etc. the block's NumBytes will set > NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does > not need explicit ACK from the node. > (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to > chooseTargets for the same blocks and no node will be selected after traverse > whole cluster because no node choice satisfy the goodness criteria > (remaining spaces achieve required size Long.MAX_VALUE). > During of stage#3 ReplicationMonitor stuck for long time, especial in a large > cluster. invalidateBlocks & neededReplications continues to grow and no > consumes. it will loss data at the worst. > This can mostly be avoided by skip chooseTarget for BlockCommand.NO_ACK block > and remove it from neededReplications. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org