[jira] [Commented] (HDFS-15176) Enable GcTimePercentage Metric in NameNode's JvmMetrics.
[ https://issues.apache.org/jira/browse/HDFS-15176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17038845#comment-17038845 ] Hadoop QA commented on HDFS-15176: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 46s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 11s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 23m 50s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 3s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 19m 0s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 3m 19s{color} | {color:orange} root: The patch generated 12 new + 568 unchanged - 0 fixed = 580 total (was 568) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 50s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 39s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 12m 8s{color} | {color:red} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}112m 42s{color} | {color:red} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 48s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}257m 30s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.security.TestFixKerberosTicketOrder | | | hadoop.ipc.TestRPC | | | hadoop.security.TestRaceWhenRelogin | | | hadoop.hdfs.TestDeadNodeDetection | | | hadoop.tools.TestHdfsConfigFields | | | hadoop.hdfs.TestRollingUpgrade | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 | | JIRA Issue | HDFS-15176 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12993731/HDFS-15176.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 1b1487595d04 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Updated] (HDFS-15177) Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too much time.
[ https://issues.apache.org/jira/browse/HDFS-15177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuqi updated HDFS-15177: - Summary: Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too much time. (was: Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too many time.) > Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too > much time. > -- > > Key: HDFS-15177 > URL: https://issues.apache.org/jira/browse/HDFS-15177 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: zhuqi >Assignee: zhuqi >Priority: Major > > In our cluster, the datanode receive the delete command with too many blocks > deletion when we have many blockpools sharing the same datanode and the > datanode with about 30 storage dirs, it will cause the FsDatasetImpl lock too > much time. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15177) Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too many time.
[ https://issues.apache.org/jira/browse/HDFS-15177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuqi updated HDFS-15177: - Description: In our cluster, the datanode receive the delete command with too many blocks deletion when we have many blockpools sharing the same datanode and the datanode with about 30 storage dirs, it will cause the FsDatasetImpl lock too much time. was: In our cluster, the datanode receive the delete command with too many blocks deletion when we have many blockpools sharing the same datanode, it will cause the FsDatasetImpl lock too much time. > Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too > many time. > -- > > Key: HDFS-15177 > URL: https://issues.apache.org/jira/browse/HDFS-15177 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: zhuqi >Assignee: zhuqi >Priority: Major > > In our cluster, the datanode receive the delete command with too many blocks > deletion when we have many blockpools sharing the same datanode and the > datanode with about 30 storage dirs, it will cause the FsDatasetImpl lock too > much time. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15178) Federation: Add missing FederationClientInterceptor APIs
D M Murali Krishna Reddy created HDFS-15178: --- Summary: Federation: Add missing FederationClientInterceptor APIs Key: HDFS-15178 URL: https://issues.apache.org/jira/browse/HDFS-15178 Project: Hadoop HDFS Issue Type: Improvement Components: federation Reporter: D M Murali Krishna Reddy In FederationClientInterceptor, many API's are not Implemented. * getClusterNodes * getQueueInfo * getQueueUserAcls * moveApplicationAcrossQueues * getNewReservation * submitReservation * listReservations * updateReservation * deleteReservation * getNodeToLabels * getLabelsToNodes * getClusterNodeLabels * getApplicationAttemptReport * getApplicationAttempts * getContainerReport * getContainers * getDelegationToken * renewDelegationToken * cancelDelegationToken * failApplicationAttempt * updateApplicationPriority * signalToContainer * updateApplicationTimeouts * getResourceProfiles * getResourceProfile * getResourceTypeInfo * getAttributesToNodes * getClusterNodeAttributes * getNodesToAttributes -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15177) Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too many time.
[ https://issues.apache.org/jira/browse/HDFS-15177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuqi updated HDFS-15177: - Description: In our cluster, the datanode receive the delete command with too many blocks deletion when we have many blockpools sharing the same datanode, it will cause the FsDatasetImpl lock too much time. was: In our cluster , the datanode receive the delete command with too many blocks deletion when we have many blockpools sharing the same datanode, it will cause the FsDatasetImpl lock too much time. > Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too > many time. > -- > > Key: HDFS-15177 > URL: https://issues.apache.org/jira/browse/HDFS-15177 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: zhuqi >Assignee: zhuqi >Priority: Major > > In our cluster, the datanode receive the delete command with too many blocks > deletion when we have many blockpools sharing the same datanode, it will > cause the FsDatasetImpl lock too much time. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15177) Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too many time.
zhuqi created HDFS-15177: Summary: Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too many time. Key: HDFS-15177 URL: https://issues.apache.org/jira/browse/HDFS-15177 Project: Hadoop HDFS Issue Type: Improvement Reporter: zhuqi Assignee: zhuqi In our cluster , the datanode receive the delete command with too many blocks deletion when we have many blockpools sharing the same datanode, it will cause the FsDatasetImpl lock too much time. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15120) Refresh BlockPlacementPolicy at runtime.
[ https://issues.apache.org/jira/browse/HDFS-15120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17038793#comment-17038793 ] Hadoop QA commented on HDFS-15120: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 44s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 35s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 42s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 348 unchanged - 0 fixed = 351 total (was 348) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 46s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}120m 7s{color} | {color:green} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 44s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}187m 32s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 | | JIRA Issue | HDFS-15120 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12993730/HDFS-15120.003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 8e44025ebd0c 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a562942 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_232 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/28794/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28794/testReport/ | | Max. process+thread count | 3048 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/28794/console | | Powered by | Apache Yetus 0.8.0
[jira] [Updated] (HDFS-15176) Enable GcTimePercentage Metric in NameNode's JvmMetrics.
[ https://issues.apache.org/jira/browse/HDFS-15176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinglun updated HDFS-15176: --- Attachment: HDFS-15176.001.patch Status: Patch Available (was: Open) > Enable GcTimePercentage Metric in NameNode's JvmMetrics. > > > Key: HDFS-15176 > URL: https://issues.apache.org/jira/browse/HDFS-15176 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Attachments: HDFS-15176.001.patch > > > The GcTimePercentage(computed by GcTimeMonitor) could be used as a dimension > to analyze the NameNode GC. We should add a switch config to enable the > GcTimePercentage metric in HDFS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15176) Enable GcTimePercentage Metric in NameNode's JvmMetrics.
Jinglun created HDFS-15176: -- Summary: Enable GcTimePercentage Metric in NameNode's JvmMetrics. Key: HDFS-15176 URL: https://issues.apache.org/jira/browse/HDFS-15176 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jinglun Assignee: Jinglun The GcTimePercentage(computed by GcTimeMonitor) could be used as a dimension to analyze the NameNode GC. We should add a switch config to enable the GcTimePercentage metric in HDFS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15120) Refresh BlockPlacementPolicy at runtime.
[ https://issues.apache.org/jira/browse/HDFS-15120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinglun updated HDFS-15120: --- Attachment: HDFS-15120.003.patch > Refresh BlockPlacementPolicy at runtime. > > > Key: HDFS-15120 > URL: https://issues.apache.org/jira/browse/HDFS-15120 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > Attachments: HDFS-15120.001.patch, HDFS-15120.002.patch, > HDFS-15120.003.patch > > > Now if we want to switch BlockPlacementPolicies we need to restart the > NameNode. It would be convenient if we can switch it at runtime. For example > we can switch between AvailableSpaceBlockPlacementPolicy and > BlockPlacementPolicyDefault as needed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15172) Remove unnecessary deadNodeDetectInterval in DeadNodeDetector#checkDeadNodes()
[ https://issues.apache.org/jira/browse/HDFS-15172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17038721#comment-17038721 ] Lisheng Sun commented on HDFS-15172: Thank [~elgoiri] for review. This jira solves the problem that that excessive frequency check. HDFS-15149 should not bring this part of the modification and is surpose to solve that DeadNodeDetector surpresses all interrupts and never checks for a termination flag. I think these two problem is better divided into two jiras. > Remove unnecessary deadNodeDetectInterval in > DeadNodeDetector#checkDeadNodes() > --- > > Key: HDFS-15172 > URL: https://issues.apache.org/jira/browse/HDFS-15172 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-15172-001.patch, HDFS-15172-002.patch > > > Every call to checkDeadNodes() will change the state to IDLE forcing the > DeadNodeDetector to sleep for IDLE_SLEEP_MS. So we don't need > deadNodeDetectInterval between every checkDeadNodes(). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15149) TestDeadNodeDetection test cases time-out
[ https://issues.apache.org/jira/browse/HDFS-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17038673#comment-17038673 ] Ahmed Hussein commented on HDFS-15149: -- The poll period and waiting time (5000 and 10) in {{waitFoDeadNode}} is very large. I assume you had to use large numbers to match the delays of the detector threads. I have a question about {{clearAndGetDetectedDeadNodes()}}: As far as I understand Calling the method in a loop means that a "deadnode" can be removed from the {{deadNodes}} map. In other words, the count may never reach 3, because the map does not for the removed nodes from the list. Please feel free to correct my understanding of the code if I am wrong. I did not find easy to understand the implementation of the {{DeadNodeDetector}}. It is very challenging to avoid timeout when there are multiple threads running in parallel {{DeadNodeDetector}}, {{Probe}}, and {{ProbeSchedulers}}. IMHO, {{DeadNodeDetector.java}} needs to introduce more aggressive mechanisms to coordinate between the threads. Instead of just racing between each other, tasks can use conditional variables to communicate like synchronized queues, or object monitors. Another benefit from using conditional variables is that the runtime of the tests will be improved because there won't be need to wait for a full cycle. The {{DefaultSpeculator.java}} has a synchronized queue just for the purpose of testing: "{{DefaultSpeculator.scanControl}}". > TestDeadNodeDetection test cases time-out > - > > Key: HDFS-15149 > URL: https://issues.apache.org/jira/browse/HDFS-15149 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Ahmed Hussein >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-15149-001.patch > > > TestDeadNodeDetection JUnit time out times out with the following stack > traces: > * 1- testDeadNodeDetectionInBackground* > {code:bash} > [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 264.757 s <<< FAILURE! - in org.apache.hadoop.hdfs.TestDeadNodeDetection > [ERROR] > testDeadNodeDetectionInBackground(org.apache.hadoop.hdfs.TestDeadNodeDetection) > Time elapsed: 125.806 s <<< ERROR! > java.util.concurrent.TimeoutException: > Timed out waiting for condition. Thread diagnostics: > Timestamp: 2020-01-24 08:31:07,023 > "client DomainSocketWatcher" daemon prio=5 tid=117 runnable > java.lang.Thread.State: RUNNABLE > at org.apache.hadoop.net.unix.DomainSocketWatcher.doPoll0(Native > Method) > at > org.apache.hadoop.net.unix.DomainSocketWatcher.access$900(DomainSocketWatcher.java:52) > at > org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:503) > at java.lang.Thread.run(Thread.java:748) > "Session-HouseKeeper-48c3205a" prio=5 tid=350 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > "java.util.concurrent.ThreadPoolExecutor$Worker@3ae54156[State = -1, empty > queue]" daemon prio=5 tid=752 in Object.wait() > java.lang.Thread.State: WAITING (on object monitor) > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > "CacheReplicationMonitor(1960356187)" prio=5 tid=386 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at >
[jira] [Commented] (HDFS-15167) Block Report Interval shouldn't be reset apart from first Block Report
[ https://issues.apache.org/jira/browse/HDFS-15167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17038665#comment-17038665 ] Hadoop QA commented on HDFS-15167: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 0s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 31m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 46s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 9m 22s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 27s{color} | {color:red} hadoop-hdfs in trunk failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 35s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 20m 15s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}119m 40s{color} | {color:red} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}200m 29s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestDeadNodeDetection | | | hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness | | | hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock | | | hadoop.hdfs.server.blockmanagement.TestBlockInfoStriped | | | hadoop.hdfs.TestRollingUpgrade | | | hadoop.hdfs.server.blockmanagement.TestPendingReconstruction | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 | | JIRA Issue | HDFS-15167 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12993713/HDFS-15167-08.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux dd042c1ae88c 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 439d935 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_232 | | findbugs | https://builds.apache.org/job/PreCommit-HDFS-Build/28793/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs.txt | | mvninstall |
[jira] [Commented] (HDFS-12459) Fix revert: Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API
[ https://issues.apache.org/jira/browse/HDFS-12459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17038658#comment-17038658 ] Kihwal Lee commented on HDFS-12459: --- We discovered HDFS-11156 was still in branch-2.10. Unreverted, it caused problems in testing. I reverted it from branch-2.10 and cherry-picked this Jira to branch-3.1 and branch-2.10. > Fix revert: Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API > > > Key: HDFS-12459 > URL: https://issues.apache.org/jira/browse/HDFS-12459 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 3.2.0, 3.1.4, 2.10.1 > > Attachments: HDFS-12459.001.patch, HDFS-12459.002.patch, > HDFS-12459.003.patch, HDFS-12459.004.patch, HDFS-12459.005.patch, > HDFS-12459.006.patch, HDFS-12459.006.patch, HDFS-12459.007.patch, > HDFS-12459.008.patch > > > HDFS-11156 was reverted because the implementation was non optimal, based on > the suggestion from [~shahrs87], we should avoid creating a dfs client to get > block locations because that create extra RPC call. Instead we should use > {{NamenodeProtocols#getBlockLocations}} then covert {{LocatedBlocks}} to > {{BlockLocation[]}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12459) Fix revert: Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API
[ https://issues.apache.org/jira/browse/HDFS-12459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-12459: -- Fix Version/s: 2.10.1 3.1.4 > Fix revert: Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API > > > Key: HDFS-12459 > URL: https://issues.apache.org/jira/browse/HDFS-12459 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 3.2.0, 3.1.4, 2.10.1 > > Attachments: HDFS-12459.001.patch, HDFS-12459.002.patch, > HDFS-12459.003.patch, HDFS-12459.004.patch, HDFS-12459.005.patch, > HDFS-12459.006.patch, HDFS-12459.006.patch, HDFS-12459.007.patch, > HDFS-12459.008.patch > > > HDFS-11156 was reverted because the implementation was non optimal, based on > the suggestion from [~shahrs87], we should avoid creating a dfs client to get > block locations because that create extra RPC call. Instead we should use > {{NamenodeProtocols#getBlockLocations}} then covert {{LocatedBlocks}} to > {{BlockLocation[]}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11156) Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API
[ https://issues.apache.org/jira/browse/HDFS-11156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17038653#comment-17038653 ] Kihwal Lee commented on HDFS-11156: --- Reverted from only 2.10 for now. I also cherry-picked HDFS-12459. > Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API > > > Key: HDFS-11156 > URL: https://issues.apache.org/jira/browse/HDFS-11156 > Project: Hadoop HDFS > Issue Type: Improvement > Components: webhdfs >Affects Versions: 2.7.3 >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 3.0.0-alpha2 > > Attachments: BlockLocationProperties_JSON_Schema.jpg, > BlockLocations_JSON_Schema.jpg, FileStatuses_JSON_Schema.jpg, > HDFS-11156-branch-2.01.patch, HDFS-11156.01.patch, HDFS-11156.02.patch, > HDFS-11156.03.patch, HDFS-11156.04.patch, HDFS-11156.05.patch, > HDFS-11156.06.patch, HDFS-11156.07.patch, HDFS-11156.08.patch, > HDFS-11156.09.patch, HDFS-11156.10.patch, HDFS-11156.11.patch, > HDFS-11156.12.patch, HDFS-11156.13.patch, HDFS-11156.14.patch, > HDFS-11156.15.patch, HDFS-11156.16.patch, Output_JSON_format_v10.jpg, > SampleResponse_JSON.jpg > > > Following webhdfs REST API > {code} > http://:/webhdfs/v1/?op=GET_BLOCK_LOCATIONS=0=1 > {code} > will get a response like > {code} > { > "LocatedBlocks" : { > "fileLength" : 1073741824, > "isLastBlockComplete" : true, > "isUnderConstruction" : false, > "lastLocatedBlock" : { ... }, > "locatedBlocks" : [ {...} ] > } > } > {code} > This represents for *o.a.h.h.p.LocatedBlocks*. However according to > *FileSystem* API, > {code} > public BlockLocation[] getFileBlockLocations(Path p, long start, long len) > {code} > clients would expect an array of BlockLocation. This mismatch should be > fixed. Marked as Incompatible change as this will change the output of the > GET_BLOCK_LOCATIONS API. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11156) Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API
[ https://issues.apache.org/jira/browse/HDFS-11156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17038590#comment-17038590 ] Kihwal Lee commented on HDFS-11156: --- Contrary to the Fix Version (3.0.0-alpha2) of this jira, the change was also committed to branch-2.10 (formerly branch-2) and branch-2.9. They were not reverted and it is causing problem in our 2.10 testing. I will revert them and see if HDFS-12459 can be applied. > Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API > > > Key: HDFS-11156 > URL: https://issues.apache.org/jira/browse/HDFS-11156 > Project: Hadoop HDFS > Issue Type: Improvement > Components: webhdfs >Affects Versions: 2.7.3 >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 3.0.0-alpha2 > > Attachments: BlockLocationProperties_JSON_Schema.jpg, > BlockLocations_JSON_Schema.jpg, FileStatuses_JSON_Schema.jpg, > HDFS-11156-branch-2.01.patch, HDFS-11156.01.patch, HDFS-11156.02.patch, > HDFS-11156.03.patch, HDFS-11156.04.patch, HDFS-11156.05.patch, > HDFS-11156.06.patch, HDFS-11156.07.patch, HDFS-11156.08.patch, > HDFS-11156.09.patch, HDFS-11156.10.patch, HDFS-11156.11.patch, > HDFS-11156.12.patch, HDFS-11156.13.patch, HDFS-11156.14.patch, > HDFS-11156.15.patch, HDFS-11156.16.patch, Output_JSON_format_v10.jpg, > SampleResponse_JSON.jpg > > > Following webhdfs REST API > {code} > http://:/webhdfs/v1/?op=GET_BLOCK_LOCATIONS=0=1 > {code} > will get a response like > {code} > { > "LocatedBlocks" : { > "fileLength" : 1073741824, > "isLastBlockComplete" : true, > "isUnderConstruction" : false, > "lastLocatedBlock" : { ... }, > "locatedBlocks" : [ {...} ] > } > } > {code} > This represents for *o.a.h.h.p.LocatedBlocks*. However according to > *FileSystem* API, > {code} > public BlockLocation[] getFileBlockLocations(Path p, long start, long len) > {code} > clients would expect an array of BlockLocation. This mismatch should be > fixed. Marked as Incompatible change as this will change the output of the > GET_BLOCK_LOCATIONS API. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15167) Block Report Interval shouldn't be reset apart from first Block Report
[ https://issues.apache.org/jira/browse/HDFS-15167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17038573#comment-17038573 ] Ayush Saxena commented on HDFS-15167: - Handled in v8 > Block Report Interval shouldn't be reset apart from first Block Report > -- > > Key: HDFS-15167 > URL: https://issues.apache.org/jira/browse/HDFS-15167 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-15167-01.patch, HDFS-15167-02.patch, > HDFS-15167-03.patch, HDFS-15167-04.patch, HDFS-15167-05.patch, > HDFS-15167-06.patch, HDFS-15167-07.patch, HDFS-15167-08.patch > > > Presently BlockReport interval is reset even in case the BR is manually > triggered or BR is triggered for diskError. > Which isn't required. As per the comment also, it is intended for first BR > only : > {code:java} > // If we have sent the first set of block reports, then wait a random > // time before we start the periodic block reports. > if (resetBlockReportTime) { > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15167) Block Report Interval shouldn't be reset apart from first Block Report
[ https://issues.apache.org/jira/browse/HDFS-15167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena updated HDFS-15167: Attachment: HDFS-15167-08.patch > Block Report Interval shouldn't be reset apart from first Block Report > -- > > Key: HDFS-15167 > URL: https://issues.apache.org/jira/browse/HDFS-15167 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-15167-01.patch, HDFS-15167-02.patch, > HDFS-15167-03.patch, HDFS-15167-04.patch, HDFS-15167-05.patch, > HDFS-15167-06.patch, HDFS-15167-07.patch, HDFS-15167-08.patch > > > Presently BlockReport interval is reset even in case the BR is manually > triggered or BR is triggered for diskError. > Which isn't required. As per the comment also, it is intended for first BR > only : > {code:java} > // If we have sent the first set of block reports, then wait a random > // time before we start the periodic block reports. > if (resetBlockReportTime) { > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15172) Remove unnecessary deadNodeDetectInterval in DeadNodeDetector#checkDeadNodes()
[ https://issues.apache.org/jira/browse/HDFS-15172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17038568#comment-17038568 ] Íñigo Goiri commented on HDFS-15172: It looks like you are doing part of this in HDFS-15149 too. What do you want to do with this? > Remove unnecessary deadNodeDetectInterval in > DeadNodeDetector#checkDeadNodes() > --- > > Key: HDFS-15172 > URL: https://issues.apache.org/jira/browse/HDFS-15172 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-15172-001.patch, HDFS-15172-002.patch > > > Every call to checkDeadNodes() will change the state to IDLE forcing the > DeadNodeDetector to sleep for IDLE_SLEEP_MS. So we don't need > deadNodeDetectInterval between every checkDeadNodes(). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15149) TestDeadNodeDetection test cases time-out
[ https://issues.apache.org/jira/browse/HDFS-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17038565#comment-17038565 ] Íñigo Goiri commented on HDFS-15149: The part about manually enabling/disabling the thread is not the cleanest. I don't think there is a better way though. I like the rest of the solution though. However, it looks like testDeadNodeDetectionInBackground is not handling this well. > TestDeadNodeDetection test cases time-out > - > > Key: HDFS-15149 > URL: https://issues.apache.org/jira/browse/HDFS-15149 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Ahmed Hussein >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-15149-001.patch > > > TestDeadNodeDetection JUnit time out times out with the following stack > traces: > * 1- testDeadNodeDetectionInBackground* > {code:bash} > [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 264.757 s <<< FAILURE! - in org.apache.hadoop.hdfs.TestDeadNodeDetection > [ERROR] > testDeadNodeDetectionInBackground(org.apache.hadoop.hdfs.TestDeadNodeDetection) > Time elapsed: 125.806 s <<< ERROR! > java.util.concurrent.TimeoutException: > Timed out waiting for condition. Thread diagnostics: > Timestamp: 2020-01-24 08:31:07,023 > "client DomainSocketWatcher" daemon prio=5 tid=117 runnable > java.lang.Thread.State: RUNNABLE > at org.apache.hadoop.net.unix.DomainSocketWatcher.doPoll0(Native > Method) > at > org.apache.hadoop.net.unix.DomainSocketWatcher.access$900(DomainSocketWatcher.java:52) > at > org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:503) > at java.lang.Thread.run(Thread.java:748) > "Session-HouseKeeper-48c3205a" prio=5 tid=350 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > "java.util.concurrent.ThreadPoolExecutor$Worker@3ae54156[State = -1, empty > queue]" daemon prio=5 tid=752 in Object.wait() > java.lang.Thread.State: WAITING (on object monitor) > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > "CacheReplicationMonitor(1960356187)" prio=5 tid=386 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2163) > at > org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor.run(CacheReplicationMonitor.java:181) > "Timer for 'NameNode' metrics system" daemon prio=5 tid=339 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at java.lang.Object.wait(Native Method) > at java.util.TimerThread.mainLoop(Timer.java:552) > at java.util.TimerThread.run(Timer.java:505) > "org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber@6b760460" > daemon prio=5 tid=385 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:4420) > at java.lang.Thread.run(Thread.java:748) > "qtp164757726-349" daemon prio=5 tid=349 runnable > java.lang.Thread.State:
[jira] [Commented] (HDFS-15167) Block Report Interval shouldn't be reset apart from first Block Report
[ https://issues.apache.org/jira/browse/HDFS-15167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17038562#comment-17038562 ] Íñigo Goiri commented on HDFS-15167: This looks good, just a minor comment. In the javadoc, where we define delay, let's add the unit (milliseconds, right?) and mention that 0 or smaller sends it right away. > Block Report Interval shouldn't be reset apart from first Block Report > -- > > Key: HDFS-15167 > URL: https://issues.apache.org/jira/browse/HDFS-15167 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-15167-01.patch, HDFS-15167-02.patch, > HDFS-15167-03.patch, HDFS-15167-04.patch, HDFS-15167-05.patch, > HDFS-15167-06.patch, HDFS-15167-07.patch > > > Presently BlockReport interval is reset even in case the BR is manually > triggered or BR is triggered for diskError. > Which isn't required. As per the comment also, it is intended for first BR > only : > {code:java} > // If we have sent the first set of block reports, then wait a random > // time before we start the periodic block reports. > if (resetBlockReportTime) { > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15149) TestDeadNodeDetection test cases time-out
[ https://issues.apache.org/jira/browse/HDFS-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17038552#comment-17038552 ] Hadoop QA commented on HDFS-15149: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 52s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 20m 40s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 33s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 10s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}116m 15s{color} | {color:red} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}212m 37s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestDeadNodeDetection | | | hadoop.hdfs.TestReconstructStripedFile | | | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 | | JIRA Issue | HDFS-15149 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12993694/HDFS-15149-001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux eddba98369de 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 439d935 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_232 | |
[jira] [Commented] (HDFS-15167) Block Report Interval shouldn't be reset apart from first Block Report
[ https://issues.apache.org/jira/browse/HDFS-15167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17038510#comment-17038510 ] Hadoop QA commented on HDFS-15167: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 51s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 56s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 16s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}116m 21s{color} | {color:red} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 42s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}182m 29s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestDeadNodeDetection | | | hadoop.hdfs.qjournal.server.TestJournalNodeSync | | | hadoop.hdfs.server.namenode.ha.TestHAAppend | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 | | JIRA Issue | HDFS-15167 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12993683/HDFS-15167-07.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 55146fa4a128 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 439d935 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_232 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/28791/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28791/testReport/ | | Max. process+thread count | 2982 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output |
[jira] [Commented] (HDFS-15104) If block is not reported by any Datanode, the flag corrupt of BlockLocation should be marked as true.
[ https://issues.apache.org/jira/browse/HDFS-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17038421#comment-17038421 ] Yang Yun commented on HDFS-15104: - For the block, it's missing. But for the file, it's corrupt as fsck show. I open this for it's different behavior between 3.3 and 2.6, the code of 2.6 is as below. it thinks missing in 2.6 code. Did we intentionally modify its behavior? final boolean isCorrupt = numCorruptNodes == numNodes; > If block is not reported by any Datanode, the flag corrupt of BlockLocation > should be marked as true. > -- > > Key: HDFS-15104 > URL: https://issues.apache.org/jira/browse/HDFS-15104 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Yang Yun >Assignee: Yang Yun >Priority: Major > Attachments: HDFS-15104.patch > > > The flag corrupt of BlockLocation returned from getFileBlockLocations() is > not marked true even the block is not reported by any Datanode( the hosts is > empty). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15149) TestDeadNodeDetection test cases time-out
[ https://issues.apache.org/jira/browse/HDFS-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lisheng Sun updated HDFS-15149: --- Attachment: HDFS-15149-001.patch Status: Patch Available (was: Open) > TestDeadNodeDetection test cases time-out > - > > Key: HDFS-15149 > URL: https://issues.apache.org/jira/browse/HDFS-15149 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Ahmed Hussein >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-15149-001.patch > > > TestDeadNodeDetection JUnit time out times out with the following stack > traces: > * 1- testDeadNodeDetectionInBackground* > {code:bash} > [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 264.757 s <<< FAILURE! - in org.apache.hadoop.hdfs.TestDeadNodeDetection > [ERROR] > testDeadNodeDetectionInBackground(org.apache.hadoop.hdfs.TestDeadNodeDetection) > Time elapsed: 125.806 s <<< ERROR! > java.util.concurrent.TimeoutException: > Timed out waiting for condition. Thread diagnostics: > Timestamp: 2020-01-24 08:31:07,023 > "client DomainSocketWatcher" daemon prio=5 tid=117 runnable > java.lang.Thread.State: RUNNABLE > at org.apache.hadoop.net.unix.DomainSocketWatcher.doPoll0(Native > Method) > at > org.apache.hadoop.net.unix.DomainSocketWatcher.access$900(DomainSocketWatcher.java:52) > at > org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:503) > at java.lang.Thread.run(Thread.java:748) > "Session-HouseKeeper-48c3205a" prio=5 tid=350 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > "java.util.concurrent.ThreadPoolExecutor$Worker@3ae54156[State = -1, empty > queue]" daemon prio=5 tid=752 in Object.wait() > java.lang.Thread.State: WAITING (on object monitor) > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > "CacheReplicationMonitor(1960356187)" prio=5 tid=386 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2163) > at > org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor.run(CacheReplicationMonitor.java:181) > "Timer for 'NameNode' metrics system" daemon prio=5 tid=339 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at java.lang.Object.wait(Native Method) > at java.util.TimerThread.mainLoop(Timer.java:552) > at java.util.TimerThread.run(Timer.java:505) > "org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber@6b760460" > daemon prio=5 tid=385 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:4420) > at java.lang.Thread.run(Thread.java:748) > "qtp164757726-349" daemon prio=5 tid=349 runnable > java.lang.Thread.State: RUNNABLE > at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) > at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) > at
[jira] [Commented] (HDFS-15104) If block is not reported by any Datanode, the flag corrupt of BlockLocation should be marked as true.
[ https://issues.apache.org/jira/browse/HDFS-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17038373#comment-17038373 ] Ayush Saxena commented on HDFS-15104: - How can u say that block is corrupt, if no datanode has reported till that time, even if you send back as corrupt here, the JMX won't be showing it as corrupt the FSCK won't be showing as corrupt, Don't think it is good to conclude that the block is corrupt without actually knowing it. This seems to be a missing block scenario, not corrupt block > If block is not reported by any Datanode, the flag corrupt of BlockLocation > should be marked as true. > -- > > Key: HDFS-15104 > URL: https://issues.apache.org/jira/browse/HDFS-15104 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Yang Yun >Assignee: Yang Yun >Priority: Major > Attachments: HDFS-15104.patch > > > The flag corrupt of BlockLocation returned from getFileBlockLocations() is > not marked true even the block is not reported by any Datanode( the hosts is > empty). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15167) Block Report Interval shouldn't be reset apart from first Block Report
[ https://issues.apache.org/jira/browse/HDFS-15167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17038360#comment-17038360 ] Ayush Saxena commented on HDFS-15167: - Thanx [~elgoiri] for the review. Added comments as suggested. > Block Report Interval shouldn't be reset apart from first Block Report > -- > > Key: HDFS-15167 > URL: https://issues.apache.org/jira/browse/HDFS-15167 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-15167-01.patch, HDFS-15167-02.patch, > HDFS-15167-03.patch, HDFS-15167-04.patch, HDFS-15167-05.patch, > HDFS-15167-06.patch, HDFS-15167-07.patch > > > Presently BlockReport interval is reset even in case the BR is manually > triggered or BR is triggered for diskError. > Which isn't required. As per the comment also, it is intended for first BR > only : > {code:java} > // If we have sent the first set of block reports, then wait a random > // time before we start the periodic block reports. > if (resetBlockReportTime) { > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15167) Block Report Interval shouldn't be reset apart from first Block Report
[ https://issues.apache.org/jira/browse/HDFS-15167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena updated HDFS-15167: Attachment: HDFS-15167-07.patch > Block Report Interval shouldn't be reset apart from first Block Report > -- > > Key: HDFS-15167 > URL: https://issues.apache.org/jira/browse/HDFS-15167 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-15167-01.patch, HDFS-15167-02.patch, > HDFS-15167-03.patch, HDFS-15167-04.patch, HDFS-15167-05.patch, > HDFS-15167-06.patch, HDFS-15167-07.patch > > > Presently BlockReport interval is reset even in case the BR is manually > triggered or BR is triggered for diskError. > Which isn't required. As per the comment also, it is intended for first BR > only : > {code:java} > // If we have sent the first set of block reports, then wait a random > // time before we start the periodic block reports. > if (resetBlockReportTime) { > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15175) Multiple CloseOp shared block instance causes the standby namenode to crash when rolling editlog
[ https://issues.apache.org/jira/browse/HDFS-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yicong Cai updated HDFS-15175: -- Description: {panel:title=Crash exception} 2020-02-16 09:24:46,426 [507844305] - ERROR [Edit log tailer:FSEditLogLoader@245] - Encountered exception on operation CloseOp [length=0, inodeId=0, path=..., replication=3, mtime=1581816138774, atime=1581814760398, blockSize=536870912, blocks=[blk_5568434562_4495417845], permissions=da_music:hdfs:rw-r-, aclEntries=null, clientName=, clientMachine=, overwrite=false, storagePolicyId=0, opCode=OP_CLOSE, txid=32625024993] java.io.IOException: File is not under construction: .. at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:442) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:237) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:146) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:891) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:872) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:262) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:395) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$300(EditLogTailer.java:348) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:365) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1873) at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:479) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:361) {panel} {panel:title=Editlog} OP_REASSIGN_LEASE 32625021150 DFSClient_NONMAPREDUCE_-969060727_197760 .. DFSClient_NONMAPREDUCE_1000868229_201260 .. OP_CLOSE 32625023743 0 0 .. 3 1581816135883 1581814760398 536870912 false 5568434562 185818644 4495417845 da_music hdfs 416 .. OP_TRUNCATE 32625024049 .. DFSClient_NONMAPREDUCE_1000868229_201260 .. 185818644 1581816136336 5568434562 185818648 4495417845 .. OP_CLOSE 32625024993 0 0 .. 3 1581816138774 1581814760398 536870912 false 5568434562 185818644 4495417845 da_music hdfs 416 {panel} The block size should be 185818648 in the first CloseOp. When truncate is used, the block size becomes 185818644. The CloseOp/TruncateOp/CloseOp is synchronized to the JournalNode in the same batch. The block used by CloseOp twice is the same instance, which causes the first CloseOp has wrong block size. When SNN rolling Editlog, TruncateOp does not make the file to the UnderConstruction state. Then, when the second CloseOp is executed, the file is not in the UnderConstruction state, and SNN crashes. was: {panel:title=Crash exception} 2020-02-16 09:24:46,426 [507844305] - ERROR [Edit log tailer:FSEditLogLoader@245] - Encountered exception on operation CloseOp [length=0, inodeId=0, path=..., replication=3, mtime=1581816138774, atime=1581814760398, blockSize=536870912, blocks=[blk_5568434562_4495417845], permissions=da_music:hdfs:rw-r-, aclEntries=null, clientName=, clientMachine=, overwrite=false, storagePolicyId=0, opCode=OP_CLOSE, txid=32625024993] java.io.IOException: File is not under construction: .. at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:442) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:237) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:146) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:891) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:872) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:262) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:395) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$300(EditLogTailer.java:348) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:365) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1873) at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:479) at
[jira] [Created] (HDFS-15175) Multiple CloseOp shared block instance causes the standby namenode to crash when rolling editlog
Yicong Cai created HDFS-15175: - Summary: Multiple CloseOp shared block instance causes the standby namenode to crash when rolling editlog Key: HDFS-15175 URL: https://issues.apache.org/jira/browse/HDFS-15175 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.9.2 Reporter: Yicong Cai Assignee: Yicong Cai {panel:title=Crash exception} 2020-02-16 09:24:46,426 [507844305] - ERROR [Edit log tailer:FSEditLogLoader@245] - Encountered exception on operation CloseOp [length=0, inodeId=0, path=..., replication=3, mtime=1581816138774, atime=1581814760398, blockSize=536870912, blocks=[blk_5568434562_4495417845], permissions=da_music:hdfs:rw-r-, aclEntries=null, clientName=, clientMachine=, overwrite=false, storagePolicyId=0, opCode=OP_CLOSE, txid=32625024993] java.io.IOException: File is not under construction: .. at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:442) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:237) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:146) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:891) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:872) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:262) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:395) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$300(EditLogTailer.java:348) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:365) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1873) at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:479) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:361) {panel} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15173) RBF: Delete repeated configuration 'dfs.federation.router.metrics.enable'
[ https://issues.apache.org/jira/browse/HDFS-15173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17038177#comment-17038177 ] panlijie commented on HDFS-15173: - Thanks for your review ! [~aajisaka] > RBF: Delete repeated configuration 'dfs.federation.router.metrics.enable' > - > > Key: HDFS-15173 > URL: https://issues.apache.org/jira/browse/HDFS-15173 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: documentation, rbf >Affects Versions: 3.1.1, 3.2.1 >Reporter: panlijie >Assignee: panlijie >Priority: Minor > Fix For: 3.3.0, 3.1.4, 3.2.2 > > > In The HDFS RBF default config hdfs-rbf-default.xml, The configuration > contains two repeated configurations, 'dfs.federation.router.metrics.enable' > appears twice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org