[jira] [Work logged] (HDFS-16676) DatanodeAdminManager$Monitor reports a node as invalid continuously

2022-07-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16676?focusedWorklogId=795540=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795540
 ]

ASF GitHub Bot logged work on HDFS-16676:
-

Author: ASF GitHub Bot
Created on: 27/Jul/22 07:18
Start Date: 27/Jul/22 07:18
Worklog Time Spent: 10m 
  Work Description: ashutoshcipher commented on code in PR #4626:
URL: https://github.com/apache/hadoop/pull/4626#discussion_r930703886


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java:
##
@@ -201,6 +201,7 @@ private void check() {
 iterkey).iterator();
 final List toRemove = new ArrayList<>();
 final List unhealthyDns = new ArrayList<>();
+boolean inValidState = false;

Review Comment:
   Thanks @jojochuang - I have made the required changes. 





Issue Time Tracking
---

Worklog Id: (was: 795540)
Time Spent: 3h 10m  (was: 3h)

> DatanodeAdminManager$Monitor reports a node as invalid continuously
> ---
>
> Key: HDFS-16676
> URL: https://issues.apache.org/jira/browse/HDFS-16676
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.2.1
>Reporter: Prabhu Joseph
>Assignee: groot
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> DatanodeAdminManager$Monitor reports a node as invalid continuously
> {code}
> 2022-07-21 06:54:38,562 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager 
> (DatanodeAdminMonitor-0): DatanodeAdminMonitor caught exception when 
> processing node 1.2.3.4:9866.
> java.lang.IllegalStateException: Node 1.2.3.4:9866 is in an invalid state! 
> Invalid state: In Service 0 blocks are on this dn.
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:172)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:601)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:504)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> {code}
> A node goes into invalid state when stopDecommission sets the node to 
> IN-Service and misses to remove from pendingNodes queues (HDFS-16675). This 
> will be corrected only when user triggers startDecommission. Till then we 
> need not keep the invalid state node in the queue as anyway startDecommission 
> will add it back.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16676) DatanodeAdminManager$Monitor reports a node as invalid continuously

2022-07-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16676?focusedWorklogId=795364=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795364
 ]

ASF GitHub Bot logged work on HDFS-16676:
-

Author: ASF GitHub Bot
Created on: 26/Jul/22 17:44
Start Date: 26/Jul/22 17:44
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4626:
URL: https://github.com/apache/hadoop/pull/4626#issuecomment-1195790209

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 51s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  43m 27s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 45s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   1m 35s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 17s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 43s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 20s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 42s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 52s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  26m 18s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 24s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 29s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   1m 29s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  1s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 59s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 33s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m 25s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 337m  3s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 59s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 457m  4s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4626/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4626 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux c71a67c7dd9d 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 
17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / a3c4dc90d5108fb8eb57f5efef312e295c8128e2 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4626/2/testReport/ |
   | Max. process+thread count | 2201 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 

[jira] [Work logged] (HDFS-16676) DatanodeAdminManager$Monitor reports a node as invalid continuously

2022-07-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16676?focusedWorklogId=795123=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795123
 ]

ASF GitHub Bot logged work on HDFS-16676:
-

Author: ASF GitHub Bot
Created on: 26/Jul/22 04:58
Start Date: 26/Jul/22 04:58
Worklog Time Spent: 10m 
  Work Description: jojochuang commented on code in PR #4626:
URL: https://github.com/apache/hadoop/pull/4626#discussion_r929524721


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java:
##
@@ -201,6 +201,7 @@ private void check() {
 iterkey).iterator();
 final List toRemove = new ArrayList<>();
 final List unhealthyDns = new ArrayList<>();
+boolean inValidState = false;

Review Comment:
   IMO the variable name is confusing in the first glance. Is it "valid" or 
"invalid" when it's true? How about renaming it as "isValid"?





Issue Time Tracking
---

Worklog Id: (was: 795123)
Time Spent: 2h 50m  (was: 2h 40m)

> DatanodeAdminManager$Monitor reports a node as invalid continuously
> ---
>
> Key: HDFS-16676
> URL: https://issues.apache.org/jira/browse/HDFS-16676
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.2.1
>Reporter: Prabhu Joseph
>Assignee: groot
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> DatanodeAdminManager$Monitor reports a node as invalid continuously
> {code}
> 2022-07-21 06:54:38,562 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager 
> (DatanodeAdminMonitor-0): DatanodeAdminMonitor caught exception when 
> processing node 1.2.3.4:9866.
> java.lang.IllegalStateException: Node 1.2.3.4:9866 is in an invalid state! 
> Invalid state: In Service 0 blocks are on this dn.
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:172)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:601)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:504)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> {code}
> A node goes into invalid state when stopDecommission sets the node to 
> IN-Service and misses to remove from pendingNodes queues (HDFS-16675). This 
> will be corrected only when user triggers startDecommission. Till then we 
> need not keep the invalid state node in the queue as anyway startDecommission 
> will add it back.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16676) DatanodeAdminManager$Monitor reports a node as invalid continuously

2022-07-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16676?focusedWorklogId=795120=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795120
 ]

ASF GitHub Bot logged work on HDFS-16676:
-

Author: ASF GitHub Bot
Created on: 26/Jul/22 03:54
Start Date: 26/Jul/22 03:54
Worklog Time Spent: 10m 
  Work Description: PrabhuJoseph commented on PR #4626:
URL: https://github.com/apache/hadoop/pull/4626#issuecomment-1194967117

   Thanks @ashutoshcipher for the patch. Can you include a test case.




Issue Time Tracking
---

Worklog Id: (was: 795120)
Time Spent: 2h 40m  (was: 2.5h)

> DatanodeAdminManager$Monitor reports a node as invalid continuously
> ---
>
> Key: HDFS-16676
> URL: https://issues.apache.org/jira/browse/HDFS-16676
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.2.1
>Reporter: Prabhu Joseph
>Assignee: groot
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> DatanodeAdminManager$Monitor reports a node as invalid continuously
> {code}
> 2022-07-21 06:54:38,562 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager 
> (DatanodeAdminMonitor-0): DatanodeAdminMonitor caught exception when 
> processing node 1.2.3.4:9866.
> java.lang.IllegalStateException: Node 1.2.3.4:9866 is in an invalid state! 
> Invalid state: In Service 0 blocks are on this dn.
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:172)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:601)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:504)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> {code}
> A node goes into invalid state when stopDecommission sets the node to 
> IN-Service and misses to remove from pendingNodes queues (HDFS-16675). This 
> will be corrected only when user triggers startDecommission. Till then we 
> need not keep the invalid state node in the queue as anyway startDecommission 
> will add it back.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16676) DatanodeAdminManager$Monitor reports a node as invalid continuously

2022-07-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16676?focusedWorklogId=795003=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795003
 ]

ASF GitHub Bot logged work on HDFS-16676:
-

Author: ASF GitHub Bot
Created on: 25/Jul/22 17:56
Start Date: 25/Jul/22 17:56
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4626:
URL: https://github.com/apache/hadoop/pull/4626#issuecomment-1194417858

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 38s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  38m 12s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 43s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   1m 40s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 23s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 46s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 25s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 49s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 39s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m  6s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 24s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 27s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   1m 27s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 20s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  1s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 25s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 18s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 35s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 243m 29s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   1m 14s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 353m 28s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4626/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4626 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 31b1acdfdf9e 4.15.0-156-generic #163-Ubuntu SMP Thu Aug 19 
23:31:58 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 28c374cf535405254f4183528864014ea5776fc8 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4626/1/testReport/ |
   | Max. process+thread count | 3405 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 

[jira] [Work logged] (HDFS-16676) DatanodeAdminManager$Monitor reports a node as invalid continuously

2022-07-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16676?focusedWorklogId=794931=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794931
 ]

ASF GitHub Bot logged work on HDFS-16676:
-

Author: ASF GitHub Bot
Created on: 25/Jul/22 14:43
Start Date: 25/Jul/22 14:43
Worklog Time Spent: 10m 
  Work Description: ashutoshcipher commented on PR #4626:
URL: https://github.com/apache/hadoop/pull/4626#issuecomment-1194143412

   Thanks @slfan1989 for your review and approval :)




Issue Time Tracking
---

Worklog Id: (was: 794931)
Time Spent: 2h 20m  (was: 2h 10m)

> DatanodeAdminManager$Monitor reports a node as invalid continuously
> ---
>
> Key: HDFS-16676
> URL: https://issues.apache.org/jira/browse/HDFS-16676
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.2.1
>Reporter: Prabhu Joseph
>Assignee: groot
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> DatanodeAdminManager$Monitor reports a node as invalid continuously
> {code}
> 2022-07-21 06:54:38,562 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager 
> (DatanodeAdminMonitor-0): DatanodeAdminMonitor caught exception when 
> processing node 1.2.3.4:9866.
> java.lang.IllegalStateException: Node 1.2.3.4:9866 is in an invalid state! 
> Invalid state: In Service 0 blocks are on this dn.
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:172)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:601)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:504)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> {code}
> A node goes into invalid state when stopDecommission sets the node to 
> IN-Service and misses to remove from pendingNodes queues (HDFS-16675). This 
> will be corrected only when user triggers startDecommission. Till then we 
> need not keep the invalid state node in the queue as anyway startDecommission 
> will add it back.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16676) DatanodeAdminManager$Monitor reports a node as invalid continuously

2022-07-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16676?focusedWorklogId=794929=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794929
 ]

ASF GitHub Bot logged work on HDFS-16676:
-

Author: ASF GitHub Bot
Created on: 25/Jul/22 14:40
Start Date: 25/Jul/22 14:40
Worklog Time Spent: 10m 
  Work Description: ashutoshcipher commented on code in PR #4626:
URL: https://github.com/apache/hadoop/pull/4626#discussion_r928961283


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java:
##
@@ -201,6 +201,7 @@ private void check() {
 iterkey).iterator();
 final List toRemove = new ArrayList<>();
 final List unhealthyDns = new ArrayList<>();
+boolean inValidState = false;
 
 while (it.hasNext() && !exceededNumBlocksPerCheck() && namesystem

Review Comment:
   This change hasnt made my me. For fixing any such existing `Single Line` 
issues - new JIRA can be created to clean or modify such issues on module 
level. 





Issue Time Tracking
---

Worklog Id: (was: 794929)
Time Spent: 2h 10m  (was: 2h)

> DatanodeAdminManager$Monitor reports a node as invalid continuously
> ---
>
> Key: HDFS-16676
> URL: https://issues.apache.org/jira/browse/HDFS-16676
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.2.1
>Reporter: Prabhu Joseph
>Assignee: groot
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> DatanodeAdminManager$Monitor reports a node as invalid continuously
> {code}
> 2022-07-21 06:54:38,562 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager 
> (DatanodeAdminMonitor-0): DatanodeAdminMonitor caught exception when 
> processing node 1.2.3.4:9866.
> java.lang.IllegalStateException: Node 1.2.3.4:9866 is in an invalid state! 
> Invalid state: In Service 0 blocks are on this dn.
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:172)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:601)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:504)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> {code}
> A node goes into invalid state when stopDecommission sets the node to 
> IN-Service and misses to remove from pendingNodes queues (HDFS-16675). This 
> will be corrected only when user triggers startDecommission. Till then we 
> need not keep the invalid state node in the queue as anyway startDecommission 
> will add it back.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16676) DatanodeAdminManager$Monitor reports a node as invalid continuously

2022-07-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16676?focusedWorklogId=794928=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794928
 ]

ASF GitHub Bot logged work on HDFS-16676:
-

Author: ASF GitHub Bot
Created on: 25/Jul/22 14:39
Start Date: 25/Jul/22 14:39
Worklog Time Spent: 10m 
  Work Description: ashutoshcipher commented on code in PR #4626:
URL: https://github.com/apache/hadoop/pull/4626#discussion_r928963626


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java:
##
@@ -265,6 +266,7 @@ private void check() {
   // to track maintenance expiration.
   dnAdmin.setInMaintenance(dn);
 } else {
+  inValidState  = true;
   Preconditions.checkState(false,

Review Comment:
   >"Node %s is in an invalid state! " + "Invalid state: %s %s blocks are on 
this dn."
   
   I am not sure if there is really an indentation issue here. I have not added 
this change here. To correct existing indentations, a separate JIRA can be 
created to track and correct this on module level if required.





Issue Time Tracking
---

Worklog Id: (was: 794928)
Time Spent: 2h  (was: 1h 50m)

> DatanodeAdminManager$Monitor reports a node as invalid continuously
> ---
>
> Key: HDFS-16676
> URL: https://issues.apache.org/jira/browse/HDFS-16676
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.2.1
>Reporter: Prabhu Joseph
>Assignee: groot
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> DatanodeAdminManager$Monitor reports a node as invalid continuously
> {code}
> 2022-07-21 06:54:38,562 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager 
> (DatanodeAdminMonitor-0): DatanodeAdminMonitor caught exception when 
> processing node 1.2.3.4:9866.
> java.lang.IllegalStateException: Node 1.2.3.4:9866 is in an invalid state! 
> Invalid state: In Service 0 blocks are on this dn.
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:172)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:601)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:504)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> {code}
> A node goes into invalid state when stopDecommission sets the node to 
> IN-Service and misses to remove from pendingNodes queues (HDFS-16675). This 
> will be corrected only when user triggers startDecommission. Till then we 
> need not keep the invalid state node in the queue as anyway startDecommission 
> will add it back.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16676) DatanodeAdminManager$Monitor reports a node as invalid continuously

2022-07-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16676?focusedWorklogId=794926=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794926
 ]

ASF GitHub Bot logged work on HDFS-16676:
-

Author: ASF GitHub Bot
Created on: 25/Jul/22 14:37
Start Date: 25/Jul/22 14:37
Worklog Time Spent: 10m 
  Work Description: ashutoshcipher commented on code in PR #4626:
URL: https://github.com/apache/hadoop/pull/4626#discussion_r928961283


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java:
##
@@ -201,6 +201,7 @@ private void check() {
 iterkey).iterator();
 final List toRemove = new ArrayList<>();
 final List unhealthyDns = new ArrayList<>();
+boolean inValidState = false;
 
 while (it.hasNext() && !exceededNumBlocksPerCheck() && namesystem

Review Comment:
   This change hasnt made my me. For fixing any such issues - new JIRA can be 
created to clean or modify such issues on module level. 





Issue Time Tracking
---

Worklog Id: (was: 794926)
Time Spent: 1h 50m  (was: 1h 40m)

> DatanodeAdminManager$Monitor reports a node as invalid continuously
> ---
>
> Key: HDFS-16676
> URL: https://issues.apache.org/jira/browse/HDFS-16676
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.2.1
>Reporter: Prabhu Joseph
>Assignee: groot
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> DatanodeAdminManager$Monitor reports a node as invalid continuously
> {code}
> 2022-07-21 06:54:38,562 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager 
> (DatanodeAdminMonitor-0): DatanodeAdminMonitor caught exception when 
> processing node 1.2.3.4:9866.
> java.lang.IllegalStateException: Node 1.2.3.4:9866 is in an invalid state! 
> Invalid state: In Service 0 blocks are on this dn.
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:172)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:601)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:504)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> {code}
> A node goes into invalid state when stopDecommission sets the node to 
> IN-Service and misses to remove from pendingNodes queues (HDFS-16675). This 
> will be corrected only when user triggers startDecommission. Till then we 
> need not keep the invalid state node in the queue as anyway startDecommission 
> will add it back.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16676) DatanodeAdminManager$Monitor reports a node as invalid continuously

2022-07-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16676?focusedWorklogId=794919=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794919
 ]

ASF GitHub Bot logged work on HDFS-16676:
-

Author: ASF GitHub Bot
Created on: 25/Jul/22 14:27
Start Date: 25/Jul/22 14:27
Worklog Time Spent: 10m 
  Work Description: slfan1989 commented on code in PR #4626:
URL: https://github.com/apache/hadoop/pull/4626#discussion_r928949967


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java:
##
@@ -201,6 +201,7 @@ private void check() {
 iterkey).iterator();
 final List toRemove = new ArrayList<>();
 final List unhealthyDns = new ArrayList<>();
+boolean inValidState = false;
 
 while (it.hasNext() && !exceededNumBlocksPerCheck() && namesystem

Review Comment:
   `while (it.hasNext() && !exceededNumBlocksPerCheck() && 
namesystem.isRunning())` 





Issue Time Tracking
---

Worklog Id: (was: 794919)
Time Spent: 1h 40m  (was: 1.5h)

> DatanodeAdminManager$Monitor reports a node as invalid continuously
> ---
>
> Key: HDFS-16676
> URL: https://issues.apache.org/jira/browse/HDFS-16676
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.2.1
>Reporter: Prabhu Joseph
>Assignee: groot
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> DatanodeAdminManager$Monitor reports a node as invalid continuously
> {code}
> 2022-07-21 06:54:38,562 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager 
> (DatanodeAdminMonitor-0): DatanodeAdminMonitor caught exception when 
> processing node 1.2.3.4:9866.
> java.lang.IllegalStateException: Node 1.2.3.4:9866 is in an invalid state! 
> Invalid state: In Service 0 blocks are on this dn.
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:172)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:601)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:504)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> {code}
> A node goes into invalid state when stopDecommission sets the node to 
> IN-Service and misses to remove from pendingNodes queues (HDFS-16675). This 
> will be corrected only when user triggers startDecommission. Till then we 
> need not keep the invalid state node in the queue as anyway startDecommission 
> will add it back.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16676) DatanodeAdminManager$Monitor reports a node as invalid continuously

2022-07-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16676?focusedWorklogId=794917=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794917
 ]

ASF GitHub Bot logged work on HDFS-16676:
-

Author: ASF GitHub Bot
Created on: 25/Jul/22 14:26
Start Date: 25/Jul/22 14:26
Worklog Time Spent: 10m 
  Work Description: slfan1989 commented on code in PR #4626:
URL: https://github.com/apache/hadoop/pull/4626#discussion_r928949038


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java:
##
@@ -265,6 +266,7 @@ private void check() {
   // to track maintenance expiration.
   dnAdmin.setInMaintenance(dn);
 } else {
+  inValidState  = true;
   Preconditions.checkState(false,

Review Comment:
   `"Node %s is in an invalid state! "
 + "Invalid state: %s %s blocks are on this dn.",`





Issue Time Tracking
---

Worklog Id: (was: 794917)
Time Spent: 1.5h  (was: 1h 20m)

> DatanodeAdminManager$Monitor reports a node as invalid continuously
> ---
>
> Key: HDFS-16676
> URL: https://issues.apache.org/jira/browse/HDFS-16676
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.2.1
>Reporter: Prabhu Joseph
>Assignee: groot
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> DatanodeAdminManager$Monitor reports a node as invalid continuously
> {code}
> 2022-07-21 06:54:38,562 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager 
> (DatanodeAdminMonitor-0): DatanodeAdminMonitor caught exception when 
> processing node 1.2.3.4:9866.
> java.lang.IllegalStateException: Node 1.2.3.4:9866 is in an invalid state! 
> Invalid state: In Service 0 blocks are on this dn.
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:172)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:601)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:504)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> {code}
> A node goes into invalid state when stopDecommission sets the node to 
> IN-Service and misses to remove from pendingNodes queues (HDFS-16675). This 
> will be corrected only when user triggers startDecommission. Till then we 
> need not keep the invalid state node in the queue as anyway startDecommission 
> will add it back.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16676) DatanodeAdminManager$Monitor reports a node as invalid continuously

2022-07-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16676?focusedWorklogId=794903=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794903
 ]

ASF GitHub Bot logged work on HDFS-16676:
-

Author: ASF GitHub Bot
Created on: 25/Jul/22 13:43
Start Date: 25/Jul/22 13:43
Worklog Time Spent: 10m 
  Work Description: ashutoshcipher commented on code in PR #4626:
URL: https://github.com/apache/hadoop/pull/4626#discussion_r928899669


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java:
##
@@ -288,7 +290,11 @@ private void check() {
 // an invalid state.
 LOG.warn("DatanodeAdminMonitor caught exception when processing node "
 + "{}.", dn, e);

Review Comment:
   I havent made this change. To correct Log Single Line issues is any and 
required - A separate JIRA(to check and fix in different modules) can be 
created I think. 





Issue Time Tracking
---

Worklog Id: (was: 794903)
Time Spent: 1h 20m  (was: 1h 10m)

> DatanodeAdminManager$Monitor reports a node as invalid continuously
> ---
>
> Key: HDFS-16676
> URL: https://issues.apache.org/jira/browse/HDFS-16676
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.2.1
>Reporter: Prabhu Joseph
>Assignee: groot
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> DatanodeAdminManager$Monitor reports a node as invalid continuously
> {code}
> 2022-07-21 06:54:38,562 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager 
> (DatanodeAdminMonitor-0): DatanodeAdminMonitor caught exception when 
> processing node 1.2.3.4:9866.
> java.lang.IllegalStateException: Node 1.2.3.4:9866 is in an invalid state! 
> Invalid state: In Service 0 blocks are on this dn.
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:172)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:601)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:504)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> {code}
> A node goes into invalid state when stopDecommission sets the node to 
> IN-Service and misses to remove from pendingNodes queues (HDFS-16675). This 
> will be corrected only when user triggers startDecommission. Till then we 
> need not keep the invalid state node in the queue as anyway startDecommission 
> will add it back.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16676) DatanodeAdminManager$Monitor reports a node as invalid continuously

2022-07-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16676?focusedWorklogId=794901=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794901
 ]

ASF GitHub Bot logged work on HDFS-16676:
-

Author: ASF GitHub Bot
Created on: 25/Jul/22 13:42
Start Date: 25/Jul/22 13:42
Worklog Time Spent: 10m 
  Work Description: ashutoshcipher commented on code in PR #4626:
URL: https://github.com/apache/hadoop/pull/4626#discussion_r928900049


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java:
##
@@ -201,6 +201,7 @@ private void check() {
 iterkey).iterator();
 final List toRemove = new ArrayList<>();
 final List unhealthyDns = new ArrayList<>();
+boolean inValidState = false;
 
 while (it.hasNext() && !exceededNumBlocksPerCheck() && namesystem

Review Comment:
   What do you mean by Single Line?
   





Issue Time Tracking
---

Worklog Id: (was: 794901)
Time Spent: 1h 10m  (was: 1h)

> DatanodeAdminManager$Monitor reports a node as invalid continuously
> ---
>
> Key: HDFS-16676
> URL: https://issues.apache.org/jira/browse/HDFS-16676
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.2.1
>Reporter: Prabhu Joseph
>Assignee: groot
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> DatanodeAdminManager$Monitor reports a node as invalid continuously
> {code}
> 2022-07-21 06:54:38,562 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager 
> (DatanodeAdminMonitor-0): DatanodeAdminMonitor caught exception when 
> processing node 1.2.3.4:9866.
> java.lang.IllegalStateException: Node 1.2.3.4:9866 is in an invalid state! 
> Invalid state: In Service 0 blocks are on this dn.
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:172)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:601)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:504)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> {code}
> A node goes into invalid state when stopDecommission sets the node to 
> IN-Service and misses to remove from pendingNodes queues (HDFS-16675). This 
> will be corrected only when user triggers startDecommission. Till then we 
> need not keep the invalid state node in the queue as anyway startDecommission 
> will add it back.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16676) DatanodeAdminManager$Monitor reports a node as invalid continuously

2022-07-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16676?focusedWorklogId=794900=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794900
 ]

ASF GitHub Bot logged work on HDFS-16676:
-

Author: ASF GitHub Bot
Created on: 25/Jul/22 13:42
Start Date: 25/Jul/22 13:42
Worklog Time Spent: 10m 
  Work Description: ashutoshcipher commented on code in PR #4626:
URL: https://github.com/apache/hadoop/pull/4626#discussion_r928899669


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java:
##
@@ -288,7 +290,11 @@ private void check() {
 // an invalid state.
 LOG.warn("DatanodeAdminMonitor caught exception when processing node "
 + "{}.", dn, e);

Review Comment:
   I havent made this change. To correct Log Single Line issues is any and 
required - A separate JIRA can be created I think. 





Issue Time Tracking
---

Worklog Id: (was: 794900)
Time Spent: 1h  (was: 50m)

> DatanodeAdminManager$Monitor reports a node as invalid continuously
> ---
>
> Key: HDFS-16676
> URL: https://issues.apache.org/jira/browse/HDFS-16676
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.2.1
>Reporter: Prabhu Joseph
>Assignee: groot
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> DatanodeAdminManager$Monitor reports a node as invalid continuously
> {code}
> 2022-07-21 06:54:38,562 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager 
> (DatanodeAdminMonitor-0): DatanodeAdminMonitor caught exception when 
> processing node 1.2.3.4:9866.
> java.lang.IllegalStateException: Node 1.2.3.4:9866 is in an invalid state! 
> Invalid state: In Service 0 blocks are on this dn.
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:172)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:601)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:504)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> {code}
> A node goes into invalid state when stopDecommission sets the node to 
> IN-Service and misses to remove from pendingNodes queues (HDFS-16675). This 
> will be corrected only when user triggers startDecommission. Till then we 
> need not keep the invalid state node in the queue as anyway startDecommission 
> will add it back.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16676) DatanodeAdminManager$Monitor reports a node as invalid continuously

2022-07-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16676?focusedWorklogId=794899=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794899
 ]

ASF GitHub Bot logged work on HDFS-16676:
-

Author: ASF GitHub Bot
Created on: 25/Jul/22 13:41
Start Date: 25/Jul/22 13:41
Worklog Time Spent: 10m 
  Work Description: ashutoshcipher commented on code in PR #4626:
URL: https://github.com/apache/hadoop/pull/4626#discussion_r928898679


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java:
##
@@ -265,6 +266,7 @@ private void check() {
   // to track maintenance expiration.
   dnAdmin.setInMaintenance(dn);
 } else {
+  inValidState  = true;
   Preconditions.checkState(false,

Review Comment:
   What's the indentation issue here?





Issue Time Tracking
---

Worklog Id: (was: 794899)
Time Spent: 50m  (was: 40m)

> DatanodeAdminManager$Monitor reports a node as invalid continuously
> ---
>
> Key: HDFS-16676
> URL: https://issues.apache.org/jira/browse/HDFS-16676
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.2.1
>Reporter: Prabhu Joseph
>Assignee: groot
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> DatanodeAdminManager$Monitor reports a node as invalid continuously
> {code}
> 2022-07-21 06:54:38,562 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager 
> (DatanodeAdminMonitor-0): DatanodeAdminMonitor caught exception when 
> processing node 1.2.3.4:9866.
> java.lang.IllegalStateException: Node 1.2.3.4:9866 is in an invalid state! 
> Invalid state: In Service 0 blocks are on this dn.
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:172)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:601)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:504)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> {code}
> A node goes into invalid state when stopDecommission sets the node to 
> IN-Service and misses to remove from pendingNodes queues (HDFS-16675). This 
> will be corrected only when user triggers startDecommission. Till then we 
> need not keep the invalid state node in the queue as anyway startDecommission 
> will add it back.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16676) DatanodeAdminManager$Monitor reports a node as invalid continuously

2022-07-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16676?focusedWorklogId=794871=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794871
 ]

ASF GitHub Bot logged work on HDFS-16676:
-

Author: ASF GitHub Bot
Created on: 25/Jul/22 12:35
Start Date: 25/Jul/22 12:35
Worklog Time Spent: 10m 
  Work Description: slfan1989 commented on code in PR #4626:
URL: https://github.com/apache/hadoop/pull/4626#discussion_r928833032


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java:
##
@@ -201,6 +201,7 @@ private void check() {
 iterkey).iterator();
 final List toRemove = new ArrayList<>();
 final List unhealthyDns = new ArrayList<>();
+boolean inValidState = false;
 
 while (it.hasNext() && !exceededNumBlocksPerCheck() && namesystem

Review Comment:
   Single Line?





Issue Time Tracking
---

Worklog Id: (was: 794871)
Time Spent: 40m  (was: 0.5h)

> DatanodeAdminManager$Monitor reports a node as invalid continuously
> ---
>
> Key: HDFS-16676
> URL: https://issues.apache.org/jira/browse/HDFS-16676
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.2.1
>Reporter: Prabhu Joseph
>Assignee: groot
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> DatanodeAdminManager$Monitor reports a node as invalid continuously
> {code}
> 2022-07-21 06:54:38,562 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager 
> (DatanodeAdminMonitor-0): DatanodeAdminMonitor caught exception when 
> processing node 1.2.3.4:9866.
> java.lang.IllegalStateException: Node 1.2.3.4:9866 is in an invalid state! 
> Invalid state: In Service 0 blocks are on this dn.
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:172)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:601)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:504)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> {code}
> A node goes into invalid state when stopDecommission sets the node to 
> IN-Service and misses to remove from pendingNodes queues (HDFS-16675). This 
> will be corrected only when user triggers startDecommission. Till then we 
> need not keep the invalid state node in the queue as anyway startDecommission 
> will add it back.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16676) DatanodeAdminManager$Monitor reports a node as invalid continuously

2022-07-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16676?focusedWorklogId=794870=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794870
 ]

ASF GitHub Bot logged work on HDFS-16676:
-

Author: ASF GitHub Bot
Created on: 25/Jul/22 12:35
Start Date: 25/Jul/22 12:35
Worklog Time Spent: 10m 
  Work Description: slfan1989 commented on code in PR #4626:
URL: https://github.com/apache/hadoop/pull/4626#discussion_r928832831


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java:
##
@@ -265,6 +266,7 @@ private void check() {
   // to track maintenance expiration.
   dnAdmin.setInMaintenance(dn);
 } else {
+  inValidState  = true;
   Preconditions.checkState(false,

Review Comment:
   indentation





Issue Time Tracking
---

Worklog Id: (was: 794870)
Time Spent: 0.5h  (was: 20m)

> DatanodeAdminManager$Monitor reports a node as invalid continuously
> ---
>
> Key: HDFS-16676
> URL: https://issues.apache.org/jira/browse/HDFS-16676
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.2.1
>Reporter: Prabhu Joseph
>Assignee: groot
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> DatanodeAdminManager$Monitor reports a node as invalid continuously
> {code}
> 2022-07-21 06:54:38,562 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager 
> (DatanodeAdminMonitor-0): DatanodeAdminMonitor caught exception when 
> processing node 1.2.3.4:9866.
> java.lang.IllegalStateException: Node 1.2.3.4:9866 is in an invalid state! 
> Invalid state: In Service 0 blocks are on this dn.
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:172)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:601)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:504)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> {code}
> A node goes into invalid state when stopDecommission sets the node to 
> IN-Service and misses to remove from pendingNodes queues (HDFS-16675). This 
> will be corrected only when user triggers startDecommission. Till then we 
> need not keep the invalid state node in the queue as anyway startDecommission 
> will add it back.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16676) DatanodeAdminManager$Monitor reports a node as invalid continuously

2022-07-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16676?focusedWorklogId=794869=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794869
 ]

ASF GitHub Bot logged work on HDFS-16676:
-

Author: ASF GitHub Bot
Created on: 25/Jul/22 12:34
Start Date: 25/Jul/22 12:34
Worklog Time Spent: 10m 
  Work Description: slfan1989 commented on code in PR #4626:
URL: https://github.com/apache/hadoop/pull/4626#discussion_r928832301


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminDefaultMonitor.java:
##
@@ -288,7 +290,11 @@ private void check() {
 // an invalid state.
 LOG.warn("DatanodeAdminMonitor caught exception when processing node "
 + "{}.", dn, e);

Review Comment:
   Log Single Line?





Issue Time Tracking
---

Worklog Id: (was: 794869)
Time Spent: 20m  (was: 10m)

> DatanodeAdminManager$Monitor reports a node as invalid continuously
> ---
>
> Key: HDFS-16676
> URL: https://issues.apache.org/jira/browse/HDFS-16676
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.2.1
>Reporter: Prabhu Joseph
>Assignee: groot
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> DatanodeAdminManager$Monitor reports a node as invalid continuously
> {code}
> 2022-07-21 06:54:38,562 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager 
> (DatanodeAdminMonitor-0): DatanodeAdminMonitor caught exception when 
> processing node 1.2.3.4:9866.
> java.lang.IllegalStateException: Node 1.2.3.4:9866 is in an invalid state! 
> Invalid state: In Service 0 blocks are on this dn.
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:172)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:601)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:504)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> {code}
> A node goes into invalid state when stopDecommission sets the node to 
> IN-Service and misses to remove from pendingNodes queues (HDFS-16675). This 
> will be corrected only when user triggers startDecommission. Till then we 
> need not keep the invalid state node in the queue as anyway startDecommission 
> will add it back.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16676) DatanodeAdminManager$Monitor reports a node as invalid continuously

2022-07-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16676?focusedWorklogId=794862=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794862
 ]

ASF GitHub Bot logged work on HDFS-16676:
-

Author: ASF GitHub Bot
Created on: 25/Jul/22 12:01
Start Date: 25/Jul/22 12:01
Worklog Time Spent: 10m 
  Work Description: ashutoshcipher opened a new pull request, #4626:
URL: https://github.com/apache/hadoop/pull/4626

   ### Description of PR
   
   DatanodeAdminManager$Monitor reports a node as invalid continuously
   
   JIRA - HDFS-16676
   
   
   ### For code changes:
   
   - [X] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




Issue Time Tracking
---

Worklog Id: (was: 794862)
Remaining Estimate: 0h
Time Spent: 10m

> DatanodeAdminManager$Monitor reports a node as invalid continuously
> ---
>
> Key: HDFS-16676
> URL: https://issues.apache.org/jira/browse/HDFS-16676
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.2.1
>Reporter: Prabhu Joseph
>Assignee: groot
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> DatanodeAdminManager$Monitor reports a node as invalid continuously
> {code}
> 2022-07-21 06:54:38,562 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager 
> (DatanodeAdminMonitor-0): DatanodeAdminMonitor caught exception when 
> processing node 1.2.3.4:9866.
> java.lang.IllegalStateException: Node 1.2.3.4:9866 is in an invalid state! 
> Invalid state: In Service 0 blocks are on this dn.
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:172)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:601)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:504)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> {code}
> A node goes into invalid state when stopDecommission sets the node to 
> IN-Service and misses to remove from pendingNodes queues (HDFS-16675). This 
> will be corrected only when user triggers startDecommission. Till then we 
> need not keep the invalid state node in the queue as anyway startDecommission 
> will add it back.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org