[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=727563=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-727563 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 15/Feb/22 19:01 Start Date: 15/Feb/22 19:01 Worklog Time Spent: 10m Work Description: tasanuma merged pull request #3827: URL: https://github.com/apache/hadoop/pull/3827 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 727563) Time Spent: 5h 10m (was: 5h) > Reconfig slow peer parameters for datanode > -- > > Key: HDFS-16396 > URL: https://issues.apache.org/jira/browse/HDFS-16396 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 5h 10m > Remaining Estimate: 0h > > In large clusters, rolling restart datanodes takes a long time. We can make > slow peers parameters and slow disks parameters in datanode reconfigurable to > facilitate cluster operation and maintenance. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=727436=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-727436 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 15/Feb/22 18:50 Start Date: 15/Feb/22 18:50 Worklog Time Spent: 10m Work Description: tomscut commented on pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#issuecomment-1039826206 Thanks @tasanuma and @ayushtkn for the review and confirming this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 727436) Time Spent: 5h (was: 4h 50m) > Reconfig slow peer parameters for datanode > -- > > Key: HDFS-16396 > URL: https://issues.apache.org/jira/browse/HDFS-16396 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 5h > Remaining Estimate: 0h > > In large clusters, rolling restart datanodes takes a long time. We can make > slow peers parameters and slow disks parameters in datanode reconfigurable to > facilitate cluster operation and maintenance. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=727302=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-727302 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 15/Feb/22 18:38 Start Date: 15/Feb/22 18:38 Worklog Time Spent: 10m Work Description: tasanuma commented on pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#issuecomment-1039851634 Merged it. Thanks for your contribution, @tomscut, and thanks for your review, @ayushtkn! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 727302) Time Spent: 4h 50m (was: 4h 40m) > Reconfig slow peer parameters for datanode > -- > > Key: HDFS-16396 > URL: https://issues.apache.org/jira/browse/HDFS-16396 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 4h 50m > Remaining Estimate: 0h > > In large clusters, rolling restart datanodes takes a long time. We can make > slow peers parameters and slow disks parameters in datanode reconfigurable to > facilitate cluster operation and maintenance. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=726835=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-726835 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 15/Feb/22 04:41 Start Date: 15/Feb/22 04:41 Worklog Time Spent: 10m Work Description: tasanuma merged pull request #3827: URL: https://github.com/apache/hadoop/pull/3827 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 726835) Time Spent: 4.5h (was: 4h 20m) > Reconfig slow peer parameters for datanode > -- > > Key: HDFS-16396 > URL: https://issues.apache.org/jira/browse/HDFS-16396 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 4.5h > Remaining Estimate: 0h > > In large clusters, rolling restart datanodes takes a long time. We can make > slow peers parameters and slow disks parameters in datanode reconfigurable to > facilitate cluster operation and maintenance. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=726836=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-726836 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 15/Feb/22 04:41 Start Date: 15/Feb/22 04:41 Worklog Time Spent: 10m Work Description: tasanuma commented on pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#issuecomment-1039851634 Merged it. Thanks for your contribution, @tomscut, and thanks for your review, @ayushtkn! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 726836) Time Spent: 4h 40m (was: 4.5h) > Reconfig slow peer parameters for datanode > -- > > Key: HDFS-16396 > URL: https://issues.apache.org/jira/browse/HDFS-16396 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 4h 40m > Remaining Estimate: 0h > > In large clusters, rolling restart datanodes takes a long time. We can make > slow peers parameters and slow disks parameters in datanode reconfigurable to > facilitate cluster operation and maintenance. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=726824=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-726824 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 15/Feb/22 03:52 Start Date: 15/Feb/22 03:52 Worklog Time Spent: 10m Work Description: tomscut commented on pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#issuecomment-1039826206 Thanks @tasanuma and @ayushtkn for the review and confirming this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 726824) Time Spent: 4h 20m (was: 4h 10m) > Reconfig slow peer parameters for datanode > -- > > Key: HDFS-16396 > URL: https://issues.apache.org/jira/browse/HDFS-16396 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 4h 20m > Remaining Estimate: 0h > > In large clusters, rolling restart datanodes takes a long time. We can make > slow peers parameters and slow disks parameters in datanode reconfigurable to > facilitate cluster operation and maintenance. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=726224=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-726224 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 14/Feb/22 10:37 Start Date: 14/Feb/22 10:37 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#issuecomment-1038924634 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 17m 16s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 34m 47s | | trunk passed | | +1 :green_heart: | compile | 1m 30s | | trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 1m 19s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 0m 59s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 26s | | trunk passed | | +1 :green_heart: | javadoc | 1m 1s | | trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 31s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 23s | | trunk passed | | +1 :green_heart: | shadedclient | 25m 38s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 20s | | the patch passed | | +1 :green_heart: | compile | 1m 23s | | the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javac | 1m 23s | | the patch passed | | +1 :green_heart: | compile | 1m 13s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 1m 13s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 51s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/8/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 209 unchanged - 1 fixed = 210 total (was 210) | | +1 :green_heart: | mvnsite | 1m 22s | | the patch passed | | +1 :green_heart: | javadoc | 0m 53s | | the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 23s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 26s | | the patch passed | | +1 :green_heart: | shadedclient | 25m 24s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 345m 5s | | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 38s | | The patch does not generate ASF License warnings. | | | | 468m 53s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/8/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3827 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux eb298760596d 4.15.0-163-generic #171-Ubuntu SMP Fri Nov 5 11:55:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 1b9155e5cb920beab40e352a0b00dbd8b5178af9 | | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/8/testReport/ | | Max. process+thread count | 2343 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U:
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=726083=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-726083 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 14/Feb/22 02:55 Start Date: 14/Feb/22 02:55 Worklog Time Spent: 10m Work Description: tomscut commented on a change in pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#discussion_r805472758 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodePeerMetrics.java ## @@ -57,26 +60,26 @@ * for outlier detection. If the number of samples is below this then * outlier detection is skipped. */ - private final long minOutlierDetectionSamples; + private volatile long minOutlierDetectionSamples; /** * Threshold in milliseconds below which a DataNode is definitely not slow. */ - private final long lowThresholdMs; + private volatile long lowThresholdMs; /** * Minimum number of nodes to run outlier detection. */ - private final long minOutlierDetectionNodes; + private volatile long minOutlierDetectionNodes; public DataNodePeerMetrics(final String name, Configuration conf) { this.name = name; minOutlierDetectionSamples = conf.getLong( DFS_DATANODE_PEER_METRICS_MIN_OUTLIER_DETECTION_SAMPLES_KEY, DFS_DATANODE_PEER_METRICS_MIN_OUTLIER_DETECTION_SAMPLES_DEFAULT); lowThresholdMs = -conf.getLong(DFSConfigKeys.DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_KEY, +conf.getLong(DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_KEY, DFSConfigKeys.DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_DEFAULT); minOutlierDetectionNodes = - conf.getLong(DFSConfigKeys.DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_KEY, +conf.getLong(DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_KEY, DFSConfigKeys.DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_DEFAULT); this.slowNodeDetector = new OutlierDetector(minOutlierDetectionNodes, lowThresholdMs); Review comment: Thanks @tasanuma for your comment, and it makes sense to me. I updated the code, please have a look when you are free. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 726083) Time Spent: 4h (was: 3h 50m) > Reconfig slow peer parameters for datanode > -- > > Key: HDFS-16396 > URL: https://issues.apache.org/jira/browse/HDFS-16396 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 4h > Remaining Estimate: 0h > > In large clusters, rolling restart datanodes takes a long time. We can make > slow peers parameters and slow disks parameters in datanode reconfigurable to > facilitate cluster operation and maintenance. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=725929=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-725929 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 14/Feb/22 02:28 Start Date: 14/Feb/22 02:28 Worklog Time Spent: 10m Work Description: tasanuma commented on a change in pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#discussion_r805442838 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodePeerMetrics.java ## @@ -57,26 +60,26 @@ * for outlier detection. If the number of samples is below this then * outlier detection is skipped. */ - private final long minOutlierDetectionSamples; + private volatile long minOutlierDetectionSamples; /** * Threshold in milliseconds below which a DataNode is definitely not slow. */ - private final long lowThresholdMs; + private volatile long lowThresholdMs; /** * Minimum number of nodes to run outlier detection. */ - private final long minOutlierDetectionNodes; + private volatile long minOutlierDetectionNodes; public DataNodePeerMetrics(final String name, Configuration conf) { this.name = name; minOutlierDetectionSamples = conf.getLong( DFS_DATANODE_PEER_METRICS_MIN_OUTLIER_DETECTION_SAMPLES_KEY, DFS_DATANODE_PEER_METRICS_MIN_OUTLIER_DETECTION_SAMPLES_DEFAULT); lowThresholdMs = -conf.getLong(DFSConfigKeys.DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_KEY, +conf.getLong(DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_KEY, DFSConfigKeys.DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_DEFAULT); minOutlierDetectionNodes = - conf.getLong(DFSConfigKeys.DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_KEY, +conf.getLong(DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_KEY, DFSConfigKeys.DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_DEFAULT); Review comment: Why don't you import the static values of `..._DEFAULT` as well? ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodePeerMetrics.java ## @@ -57,26 +60,26 @@ * for outlier detection. If the number of samples is below this then * outlier detection is skipped. */ - private final long minOutlierDetectionSamples; + private volatile long minOutlierDetectionSamples; /** * Threshold in milliseconds below which a DataNode is definitely not slow. */ - private final long lowThresholdMs; + private volatile long lowThresholdMs; /** * Minimum number of nodes to run outlier detection. */ - private final long minOutlierDetectionNodes; + private volatile long minOutlierDetectionNodes; public DataNodePeerMetrics(final String name, Configuration conf) { this.name = name; minOutlierDetectionSamples = conf.getLong( DFS_DATANODE_PEER_METRICS_MIN_OUTLIER_DETECTION_SAMPLES_KEY, DFS_DATANODE_PEER_METRICS_MIN_OUTLIER_DETECTION_SAMPLES_DEFAULT); lowThresholdMs = -conf.getLong(DFSConfigKeys.DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_KEY, +conf.getLong(DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_KEY, DFSConfigKeys.DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_DEFAULT); minOutlierDetectionNodes = - conf.getLong(DFSConfigKeys.DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_KEY, +conf.getLong(DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_KEY, DFSConfigKeys.DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_DEFAULT); this.slowNodeDetector = new OutlierDetector(minOutlierDetectionNodes, lowThresholdMs); Review comment: `this.slowNodeDetector` has to update `minOutlierDetectionNodes` and `lowThresholdMs` after reconfiguring them, right? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 725929) Time Spent: 3h 50m (was: 3h 40m) > Reconfig slow peer parameters for datanode > -- > > Key: HDFS-16396 > URL: https://issues.apache.org/jira/browse/HDFS-16396 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 3h 50m > Remaining Estimate: 0h > > In large clusters, rolling restart datanodes takes a long time. We can make > slow peers parameters and slow disks parameters in datanode
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=725870=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-725870 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 14/Feb/22 02:03 Start Date: 14/Feb/22 02:03 Worklog Time Spent: 10m Work Description: tasanuma commented on a change in pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#discussion_r805442838 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodePeerMetrics.java ## @@ -57,26 +60,26 @@ * for outlier detection. If the number of samples is below this then * outlier detection is skipped. */ - private final long minOutlierDetectionSamples; + private volatile long minOutlierDetectionSamples; /** * Threshold in milliseconds below which a DataNode is definitely not slow. */ - private final long lowThresholdMs; + private volatile long lowThresholdMs; /** * Minimum number of nodes to run outlier detection. */ - private final long minOutlierDetectionNodes; + private volatile long minOutlierDetectionNodes; public DataNodePeerMetrics(final String name, Configuration conf) { this.name = name; minOutlierDetectionSamples = conf.getLong( DFS_DATANODE_PEER_METRICS_MIN_OUTLIER_DETECTION_SAMPLES_KEY, DFS_DATANODE_PEER_METRICS_MIN_OUTLIER_DETECTION_SAMPLES_DEFAULT); lowThresholdMs = -conf.getLong(DFSConfigKeys.DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_KEY, +conf.getLong(DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_KEY, DFSConfigKeys.DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_DEFAULT); minOutlierDetectionNodes = - conf.getLong(DFSConfigKeys.DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_KEY, +conf.getLong(DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_KEY, DFSConfigKeys.DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_DEFAULT); Review comment: Why don't you import the static values of `..._DEFAULT` as well? ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodePeerMetrics.java ## @@ -57,26 +60,26 @@ * for outlier detection. If the number of samples is below this then * outlier detection is skipped. */ - private final long minOutlierDetectionSamples; + private volatile long minOutlierDetectionSamples; /** * Threshold in milliseconds below which a DataNode is definitely not slow. */ - private final long lowThresholdMs; + private volatile long lowThresholdMs; /** * Minimum number of nodes to run outlier detection. */ - private final long minOutlierDetectionNodes; + private volatile long minOutlierDetectionNodes; public DataNodePeerMetrics(final String name, Configuration conf) { this.name = name; minOutlierDetectionSamples = conf.getLong( DFS_DATANODE_PEER_METRICS_MIN_OUTLIER_DETECTION_SAMPLES_KEY, DFS_DATANODE_PEER_METRICS_MIN_OUTLIER_DETECTION_SAMPLES_DEFAULT); lowThresholdMs = -conf.getLong(DFSConfigKeys.DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_KEY, +conf.getLong(DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_KEY, DFSConfigKeys.DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_DEFAULT); minOutlierDetectionNodes = - conf.getLong(DFSConfigKeys.DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_KEY, +conf.getLong(DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_KEY, DFSConfigKeys.DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_DEFAULT); this.slowNodeDetector = new OutlierDetector(minOutlierDetectionNodes, lowThresholdMs); Review comment: `this.slowNodeDetector` has to update `minOutlierDetectionNodes` and `lowThresholdMs` after reconfiguring them, right? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 725870) Time Spent: 3h 40m (was: 3.5h) > Reconfig slow peer parameters for datanode > -- > > Key: HDFS-16396 > URL: https://issues.apache.org/jira/browse/HDFS-16396 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 3h 40m > Remaining Estimate: 0h > > In large clusters, rolling restart datanodes takes a long time. We can make > slow peers parameters and slow disks parameters in datanode
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=719815=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-719815 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 03/Feb/22 01:13 Start Date: 03/Feb/22 01:13 Worklog Time Spent: 10m Work Description: tomscut commented on pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#issuecomment-1028510711 Hi @ayushtkn @tasanuma , could you please help review this PR. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 719815) Time Spent: 3.5h (was: 3h 20m) > Reconfig slow peer parameters for datanode > -- > > Key: HDFS-16396 > URL: https://issues.apache.org/jira/browse/HDFS-16396 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 3.5h > Remaining Estimate: 0h > > In large clusters, rolling restart datanodes takes a long time. We can make > slow peers parameters and slow disks parameters in datanode reconfigurable to > facilitate cluster operation and maintenance. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=718376=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-718376 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 01/Feb/22 02:13 Start Date: 01/Feb/22 02:13 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#issuecomment-1026413196 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 45s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 34m 20s | | trunk passed | | +1 :green_heart: | compile | 1m 32s | | trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 1m 23s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 1m 6s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 48s | | trunk passed | | +1 :green_heart: | javadoc | 1m 5s | | trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 34s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 19s | | trunk passed | | +1 :green_heart: | shadedclient | 23m 54s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 18s | | the patch passed | | +1 :green_heart: | compile | 1m 20s | | the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javac | 1m 20s | | the patch passed | | +1 :green_heart: | compile | 1m 14s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 1m 14s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 51s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/7/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 210 unchanged - 1 fixed = 211 total (was 211) | | +1 :green_heart: | mvnsite | 1m 20s | | the patch passed | | +1 :green_heart: | javadoc | 0m 51s | | the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 23s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 18s | | the patch passed | | +1 :green_heart: | shadedclient | 23m 37s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 226m 28s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/7/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 48s | | The patch does not generate ASF License warnings. | | | | 330m 49s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/7/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3827 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux 9d5e61c47290 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 5332ff35509109879ad95d5529e555cc9e35f588 | | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=716298=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-716298 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 27/Jan/22 09:32 Start Date: 27/Jan/22 09:32 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#issuecomment-1023014162 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 51s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 35m 12s | | trunk passed | | +1 :green_heart: | compile | 1m 29s | | trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 1m 22s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 0m 59s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 28s | | trunk passed | | +1 :green_heart: | javadoc | 1m 1s | | trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 28s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 20s | | trunk passed | | +1 :green_heart: | shadedclient | 25m 26s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 19s | | the patch passed | | +1 :green_heart: | compile | 1m 23s | | the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javac | 1m 23s | | the patch passed | | +1 :green_heart: | compile | 1m 14s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 1m 14s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 53s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/6/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 209 unchanged - 1 fixed = 210 total (was 210) | | +1 :green_heart: | mvnsite | 1m 18s | | the patch passed | | +1 :green_heart: | javadoc | 0m 53s | | the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 24s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 26s | | the patch passed | | +1 :green_heart: | shadedclient | 25m 58s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 332m 40s | | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 38s | | The patch does not generate ASF License warnings. | | | | 440m 45s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/6/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3827 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux 4ee23d48da65 4.15.0-162-generic #170-Ubuntu SMP Mon Oct 18 11:38:05 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 5332ff35509109879ad95d5529e555cc9e35f588 | | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/6/testReport/ | | Max. process+thread count | 2000 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U:
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=716295=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-716295 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 27/Jan/22 09:24 Start Date: 27/Jan/22 09:24 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#issuecomment-1023007212 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 17m 47s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 35m 20s | | trunk passed | | +1 :green_heart: | compile | 1m 27s | | trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 1m 17s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 1m 0s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 25s | | trunk passed | | +1 :green_heart: | javadoc | 1m 2s | | trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 29s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 20s | | trunk passed | | +1 :green_heart: | shadedclient | 25m 29s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 16s | | the patch passed | | +1 :green_heart: | compile | 1m 22s | | the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javac | 1m 22s | | the patch passed | | +1 :green_heart: | compile | 1m 18s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 1m 18s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 53s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/4/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 209 unchanged - 1 fixed = 212 total (was 210) | | +1 :green_heart: | mvnsite | 1m 21s | | the patch passed | | +1 :green_heart: | javadoc | 0m 53s | | the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 24s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 27s | | the patch passed | | +1 :green_heart: | shadedclient | 25m 35s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 342m 35s | | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 46s | | The patch does not generate ASF License warnings. | | | | 467m 49s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3827 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux 7a000dcff8ab 4.15.0-162-generic #170-Ubuntu SMP Mon Oct 18 11:38:05 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / edbee4bd469fcd741376783236d828b70a3f5051 | | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/4/testReport/ | | Max. process+thread count | 2013 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U:
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=716225=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-716225 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 27/Jan/22 07:15 Start Date: 27/Jan/22 07:15 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#issuecomment-1022913689 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 45s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 32m 14s | | trunk passed | | +1 :green_heart: | compile | 1m 26s | | trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 1m 18s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 1m 3s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 29s | | trunk passed | | +1 :green_heart: | javadoc | 1m 3s | | trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 30s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 15s | | trunk passed | | +1 :green_heart: | shadedclient | 22m 39s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 16s | | the patch passed | | +1 :green_heart: | compile | 1m 17s | | the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javac | 1m 17s | | the patch passed | | +1 :green_heart: | compile | 1m 14s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 1m 14s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 52s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/5/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 209 unchanged - 1 fixed = 212 total (was 210) | | +1 :green_heart: | mvnsite | 1m 19s | | the patch passed | | +1 :green_heart: | javadoc | 0m 51s | | the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 21s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 17s | | the patch passed | | +1 :green_heart: | shadedclient | 22m 7s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 226m 59s | | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 45s | | The patch does not generate ASF License warnings. | | | | 325m 46s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/5/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3827 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux c427e27a3950 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / edbee4bd469fcd741376783236d828b70a3f5051 | | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/5/testReport/ | | Max. process+thread count | 3442 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U:
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=716154=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-716154 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 27/Jan/22 02:08 Start Date: 27/Jan/22 02:08 Worklog Time Spent: 10m Work Description: tomscut commented on pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#issuecomment-1022777015 Hi @tamaashu , please also take a look at this. Thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 716154) Time Spent: 2h 40m (was: 2.5h) > Reconfig slow peer parameters for datanode > -- > > Key: HDFS-16396 > URL: https://issues.apache.org/jira/browse/HDFS-16396 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 2h 40m > Remaining Estimate: 0h > > In large clusters, rolling restart datanodes takes a long time. We can make > slow peers parameters and slow disks parameters in datanode reconfigurable to > facilitate cluster operation and maintenance. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=716153=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-716153 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 27/Jan/22 02:07 Start Date: 27/Jan/22 02:07 Worklog Time Spent: 10m Work Description: tomscut commented on a change in pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#discussion_r793197883 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java ## @@ -642,13 +656,76 @@ public String reconfigurePropertyImpl(String property, String newVal) } break; } +case DFS_DATANODE_PEER_STATS_ENABLED_KEY: +case DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_KEY: +case DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_KEY: +case DFS_DATANODE_PEER_METRICS_MIN_OUTLIER_DETECTION_SAMPLES_KEY: + return reconfSlowPeerParameters(property, newVal); default: break; } throw new ReconfigurationException( property, newVal, getConf().get(property)); } + private String reconfSlowPeerParameters(String property, String newVal) + throws ReconfigurationException { +String result; +try { + LOG.info("Reconfiguring {} to {}", property, newVal); + if (property.equals(DFS_DATANODE_PEER_STATS_ENABLED_KEY)) { +checkNotNull(dnConf, "DNConf has not been initialized."); +if (newVal != null && !newVal.equalsIgnoreCase("true") +&& !newVal.equalsIgnoreCase("false")) { + throw new IllegalArgumentException("Not a valid Boolean value for " + property + + " in reconfSlowPeerParameters"); +} +boolean enable = (newVal == null ? DFS_DATANODE_PEER_STATS_ENABLED_DEFAULT : +Boolean.parseBoolean(newVal)); +result = Boolean.toString(enable); +dnConf.setPeerStatsEnabled(enable); +if (enable) { + if (peerMetrics == null) { +peerMetrics = DataNodePeerMetrics.create(getDisplayName(), getConf()); + } +} else { + peerMetrics = null; Review comment: > `peerMetrics` isn't synchronised and it is being fetched by BpServiceActor & BlockReciever. I doubt in race conditions this may lead to some NPE there Thanks @ayushtkn for your comment. To avoid NPE, I did not set `peerMetrics` to null when disable, but `peerMetrics` remains. Do you think such a solution is feasible? Thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 716153) Time Spent: 2.5h (was: 2h 20m) > Reconfig slow peer parameters for datanode > -- > > Key: HDFS-16396 > URL: https://issues.apache.org/jira/browse/HDFS-16396 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > > In large clusters, rolling restart datanodes takes a long time. We can make > slow peers parameters and slow disks parameters in datanode reconfigurable to > facilitate cluster operation and maintenance. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=716146=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-716146 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 27/Jan/22 02:03 Start Date: 27/Jan/22 02:03 Worklog Time Spent: 10m Work Description: tomscut commented on a change in pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#discussion_r777201820 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java ## @@ -642,13 +656,76 @@ public String reconfigurePropertyImpl(String property, String newVal) } break; } +case DFS_DATANODE_PEER_STATS_ENABLED_KEY: +case DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_KEY: +case DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_KEY: +case DFS_DATANODE_PEER_METRICS_MIN_OUTLIER_DETECTION_SAMPLES_KEY: + return reconfSlowPeerParameters(property, newVal); default: break; } throw new ReconfigurationException( property, newVal, getConf().get(property)); } + private String reconfSlowPeerParameters(String property, String newVal) + throws ReconfigurationException { +String result; +try { + LOG.info("Reconfiguring {} to {}", property, newVal); + if (property.equals(DFS_DATANODE_PEER_STATS_ENABLED_KEY)) { +checkNotNull(dnConf, "DNConf has not been initialized."); +if (newVal != null && !newVal.equalsIgnoreCase("true") +&& !newVal.equalsIgnoreCase("false")) { + throw new IllegalArgumentException("Not a valid Boolean value for " + property + + " in reconfSlowPeerParameters"); +} +boolean enable = (newVal == null ? DFS_DATANODE_PEER_STATS_ENABLED_DEFAULT : +Boolean.parseBoolean(newVal)); +result = Boolean.toString(enable); +dnConf.setPeerStatsEnabled(enable); +if (enable) { + if (peerMetrics == null) { +peerMetrics = DataNodePeerMetrics.create(getDisplayName(), getConf()); + } +} else { + peerMetrics = null; Review comment: Thanks @ayushtkn for your comments. To avoid NPE, I set `peerStatsEnabled`(volatile) first, and then I add judgment `dnConf.peerStatsEnabled && peerMetrics != null` before using `peerMetrics`. Do you think this is ok? ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java ## @@ -642,13 +656,76 @@ public String reconfigurePropertyImpl(String property, String newVal) } break; } +case DFS_DATANODE_PEER_STATS_ENABLED_KEY: +case DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_KEY: +case DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_KEY: +case DFS_DATANODE_PEER_METRICS_MIN_OUTLIER_DETECTION_SAMPLES_KEY: + return reconfSlowPeerParameters(property, newVal); default: break; } throw new ReconfigurationException( property, newVal, getConf().get(property)); } + private String reconfSlowPeerParameters(String property, String newVal) + throws ReconfigurationException { +String result; +try { + LOG.info("Reconfiguring {} to {}", property, newVal); + if (property.equals(DFS_DATANODE_PEER_STATS_ENABLED_KEY)) { +checkNotNull(dnConf, "DNConf has not been initialized."); +if (newVal != null && !newVal.equalsIgnoreCase("true") +&& !newVal.equalsIgnoreCase("false")) { + throw new IllegalArgumentException("Not a valid Boolean value for " + property + + " in reconfSlowPeerParameters"); +} +boolean enable = (newVal == null ? DFS_DATANODE_PEER_STATS_ENABLED_DEFAULT : +Boolean.parseBoolean(newVal)); +result = Boolean.toString(enable); +dnConf.setPeerStatsEnabled(enable); +if (enable) { + if (peerMetrics == null) { +peerMetrics = DataNodePeerMetrics.create(getDisplayName(), getConf()); + } +} else { + peerMetrics = null; Review comment: > Thanks @ayushtkn for your comments. > > To avoid NPE, I set `peerStatsEnabled`(volatile) first, and then I add judgment `dnConf.peerStatsEnabled && peerMetrics != null` before using `peerMetrics`. Do you think this is ok? Hi @ayushtkn , what do you think of this solution? Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 716146) Time Spent: 2h 20m (was: 2h
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=716139=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-716139 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 27/Jan/22 01:48 Start Date: 27/Jan/22 01:48 Worklog Time Spent: 10m Work Description: tomscut commented on a change in pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#discussion_r793191324 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeReconfiguration.java ## @@ -365,4 +372,79 @@ public void testBlockReportIntervalReconfiguration() .getConf().get(DFS_BLOCKREPORT_INTERVAL_MSEC_KEY)); } } + + @Test + public void testSlowPeerParameters() + throws ReconfigurationException { +String[] slowPeersParameters = { +DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_KEY, +DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_KEY, +DFS_DATANODE_PEER_METRICS_MIN_OUTLIER_DETECTION_SAMPLES_KEY}; + +for (int i = 0; i < NUM_DATA_NODE; i++) { + DataNode dn = cluster.getDataNodes().get(i); + + // Try invalid values. + try { +dn.reconfigureProperty(DFS_DATANODE_PEER_STATS_ENABLED_KEY, "text"); + } catch (ReconfigurationException expected) { +assertEquals("Could not change property dfs.datanode.peer.stats.enabled from 'true' to " + +"'text'", expected.getMessage()); + } Review comment: Hi @ayushtkn , I update it, PTAL. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 716139) Time Spent: 2h 10m (was: 2h) > Reconfig slow peer parameters for datanode > -- > > Key: HDFS-16396 > URL: https://issues.apache.org/jira/browse/HDFS-16396 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > > In large clusters, rolling restart datanodes takes a long time. We can make > slow peers parameters and slow disks parameters in datanode reconfigurable to > facilitate cluster operation and maintenance. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=704378=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-704378 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 06/Jan/22 02:22 Start Date: 06/Jan/22 02:22 Worklog Time Spent: 10m Work Description: tomscut commented on a change in pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#discussion_r779258490 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java ## @@ -642,13 +656,76 @@ public String reconfigurePropertyImpl(String property, String newVal) } break; } +case DFS_DATANODE_PEER_STATS_ENABLED_KEY: +case DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_KEY: +case DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_KEY: +case DFS_DATANODE_PEER_METRICS_MIN_OUTLIER_DETECTION_SAMPLES_KEY: + return reconfSlowPeerParameters(property, newVal); default: break; } throw new ReconfigurationException( property, newVal, getConf().get(property)); } + private String reconfSlowPeerParameters(String property, String newVal) + throws ReconfigurationException { +String result; +try { + LOG.info("Reconfiguring {} to {}", property, newVal); + if (property.equals(DFS_DATANODE_PEER_STATS_ENABLED_KEY)) { +checkNotNull(dnConf, "DNConf has not been initialized."); +if (newVal != null && !newVal.equalsIgnoreCase("true") +&& !newVal.equalsIgnoreCase("false")) { + throw new IllegalArgumentException("Not a valid Boolean value for " + property + + " in reconfSlowPeerParameters"); +} +boolean enable = (newVal == null ? DFS_DATANODE_PEER_STATS_ENABLED_DEFAULT : +Boolean.parseBoolean(newVal)); +result = Boolean.toString(enable); +dnConf.setPeerStatsEnabled(enable); +if (enable) { + if (peerMetrics == null) { +peerMetrics = DataNodePeerMetrics.create(getDisplayName(), getConf()); + } +} else { + peerMetrics = null; Review comment: > Thanks @ayushtkn for your comments. > > To avoid NPE, I set `peerStatsEnabled`(volatile) first, and then I add judgment `dnConf.peerStatsEnabled && peerMetrics != null` before using `peerMetrics`. Do you think this is ok? Hi @ayushtkn , what do you think of this solution? Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 704378) Time Spent: 2h (was: 1h 50m) > Reconfig slow peer parameters for datanode > -- > > Key: HDFS-16396 > URL: https://issues.apache.org/jira/browse/HDFS-16396 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > In large clusters, rolling restart datanodes takes a long time. We can make > slow peers parameters and slow disks parameters in datanode reconfigurable to > facilitate cluster operation and maintenance. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=703168=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-703168 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 04/Jan/22 02:51 Start Date: 04/Jan/22 02:51 Worklog Time Spent: 10m Work Description: tomscut commented on a change in pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#discussion_r777201820 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java ## @@ -642,13 +656,76 @@ public String reconfigurePropertyImpl(String property, String newVal) } break; } +case DFS_DATANODE_PEER_STATS_ENABLED_KEY: +case DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_KEY: +case DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_KEY: +case DFS_DATANODE_PEER_METRICS_MIN_OUTLIER_DETECTION_SAMPLES_KEY: + return reconfSlowPeerParameters(property, newVal); default: break; } throw new ReconfigurationException( property, newVal, getConf().get(property)); } + private String reconfSlowPeerParameters(String property, String newVal) + throws ReconfigurationException { +String result; +try { + LOG.info("Reconfiguring {} to {}", property, newVal); + if (property.equals(DFS_DATANODE_PEER_STATS_ENABLED_KEY)) { +checkNotNull(dnConf, "DNConf has not been initialized."); +if (newVal != null && !newVal.equalsIgnoreCase("true") +&& !newVal.equalsIgnoreCase("false")) { + throw new IllegalArgumentException("Not a valid Boolean value for " + property + + " in reconfSlowPeerParameters"); +} +boolean enable = (newVal == null ? DFS_DATANODE_PEER_STATS_ENABLED_DEFAULT : +Boolean.parseBoolean(newVal)); +result = Boolean.toString(enable); +dnConf.setPeerStatsEnabled(enable); +if (enable) { + if (peerMetrics == null) { +peerMetrics = DataNodePeerMetrics.create(getDisplayName(), getConf()); + } +} else { + peerMetrics = null; Review comment: Thanks @ayushtkn for your comments. To avoid NPE, I set `peerStatsEnabled`(volatile) first, and then I add judgment `dnConf.peerStatsEnabled && peerMetrics != null` before using `peerMetrics`. Do you think this is ok? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 703168) Time Spent: 1h 50m (was: 1h 40m) > Reconfig slow peer parameters for datanode > -- > > Key: HDFS-16396 > URL: https://issues.apache.org/jira/browse/HDFS-16396 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > In large clusters, rolling restart datanodes takes a long time. We can make > slow peers parameters and slow disks parameters in datanode reconfigurable to > facilitate cluster operation and maintenance. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=702802=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-702802 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 02/Jan/22 12:05 Start Date: 02/Jan/22 12:05 Worklog Time Spent: 10m Work Description: tomscut commented on a change in pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#discussion_r777201910 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java ## @@ -865,7 +942,7 @@ private void refreshVolumes(String newVolumes) throws IOException { .newFixedThreadPool(changedVolumes.newLocations.size()); List> exceptions = Lists.newArrayList(); - Preconditions.checkNotNull(data, "Storage not yet initialized"); + checkNotNull(data, "Storage not yet initialized"); Review comment: Thanks. I will fix this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 702802) Time Spent: 1.5h (was: 1h 20m) > Reconfig slow peer parameters for datanode > -- > > Key: HDFS-16396 > URL: https://issues.apache.org/jira/browse/HDFS-16396 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > In large clusters, rolling restart datanodes takes a long time. We can make > slow peers parameters and slow disks parameters in datanode reconfigurable to > facilitate cluster operation and maintenance. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=702803=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-702803 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 02/Jan/22 12:05 Start Date: 02/Jan/22 12:05 Worklog Time Spent: 10m Work Description: tomscut commented on a change in pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#discussion_r777201933 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeReconfiguration.java ## @@ -365,4 +372,79 @@ public void testBlockReportIntervalReconfiguration() .getConf().get(DFS_BLOCKREPORT_INTERVAL_MSEC_KEY)); } } + + @Test + public void testSlowPeerParameters() + throws ReconfigurationException { +String[] slowPeersParameters = { +DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_KEY, +DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_KEY, +DFS_DATANODE_PEER_METRICS_MIN_OUTLIER_DETECTION_SAMPLES_KEY}; + +for (int i = 0; i < NUM_DATA_NODE; i++) { + DataNode dn = cluster.getDataNodes().get(i); + + // Try invalid values. + try { +dn.reconfigureProperty(DFS_DATANODE_PEER_STATS_ENABLED_KEY, "text"); + } catch (ReconfigurationException expected) { +assertEquals("Could not change property dfs.datanode.peer.stats.enabled from 'true' to " + +"'text'", expected.getMessage()); + } Review comment: Good idea. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 702803) Time Spent: 1h 40m (was: 1.5h) > Reconfig slow peer parameters for datanode > -- > > Key: HDFS-16396 > URL: https://issues.apache.org/jira/browse/HDFS-16396 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > In large clusters, rolling restart datanodes takes a long time. We can make > slow peers parameters and slow disks parameters in datanode reconfigurable to > facilitate cluster operation and maintenance. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=702801=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-702801 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 02/Jan/22 12:04 Start Date: 02/Jan/22 12:04 Worklog Time Spent: 10m Work Description: tomscut commented on a change in pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#discussion_r777201820 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java ## @@ -642,13 +656,76 @@ public String reconfigurePropertyImpl(String property, String newVal) } break; } +case DFS_DATANODE_PEER_STATS_ENABLED_KEY: +case DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_KEY: +case DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_KEY: +case DFS_DATANODE_PEER_METRICS_MIN_OUTLIER_DETECTION_SAMPLES_KEY: + return reconfSlowPeerParameters(property, newVal); default: break; } throw new ReconfigurationException( property, newVal, getConf().get(property)); } + private String reconfSlowPeerParameters(String property, String newVal) + throws ReconfigurationException { +String result; +try { + LOG.info("Reconfiguring {} to {}", property, newVal); + if (property.equals(DFS_DATANODE_PEER_STATS_ENABLED_KEY)) { +checkNotNull(dnConf, "DNConf has not been initialized."); +if (newVal != null && !newVal.equalsIgnoreCase("true") +&& !newVal.equalsIgnoreCase("false")) { + throw new IllegalArgumentException("Not a valid Boolean value for " + property + + " in reconfSlowPeerParameters"); +} +boolean enable = (newVal == null ? DFS_DATANODE_PEER_STATS_ENABLED_DEFAULT : +Boolean.parseBoolean(newVal)); +result = Boolean.toString(enable); +dnConf.setPeerStatsEnabled(enable); +if (enable) { + if (peerMetrics == null) { +peerMetrics = DataNodePeerMetrics.create(getDisplayName(), getConf()); + } +} else { + peerMetrics = null; Review comment: Thanks @ayushtkn for your comments. To avoid NPE, I set `peerStatsEnabled` first, and then I add judgment `dnConf.peerStatsEnabled && peerMetrics != null` before using `peerMetrics`. Do you think this is ok? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 702801) Time Spent: 1h 20m (was: 1h 10m) > Reconfig slow peer parameters for datanode > -- > > Key: HDFS-16396 > URL: https://issues.apache.org/jira/browse/HDFS-16396 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > In large clusters, rolling restart datanodes takes a long time. We can make > slow peers parameters and slow disks parameters in datanode reconfigurable to > facilitate cluster operation and maintenance. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=702800=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-702800 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 02/Jan/22 11:42 Start Date: 02/Jan/22 11:42 Worklog Time Spent: 10m Work Description: tomscut commented on a change in pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#discussion_r777199652 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java ## @@ -303,7 +312,12 @@ Arrays.asList( DFS_DATANODE_DATA_DIR_KEY, DFS_DATANODE_BALANCE_MAX_NUM_CONCURRENT_MOVES_KEY, - DFS_BLOCKREPORT_INTERVAL_MSEC_KEY)); + DFS_BLOCKREPORT_INTERVAL_MSEC_KEY, + DFS_BLOCKREPORT_INTERVAL_MSEC_KEY, Review comment: Sorry. I will fix it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 702800) Time Spent: 1h 10m (was: 1h) > Reconfig slow peer parameters for datanode > -- > > Key: HDFS-16396 > URL: https://issues.apache.org/jira/browse/HDFS-16396 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > In large clusters, rolling restart datanodes takes a long time. We can make > slow peers parameters and slow disks parameters in datanode reconfigurable to > facilitate cluster operation and maintenance. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=702749=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-702749 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 01/Jan/22 18:26 Start Date: 01/Jan/22 18:26 Worklog Time Spent: 10m Work Description: ayushtkn commented on a change in pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#discussion_r777127124 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java ## @@ -642,13 +656,76 @@ public String reconfigurePropertyImpl(String property, String newVal) } break; } +case DFS_DATANODE_PEER_STATS_ENABLED_KEY: +case DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_KEY: +case DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_KEY: +case DFS_DATANODE_PEER_METRICS_MIN_OUTLIER_DETECTION_SAMPLES_KEY: + return reconfSlowPeerParameters(property, newVal); default: break; } throw new ReconfigurationException( property, newVal, getConf().get(property)); } + private String reconfSlowPeerParameters(String property, String newVal) + throws ReconfigurationException { +String result; +try { + LOG.info("Reconfiguring {} to {}", property, newVal); + if (property.equals(DFS_DATANODE_PEER_STATS_ENABLED_KEY)) { +checkNotNull(dnConf, "DNConf has not been initialized."); +if (newVal != null && !newVal.equalsIgnoreCase("true") +&& !newVal.equalsIgnoreCase("false")) { + throw new IllegalArgumentException("Not a valid Boolean value for " + property + + " in reconfSlowPeerParameters"); +} +boolean enable = (newVal == null ? DFS_DATANODE_PEER_STATS_ENABLED_DEFAULT : +Boolean.parseBoolean(newVal)); +result = Boolean.toString(enable); +dnConf.setPeerStatsEnabled(enable); +if (enable) { + if (peerMetrics == null) { +peerMetrics = DataNodePeerMetrics.create(getDisplayName(), getConf()); + } +} else { + peerMetrics = null; Review comment: `peerMetrics` isn't synchronised and it is being fetched by BpServiceActor & BlockReciever. I doubt in race conditions this may lead to some NPE there ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java ## @@ -303,7 +312,12 @@ Arrays.asList( DFS_DATANODE_DATA_DIR_KEY, DFS_DATANODE_BALANCE_MAX_NUM_CONCURRENT_MOVES_KEY, - DFS_BLOCKREPORT_INTERVAL_MSEC_KEY)); + DFS_BLOCKREPORT_INTERVAL_MSEC_KEY, + DFS_BLOCKREPORT_INTERVAL_MSEC_KEY, Review comment: you added it twice ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java ## @@ -865,7 +942,7 @@ private void refreshVolumes(String newVolumes) throws IOException { .newFixedThreadPool(changedVolumes.newLocations.size()); List> exceptions = Lists.newArrayList(); - Preconditions.checkNotNull(data, "Storage not yet initialized"); + checkNotNull(data, "Storage not yet initialized"); Review comment: let the imports stay as is. no need to bother them ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeReconfiguration.java ## @@ -365,4 +372,79 @@ public void testBlockReportIntervalReconfiguration() .getConf().get(DFS_BLOCKREPORT_INTERVAL_MSEC_KEY)); } } + + @Test + public void testSlowPeerParameters() + throws ReconfigurationException { +String[] slowPeersParameters = { +DFS_DATANODE_MIN_OUTLIER_DETECTION_NODES_KEY, +DFS_DATANODE_SLOWPEER_LOW_THRESHOLD_MS_KEY, +DFS_DATANODE_PEER_METRICS_MIN_OUTLIER_DETECTION_SAMPLES_KEY}; + +for (int i = 0; i < NUM_DATA_NODE; i++) { + DataNode dn = cluster.getDataNodes().get(i); + + // Try invalid values. + try { +dn.reconfigureProperty(DFS_DATANODE_PEER_STATS_ENABLED_KEY, "text"); + } catch (ReconfigurationException expected) { +assertEquals("Could not change property dfs.datanode.peer.stats.enabled from 'true' to " + +"'text'", expected.getMessage()); + } Review comment: Use `LambdaTestUtils` instead for such cases -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id:
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=701274=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-701274 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 27/Dec/21 11:35 Start Date: 27/Dec/21 11:35 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#issuecomment-1001523369 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 49s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 35m 22s | | trunk passed | | +1 :green_heart: | compile | 1m 28s | | trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 1m 20s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 1m 0s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 27s | | trunk passed | | +1 :green_heart: | javadoc | 1m 2s | | trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 34s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 19s | | trunk passed | | +1 :green_heart: | shadedclient | 25m 36s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 18s | | the patch passed | | +1 :green_heart: | compile | 1m 23s | | the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javac | 1m 23s | | the patch passed | | +1 :green_heart: | compile | 1m 14s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 1m 14s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 53s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/3/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 210 unchanged - 1 fixed = 211 total (was 211) | | +1 :green_heart: | mvnsite | 1m 18s | | the patch passed | | +1 :green_heart: | javadoc | 0m 53s | | the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 24s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 26s | | the patch passed | | +1 :green_heart: | shadedclient | 25m 36s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 349m 39s | | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 38s | | The patch does not generate ASF License warnings. | | | | 457m 47s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3827 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux c758bcecb8f3 4.15.0-163-generic #171-Ubuntu SMP Fri Nov 5 11:55:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 06253c414a9787b21c9af282bda378ea79561181 | | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/3/testReport/ | | Max. process+thread count | 2092 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U:
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=701219=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-701219 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 27/Dec/21 02:18 Start Date: 27/Dec/21 02:18 Worklog Time Spent: 10m Work Description: tomscut commented on pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#issuecomment-1001294355 Hi @jojochuang @ayushtkn @sodonnel @Hexiaoqiao @ferhui , could you please take a look at this. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 701219) Time Spent: 40m (was: 0.5h) > Reconfig slow peer parameters for datanode > -- > > Key: HDFS-16396 > URL: https://issues.apache.org/jira/browse/HDFS-16396 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > In large clusters, rolling restart datanodes takes a long time. We can make > slow peers parameters and slow disks parameters in datanode reconfigurable to > facilitate cluster operation and maintenance. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=701210=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-701210 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 26/Dec/21 21:21 Start Date: 26/Dec/21 21:21 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#issuecomment-1001243371 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 50s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 36m 58s | | trunk passed | | +1 :green_heart: | compile | 1m 40s | | trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 1m 29s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 1m 3s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 40s | | trunk passed | | +1 :green_heart: | javadoc | 1m 10s | | trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 41s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 49s | | trunk passed | | +1 :green_heart: | shadedclient | 27m 1s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 26s | | the patch passed | | +1 :green_heart: | compile | 1m 34s | | the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javac | 1m 34s | | the patch passed | | +1 :green_heart: | compile | 1m 21s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 1m 21s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 55s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/2/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 111 unchanged - 1 fixed = 112 total (was 112) | | +1 :green_heart: | mvnsite | 1m 30s | | the patch passed | | +1 :green_heart: | javadoc | 1m 1s | | the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 33s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 49s | | the patch passed | | +1 :green_heart: | shadedclient | 26m 26s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 348m 51s | | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 41s | | The patch does not generate ASF License warnings. | | | | 463m 2s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3827 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux aead755190fa 4.15.0-163-generic #171-Ubuntu SMP Fri Nov 5 11:55:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / d900689f2da92d8d8b43eee3a540ce63dbece62b | | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/2/testReport/ | | Max. process+thread count | 2491 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U:
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=701186=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-701186 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 26/Dec/21 12:23 Start Date: 26/Dec/21 12:23 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3827: URL: https://github.com/apache/hadoop/pull/3827#issuecomment-1001169447 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 55s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 37m 12s | | trunk passed | | +1 :green_heart: | compile | 1m 39s | | trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 1m 27s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 1m 4s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 38s | | trunk passed | | +1 :green_heart: | javadoc | 1m 12s | | trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 38s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 48s | | trunk passed | | +1 :green_heart: | shadedclient | 26m 27s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 30s | | the patch passed | | +1 :green_heart: | compile | 1m 40s | | the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javac | 1m 40s | | the patch passed | | +1 :green_heart: | compile | 1m 26s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 1m 26s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 58s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/1/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 111 unchanged - 1 fixed = 112 total (was 112) | | +1 :green_heart: | mvnsite | 1m 35s | | the patch passed | | +1 :green_heart: | javadoc | 1m 1s | | the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 34s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | -1 :x: | spotbugs | 3m 58s | [/new-spotbugs-hadoop-hdfs-project_hadoop-hdfs.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/1/artifact/out/new-spotbugs-hadoop-hdfs-project_hadoop-hdfs.html) | hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +1 :green_heart: | shadedclient | 26m 54s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 354m 59s | | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 46s | | The patch does not generate ASF License warnings. | | | | 470m 4s | | | | Reason | Tests | |---:|:--| | SpotBugs | module:hadoop-hdfs-project/hadoop-hdfs | | | Exception is caught when Exception is not thrown in org.apache.hadoop.hdfs.server.datanode.DataNode.reconfSlowPeerParameters(String, String) At DataNode.java:is not thrown in org.apache.hadoop.hdfs.server.datanode.DataNode.reconfSlowPeerParameters(String, String) At DataNode.java:[line 724] | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3827/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3827 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux e7b610ded91d 4.15.0-163-generic #171-Ubuntu SMP Fri Nov 5 11:55:11 UTC
[jira] [Work logged] (HDFS-16396) Reconfig slow peer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-16396?focusedWorklogId=701160=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-701160 ] ASF GitHub Bot logged work on HDFS-16396: - Author: ASF GitHub Bot Created on: 26/Dec/21 04:31 Start Date: 26/Dec/21 04:31 Worklog Time Spent: 10m Work Description: tomscut opened a new pull request #3827: URL: https://github.com/apache/hadoop/pull/3827 JIRA: [HDFS-16396](https://issues.apache.org/jira/browse/HDFS-16396). In large clusters, rolling restart datanodes takes a long time. We can make slow peers parameters and slow disks parameters in datanode reconfigurable to facilitate cluster operation and maintenance. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 701160) Remaining Estimate: 0h Time Spent: 10m > Reconfig slow peer parameters for datanode > -- > > Key: HDFS-16396 > URL: https://issues.apache.org/jira/browse/HDFS-16396 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > In large clusters, rolling restart datanodes takes a long time. We can make > slow peers parameters and slow disks parameters in datanode reconfigurable to > facilitate cluster operation and maintenance. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org