[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=771965=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771965 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 18/May/22 15:29 Start Date: 18/May/22 15:29 Worklog Time Spent: 10m Work Description: saintstack commented on PR #4326: URL: https://github.com/apache/hadoop/pull/4326#issuecomment-1130165900 Merged. Thank you for the cleanup @Hexiaoqiao Issue Time Tracking --- Worklog Id: (was: 771965) Time Spent: 8h 20m (was: 8h 10m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.4 > > Time Spent: 8h 20m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=771964=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771964 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 18/May/22 15:28 Start Date: 18/May/22 15:28 Worklog Time Spent: 10m Work Description: saintstack merged PR #4326: URL: https://github.com/apache/hadoop/pull/4326 Issue Time Tracking --- Worklog Id: (was: 771964) Time Spent: 8h 10m (was: 8h) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.4 > > Time Spent: 8h 10m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=771894=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771894 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 18/May/22 14:13 Start Date: 18/May/22 14:13 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4326: URL: https://github.com/apache/hadoop/pull/4326#issuecomment-1130072419 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 40s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | _ trunk Compile Tests _ | | +1 :green_heart: | shadedclient | 29m 54s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | shadedclient | 19m 20s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | asflicense | 0m 53s | | The patch does not generate ASF License warnings. | | | | 52m 21s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4326/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/4326 | | Optional Tests | dupname asflicense codespell | | uname | Linux 0b6857aa64db 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / d394689a5b727f2b52272776dd950805e5427f14 | | Max. process+thread count | 548 (vs. ulimit of 5500) | | modules | C: . U: . | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4326/1/console | | versions | git=2.25.1 maven=3.6.3 | | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated. Issue Time Tracking --- Worklog Id: (was: 771894) Time Spent: 8h (was: 7h 50m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.4 > > Time Spent: 8h > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=771872=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771872 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 18/May/22 13:21 Start Date: 18/May/22 13:21 Worklog Time Spent: 10m Work Description: Hexiaoqiao commented on PR #4326: URL: https://github.com/apache/hadoop/pull/4326#issuecomment-1130011537 cc @saintstack and @tomscut Issue Time Tracking --- Worklog Id: (was: 771872) Time Spent: 7h 50m (was: 7h 40m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.4 > > Time Spent: 7h 50m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=771871=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771871 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 18/May/22 13:19 Start Date: 18/May/22 13:19 Worklog Time Spent: 10m Work Description: Hexiaoqiao opened a new pull request, #4326: URL: https://github.com/apache/hadoop/pull/4326 ### Description of PR remove unusually introduce file '.BUILDING.txt.swp'. ### How was this patch tested? No adding test since no source code changes. ### For code changes: - [Y] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? - [N] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation? - [N] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [N] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files? Issue Time Tracking --- Worklog Id: (was: 771871) Time Spent: 7h 40m (was: 7.5h) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.4 > > Time Spent: 7h 40m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=771869=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771869 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 18/May/22 13:11 Start Date: 18/May/22 13:11 Worklog Time Spent: 10m Work Description: Hexiaoqiao commented on PR #4246: URL: https://github.com/apache/hadoop/pull/4246#issuecomment-112819 I don't get what changes here. Maybe I missed something? Thanks. Issue Time Tracking --- Worklog Id: (was: 771869) Time Spent: 7.5h (was: 7h 20m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.4 > > Time Spent: 7.5h > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=771865=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771865 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 18/May/22 13:02 Start Date: 18/May/22 13:02 Worklog Time Spent: 10m Work Description: Hexiaoqiao commented on PR #4170: URL: https://github.com/apache/hadoop/pull/4170#issuecomment-1129990002 Sorry for late response. Just found that this PR involved unrelated changes '.BUILDING.txt.swp' under root path of project. If no other concerns I would like to remove it for a while. Issue Time Tracking --- Worklog Id: (was: 771865) Time Spent: 7h 20m (was: 7h 10m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.4 > > Time Spent: 7h 20m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=771494=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771494 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 17/May/22 17:59 Start Date: 17/May/22 17:59 Worklog Time Spent: 10m Work Description: huaxiangsun commented on PR #4246: URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1129159135 Thanks a lot, @saintstack! Issue Time Tracking --- Worklog Id: (was: 771494) Time Spent: 7h 10m (was: 7h) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.4 > > Time Spent: 7h 10m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=770656=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770656 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 16/May/22 04:32 Start Date: 16/May/22 04:32 Worklog Time Spent: 10m Work Description: saintstack commented on PR #4246: URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1127210267 I ran the two test below in loops locally. TestBPOfferService.testMissBlocksWhenReregister TestUnderReplicatedBlocks.testSetRepIncWithUnderReplicatedBlocks The first failed once out of ten cycles both when the patch was in place and when not (jibes w/ what we see here in test runs where sometimes it fails but not always). TestUnderReplicatedBlocks.testSetRepIncWithUnderReplicatedBlocks shows up consistently but when I run it locally in multiple cycles, it passes whether the patch is applied or not. I see that in the last full branch-3.3 run, back on May 5th (https://ci-hadoop.apache.org/job/hadoop-qbt-branch-3.3-java8-linux-x86_64/54/), it failed for same reason (the May 12th run was incomplete). This test is about block replication where the PR here is about a minor adjustment in NN node accounting. Unrelated I'd say. Pushing the backport. Will push in the morning. Issue Time Tracking --- Worklog Id: (was: 770656) Time Spent: 6h 50m (was: 6h 40m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 6h 50m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=770657=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770657 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 16/May/22 04:32 Start Date: 16/May/22 04:32 Worklog Time Spent: 10m Work Description: saintstack merged PR #4246: URL: https://github.com/apache/hadoop/pull/4246 Issue Time Tracking --- Worklog Id: (was: 770657) Time Spent: 7h (was: 6h 50m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 7h > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=770639=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770639 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 16/May/22 02:31 Start Date: 16/May/22 02:31 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4246: URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1127151370 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 40s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ branch-3.3 Compile Tests _ | | +1 :green_heart: | mvninstall | 35m 58s | | branch-3.3 passed | | +1 :green_heart: | compile | 1m 34s | | branch-3.3 passed | | +1 :green_heart: | checkstyle | 1m 14s | | branch-3.3 passed | | +1 :green_heart: | mvnsite | 1m 39s | | branch-3.3 passed | | +1 :green_heart: | javadoc | 1m 52s | | branch-3.3 passed | | +1 :green_heart: | spotbugs | 3m 35s | | branch-3.3 passed | | +1 :green_heart: | shadedclient | 27m 7s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 20s | | the patch passed | | +1 :green_heart: | compile | 1m 14s | | the patch passed | | +1 :green_heart: | javac | 1m 14s | | the patch passed | | -1 :x: | blanks | 0m 0s | [/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/14/artifact/out/blanks-eol.txt) | The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply | | +1 :green_heart: | checkstyle | 0m 48s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 20s | | the patch passed | | +1 :green_heart: | javadoc | 1m 33s | | the patch passed | | +1 :green_heart: | spotbugs | 3m 17s | | the patch passed | | +1 :green_heart: | shadedclient | 26m 19s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 190m 25s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/14/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 14s | | The patch does not generate ASF License warnings. | | | | 298m 53s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/14/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/4246 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux 8f80b78e9fdd 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.3 / a50952249eecd4f1d1f0479f1ca1643c3dbec925 | | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/14/testReport/ | | Max. process+thread count | 3224 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/14/console | | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated. Issue Time Tracking --- Worklog Id: (was: 770639) Time Spent: 6h 40m (was: 6.5h) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 >
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=770474=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770474 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 14/May/22 05:53 Start Date: 14/May/22 05:53 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4246: URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1126648700 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 38s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ branch-3.3 Compile Tests _ | | +1 :green_heart: | mvninstall | 37m 26s | | branch-3.3 passed | | +1 :green_heart: | compile | 1m 31s | | branch-3.3 passed | | +1 :green_heart: | checkstyle | 1m 13s | | branch-3.3 passed | | +1 :green_heart: | mvnsite | 1m 40s | | branch-3.3 passed | | +1 :green_heart: | javadoc | 1m 57s | | branch-3.3 passed | | +1 :green_heart: | spotbugs | 3m 38s | | branch-3.3 passed | | +1 :green_heart: | shadedclient | 26m 54s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 23s | | the patch passed | | +1 :green_heart: | compile | 1m 17s | | the patch passed | | +1 :green_heart: | javac | 1m 17s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 49s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 24s | | the patch passed | | +1 :green_heart: | javadoc | 1m 30s | | the patch passed | | +1 :green_heart: | spotbugs | 3m 22s | | the patch passed | | +1 :green_heart: | shadedclient | 26m 23s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 190m 22s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/13/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 14s | | The patch does not generate ASF License warnings. | | | | 300m 22s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery | | | hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes | | | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/13/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/4246 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux 60cd37572baf 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.3 / 21686a25ac0fc811c894b03354592b811e84b1eb | | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/13/testReport/ | | Max. process+thread count | 3589 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/13/console | | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated. Issue Time Tracking --- Worklog Id: (was: 770474) Time Spent: 6.5h (was: 6h 20m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=770425=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770425 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 13/May/22 23:14 Start Date: 13/May/22 23:14 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4246: URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1126569964 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 43s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ branch-3.3 Compile Tests _ | | +1 :green_heart: | mvninstall | 35m 26s | | branch-3.3 passed | | +1 :green_heart: | compile | 1m 31s | | branch-3.3 passed | | +1 :green_heart: | checkstyle | 1m 13s | | branch-3.3 passed | | +1 :green_heart: | mvnsite | 1m 41s | | branch-3.3 passed | | +1 :green_heart: | javadoc | 1m 53s | | branch-3.3 passed | | +1 :green_heart: | spotbugs | 3m 36s | | branch-3.3 passed | | +1 :green_heart: | shadedclient | 27m 19s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 21s | | the patch passed | | +1 :green_heart: | compile | 1m 16s | | the patch passed | | +1 :green_heart: | javac | 1m 16s | | the patch passed | | -1 :x: | blanks | 0m 0s | [/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/12/artifact/out/blanks-eol.txt) | The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply | | +1 :green_heart: | checkstyle | 0m 46s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 21s | | the patch passed | | +1 :green_heart: | javadoc | 1m 28s | | the patch passed | | +1 :green_heart: | spotbugs | 3m 18s | | the patch passed | | +1 :green_heart: | shadedclient | 26m 21s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 190m 47s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/12/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 15s | | The patch does not generate ASF License warnings. | | | | 299m 1s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.balancer.TestBalancer | | | hadoop.hdfs.server.namenode.ha.TestHAAppend | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/12/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/4246 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux bdf2fce93bdd 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.3 / 007c9e844ffed2e12691b6774fd738688e7c1c06 | | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/12/testReport/ | | Max. process+thread count | (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/12/console | | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated. Issue Time Tracking --- Worklog Id: (was: 770425) Time Spent: 6h 20m (was: 6h 10m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 >
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=770312=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770312 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 13/May/22 18:12 Start Date: 13/May/22 18:12 Worklog Time Spent: 10m Work Description: saintstack commented on PR #4246: URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1126312782 Two failures: TestBPOfferService.testMissBlocksWhenReregister TestUnderReplicatedBlocks.testSetRepIncWithUnderReplicatedBlocks They come up often enough. Let me try again. Meantime running locally. Issue Time Tracking --- Worklog Id: (was: 770312) Time Spent: 6h 10m (was: 6h) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 6h 10m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=769499=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769499 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 12/May/22 09:26 Start Date: 12/May/22 09:26 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4246: URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1124741659 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 38s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ branch-3.3 Compile Tests _ | | +1 :green_heart: | mvninstall | 36m 42s | | branch-3.3 passed | | +1 :green_heart: | compile | 1m 33s | | branch-3.3 passed | | +1 :green_heart: | checkstyle | 1m 13s | | branch-3.3 passed | | +1 :green_heart: | mvnsite | 1m 40s | | branch-3.3 passed | | +1 :green_heart: | javadoc | 1m 58s | | branch-3.3 passed | | +1 :green_heart: | spotbugs | 3m 39s | | branch-3.3 passed | | +1 :green_heart: | shadedclient | 27m 1s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 20s | | the patch passed | | +1 :green_heart: | compile | 1m 16s | | the patch passed | | +1 :green_heart: | javac | 1m 16s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 50s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 23s | | the patch passed | | +1 :green_heart: | javadoc | 1m 26s | | the patch passed | | +1 :green_heart: | spotbugs | 3m 19s | | the patch passed | | +1 :green_heart: | shadedclient | 26m 13s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 193m 17s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/11/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 15s | | The patch does not generate ASF License warnings. | | | | 302m 10s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.datanode.TestBPOfferService | | | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/11/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/4246 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux 14ee2742708c 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.3 / 53773ea019ca5ed793d36035c7adbfe589f5926c | | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/11/testReport/ | | Max. process+thread count | 2983 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/11/console | | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated. Issue Time Tracking --- Worklog Id: (was: 769499) Time Spent: 6h (was: 5h 50m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=769414=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769414 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 12/May/22 04:20 Start Date: 12/May/22 04:20 Worklog Time Spent: 10m Work Description: saintstack commented on PR #4246: URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1124514541 Removing a space had us run more tests and 4 tests failed instead of 44 on previous run. Below is the change in last run. ``` From eb904f3adaa55d44aa6494ad116344317e9ec882 Mon Sep 17 00:00:00 2001 From: stack Date: Wed, 11 May 2022 15:42:18 -0700 Subject: [PATCH] Remove a space at end of line inside a comment Issue Time Tracking --- Worklog Id: (was: 769414) Time Spent: 5h 50m (was: 5h 40m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 5h 50m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=769407=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769407 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 12/May/22 03:15 Start Date: 12/May/22 03:15 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4246: URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1124486913 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 45s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ branch-3.3 Compile Tests _ | | -1 :x: | mvninstall | 6m 15s | [/branch-mvninstall-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/10/artifact/out/branch-mvninstall-root.txt) | root in branch-3.3 failed. | | -1 :x: | compile | 0m 30s | [/branch-compile-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/10/artifact/out/branch-compile-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in branch-3.3 failed. | | -0 :warning: | checkstyle | 0m 27s | [/buildtool-branch-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/10/artifact/out/buildtool-branch-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | The patch fails to run checkstyle in hadoop-hdfs | | +1 :green_heart: | mvnsite | 3m 14s | | branch-3.3 passed | | +1 :green_heart: | javadoc | 1m 55s | | branch-3.3 passed | | +1 :green_heart: | spotbugs | 4m 0s | | branch-3.3 passed | | +1 :green_heart: | shadedclient | 29m 31s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 21s | | the patch passed | | +1 :green_heart: | compile | 1m 14s | | the patch passed | | -1 :x: | javac | 1m 14s | [/results-compile-javac-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/10/artifact/out/results-compile-javac-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project_hadoop-hdfs generated 567 new + 0 unchanged - 0 fixed = 567 total (was 0) | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 55s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/10/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 52 new + 0 unchanged - 0 fixed = 52 total (was 0) | | +1 :green_heart: | mvnsite | 1m 22s | | the patch passed | | +1 :green_heart: | javadoc | 1m 27s | | the patch passed | | +1 :green_heart: | spotbugs | 3m 15s | | the patch passed | | +1 :green_heart: | shadedclient | 26m 0s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 189m 51s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/10/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 14s | | The patch does not generate ASF License warnings. | | | | 270m 37s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestHAAppend | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS | | | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/10/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/4246 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux 691ecf7100cf 4.15.0-156-generic #163-Ubuntu SMP Thu Aug 19 23:31:58 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=769254=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769254 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 11/May/22 19:22 Start Date: 11/May/22 19:22 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4246: URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1124201434 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 44s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ branch-3.3 Compile Tests _ | | +1 :green_heart: | mvninstall | 36m 27s | | branch-3.3 passed | | +1 :green_heart: | compile | 1m 33s | | branch-3.3 passed | | +1 :green_heart: | checkstyle | 1m 13s | | branch-3.3 passed | | +1 :green_heart: | mvnsite | 1m 45s | | branch-3.3 passed | | +1 :green_heart: | javadoc | 1m 52s | | branch-3.3 passed | | +1 :green_heart: | spotbugs | 3m 41s | | branch-3.3 passed | | +1 :green_heart: | shadedclient | 27m 12s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 22s | | the patch passed | | +1 :green_heart: | compile | 1m 17s | | the patch passed | | +1 :green_heart: | javac | 1m 17s | | the patch passed | | -1 :x: | blanks | 0m 0s | [/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/9/artifact/out/blanks-eol.txt) | The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply | | +1 :green_heart: | checkstyle | 0m 45s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 17s | | the patch passed | | +1 :green_heart: | javadoc | 1m 29s | | the patch passed | | +1 :green_heart: | spotbugs | 3m 59s | | the patch passed | | +1 :green_heart: | shadedclient | 33m 9s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 34m 55s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/9/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +0 :ok: | asflicense | 0m 41s | | ASF License check generated no output? | | | | 150m 13s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.tools.TestECAdmin | | | hadoop.hdfs.tools.TestViewFileSystemOverloadSchemeWithDFSAdmin | | | hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerWithStripedBlocks | | | hadoop.cli.TestHDFSCLI | | | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer | | | hadoop.hdfs.TestBlockStoragePolicy | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/9/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/4246 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux 80c88472574d 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.3 / 53fdbf60a26f18341150743f860b0713ca2d632a | | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/9/testReport/ | | Max. process+thread count | 1567 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/9/console | | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated. Issue Time Tracking --- Worklog Id: (was: 769254) Time Spent: 5.5h (was: 5h 20m) > Data locality is lost when DataNode
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=768818=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768818 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 11/May/22 02:19 Start Date: 11/May/22 02:19 Worklog Time Spent: 10m Work Description: saintstack commented on PR #4246: URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1123106711 A noop patch has these failures: ``` org.apache.hadoop.hdfs.TestDFSUpgradeFromImage.testUpgradeFromRel1BBWImage org.apache.hadoop.hdfs.TestDecommissionWithStripedBackoffMonitor.testDecommissionWithMissingBlock org.apache.hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade.testWithLayoutChangeAndFinalize ``` A push with the backport has these failures: ``` org.apache.hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks.testSetRepIncWithUnderReplicatedBlocks org.apache.hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade.testWithLayoutChangeAndRollback ``` One overlap: TestDataNodeRollingUpgrade. Other seems unrelated. Let me try a repush. Issue Time Tracking --- Worklog Id: (was: 768818) Time Spent: 5h 20m (was: 5h 10m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 5h 20m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=768809=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768809 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 11/May/22 01:47 Start Date: 11/May/22 01:47 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4246: URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1123090143 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 40s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ branch-3.3 Compile Tests _ | | +1 :green_heart: | mvninstall | 36m 23s | | branch-3.3 passed | | +1 :green_heart: | compile | 1m 32s | | branch-3.3 passed | | +1 :green_heart: | checkstyle | 1m 12s | | branch-3.3 passed | | +1 :green_heart: | mvnsite | 1m 37s | | branch-3.3 passed | | +1 :green_heart: | javadoc | 1m 56s | | branch-3.3 passed | | +1 :green_heart: | spotbugs | 3m 32s | | branch-3.3 passed | | +1 :green_heart: | shadedclient | 27m 3s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 22s | | the patch passed | | +1 :green_heart: | compile | 1m 12s | | the patch passed | | +1 :green_heart: | javac | 1m 12s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 49s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 22s | | the patch passed | | +1 :green_heart: | javadoc | 1m 32s | | the patch passed | | +1 :green_heart: | spotbugs | 3m 15s | | the patch passed | | +1 :green_heart: | shadedclient | 26m 15s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 190m 33s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/8/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 13s | | The patch does not generate ASF License warnings. | | | | 299m 15s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.TestDecommissionWithStripedBackoffMonitor | | | hadoop.hdfs.TestDFSUpgradeFromImage | | | hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/8/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/4246 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux 90a26f48a12b 4.15.0-156-generic #163-Ubuntu SMP Thu Aug 19 23:31:58 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.3 / 2966e5350569cbcc88f66382d19537201deff0e7 | | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/8/testReport/ | | Max. process+thread count | 3708 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/8/console | | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated. Issue Time Tracking --- Worklog Id: (was: 768809) Time Spent: 5h 10m (was: 5h) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter:
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=768692=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768692 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 10/May/22 20:18 Start Date: 10/May/22 20:18 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4246: URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1122821124 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 43s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ branch-3.3 Compile Tests _ | | +1 :green_heart: | mvninstall | 36m 30s | | branch-3.3 passed | | +1 :green_heart: | compile | 1m 32s | | branch-3.3 passed | | +1 :green_heart: | checkstyle | 1m 13s | | branch-3.3 passed | | +1 :green_heart: | mvnsite | 1m 40s | | branch-3.3 passed | | +1 :green_heart: | javadoc | 1m 51s | | branch-3.3 passed | | +1 :green_heart: | spotbugs | 3m 36s | | branch-3.3 passed | | +1 :green_heart: | shadedclient | 27m 14s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 22s | | the patch passed | | +1 :green_heart: | compile | 1m 15s | | the patch passed | | +1 :green_heart: | javac | 1m 15s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 48s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 21s | | the patch passed | | +1 :green_heart: | javadoc | 1m 31s | | the patch passed | | +1 :green_heart: | spotbugs | 3m 16s | | the patch passed | | +1 :green_heart: | shadedclient | 25m 57s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 190m 14s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/7/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 12s | | The patch does not generate ASF License warnings. | | | | 299m 0s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade | | | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/7/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/4246 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux 3b44c5c05aa1 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.3 / 306eb361f6831f1d7722049133e89a2184b47cd1 | | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/7/testReport/ | | Max. process+thread count | 3175 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/7/console | | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated. Issue Time Tracking --- Worklog Id: (was: 768692) Time Spent: 5h (was: 4h 50m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=768543=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768543 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 10/May/22 15:02 Start Date: 10/May/22 15:02 Worklog Time Spent: 10m Work Description: saintstack commented on PR #4246: URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1122514860 Node seems to be having issues... java.lang.OutOfMemoryError: unable to create new native thread Let me try a new push. Issue Time Tracking --- Worklog Id: (was: 768543) Time Spent: 4h 50m (was: 4h 40m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 4h 50m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=768385=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768385 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 10/May/22 09:45 Start Date: 10/May/22 09:45 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4246: URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1122168129 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 7m 6s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ branch-3.3 Compile Tests _ | | +1 :green_heart: | mvninstall | 36m 34s | | branch-3.3 passed | | +1 :green_heart: | compile | 1m 32s | | branch-3.3 passed | | +1 :green_heart: | checkstyle | 1m 12s | | branch-3.3 passed | | +1 :green_heart: | mvnsite | 1m 40s | | branch-3.3 passed | | +1 :green_heart: | javadoc | 1m 58s | | branch-3.3 passed | | +1 :green_heart: | spotbugs | 3m 32s | | branch-3.3 passed | | +1 :green_heart: | shadedclient | 26m 34s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 19s | | the patch passed | | +1 :green_heart: | compile | 1m 13s | | the patch passed | | +1 :green_heart: | javac | 1m 13s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 49s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 21s | | the patch passed | | +1 :green_heart: | javadoc | 1m 34s | | the patch passed | | +1 :green_heart: | spotbugs | 3m 17s | | the patch passed | | +1 :green_heart: | shadedclient | 26m 12s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 189m 1s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/6/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 9s | | The patch does not generate ASF License warnings. | | | | 303m 48s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.datanode.fsdataset.impl.TestSpaceReservation | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.server.datanode.TestRefreshNamenodes | | | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | | hadoop.hdfs.server.datanode.TestDataNodeReconfiguration | | | hadoop.hdfs.server.datanode.TestTransferRbw | | | hadoop.hdfs.server.datanode.TestBatchIbr | | | hadoop.hdfs.server.datanode.TestReadOnlySharedStorage | | | hadoop.hdfs.qjournal.server.TestJournalNodeSync | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles | | | hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/6/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/4246 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux fabca6d9bc43 4.15.0-156-generic #163-Ubuntu SMP Thu Aug 19 23:31:58 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.3 / e9049bdf18516093b053cd83bd2374262cb2519c | | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/6/testReport/ | | Max. process+thread count | 3227 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output |
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=767411=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-767411 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 06/May/22 21:03 Start Date: 06/May/22 21:03 Worklog Time Spent: 10m Work Description: saintstack commented on PR #4246: URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1120006228 Looking at these test failures: * on a full run Issue Time Tracking --- Worklog Id: (was: 767411) Time Spent: 4.5h (was: 4h 20m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 4.5h > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=766514=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-766514 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 05/May/22 08:45 Start Date: 05/May/22 08:45 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4246: URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1118312024 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 38s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ branch-3.3 Compile Tests _ | | +0 :ok: | mvndep | 15m 30s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 23m 51s | | branch-3.3 passed | | +1 :green_heart: | compile | 18m 4s | | branch-3.3 passed | | +1 :green_heart: | checkstyle | 3m 21s | | branch-3.3 passed | | +1 :green_heart: | mvnsite | 20m 22s | | branch-3.3 passed | | +1 :green_heart: | javadoc | 8m 34s | | branch-3.3 passed | | +1 :green_heart: | spotbugs | 35m 24s | | branch-3.3 passed | | +1 :green_heart: | shadedclient | 55m 48s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 30s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 22m 54s | | the patch passed | | +1 :green_heart: | compile | 17m 42s | | the patch passed | | +1 :green_heart: | javac | 17m 42s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 3m 8s | | the patch passed | | +1 :green_heart: | mvnsite | 19m 55s | | the patch passed | | +1 :green_heart: | javadoc | 7m 50s | | the patch passed | | +1 :green_heart: | spotbugs | 35m 55s | | the patch passed | | +1 :green_heart: | shadedclient | 56m 14s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 308m 42s | [/patch-unit-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/5/artifact/out/patch-unit-root.txt) | root in the patch failed. | | +0 :ok: | asflicense | 1m 11s | | ASF License check generated no output? | | | | 596m 16s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.yarn.server.resourcemanager.security.TestAMRMTokens | | | hadoop.yarn.server.resourcemanager.TestRMHAForAsyncScheduler | | | hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter | | | hadoop.yarn.server.resourcemanager.TestResourceTrackerService | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler | | | hadoop.yarn.server.timeline.webapp.TestTimelineWebServicesWithSSL | | | hadoop.yarn.server.timeline.security.TestTimelineAuthenticationFilterForV1 | | | hadoop.yarn.server.timeline.webapp.TestTimelineWebServices | | | hadoop.yarn.server.applicationhistoryservice.TestApplicationHistoryServer | | | hadoop.crypto.key.kms.server.TestKMS | | | hadoop.crypto.key.kms.server.TestKMSWithZK | | | hadoop.hdfs.server.federation.router.TestRouterRpc | | | hadoop.hdfs.server.federation.security.TestRouterSecurityManager | | | hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination | | | hadoop.hdfs.server.datanode.TestBPOfferService | | | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/5/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/4246 | | Optional Tests | dupname asflicense codespell compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle | | uname | Linux 69ddfeecc933 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.3 / 0f8ad00d4c5a1d15bc4bbe1be7b8a93163053362 | | Default Java | Private
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=766140=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-766140 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 04/May/22 16:29 Start Date: 04/May/22 16:29 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4246: URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1117559675 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 7m 2s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ branch-3.3 Compile Tests _ | | +0 :ok: | mvndep | 15m 28s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 23m 59s | | branch-3.3 passed | | +1 :green_heart: | compile | 17m 59s | | branch-3.3 passed | | +1 :green_heart: | checkstyle | 3m 19s | | branch-3.3 passed | | +1 :green_heart: | mvnsite | 20m 20s | | branch-3.3 passed | | +1 :green_heart: | javadoc | 8m 0s | | branch-3.3 passed | | +1 :green_heart: | spotbugs | 35m 23s | | branch-3.3 passed | | +1 :green_heart: | shadedclient | 55m 50s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 31s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 22m 36s | | the patch passed | | +1 :green_heart: | compile | 17m 28s | | the patch passed | | +1 :green_heart: | javac | 17m 28s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 3m 11s | | the patch passed | | +1 :green_heart: | mvnsite | 20m 9s | | the patch passed | | +1 :green_heart: | javadoc | 7m 57s | | the patch passed | | +1 :green_heart: | spotbugs | 35m 51s | | the patch passed | | +1 :green_heart: | shadedclient | 55m 52s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 662m 48s | [/patch-unit-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/4/artifact/out/patch-unit-root.txt) | root in the patch passed. | | +1 :green_heart: | asflicense | 2m 40s | | The patch does not generate ASF License warnings. | | | | 956m 39s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.yarn.server.router.webapp.TestRouterWebServicesREST | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication | | | hadoop.yarn.server.resourcemanager.TestRMHA | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebappAuthentication | | | hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisher | | | hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher | | | hadoop.yarn.server.resourcemanager.TestClientRMService | | | hadoop.yarn.server.timeline.webapp.TestTimelineWebServicesWithSSL | | | hadoop.yarn.server.timeline.security.TestTimelineAuthenticationFilterForV1 | | | hadoop.yarn.server.applicationhistoryservice.TestApplicationHistoryServer | | | hadoop.yarn.client.TestGetGroups | | | hadoop.yarn.client.TestResourceManagerAdministrationProtocolPBClientImpl | | | hadoop.crypto.key.kms.server.TestKMS | | | hadoop.crypto.key.kms.server.TestKMSWithZK | | | hadoop.hdfs.server.federation.router.TestRouterRpc | | | hadoop.hdfs.server.federation.security.TestRouterSecurityManager | | | hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination | | | hadoop.hdfs.server.datanode.TestBPOfferService | | | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | | hadoop.mapred.TestLocalDistributedCacheManager | | | hadoop.yarn.sls.appmaster.TestAMSimulator | | | hadoop.yarn.sls.TestSLSRunner | | | hadoop.yarn.sls.TestSLSDagAMSimulator | | | hadoop.yarn.sls.TestReservationSystemInvariants | | | hadoop.yarn.sls.TestSLSStreamAMSynth | | | hadoop.yarn.sls.TestSLSGenericSynth | | | hadoop.yarn.sls.nodemanager.TestNMSimulator | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=764660=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-764660 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 30/Apr/22 12:26 Start Date: 30/Apr/22 12:26 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4246: URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1113980465 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 39s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ branch-3.3 Compile Tests _ | | +0 :ok: | mvndep | 14m 31s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 23m 49s | | branch-3.3 passed | | +1 :green_heart: | compile | 18m 8s | | branch-3.3 passed | | +1 :green_heart: | checkstyle | 3m 20s | | branch-3.3 passed | | +1 :green_heart: | mvnsite | 20m 21s | | branch-3.3 passed | | +1 :green_heart: | javadoc | 8m 8s | | branch-3.3 passed | | +1 :green_heart: | spotbugs | 35m 15s | | branch-3.3 passed | | +1 :green_heart: | shadedclient | 55m 37s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 36s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 22m 39s | | the patch passed | | +1 :green_heart: | compile | 17m 33s | | the patch passed | | +1 :green_heart: | javac | 17m 33s | | the patch passed | | +1 :green_heart: | blanks | 0m 1s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 3m 12s | | the patch passed | | +1 :green_heart: | mvnsite | 20m 0s | | the patch passed | | +1 :green_heart: | javadoc | 8m 23s | | the patch passed | | +1 :green_heart: | spotbugs | 35m 44s | | the patch passed | | +1 :green_heart: | shadedclient | 55m 48s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 660m 38s | [/patch-unit-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/2/artifact/out/patch-unit-root.txt) | root in the patch passed. | | +1 :green_heart: | asflicense | 2m 42s | | The patch does not generate ASF License warnings. | | | | 947m 46s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.yarn.server.router.webapp.TestRouterWebServicesREST | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication | | | hadoop.yarn.server.resourcemanager.TestRMHA | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebappAuthentication | | | hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisher | | | hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher | | | hadoop.yarn.server.resourcemanager.TestClientRMService | | | hadoop.yarn.server.timeline.webapp.TestTimelineWebServicesWithSSL | | | hadoop.yarn.server.timeline.security.TestTimelineAuthenticationFilterForV1 | | | hadoop.yarn.server.applicationhistoryservice.TestApplicationHistoryServer | | | hadoop.yarn.client.TestGetGroups | | | hadoop.yarn.client.TestResourceManagerAdministrationProtocolPBClientImpl | | | hadoop.crypto.key.kms.server.TestKMS | | | hadoop.crypto.key.kms.server.TestKMSWithZK | | | hadoop.hdfs.server.federation.router.TestRouterRpc | | | hadoop.hdfs.server.federation.security.TestRouterSecurityManager | | | hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination | | | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | | hadoop.mapred.TestLocalDistributedCacheManager | | | hadoop.yarn.sls.appmaster.TestAMSimulator | | | hadoop.yarn.sls.TestSLSRunner | | | hadoop.yarn.sls.TestSLSDagAMSimulator | | | hadoop.yarn.sls.TestReservationSystemInvariants | | | hadoop.yarn.sls.TestSLSStreamAMSynth | | | hadoop.yarn.sls.TestSLSGenericSynth | | | hadoop.yarn.sls.nodemanager.TestNMSimulator | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base:
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=764073=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-764073 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 29/Apr/22 08:26 Start Date: 29/Apr/22 08:26 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4246: URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1113018015 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 6m 55s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ branch-3.3 Compile Tests _ | | +0 :ok: | mvndep | 14m 29s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 23m 53s | | branch-3.3 passed | | +1 :green_heart: | compile | 18m 9s | | branch-3.3 passed | | +1 :green_heart: | checkstyle | 3m 23s | | branch-3.3 passed | | +1 :green_heart: | mvnsite | 20m 24s | | branch-3.3 passed | | +1 :green_heart: | javadoc | 8m 2s | | branch-3.3 passed | | +1 :green_heart: | spotbugs | 35m 29s | | branch-3.3 passed | | +1 :green_heart: | shadedclient | 59m 16s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 33s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 22m 45s | | the patch passed | | +1 :green_heart: | compile | 17m 35s | | the patch passed | | +1 :green_heart: | javac | 17m 35s | | the patch passed | | +1 :green_heart: | blanks | 0m 1s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 3m 6s | | the patch passed | | +1 :green_heart: | mvnsite | 19m 49s | | the patch passed | | +1 :green_heart: | javadoc | 7m 57s | | the patch passed | | +1 :green_heart: | spotbugs | 35m 39s | | the patch passed | | +1 :green_heart: | shadedclient | 55m 59s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 579m 22s | [/patch-unit-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/1/artifact/out/patch-unit-root.txt) | root in the patch failed. | | +0 :ok: | asflicense | 1m 41s | | ASF License check generated no output? | | | | 874m 55s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.yarn.server.router.webapp.TestRouterWebServicesREST | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication | | | hadoop.yarn.server.resourcemanager.TestRMHA | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebappAuthentication | | | hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisher | | | hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher | | | hadoop.yarn.server.resourcemanager.TestClientRMService | | | hadoop.yarn.server.timeline.webapp.TestTimelineWebServicesWithSSL | | | hadoop.yarn.server.timeline.security.TestTimelineAuthenticationFilterForV1 | | | hadoop.yarn.server.applicationhistoryservice.TestApplicationHistoryServer | | | hadoop.yarn.client.TestGetGroups | | | hadoop.yarn.client.TestResourceManagerAdministrationProtocolPBClientImpl | | | hadoop.crypto.key.kms.server.TestKMS | | | hadoop.crypto.key.kms.server.TestKMSWithZK | | | hadoop.hdfs.server.federation.router.TestRouterRpc | | | hadoop.hdfs.server.federation.security.TestRouterSecurityManager | | | hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination | | | hadoop.hdfs.server.namenode.TestFileTruncate | | | hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade | | | hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics | | | hadoop.hdfs.server.datanode.TestBPOfferService | | | hadoop.hdfs.TestDFSInotifyEventInputStreamKerberized | | | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | | hadoop.mapreduce.v2.TestMRJobs | | | hadoop.mapreduce.v2.TestMRJobsWithProfiler | | | hadoop.mapreduce.v2.TestRMNMInfo | | | hadoop.mapreduce.v2.TestMRAppWithCombiner | | | hadoop.mapreduce.v2.TestSpeculativeExecutionWithMRApp | | |
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=763760=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-763760 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 28/Apr/22 17:50 Start Date: 28/Apr/22 17:50 Worklog Time Spent: 10m Work Description: saintstack opened a new pull request, #4246: URL: https://github.com/apache/hadoop/pull/4246 ### Description of PR Cherry-pick of 9ed8d60511dccf96108239c5c96e108a7d4bc975 ### How was this patch tested? ### For code changes: - [ ] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? - [ ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files? Issue Time Tracking --- Worklog Id: (was: 763760) Time Spent: 3h 40m (was: 3.5h) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 3h 40m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=763731=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-763731 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 28/Apr/22 17:24 Start Date: 28/Apr/22 17:24 Worklog Time Spent: 10m Work Description: saintstack merged PR #4170: URL: https://github.com/apache/hadoop/pull/4170 Issue Time Tracking --- Worklog Id: (was: 763731) Time Spent: 3.5h (was: 3h 20m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 3.5h > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=762942=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762942 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 27/Apr/22 14:56 Start Date: 27/Apr/22 14:56 Worklog Time Spent: 10m Work Description: saintstack commented on PR #4170: URL: https://github.com/apache/hadoop/pull/4170#issuecomment-106375 I'll merge later today (unless someone else beats me to it). Issue Time Tracking --- Worklog Id: (was: 762942) Time Spent: 3h 20m (was: 3h 10m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 3h 20m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=762435=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762435 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 26/Apr/22 17:14 Start Date: 26/Apr/22 17:14 Worklog Time Spent: 10m Work Description: huaxiangsun commented on PR #4170: URL: https://github.com/apache/hadoop/pull/4170#issuecomment-1110051006 Any more comments? Thanks. Issue Time Tracking --- Worklog Id: (was: 762435) Time Spent: 3h 10m (was: 3h) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 3h 10m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=762042=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762042 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 25/Apr/22 21:13 Start Date: 25/Apr/22 21:13 Worklog Time Spent: 10m Work Description: huaxiangsun commented on PR #4170: URL: https://github.com/apache/hadoop/pull/4170#issuecomment-1109046711 I run the failed test [.TestReplaceDatanodeFailureReplication.testWithOnlyLastDatanodeIsAlive] locally multiple times, it passed. Issue Time Tracking --- Worklog Id: (was: 762042) Time Spent: 3h (was: 2h 50m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 3h > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=761326=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761326 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 23/Apr/22 18:44 Start Date: 23/Apr/22 18:44 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4170: URL: https://github.com/apache/hadoop/pull/4170#issuecomment-1107606815 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 56s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 16m 1s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 28m 2s | | trunk passed | | +1 :green_heart: | compile | 24m 51s | | trunk passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 21m 32s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 4m 29s | | trunk passed | | +1 :green_heart: | mvnsite | 19m 53s | | trunk passed | | -1 :x: | javadoc | 1m 37s | [/branch-javadoc-root-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/5/artifact/out/branch-javadoc-root-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt) | root in trunk failed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04. | | +1 :green_heart: | javadoc | 8m 28s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 38m 52s | | trunk passed | | +1 :green_heart: | shadedclient | 57m 54s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 27s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 26m 0s | | the patch passed | | +1 :green_heart: | compile | 24m 27s | | the patch passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javac | 24m 27s | | the patch passed | | +1 :green_heart: | compile | 21m 38s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 21m 38s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 4m 20s | | the patch passed | | +1 :green_heart: | mvnsite | 19m 30s | | the patch passed | | -1 :x: | javadoc | 1m 26s | [/patch-javadoc-root-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/5/artifact/out/patch-javadoc-root-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt) | root in the patch failed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04. | | +1 :green_heart: | javadoc | 8m 58s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 39m 23s | | the patch passed | | +1 :green_heart: | shadedclient | 57m 59s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 1054m 18s | [/patch-unit-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/5/artifact/out/patch-unit-root.txt) | root in the patch passed. | | +1 :green_heart: | asflicense | 2m 16s | | The patch does not generate ASF License warnings. | | | | 1417m 34s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.TestReplaceDatanodeFailureReplication | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/5/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/4170 | | Optional Tests | dupname asflicense codespell compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle | | uname | Linux 95b69934c683 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | |
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=761028=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761028 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 22/Apr/22 19:04 Start Date: 22/Apr/22 19:04 Worklog Time Spent: 10m Work Description: huaxiangsun commented on code in PR #4170: URL: https://github.com/apache/hadoop/pull/4170#discussion_r856498429 ## hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestDatanodeManager.java: ## @@ -138,6 +138,35 @@ public void testNumVersionsCorrectAfterReregister() mapToCheck.get("version1").intValue(), 1); } + /** + * This test checks that if a node is re-registered with a different ip, its + * host2DatanodeMap is correctly updated with the new ip. + */ + @Test Review Comment: Yeah, the NetworkTopology map is another map. It is quite complicated to add a unittest for it. I am going to leave to hdfs experts to decide if a unittest is needed. Issue Time Tracking --- Worklog Id: (was: 761028) Time Spent: 2h 40m (was: 2.5h) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 2h 40m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=761024=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761024 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 22/Apr/22 18:56 Start Date: 22/Apr/22 18:56 Worklog Time Spent: 10m Work Description: huaxiangsun commented on code in PR #4170: URL: https://github.com/apache/hadoop/pull/4170#discussion_r856493693 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java: ## @@ -1189,16 +1190,25 @@ public void registerDatanode(DatanodeRegistration nodeReg) nodes with its data cleared (or user can just remove the StorageID value in "VERSION" file under the data directory of the datanode, but this is might not work if VERSION file format has changed - */ + */ + // Check if nodeS's host information is same as nodeReg's, if not, + // it needs to update host2DatanodeMap accordringly. + updateHost2DatanodeMap = !nodeS.getXferAddr().equals(nodeReg.getXferAddr()); + NameNode.stateChangeLog.info("BLOCK* registerDatanode: " + nodeS + " is replaced by " + nodeReg + " with the same storageID " - + nodeReg.getDatanodeUuid()); + + nodeReg.getDatanodeUuid() + ", updateHost2DatanodeMap: " + updateHost2DatanodeMap); Review Comment: I am going to upload the patch which does not log updateHost2DatanodeMap. Issue Time Tracking --- Worklog Id: (was: 761024) Time Spent: 2.5h (was: 2h 20m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=761022=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761022 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 22/Apr/22 18:55 Start Date: 22/Apr/22 18:55 Worklog Time Spent: 10m Work Description: huaxiangsun commented on code in PR #4170: URL: https://github.com/apache/hadoop/pull/4170#discussion_r856493317 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java: ## @@ -1189,16 +1190,25 @@ public void registerDatanode(DatanodeRegistration nodeReg) nodes with its data cleared (or user can just remove the StorageID value in "VERSION" file under the data directory of the datanode, but this is might not work if VERSION file format has changed - */ + */ + // Check if nodeS's host information is same as nodeReg's, if not, + // it needs to update host2DatanodeMap accordringly. + updateHost2DatanodeMap = !nodeS.getXferAddr().equals(nodeReg.getXferAddr()); Review Comment: Yeah, since nodeS is not null, nodeS.getXferAddr() cannot be null. Issue Time Tracking --- Worklog Id: (was: 761022) Time Spent: 2h 20m (was: 2h 10m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=760247=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-760247 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 21/Apr/22 17:33 Start Date: 21/Apr/22 17:33 Worklog Time Spent: 10m Work Description: huaxiangsun commented on code in PR #4170: URL: https://github.com/apache/hadoop/pull/4170#discussion_r855427229 ## hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestDatanodeManager.java: ## @@ -138,6 +138,35 @@ public void testNumVersionsCorrectAfterReregister() mapToCheck.get("version1").intValue(), 1); } + /** + * This test checks that if a node is re-registered with a different ip, its + * host2DatanodeMap is correctly updated with the new ip. + */ + @Test Review Comment: When I checked the code, I think there is another one Network map, its code is what I followed for this hostToDatanodeMap change. Let me check, would rather avoid adding more testing cases for code which is already there. Issue Time Tracking --- Worklog Id: (was: 760247) Time Spent: 2h 10m (was: 2h) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=760243=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-760243 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 21/Apr/22 17:28 Start Date: 21/Apr/22 17:28 Worklog Time Spent: 10m Work Description: huaxiangsun commented on code in PR #4170: URL: https://github.com/apache/hadoop/pull/4170#discussion_r855423782 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java: ## @@ -1189,16 +1190,25 @@ public void registerDatanode(DatanodeRegistration nodeReg) nodes with its data cleared (or user can just remove the StorageID value in "VERSION" file under the data directory of the datanode, but this is might not work if VERSION file format has changed - */ + */ + // Check if nodeS's host information is same as nodeReg's, if not, + // it needs to update host2DatanodeMap accordringly. + updateHost2DatanodeMap = !nodeS.getXferAddr().equals(nodeReg.getXferAddr()); + NameNode.stateChangeLog.info("BLOCK* registerDatanode: " + nodeS + " is replaced by " + nodeReg + " with the same storageID " - + nodeReg.getDatanodeUuid()); + + nodeReg.getDatanodeUuid() + ", updateHost2DatanodeMap: " + updateHost2DatanodeMap); Review Comment: Yeah, agree. Let me undo this change. Issue Time Tracking --- Worklog Id: (was: 760243) Time Spent: 2h (was: 1h 50m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=760242=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-760242 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 21/Apr/22 17:27 Start Date: 21/Apr/22 17:27 Worklog Time Spent: 10m Work Description: huaxiangsun commented on code in PR #4170: URL: https://github.com/apache/hadoop/pull/4170#discussion_r855422776 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java: ## @@ -1189,16 +1190,25 @@ public void registerDatanode(DatanodeRegistration nodeReg) nodes with its data cleared (or user can just remove the StorageID value in "VERSION" file under the data directory of the datanode, but this is might not work if VERSION file format has changed - */ + */ + // Check if nodeS's host information is same as nodeReg's, if not, + // it needs to update host2DatanodeMap accordringly. + updateHost2DatanodeMap = !nodeS.getXferAddr().equals(nodeReg.getXferAddr()); Review Comment: Thanks Nick for the comment. I think nodeS.getXferAddr() cannot be null, let me investigate more and get back. Issue Time Tracking --- Worklog Id: (was: 760242) Time Spent: 1h 50m (was: 1h 40m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=760139=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-760139 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 21/Apr/22 15:45 Start Date: 21/Apr/22 15:45 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4170: URL: https://github.com/apache/hadoop/pull/4170#issuecomment-1105400155 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 58s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 15m 45s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 28m 6s | | trunk passed | | +1 :green_heart: | compile | 24m 43s | | trunk passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 20m 54s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 3m 55s | | trunk passed | | +1 :green_heart: | mvnsite | 25m 47s | | trunk passed | | -1 :x: | javadoc | 1m 30s | [/branch-javadoc-root-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/4/artifact/out/branch-javadoc-root-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt) | root in trunk failed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04. | | +1 :green_heart: | javadoc | 8m 21s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 39m 0s | | trunk passed | | +1 :green_heart: | shadedclient | 57m 59s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 28s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 25m 40s | | the patch passed | | +1 :green_heart: | compile | 24m 26s | | the patch passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javac | 24m 26s | | the patch passed | | +1 :green_heart: | compile | 21m 29s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 21m 29s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 4m 15s | | the patch passed | | +1 :green_heart: | mvnsite | 20m 28s | | the patch passed | | -1 :x: | javadoc | 1m 30s | [/patch-javadoc-root-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/4/artifact/out/patch-javadoc-root-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt) | root in the patch failed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04. | | +1 :green_heart: | javadoc | 8m 27s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 39m 20s | | the patch passed | | +1 :green_heart: | shadedclient | 58m 8s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 1064m 10s | [/patch-unit-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/4/artifact/out/patch-unit-root.txt) | root in the patch passed. | | +1 :green_heart: | asflicense | 2m 12s | | The patch does not generate ASF License warnings. | | | | 1431m 51s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector | | | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/4170 | | Optional Tests | dupname asflicense codespell compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle | | uname | Linux c73e326f6bae 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=759978=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-759978 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 21/Apr/22 12:26 Start Date: 21/Apr/22 12:26 Worklog Time Spent: 10m Work Description: ndimiduk commented on code in PR #4170: URL: https://github.com/apache/hadoop/pull/4170#discussion_r855124598 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java: ## @@ -1189,16 +1190,25 @@ public void registerDatanode(DatanodeRegistration nodeReg) nodes with its data cleared (or user can just remove the StorageID value in "VERSION" file under the data directory of the datanode, but this is might not work if VERSION file format has changed - */ + */ + // Check if nodeS's host information is same as nodeReg's, if not, + // it needs to update host2DatanodeMap accordringly. + updateHost2DatanodeMap = !nodeS.getXferAddr().equals(nodeReg.getXferAddr()); + NameNode.stateChangeLog.info("BLOCK* registerDatanode: " + nodeS + " is replaced by " + nodeReg + " with the same storageID " - + nodeReg.getDatanodeUuid()); + + nodeReg.getDatanodeUuid() + ", updateHost2DatanodeMap: " + updateHost2DatanodeMap); Review Comment: Is this extra information needed at the `INFO` level log? I understand that having the value printed is helpful during development, but I don't think it's meaningful to an operator. Also, if you're here to change a log message, maybe also change it to use the format string version instead of string concatenation? ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java: ## @@ -1189,16 +1190,25 @@ public void registerDatanode(DatanodeRegistration nodeReg) nodes with its data cleared (or user can just remove the StorageID value in "VERSION" file under the data directory of the datanode, but this is might not work if VERSION file format has changed - */ + */ + // Check if nodeS's host information is same as nodeReg's, if not, + // it needs to update host2DatanodeMap accordringly. + updateHost2DatanodeMap = !nodeS.getXferAddr().equals(nodeReg.getXferAddr()); Review Comment: Can `nodeS.getXferAddr()` ever be null? Use of `Objects.equals()` would be safer. ## hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestDatanodeManager.java: ## @@ -138,6 +138,35 @@ public void testNumVersionsCorrectAfterReregister() mapToCheck.get("version1").intValue(), 1); } + /** + * This test checks that if a node is re-registered with a different ip, its + * host2DatanodeMap is correctly updated with the new ip. + */ + @Test Review Comment: It looks like there's a couple caches like this one. Is there equivalent test coverage for all the registered DN state that is updated on registration? Can you add coverage for any of the other states that are missing? Issue Time Tracking --- Worklog Id: (was: 759978) Time Spent: 1.5h (was: 1h 20m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=758019=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-758019 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 18/Apr/22 17:29 Start Date: 18/Apr/22 17:29 Worklog Time Spent: 10m Work Description: huaxiangsun commented on code in PR #4170: URL: https://github.com/apache/hadoop/pull/4170#discussion_r852277677 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java: ## @@ -1189,16 +1190,26 @@ public void registerDatanode(DatanodeRegistration nodeReg) nodes with its data cleared (or user can just remove the StorageID value in "VERSION" file under the data directory of the datanode, but this is might not work if VERSION file format has changed - */ + */ + // Check if nodeS's host information is same as nodeReg's, if not, + // it needs to update host2DatanodeMap accordringly. + updateHost2DatanodeMap = !nodeS.getIpAddr().equals(nodeReg.getIpAddr()) || + (nodeS.getXferPort() != nodeReg.getXferPort()); + Review Comment: Updated. Issue Time Tracking --- Worklog Id: (was: 758019) Time Spent: 1h 20m (was: 1h 10m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=757979=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-757979 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 18/Apr/22 16:45 Start Date: 18/Apr/22 16:45 Worklog Time Spent: 10m Work Description: huaxiangsun commented on code in PR #4170: URL: https://github.com/apache/hadoop/pull/4170#discussion_r852249004 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java: ## @@ -1189,16 +1190,26 @@ public void registerDatanode(DatanodeRegistration nodeReg) nodes with its data cleared (or user can just remove the StorageID value in "VERSION" file under the data directory of the datanode, but this is might not work if VERSION file format has changed - */ + */ + // Check if nodeS's host information is same as nodeReg's, if not, + // it needs to update host2DatanodeMap accordringly. + updateHost2DatanodeMap = !nodeS.getIpAddr().equals(nodeReg.getIpAddr()) || + (nodeS.getXferPort() != nodeReg.getXferPort()); + Review Comment: Makes sense as getXferAddr() is "getIpAddr():getXferPort()". Issue Time Tracking --- Worklog Id: (was: 757979) Time Spent: 1h 10m (was: 1h) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=757552=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-757552 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 16/Apr/22 04:10 Start Date: 16/Apr/22 04:10 Worklog Time Spent: 10m Work Description: tomscut commented on code in PR #4170: URL: https://github.com/apache/hadoop/pull/4170#discussion_r851582741 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java: ## @@ -1189,16 +1190,26 @@ public void registerDatanode(DatanodeRegistration nodeReg) nodes with its data cleared (or user can just remove the StorageID value in "VERSION" file under the data directory of the datanode, but this is might not work if VERSION file format has changed - */ + */ + // Check if nodeS's host information is same as nodeReg's, if not, + // it needs to update host2DatanodeMap accordringly. + updateHost2DatanodeMap = !nodeS.getIpAddr().equals(nodeReg.getIpAddr()) || + (nodeS.getXferPort() != nodeReg.getXferPort()); + Review Comment: Can we update `updateHost2DatanodeMap = !nodeS.getIpAddr().equals(nodeReg.getIpAddr()) || (nodeS.getXferPort() != nodeReg.getXferPort());` to `updateHost2DatanodeMap = !nodeS.getXferAddr().equals(nodeReg.getXferAddr());` Issue Time Tracking --- Worklog Id: (was: 757552) Time Spent: 1h (was: 50m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=757526=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-757526 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 15/Apr/22 23:15 Start Date: 15/Apr/22 23:15 Worklog Time Spent: 10m Work Description: huaxiangsun commented on code in PR #4170: URL: https://github.com/apache/hadoop/pull/4170#discussion_r851552322 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java: ## @@ -1189,16 +1190,26 @@ public void registerDatanode(DatanodeRegistration nodeReg) nodes with its data cleared (or user can just remove the StorageID value in "VERSION" file under the data directory of the datanode, but this is might not work if VERSION file format has changed - */ + */ + // Check if nodeS's host information is same as nodeReg's, if not, + // it needs to update host2DatanodeMap accordringly. + updateHost2DatanodeMap = !nodeS.getIpAddr().equals(nodeReg.getAddress()) || Review Comment: Updated and verified. Issue Time Tracking --- Worklog Id: (was: 757526) Time Spent: 50m (was: 40m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=757478=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-757478 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 15/Apr/22 18:46 Start Date: 15/Apr/22 18:46 Worklog Time Spent: 10m Work Description: huaxiangsun commented on code in PR #4170: URL: https://github.com/apache/hadoop/pull/4170#discussion_r851432390 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java: ## @@ -1189,16 +1190,26 @@ public void registerDatanode(DatanodeRegistration nodeReg) nodes with its data cleared (or user can just remove the StorageID value in "VERSION" file under the data directory of the datanode, but this is might not work if VERSION file format has changed - */ + */ + // Check if nodeS's host information is same as nodeReg's, if not, + // it needs to update host2DatanodeMap accordringly. + updateHost2DatanodeMap = !nodeS.getIpAddr().equals(nodeReg.getAddress()) || Review Comment: Good catch! Let me update it to getIpAddr(), will update shortly. Issue Time Tracking --- Worklog Id: (was: 757478) Time Spent: 40m (was: 0.5h) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=757262=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-757262 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 15/Apr/22 01:26 Start Date: 15/Apr/22 01:26 Worklog Time Spent: 10m Work Description: tomscut commented on code in PR #4170: URL: https://github.com/apache/hadoop/pull/4170#discussion_r850967827 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java: ## @@ -1189,16 +1190,26 @@ public void registerDatanode(DatanodeRegistration nodeReg) nodes with its data cleared (or user can just remove the StorageID value in "VERSION" file under the data directory of the datanode, but this is might not work if VERSION file format has changed - */ + */ + // Check if nodeS's host information is same as nodeReg's, if not, + // it needs to update host2DatanodeMap accordringly. + updateHost2DatanodeMap = !nodeS.getIpAddr().equals(nodeReg.getAddress()) || Review Comment: `nodeReg.getAddress()` contains port, but `nodeS.getIpAddr()` doesn't, so `updateHost2DatanodeMap` is always `true`, right? Issue Time Tracking --- Worklog Id: (was: 757262) Time Spent: 0.5h (was: 20m) > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=756827=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-756827 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 14/Apr/22 03:46 Start Date: 14/Apr/22 03:46 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4170: URL: https://github.com/apache/hadoop/pull/4170#issuecomment-1098680455 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 17m 3s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 41m 35s | | trunk passed | | +1 :green_heart: | compile | 1m 31s | | trunk passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 1m 20s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 1m 5s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 31s | | trunk passed | | +1 :green_heart: | javadoc | 1m 7s | | trunk passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 31s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 40s | | trunk passed | | +1 :green_heart: | shadedclient | 26m 14s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 18s | | the patch passed | | +1 :green_heart: | compile | 1m 25s | | the patch passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javac | 1m 25s | | the patch passed | | +1 :green_heart: | compile | 1m 15s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 1m 15s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 53s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 20s | | the patch passed | | +1 :green_heart: | javadoc | 0m 54s | | the patch passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 26s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 32s | | the patch passed | | +1 :green_heart: | shadedclient | 26m 6s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 341m 40s | | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 41s | | The patch does not generate ASF License warnings. | | | | 474m 5s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/4170 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux 98cf76256475 4.15.0-162-generic #170-Ubuntu SMP Mon Oct 18 11:38:05 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 1316ff0eada1e29dec8ca56ab266c9bcbe60051c | | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/1/testReport/ | | Max. process+thread count | 2175 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/1/console | | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org | This
[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes
[ https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=756643=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-756643 ] ASF GitHub Bot logged work on HDFS-16540: - Author: ASF GitHub Bot Created on: 13/Apr/22 19:51 Start Date: 13/Apr/22 19:51 Worklog Time Spent: 10m Work Description: huaxiangsun opened a new pull request, #4170: URL: https://github.com/apache/hadoop/pull/4170 …etes ### Description of PR When Dn with the same uuid is registered with a different ip, host2DatanodeMap needs to be updated accordingly. ### How was this patch tested? Tested 3.3.2 with the patch on a eks cluster, restarted the pod hosting DataNode and HBase region server. After that, doing a major compaction of Hbase region, made sure that locality is kept. There is also a new unittest case added. ### For code changes: - [ ] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? - [ ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files? Issue Time Tracking --- Worklog Id: (was: 756643) Remaining Estimate: 0h Time Spent: 10m > Data locality is lost when DataNode pod restarts in kubernetes > --- > > Key: HDFS-16540 > URL: https://issues.apache.org/jira/browse/HDFS-16540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.2 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod > restarts, we found that data locality is lost after we do a major compaction > of hbase regions. After some debugging, we found that upon pod restarts, its > ip changes. In DatanodeManager, maps like networktopology are updated with > the new info. host2DatanodeMap is not updated accordingly. When hdfs client > with the new ip tries to find a local DataNode, it fails. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org