[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=771965=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771965
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 18/May/22 15:29
Start Date: 18/May/22 15:29
Worklog Time Spent: 10m 
  Work Description: saintstack commented on PR #4326:
URL: https://github.com/apache/hadoop/pull/4326#issuecomment-1130165900

   Merged. Thank you for the cleanup @Hexiaoqiao 




Issue Time Tracking
---

Worklog Id: (was: 771965)
Time Spent: 8h 20m  (was: 8h 10m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=771964=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771964
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 18/May/22 15:28
Start Date: 18/May/22 15:28
Worklog Time Spent: 10m 
  Work Description: saintstack merged PR #4326:
URL: https://github.com/apache/hadoop/pull/4326




Issue Time Tracking
---

Worklog Id: (was: 771964)
Time Spent: 8h 10m  (was: 8h)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=771894=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771894
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 18/May/22 14:13
Start Date: 18/May/22 14:13
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4326:
URL: https://github.com/apache/hadoop/pull/4326#issuecomment-1130072419

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 40s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  shadedclient  |  29m 54s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  shadedclient  |  19m 20s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 53s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  52m 21s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4326/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4326 |
   | Optional Tests | dupname asflicense codespell |
   | uname | Linux 0b6857aa64db 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d394689a5b727f2b52272776dd950805e5427f14 |
   | Max. process+thread count | 548 (vs. ulimit of 5500) |
   | modules | C: . U: . |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4326/1/console |
   | versions | git=2.25.1 maven=3.6.3 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




Issue Time Tracking
---

Worklog Id: (was: 771894)
Time Spent: 8h  (was: 7h 50m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=771872=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771872
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 18/May/22 13:21
Start Date: 18/May/22 13:21
Worklog Time Spent: 10m 
  Work Description: Hexiaoqiao commented on PR #4326:
URL: https://github.com/apache/hadoop/pull/4326#issuecomment-1130011537

   cc @saintstack and @tomscut 




Issue Time Tracking
---

Worklog Id: (was: 771872)
Time Spent: 7h 50m  (was: 7h 40m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=771871=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771871
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 18/May/22 13:19
Start Date: 18/May/22 13:19
Worklog Time Spent: 10m 
  Work Description: Hexiaoqiao opened a new pull request, #4326:
URL: https://github.com/apache/hadoop/pull/4326

   ### Description of PR
   remove unusually introduce file '.BUILDING.txt.swp'.
   
   ### How was this patch tested?
   No adding test since no source code changes.
   
   ### For code changes:
   - [Y] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [N] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [N] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [N] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?




Issue Time Tracking
---

Worklog Id: (was: 771871)
Time Spent: 7h 40m  (was: 7.5h)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=771869=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771869
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 18/May/22 13:11
Start Date: 18/May/22 13:11
Worklog Time Spent: 10m 
  Work Description: Hexiaoqiao commented on PR #4246:
URL: https://github.com/apache/hadoop/pull/4246#issuecomment-112819

   I don't get what changes here. Maybe I missed something? Thanks.




Issue Time Tracking
---

Worklog Id: (was: 771869)
Time Spent: 7.5h  (was: 7h 20m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=771865=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771865
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 18/May/22 13:02
Start Date: 18/May/22 13:02
Worklog Time Spent: 10m 
  Work Description: Hexiaoqiao commented on PR #4170:
URL: https://github.com/apache/hadoop/pull/4170#issuecomment-1129990002

   Sorry for late response. Just found that this PR involved unrelated changes 
'.BUILDING.txt.swp' under root path of project. If no other concerns I would 
like to remove it for a while.




Issue Time Tracking
---

Worklog Id: (was: 771865)
Time Spent: 7h 20m  (was: 7h 10m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=771494=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771494
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 17/May/22 17:59
Start Date: 17/May/22 17:59
Worklog Time Spent: 10m 
  Work Description: huaxiangsun commented on PR #4246:
URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1129159135

   Thanks a lot, @saintstack!




Issue Time Tracking
---

Worklog Id: (was: 771494)
Time Spent: 7h 10m  (was: 7h)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=770656=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770656
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 16/May/22 04:32
Start Date: 16/May/22 04:32
Worklog Time Spent: 10m 
  Work Description: saintstack commented on PR #4246:
URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1127210267

   I ran the two test below in loops locally.
   
   TestBPOfferService.testMissBlocksWhenReregister
   TestUnderReplicatedBlocks.testSetRepIncWithUnderReplicatedBlocks
   
   The first failed once out of ten cycles both when the patch was in place and 
when not (jibes w/ what we see here in test runs where sometimes it fails but 
not always).
   
   TestUnderReplicatedBlocks.testSetRepIncWithUnderReplicatedBlocks shows up 
consistently but when I run it locally in multiple cycles, it passes whether 
the patch is applied or not. I see that in the last full branch-3.3 run, back 
on May 5th 
(https://ci-hadoop.apache.org/job/hadoop-qbt-branch-3.3-java8-linux-x86_64/54/),
 it failed for same reason (the May 12th run was incomplete). This test is 
about block replication where the PR here is about a minor adjustment in NN 
node accounting. Unrelated I'd say. Pushing the backport.
   
   Will push in the morning.




Issue Time Tracking
---

Worklog Id: (was: 770656)
Time Spent: 6h 50m  (was: 6h 40m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=770657=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770657
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 16/May/22 04:32
Start Date: 16/May/22 04:32
Worklog Time Spent: 10m 
  Work Description: saintstack merged PR #4246:
URL: https://github.com/apache/hadoop/pull/4246




Issue Time Tracking
---

Worklog Id: (was: 770657)
Time Spent: 7h  (was: 6h 50m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=770639=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770639
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 16/May/22 02:31
Start Date: 16/May/22 02:31
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4246:
URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1127151370

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 40s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ branch-3.3 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  35m 58s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  compile  |   1m 34s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 14s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  mvnsite  |   1m 39s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  javadoc  |   1m 52s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  spotbugs  |   3m 35s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  27m  7s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 14s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m  0s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/14/artifact/out/blanks-eol.txt)
 |  The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  checkstyle  |   0m 48s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 33s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |   3m 17s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  26m 19s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 190m 25s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/14/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 14s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 298m 53s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/14/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4246 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 8f80b78e9fdd 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | branch-3.3 / a50952249eecd4f1d1f0479f1ca1643c3dbec925 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/14/testReport/ |
   | Max. process+thread count | 3224 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/14/console |
   | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




Issue Time Tracking
---

Worklog Id: (was: 770639)
Time Spent: 6h 40m  (was: 6.5h)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> 

[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=770474=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770474
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 14/May/22 05:53
Start Date: 14/May/22 05:53
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4246:
URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1126648700

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 38s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ branch-3.3 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  37m 26s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  compile  |   1m 31s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 13s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  mvnsite  |   1m 40s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  javadoc  |   1m 57s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  spotbugs  |   3m 38s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  26m 54s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 23s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 49s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 24s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 30s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |   3m 22s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  26m 23s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 190m 22s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/13/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 14s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 300m 22s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints 
|
   |   | 
hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery |
   |   | hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes |
   |   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/13/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4246 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 60cd37572baf 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | branch-3.3 / 21686a25ac0fc811c894b03354592b811e84b1eb |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/13/testReport/ |
   | Max. process+thread count | 3589 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/13/console |
   | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




Issue Time Tracking
---

Worklog Id: (was: 770474)
Time Spent: 6.5h  (was: 6h 20m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop 

[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=770425=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770425
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 13/May/22 23:14
Start Date: 13/May/22 23:14
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4246:
URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1126569964

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 43s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ branch-3.3 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  35m 26s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  compile  |   1m 31s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 13s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  mvnsite  |   1m 41s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  javadoc  |   1m 53s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  spotbugs  |   3m 36s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  27m 19s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 21s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 16s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m  0s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/12/artifact/out/blanks-eol.txt)
 |  The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  checkstyle  |   0m 46s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 21s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 28s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |   3m 18s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  26m 21s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 190m 47s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/12/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 15s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 299m  1s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.balancer.TestBalancer |
   |   | hadoop.hdfs.server.namenode.ha.TestHAAppend |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/12/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4246 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux bdf2fce93bdd 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | branch-3.3 / 007c9e844ffed2e12691b6774fd738688e7c1c06 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/12/testReport/ |
   | Max. process+thread count |  (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/12/console |
   | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




Issue Time Tracking
---

Worklog Id: (was: 770425)
Time Spent: 6h 20m  (was: 6h 10m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> 

[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=770312=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770312
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 13/May/22 18:12
Start Date: 13/May/22 18:12
Worklog Time Spent: 10m 
  Work Description: saintstack commented on PR #4246:
URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1126312782

   Two failures:
   
   TestBPOfferService.testMissBlocksWhenReregister
   TestUnderReplicatedBlocks.testSetRepIncWithUnderReplicatedBlocks
   
   They come up often enough. Let me try again. Meantime running locally.
   
   




Issue Time Tracking
---

Worklog Id: (was: 770312)
Time Spent: 6h 10m  (was: 6h)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=769499=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769499
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 12/May/22 09:26
Start Date: 12/May/22 09:26
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4246:
URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1124741659

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 38s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ branch-3.3 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  36m 42s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  compile  |   1m 33s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 13s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  mvnsite  |   1m 40s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  javadoc  |   1m 58s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  spotbugs  |   3m 39s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  27m  1s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 50s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 23s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |   3m 19s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  26m 13s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 193m 17s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/11/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 15s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 302m 10s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestBPOfferService |
   |   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/11/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4246 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 14ee2742708c 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | branch-3.3 / 53773ea019ca5ed793d36035c7adbfe589f5926c |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/11/testReport/ |
   | Max. process+thread count | 2983 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/11/console |
   | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




Issue Time Tracking
---

Worklog Id: (was: 769499)
Time Spent: 6h  (was: 5h 50m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun

[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=769414=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769414
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 12/May/22 04:20
Start Date: 12/May/22 04:20
Worklog Time Spent: 10m 
  Work Description: saintstack commented on PR #4246:
URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1124514541

   Removing a space had us run more tests and 4 tests failed instead of 44 on 
previous run.
   
   Below is the change in last run.
   
   ```
   From eb904f3adaa55d44aa6494ad116344317e9ec882 Mon Sep 17 00:00:00 2001
   From: stack 
   Date: Wed, 11 May 2022 15:42:18 -0700
   Subject: [PATCH] Remove a space at end of line inside a comment 

Issue Time Tracking
---

Worklog Id: (was: 769414)
Time Spent: 5h 50m  (was: 5h 40m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=769407=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769407
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 12/May/22 03:15
Start Date: 12/May/22 03:15
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4246:
URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1124486913

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 45s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ branch-3.3 Compile Tests _ |
   | -1 :x: |  mvninstall  |   6m 15s | 
[/branch-mvninstall-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/10/artifact/out/branch-mvninstall-root.txt)
 |  root in branch-3.3 failed.  |
   | -1 :x: |  compile  |   0m 30s | 
[/branch-compile-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/10/artifact/out/branch-compile-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in branch-3.3 failed.  |
   | -0 :warning: |  checkstyle  |   0m 27s | 
[/buildtool-branch-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/10/artifact/out/buildtool-branch-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  The patch fails to run checkstyle in hadoop-hdfs  |
   | +1 :green_heart: |  mvnsite  |   3m 14s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  javadoc  |   1m 55s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  spotbugs  |   4m  0s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  29m 31s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 21s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 14s |  |  the patch passed  |
   | -1 :x: |  javac  |   1m 14s | 
[/results-compile-javac-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/10/artifact/out/results-compile-javac-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project_hadoop-hdfs generated 567 new + 0 unchanged - 0 fixed = 
567 total (was 0)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 55s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/10/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 52 new + 0 unchanged - 
0 fixed = 52 total (was 0)  |
   | +1 :green_heart: |  mvnsite  |   1m 22s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 27s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |   3m 15s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  26m  0s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 189m 51s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/10/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 14s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 270m 37s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestHAAppend |
   |   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS |
   |   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
   |   | 
hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/10/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4246 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 691ecf7100cf 4.15.0-156-generic #163-Ubuntu SMP Thu Aug 19 
23:31:58 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality 

[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=769254=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769254
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 11/May/22 19:22
Start Date: 11/May/22 19:22
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4246:
URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1124201434

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 44s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ branch-3.3 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  36m 27s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  compile  |   1m 33s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 13s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  mvnsite  |   1m 45s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  javadoc  |   1m 52s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  spotbugs  |   3m 41s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  27m 12s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 22s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 17s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m  0s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/9/artifact/out/blanks-eol.txt)
 |  The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  checkstyle  |   0m 45s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 29s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |   3m 59s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  33m  9s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  |  34m 55s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/9/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +0 :ok: |  asflicense  |   0m 41s |  |  ASF License check generated no 
output?  |
   |  |   | 150m 13s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.tools.TestECAdmin |
   |   | hadoop.hdfs.tools.TestViewFileSystemOverloadSchemeWithDFSAdmin |
   |   | 
hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerWithStripedBlocks |
   |   | hadoop.cli.TestHDFSCLI |
   |   | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
   |   | hadoop.hdfs.TestBlockStoragePolicy |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/9/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4246 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 80c88472574d 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | branch-3.3 / 53fdbf60a26f18341150743f860b0713ca2d632a |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/9/testReport/ |
   | Max. process+thread count | 1567 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/9/console |
   | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




Issue Time Tracking
---

Worklog Id: (was: 769254)
Time Spent: 5.5h  (was: 5h 20m)

> Data locality is lost when DataNode 

[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=768818=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768818
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 11/May/22 02:19
Start Date: 11/May/22 02:19
Worklog Time Spent: 10m 
  Work Description: saintstack commented on PR #4246:
URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1123106711

   A noop patch has these failures:
   
   ```
   
org.apache.hadoop.hdfs.TestDFSUpgradeFromImage.testUpgradeFromRel1BBWImage
   
org.apache.hadoop.hdfs.TestDecommissionWithStripedBackoffMonitor.testDecommissionWithMissingBlock
   
org.apache.hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade.testWithLayoutChangeAndFinalize
   ```
   
   A push with the backport has these failures:
   
   ```
   
org.apache.hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks.testSetRepIncWithUnderReplicatedBlocks
   
org.apache.hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade.testWithLayoutChangeAndRollback
   ```
   
   One overlap: TestDataNodeRollingUpgrade. Other seems unrelated. Let me try a 
repush.




Issue Time Tracking
---

Worklog Id: (was: 768818)
Time Spent: 5h 20m  (was: 5h 10m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=768809=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768809
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 11/May/22 01:47
Start Date: 11/May/22 01:47
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4246:
URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1123090143

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 40s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ branch-3.3 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  36m 23s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  compile  |   1m 32s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 12s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  mvnsite  |   1m 37s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  javadoc  |   1m 56s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  spotbugs  |   3m 32s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  27m  3s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 22s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 49s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 22s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 32s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |   3m 15s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  26m 15s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 190m 33s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/8/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 13s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 299m 15s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestDecommissionWithStripedBackoffMonitor 
|
   |   | hadoop.hdfs.TestDFSUpgradeFromImage |
   |   | hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/8/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4246 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 90a26f48a12b 4.15.0-156-generic #163-Ubuntu SMP Thu Aug 19 
23:31:58 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | branch-3.3 / 2966e5350569cbcc88f66382d19537201deff0e7 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/8/testReport/ |
   | Max. process+thread count | 3708 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/8/console |
   | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




Issue Time Tracking
---

Worklog Id: (was: 768809)
Time Spent: 5h 10m  (was: 5h)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: 

[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=768692=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768692
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 10/May/22 20:18
Start Date: 10/May/22 20:18
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4246:
URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1122821124

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 43s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ branch-3.3 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  36m 30s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  compile  |   1m 32s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 13s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  mvnsite  |   1m 40s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  javadoc  |   1m 51s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  spotbugs  |   3m 36s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  27m 14s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 22s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 48s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 21s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 31s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |   3m 16s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m 57s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 190m 14s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/7/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 12s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 299m  0s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade |
   |   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/7/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4246 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 3b44c5c05aa1 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | branch-3.3 / 306eb361f6831f1d7722049133e89a2184b47cd1 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/7/testReport/ |
   | Max. process+thread count | 3175 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/7/console |
   | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




Issue Time Tracking
---

Worklog Id: (was: 768692)
Time Spent: 5h  (was: 4h 50m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug

[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=768543=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768543
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 10/May/22 15:02
Start Date: 10/May/22 15:02
Worklog Time Spent: 10m 
  Work Description: saintstack commented on PR #4246:
URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1122514860

   Node seems to be having issues...
   
   java.lang.OutOfMemoryError: unable to create new native thread
   
   Let me try a new push.




Issue Time Tracking
---

Worklog Id: (was: 768543)
Time Spent: 4h 50m  (was: 4h 40m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=768385=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768385
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 10/May/22 09:45
Start Date: 10/May/22 09:45
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4246:
URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1122168129

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   7m  6s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ branch-3.3 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  36m 34s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  compile  |   1m 32s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 12s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  mvnsite  |   1m 40s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  javadoc  |   1m 58s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  spotbugs  |   3m 32s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  26m 34s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 19s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 49s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 21s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 34s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |   3m 17s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  26m 12s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 189m  1s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/6/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  9s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 303m 48s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.datanode.fsdataset.impl.TestSpaceReservation |
   |   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration |
   |   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
   |   | hadoop.hdfs.server.datanode.TestRefreshNamenodes |
   |   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
   |   | hadoop.hdfs.server.datanode.TestDataNodeReconfiguration |
   |   | hadoop.hdfs.server.datanode.TestTransferRbw |
   |   | hadoop.hdfs.server.datanode.TestBatchIbr |
   |   | hadoop.hdfs.server.datanode.TestReadOnlySharedStorage |
   |   | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
   |   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles |
   |   | hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/6/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4246 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux fabca6d9bc43 4.15.0-156-generic #163-Ubuntu SMP Thu Aug 19 
23:31:58 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | branch-3.3 / e9049bdf18516093b053cd83bd2374262cb2519c |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/6/testReport/ |
   | Max. process+thread count | 3227 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 

[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=767411=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-767411
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 06/May/22 21:03
Start Date: 06/May/22 21:03
Worklog Time Spent: 10m 
  Work Description: saintstack commented on PR #4246:
URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1120006228

   Looking at these test failures:
   
* on a full run 

Issue Time Tracking
---

Worklog Id: (was: 767411)
Time Spent: 4.5h  (was: 4h 20m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=766514=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-766514
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 05/May/22 08:45
Start Date: 05/May/22 08:45
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4246:
URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1118312024

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 38s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ branch-3.3 Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 30s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m 51s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  compile  |  18m  4s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  checkstyle  |   3m 21s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  mvnsite  |  20m 22s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  javadoc  |   8m 34s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  spotbugs  |  35m 24s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  55m 48s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 30s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |  22m 54s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 42s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  17m 42s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   3m  8s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |  19m 55s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   7m 50s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |  35m 55s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  56m 14s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 308m 42s | 
[/patch-unit-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/5/artifact/out/patch-unit-root.txt)
 |  root in the patch failed.  |
   | +0 :ok: |  asflicense  |   1m 11s |  |  ASF License check generated no 
output?  |
   |  |   | 596m 16s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.yarn.server.resourcemanager.security.TestAMRMTokens |
   |   | hadoop.yarn.server.resourcemanager.TestRMHAForAsyncScheduler |
   |   | hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter |
   |   | hadoop.yarn.server.resourcemanager.TestResourceTrackerService |
   |   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler |
   |   | hadoop.yarn.server.timeline.webapp.TestTimelineWebServicesWithSSL |
   |   | 
hadoop.yarn.server.timeline.security.TestTimelineAuthenticationFilterForV1 |
   |   | hadoop.yarn.server.timeline.webapp.TestTimelineWebServices |
   |   | 
hadoop.yarn.server.applicationhistoryservice.TestApplicationHistoryServer |
   |   | hadoop.crypto.key.kms.server.TestKMS |
   |   | hadoop.crypto.key.kms.server.TestKMSWithZK |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpc |
   |   | hadoop.hdfs.server.federation.security.TestRouterSecurityManager |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination |
   |   | hadoop.hdfs.server.datanode.TestBPOfferService |
   |   | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4246 |
   | Optional Tests | dupname asflicense codespell compile javac javadoc 
mvninstall mvnsite unit shadedclient spotbugs checkstyle |
   | uname | Linux 69ddfeecc933 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | branch-3.3 / 0f8ad00d4c5a1d15bc4bbe1be7b8a93163053362 |
   | Default Java | Private 

[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-05-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=766140=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-766140
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 04/May/22 16:29
Start Date: 04/May/22 16:29
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4246:
URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1117559675

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   7m  2s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ branch-3.3 Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 28s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m 59s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  compile  |  17m 59s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  checkstyle  |   3m 19s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  mvnsite  |  20m 20s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  javadoc  |   8m  0s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  spotbugs  |  35m 23s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  55m 50s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 31s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |  22m 36s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 28s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  17m 28s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   3m 11s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |  20m  9s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   7m 57s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |  35m 51s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  55m 52s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 662m 48s | 
[/patch-unit-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/4/artifact/out/patch-unit-root.txt)
 |  root in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   2m 40s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 956m 39s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.yarn.server.router.webapp.TestRouterWebServicesREST |
   |   | 
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication
 |
   |   | hadoop.yarn.server.resourcemanager.TestRMHA |
   |   | hadoop.yarn.server.resourcemanager.webapp.TestRMWebappAuthentication |
   |   | hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisher |
   |   | 
hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher |
   |   | hadoop.yarn.server.resourcemanager.TestClientRMService |
   |   | hadoop.yarn.server.timeline.webapp.TestTimelineWebServicesWithSSL |
   |   | 
hadoop.yarn.server.timeline.security.TestTimelineAuthenticationFilterForV1 |
   |   | 
hadoop.yarn.server.applicationhistoryservice.TestApplicationHistoryServer |
   |   | hadoop.yarn.client.TestGetGroups |
   |   | 
hadoop.yarn.client.TestResourceManagerAdministrationProtocolPBClientImpl |
   |   | hadoop.crypto.key.kms.server.TestKMS |
   |   | hadoop.crypto.key.kms.server.TestKMSWithZK |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpc |
   |   | hadoop.hdfs.server.federation.security.TestRouterSecurityManager |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination |
   |   | hadoop.hdfs.server.datanode.TestBPOfferService |
   |   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
   |   | hadoop.mapred.TestLocalDistributedCacheManager |
   |   | hadoop.yarn.sls.appmaster.TestAMSimulator |
   |   | hadoop.yarn.sls.TestSLSRunner |
   |   | hadoop.yarn.sls.TestSLSDagAMSimulator |
   |   | hadoop.yarn.sls.TestReservationSystemInvariants |
   |   | hadoop.yarn.sls.TestSLSStreamAMSynth |
   |   | hadoop.yarn.sls.TestSLSGenericSynth |
   |   | hadoop.yarn.sls.nodemanager.TestNMSimulator |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 

[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=764660=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-764660
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 30/Apr/22 12:26
Start Date: 30/Apr/22 12:26
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4246:
URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1113980465

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 39s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ branch-3.3 Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 31s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m 49s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  compile  |  18m  8s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  checkstyle  |   3m 20s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  mvnsite  |  20m 21s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  javadoc  |   8m  8s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  spotbugs  |  35m 15s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  55m 37s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 36s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |  22m 39s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 33s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  17m 33s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  1s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   3m 12s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |  20m  0s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   8m 23s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |  35m 44s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  55m 48s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 660m 38s | 
[/patch-unit-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/2/artifact/out/patch-unit-root.txt)
 |  root in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   2m 42s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 947m 46s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.yarn.server.router.webapp.TestRouterWebServicesREST |
   |   | 
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication
 |
   |   | hadoop.yarn.server.resourcemanager.TestRMHA |
   |   | hadoop.yarn.server.resourcemanager.webapp.TestRMWebappAuthentication |
   |   | hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisher |
   |   | 
hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher |
   |   | hadoop.yarn.server.resourcemanager.TestClientRMService |
   |   | hadoop.yarn.server.timeline.webapp.TestTimelineWebServicesWithSSL |
   |   | 
hadoop.yarn.server.timeline.security.TestTimelineAuthenticationFilterForV1 |
   |   | 
hadoop.yarn.server.applicationhistoryservice.TestApplicationHistoryServer |
   |   | hadoop.yarn.client.TestGetGroups |
   |   | 
hadoop.yarn.client.TestResourceManagerAdministrationProtocolPBClientImpl |
   |   | hadoop.crypto.key.kms.server.TestKMS |
   |   | hadoop.crypto.key.kms.server.TestKMSWithZK |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpc |
   |   | hadoop.hdfs.server.federation.security.TestRouterSecurityManager |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination |
   |   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
   |   | hadoop.mapred.TestLocalDistributedCacheManager |
   |   | hadoop.yarn.sls.appmaster.TestAMSimulator |
   |   | hadoop.yarn.sls.TestSLSRunner |
   |   | hadoop.yarn.sls.TestSLSDagAMSimulator |
   |   | hadoop.yarn.sls.TestReservationSystemInvariants |
   |   | hadoop.yarn.sls.TestSLSStreamAMSynth |
   |   | hadoop.yarn.sls.TestSLSGenericSynth |
   |   | hadoop.yarn.sls.nodemanager.TestNMSimulator |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 

[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=764073=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-764073
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 29/Apr/22 08:26
Start Date: 29/Apr/22 08:26
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4246:
URL: https://github.com/apache/hadoop/pull/4246#issuecomment-1113018015

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   6m 55s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ branch-3.3 Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 29s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m 53s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  compile  |  18m  9s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  checkstyle  |   3m 23s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  mvnsite  |  20m 24s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  javadoc  |   8m  2s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  spotbugs  |  35m 29s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  59m 16s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 33s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |  22m 45s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 35s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  17m 35s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  1s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   3m  6s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |  19m 49s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   7m 57s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |  35m 39s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  55m 59s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 579m 22s | 
[/patch-unit-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4246/1/artifact/out/patch-unit-root.txt)
 |  root in the patch failed.  |
   | +0 :ok: |  asflicense  |   1m 41s |  |  ASF License check generated no 
output?  |
   |  |   | 874m 55s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.yarn.server.router.webapp.TestRouterWebServicesREST |
   |   | 
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication
 |
   |   | hadoop.yarn.server.resourcemanager.TestRMHA |
   |   | hadoop.yarn.server.resourcemanager.webapp.TestRMWebappAuthentication |
   |   | hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisher |
   |   | 
hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher |
   |   | hadoop.yarn.server.resourcemanager.TestClientRMService |
   |   | hadoop.yarn.server.timeline.webapp.TestTimelineWebServicesWithSSL |
   |   | 
hadoop.yarn.server.timeline.security.TestTimelineAuthenticationFilterForV1 |
   |   | 
hadoop.yarn.server.applicationhistoryservice.TestApplicationHistoryServer |
   |   | hadoop.yarn.client.TestGetGroups |
   |   | 
hadoop.yarn.client.TestResourceManagerAdministrationProtocolPBClientImpl |
   |   | hadoop.crypto.key.kms.server.TestKMS |
   |   | hadoop.crypto.key.kms.server.TestKMSWithZK |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpc |
   |   | hadoop.hdfs.server.federation.security.TestRouterSecurityManager |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination |
   |   | hadoop.hdfs.server.namenode.TestFileTruncate |
   |   | hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade |
   |   | hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics |
   |   | hadoop.hdfs.server.datanode.TestBPOfferService |
   |   | hadoop.hdfs.TestDFSInotifyEventInputStreamKerberized |
   |   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
   |   | hadoop.mapreduce.v2.TestMRJobs |
   |   | hadoop.mapreduce.v2.TestMRJobsWithProfiler |
   |   | hadoop.mapreduce.v2.TestRMNMInfo |
   |   | hadoop.mapreduce.v2.TestMRAppWithCombiner |
   |   | hadoop.mapreduce.v2.TestSpeculativeExecutionWithMRApp |
   |   | 

[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=763760=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-763760
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 28/Apr/22 17:50
Start Date: 28/Apr/22 17:50
Worklog Time Spent: 10m 
  Work Description: saintstack opened a new pull request, #4246:
URL: https://github.com/apache/hadoop/pull/4246

   
   
   ### Description of PR
   
   Cherry-pick of 9ed8d60511dccf96108239c5c96e108a7d4bc975
   
   ### How was this patch tested?
   
   
   ### For code changes:
   
   - [ ] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




Issue Time Tracking
---

Worklog Id: (was: 763760)
Time Spent: 3h 40m  (was: 3.5h)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=763731=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-763731
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 28/Apr/22 17:24
Start Date: 28/Apr/22 17:24
Worklog Time Spent: 10m 
  Work Description: saintstack merged PR #4170:
URL: https://github.com/apache/hadoop/pull/4170




Issue Time Tracking
---

Worklog Id: (was: 763731)
Time Spent: 3.5h  (was: 3h 20m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=762942=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762942
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 27/Apr/22 14:56
Start Date: 27/Apr/22 14:56
Worklog Time Spent: 10m 
  Work Description: saintstack commented on PR #4170:
URL: https://github.com/apache/hadoop/pull/4170#issuecomment-106375

   I'll merge later today (unless someone else beats me to it).




Issue Time Tracking
---

Worklog Id: (was: 762942)
Time Spent: 3h 20m  (was: 3h 10m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=762435=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762435
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 26/Apr/22 17:14
Start Date: 26/Apr/22 17:14
Worklog Time Spent: 10m 
  Work Description: huaxiangsun commented on PR #4170:
URL: https://github.com/apache/hadoop/pull/4170#issuecomment-1110051006

   Any more comments? Thanks.




Issue Time Tracking
---

Worklog Id: (was: 762435)
Time Spent: 3h 10m  (was: 3h)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=762042=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762042
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 21:13
Start Date: 25/Apr/22 21:13
Worklog Time Spent: 10m 
  Work Description: huaxiangsun commented on PR #4170:
URL: https://github.com/apache/hadoop/pull/4170#issuecomment-1109046711

   I run the failed test 
[.TestReplaceDatanodeFailureReplication.testWithOnlyLastDatanodeIsAlive] 
locally multiple times, it passed.




Issue Time Tracking
---

Worklog Id: (was: 762042)
Time Spent: 3h  (was: 2h 50m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=761326=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761326
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 23/Apr/22 18:44
Start Date: 23/Apr/22 18:44
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4170:
URL: https://github.com/apache/hadoop/pull/4170#issuecomment-1107606815

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 56s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  16m  1s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  28m  2s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  24m 51s |  |  trunk passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |  21m 32s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 29s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |  19m 53s |  |  trunk passed  |
   | -1 :x: |  javadoc  |   1m 37s | 
[/branch-javadoc-root-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/5/artifact/out/branch-javadoc-root-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt)
 |  root in trunk failed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.  |
   | +1 :green_heart: |  javadoc  |   8m 28s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |  38m 52s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  57m 54s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 27s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |  26m  0s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  24m 27s |  |  the patch passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |  24m 27s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 38s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  21m 38s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m 20s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |  19m 30s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   1m 26s | 
[/patch-javadoc-root-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/5/artifact/out/patch-javadoc-root-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt)
 |  root in the patch failed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04. 
 |
   | +1 :green_heart: |  javadoc  |   8m 58s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |  39m 23s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  57m 59s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 1054m 18s | 
[/patch-unit-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/5/artifact/out/patch-unit-root.txt)
 |  root in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   2m 16s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 1417m 34s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestReplaceDatanodeFailureReplication |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4170 |
   | Optional Tests | dupname asflicense codespell compile javac javadoc 
mvninstall mvnsite unit shadedclient spotbugs checkstyle |
   | uname | Linux 95b69934c683 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 
17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | 

[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=761028=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761028
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 22/Apr/22 19:04
Start Date: 22/Apr/22 19:04
Worklog Time Spent: 10m 
  Work Description: huaxiangsun commented on code in PR #4170:
URL: https://github.com/apache/hadoop/pull/4170#discussion_r856498429


##
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestDatanodeManager.java:
##
@@ -138,6 +138,35 @@ public void testNumVersionsCorrectAfterReregister()
 mapToCheck.get("version1").intValue(), 1);
   }
 
+  /**
+   * This test checks that if a node is re-registered with a different ip, its
+   * host2DatanodeMap is correctly updated with the new ip.
+   */
+  @Test

Review Comment:
   Yeah, the NetworkTopology map is another map. It is quite complicated to add 
a unittest for it. I am going to leave to  hdfs experts to decide if a unittest 
is needed.





Issue Time Tracking
---

Worklog Id: (was: 761028)
Time Spent: 2h 40m  (was: 2.5h)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=761024=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761024
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 22/Apr/22 18:56
Start Date: 22/Apr/22 18:56
Worklog Time Spent: 10m 
  Work Description: huaxiangsun commented on code in PR #4170:
URL: https://github.com/apache/hadoop/pull/4170#discussion_r856493693


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java:
##
@@ -1189,16 +1190,25 @@ public void registerDatanode(DatanodeRegistration 
nodeReg)
 nodes with its data cleared (or user can just remove the StorageID
 value in "VERSION" file under the data directory of the datanode,
 but this is might not work if VERSION file format has changed 
- */
+ */
+  // Check if nodeS's host information is same as nodeReg's, if not,
+  // it needs to update host2DatanodeMap accordringly.
+  updateHost2DatanodeMap = 
!nodeS.getXferAddr().equals(nodeReg.getXferAddr());
+
   NameNode.stateChangeLog.info("BLOCK* registerDatanode: " + nodeS
   + " is replaced by " + nodeReg + " with the same storageID "
-  + nodeReg.getDatanodeUuid());
+  + nodeReg.getDatanodeUuid() + ", updateHost2DatanodeMap: " + 
updateHost2DatanodeMap);

Review Comment:
   I am going to upload the patch which does not log updateHost2DatanodeMap. 





Issue Time Tracking
---

Worklog Id: (was: 761024)
Time Spent: 2.5h  (was: 2h 20m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=761022=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761022
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 22/Apr/22 18:55
Start Date: 22/Apr/22 18:55
Worklog Time Spent: 10m 
  Work Description: huaxiangsun commented on code in PR #4170:
URL: https://github.com/apache/hadoop/pull/4170#discussion_r856493317


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java:
##
@@ -1189,16 +1190,25 @@ public void registerDatanode(DatanodeRegistration 
nodeReg)
 nodes with its data cleared (or user can just remove the StorageID
 value in "VERSION" file under the data directory of the datanode,
 but this is might not work if VERSION file format has changed 
- */
+ */
+  // Check if nodeS's host information is same as nodeReg's, if not,
+  // it needs to update host2DatanodeMap accordringly.
+  updateHost2DatanodeMap = 
!nodeS.getXferAddr().equals(nodeReg.getXferAddr());

Review Comment:
   Yeah, since nodeS is not null, nodeS.getXferAddr() cannot be null.





Issue Time Tracking
---

Worklog Id: (was: 761022)
Time Spent: 2h 20m  (was: 2h 10m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=760247=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-760247
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 21/Apr/22 17:33
Start Date: 21/Apr/22 17:33
Worklog Time Spent: 10m 
  Work Description: huaxiangsun commented on code in PR #4170:
URL: https://github.com/apache/hadoop/pull/4170#discussion_r855427229


##
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestDatanodeManager.java:
##
@@ -138,6 +138,35 @@ public void testNumVersionsCorrectAfterReregister()
 mapToCheck.get("version1").intValue(), 1);
   }
 
+  /**
+   * This test checks that if a node is re-registered with a different ip, its
+   * host2DatanodeMap is correctly updated with the new ip.
+   */
+  @Test

Review Comment:
   When I checked the code, I think there is another one Network map, its code 
is what I followed for this hostToDatanodeMap change. Let me check, would 
rather avoid adding more testing cases for code which is already there. 





Issue Time Tracking
---

Worklog Id: (was: 760247)
Time Spent: 2h 10m  (was: 2h)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=760243=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-760243
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 21/Apr/22 17:28
Start Date: 21/Apr/22 17:28
Worklog Time Spent: 10m 
  Work Description: huaxiangsun commented on code in PR #4170:
URL: https://github.com/apache/hadoop/pull/4170#discussion_r855423782


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java:
##
@@ -1189,16 +1190,25 @@ public void registerDatanode(DatanodeRegistration 
nodeReg)
 nodes with its data cleared (or user can just remove the StorageID
 value in "VERSION" file under the data directory of the datanode,
 but this is might not work if VERSION file format has changed 
- */
+ */
+  // Check if nodeS's host information is same as nodeReg's, if not,
+  // it needs to update host2DatanodeMap accordringly.
+  updateHost2DatanodeMap = 
!nodeS.getXferAddr().equals(nodeReg.getXferAddr());
+
   NameNode.stateChangeLog.info("BLOCK* registerDatanode: " + nodeS
   + " is replaced by " + nodeReg + " with the same storageID "
-  + nodeReg.getDatanodeUuid());
+  + nodeReg.getDatanodeUuid() + ", updateHost2DatanodeMap: " + 
updateHost2DatanodeMap);

Review Comment:
   Yeah, agree. Let me undo this change. 





Issue Time Tracking
---

Worklog Id: (was: 760243)
Time Spent: 2h  (was: 1h 50m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=760242=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-760242
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 21/Apr/22 17:27
Start Date: 21/Apr/22 17:27
Worklog Time Spent: 10m 
  Work Description: huaxiangsun commented on code in PR #4170:
URL: https://github.com/apache/hadoop/pull/4170#discussion_r855422776


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java:
##
@@ -1189,16 +1190,25 @@ public void registerDatanode(DatanodeRegistration 
nodeReg)
 nodes with its data cleared (or user can just remove the StorageID
 value in "VERSION" file under the data directory of the datanode,
 but this is might not work if VERSION file format has changed 
- */
+ */
+  // Check if nodeS's host information is same as nodeReg's, if not,
+  // it needs to update host2DatanodeMap accordringly.
+  updateHost2DatanodeMap = 
!nodeS.getXferAddr().equals(nodeReg.getXferAddr());

Review Comment:
   Thanks Nick for the comment. I think nodeS.getXferAddr() cannot be null, let 
me investigate more and get back.





Issue Time Tracking
---

Worklog Id: (was: 760242)
Time Spent: 1h 50m  (was: 1h 40m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=760139=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-760139
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 21/Apr/22 15:45
Start Date: 21/Apr/22 15:45
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4170:
URL: https://github.com/apache/hadoop/pull/4170#issuecomment-1105400155

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 58s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 45s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  28m  6s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  24m 43s |  |  trunk passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |  20m 54s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   3m 55s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |  25m 47s |  |  trunk passed  |
   | -1 :x: |  javadoc  |   1m 30s | 
[/branch-javadoc-root-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/4/artifact/out/branch-javadoc-root-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt)
 |  root in trunk failed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.  |
   | +1 :green_heart: |  javadoc  |   8m 21s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |  39m  0s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  57m 59s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |  25m 40s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  24m 26s |  |  the patch passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |  24m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 29s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  21m 29s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m 15s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |  20m 28s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   1m 30s | 
[/patch-javadoc-root-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/4/artifact/out/patch-javadoc-root-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt)
 |  root in the patch failed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04. 
 |
   | +1 :green_heart: |  javadoc  |   8m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |  39m 20s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  58m  8s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 1064m 10s | 
[/patch-unit-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/4/artifact/out/patch-unit-root.txt)
 |  root in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   2m 12s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 1431m 51s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector |
   |   | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4170 |
   | Optional Tests | dupname asflicense codespell compile javac javadoc 
mvninstall mvnsite unit shadedclient spotbugs checkstyle |
   | uname | Linux c73e326f6bae 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 
17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
 

[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=759978=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-759978
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 21/Apr/22 12:26
Start Date: 21/Apr/22 12:26
Worklog Time Spent: 10m 
  Work Description: ndimiduk commented on code in PR #4170:
URL: https://github.com/apache/hadoop/pull/4170#discussion_r855124598


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java:
##
@@ -1189,16 +1190,25 @@ public void registerDatanode(DatanodeRegistration 
nodeReg)
 nodes with its data cleared (or user can just remove the StorageID
 value in "VERSION" file under the data directory of the datanode,
 but this is might not work if VERSION file format has changed 
- */
+ */
+  // Check if nodeS's host information is same as nodeReg's, if not,
+  // it needs to update host2DatanodeMap accordringly.
+  updateHost2DatanodeMap = 
!nodeS.getXferAddr().equals(nodeReg.getXferAddr());
+
   NameNode.stateChangeLog.info("BLOCK* registerDatanode: " + nodeS
   + " is replaced by " + nodeReg + " with the same storageID "
-  + nodeReg.getDatanodeUuid());
+  + nodeReg.getDatanodeUuid() + ", updateHost2DatanodeMap: " + 
updateHost2DatanodeMap);

Review Comment:
   Is this extra information needed at the `INFO` level log? I understand that 
having the value printed is helpful during development, but I don't think it's 
meaningful to an operator.
   
   Also, if you're here to change a log message, maybe also change it to use 
the format string version instead of string concatenation?



##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java:
##
@@ -1189,16 +1190,25 @@ public void registerDatanode(DatanodeRegistration 
nodeReg)
 nodes with its data cleared (or user can just remove the StorageID
 value in "VERSION" file under the data directory of the datanode,
 but this is might not work if VERSION file format has changed 
- */
+ */
+  // Check if nodeS's host information is same as nodeReg's, if not,
+  // it needs to update host2DatanodeMap accordringly.
+  updateHost2DatanodeMap = 
!nodeS.getXferAddr().equals(nodeReg.getXferAddr());

Review Comment:
   Can `nodeS.getXferAddr()` ever be null? Use of `Objects.equals()` would be 
safer.



##
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestDatanodeManager.java:
##
@@ -138,6 +138,35 @@ public void testNumVersionsCorrectAfterReregister()
 mapToCheck.get("version1").intValue(), 1);
   }
 
+  /**
+   * This test checks that if a node is re-registered with a different ip, its
+   * host2DatanodeMap is correctly updated with the new ip.
+   */
+  @Test

Review Comment:
   It looks like there's a couple caches like this one. Is there equivalent 
test coverage for all the registered DN state that is updated on registration? 
Can you add coverage for any of the other states that are missing?





Issue Time Tracking
---

Worklog Id: (was: 759978)
Time Spent: 1.5h  (was: 1h 20m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=758019=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-758019
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 18/Apr/22 17:29
Start Date: 18/Apr/22 17:29
Worklog Time Spent: 10m 
  Work Description: huaxiangsun commented on code in PR #4170:
URL: https://github.com/apache/hadoop/pull/4170#discussion_r852277677


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java:
##
@@ -1189,16 +1190,26 @@ public void registerDatanode(DatanodeRegistration 
nodeReg)
 nodes with its data cleared (or user can just remove the StorageID
 value in "VERSION" file under the data directory of the datanode,
 but this is might not work if VERSION file format has changed 
- */
+ */
+  // Check if nodeS's host information is same as nodeReg's, if not,
+  // it needs to update host2DatanodeMap accordringly.
+  updateHost2DatanodeMap = 
!nodeS.getIpAddr().equals(nodeReg.getIpAddr()) ||
+  (nodeS.getXferPort() != nodeReg.getXferPort());
+

Review Comment:
   Updated.





Issue Time Tracking
---

Worklog Id: (was: 758019)
Time Spent: 1h 20m  (was: 1h 10m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=757979=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-757979
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 18/Apr/22 16:45
Start Date: 18/Apr/22 16:45
Worklog Time Spent: 10m 
  Work Description: huaxiangsun commented on code in PR #4170:
URL: https://github.com/apache/hadoop/pull/4170#discussion_r852249004


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java:
##
@@ -1189,16 +1190,26 @@ public void registerDatanode(DatanodeRegistration 
nodeReg)
 nodes with its data cleared (or user can just remove the StorageID
 value in "VERSION" file under the data directory of the datanode,
 but this is might not work if VERSION file format has changed 
- */
+ */
+  // Check if nodeS's host information is same as nodeReg's, if not,
+  // it needs to update host2DatanodeMap accordringly.
+  updateHost2DatanodeMap = 
!nodeS.getIpAddr().equals(nodeReg.getIpAddr()) ||
+  (nodeS.getXferPort() != nodeReg.getXferPort());
+

Review Comment:
   Makes sense as getXferAddr() is "getIpAddr():getXferPort()". 





Issue Time Tracking
---

Worklog Id: (was: 757979)
Time Spent: 1h 10m  (was: 1h)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=757552=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-757552
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 16/Apr/22 04:10
Start Date: 16/Apr/22 04:10
Worklog Time Spent: 10m 
  Work Description: tomscut commented on code in PR #4170:
URL: https://github.com/apache/hadoop/pull/4170#discussion_r851582741


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java:
##
@@ -1189,16 +1190,26 @@ public void registerDatanode(DatanodeRegistration 
nodeReg)
 nodes with its data cleared (or user can just remove the StorageID
 value in "VERSION" file under the data directory of the datanode,
 but this is might not work if VERSION file format has changed 
- */
+ */
+  // Check if nodeS's host information is same as nodeReg's, if not,
+  // it needs to update host2DatanodeMap accordringly.
+  updateHost2DatanodeMap = 
!nodeS.getIpAddr().equals(nodeReg.getIpAddr()) ||
+  (nodeS.getXferPort() != nodeReg.getXferPort());
+

Review Comment:
   Can we update 
   `updateHost2DatanodeMap = !nodeS.getIpAddr().equals(nodeReg.getIpAddr()) ||
 (nodeS.getXferPort() != nodeReg.getXferPort());`
   to
   `updateHost2DatanodeMap = 
!nodeS.getXferAddr().equals(nodeReg.getXferAddr());`





Issue Time Tracking
---

Worklog Id: (was: 757552)
Time Spent: 1h  (was: 50m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=757526=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-757526
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 15/Apr/22 23:15
Start Date: 15/Apr/22 23:15
Worklog Time Spent: 10m 
  Work Description: huaxiangsun commented on code in PR #4170:
URL: https://github.com/apache/hadoop/pull/4170#discussion_r851552322


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java:
##
@@ -1189,16 +1190,26 @@ public void registerDatanode(DatanodeRegistration 
nodeReg)
 nodes with its data cleared (or user can just remove the StorageID
 value in "VERSION" file under the data directory of the datanode,
 but this is might not work if VERSION file format has changed 
- */
+ */
+  // Check if nodeS's host information is same as nodeReg's, if not,
+  // it needs to update host2DatanodeMap accordringly.
+  updateHost2DatanodeMap = 
!nodeS.getIpAddr().equals(nodeReg.getAddress()) ||

Review Comment:
   Updated and verified.





Issue Time Tracking
---

Worklog Id: (was: 757526)
Time Spent: 50m  (was: 40m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=757478=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-757478
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 15/Apr/22 18:46
Start Date: 15/Apr/22 18:46
Worklog Time Spent: 10m 
  Work Description: huaxiangsun commented on code in PR #4170:
URL: https://github.com/apache/hadoop/pull/4170#discussion_r851432390


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java:
##
@@ -1189,16 +1190,26 @@ public void registerDatanode(DatanodeRegistration 
nodeReg)
 nodes with its data cleared (or user can just remove the StorageID
 value in "VERSION" file under the data directory of the datanode,
 but this is might not work if VERSION file format has changed 
- */
+ */
+  // Check if nodeS's host information is same as nodeReg's, if not,
+  // it needs to update host2DatanodeMap accordringly.
+  updateHost2DatanodeMap = 
!nodeS.getIpAddr().equals(nodeReg.getAddress()) ||

Review Comment:
   Good catch! Let me update it to getIpAddr(), will update shortly.





Issue Time Tracking
---

Worklog Id: (was: 757478)
Time Spent: 40m  (was: 0.5h)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=757262=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-757262
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 15/Apr/22 01:26
Start Date: 15/Apr/22 01:26
Worklog Time Spent: 10m 
  Work Description: tomscut commented on code in PR #4170:
URL: https://github.com/apache/hadoop/pull/4170#discussion_r850967827


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java:
##
@@ -1189,16 +1190,26 @@ public void registerDatanode(DatanodeRegistration 
nodeReg)
 nodes with its data cleared (or user can just remove the StorageID
 value in "VERSION" file under the data directory of the datanode,
 but this is might not work if VERSION file format has changed 
- */
+ */
+  // Check if nodeS's host information is same as nodeReg's, if not,
+  // it needs to update host2DatanodeMap accordringly.
+  updateHost2DatanodeMap = 
!nodeS.getIpAddr().equals(nodeReg.getAddress()) ||

Review Comment:
   `nodeReg.getAddress()` contains port, but `nodeS.getIpAddr()` doesn't, so 
`updateHost2DatanodeMap` is always `true`, right?





Issue Time Tracking
---

Worklog Id: (was: 757262)
Time Spent: 0.5h  (was: 20m)

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=756827=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-756827
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 14/Apr/22 03:46
Start Date: 14/Apr/22 03:46
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4170:
URL: https://github.com/apache/hadoop/pull/4170#issuecomment-1098680455

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  17m  3s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  41m 35s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 31s |  |  trunk passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   1m 20s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m  5s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 31s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  7s |  |  trunk passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 31s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 40s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  26m 14s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 25s |  |  the patch passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |   1m 25s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 53s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 54s |  |  the patch passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 26s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 32s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  26m  6s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 341m 40s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 41s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 474m  5s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4170 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 98cf76256475 4.15.0-162-generic #170-Ubuntu SMP Mon Oct 18 
11:38:05 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 1316ff0eada1e29dec8ca56ab266c9bcbe60051c |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/1/testReport/ |
   | Max. process+thread count | 2175 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/1/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This 

[jira] [Work logged] (HDFS-16540) Data locality is lost when DataNode pod restarts in kubernetes

2022-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16540?focusedWorklogId=756643=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-756643
 ]

ASF GitHub Bot logged work on HDFS-16540:
-

Author: ASF GitHub Bot
Created on: 13/Apr/22 19:51
Start Date: 13/Apr/22 19:51
Worklog Time Spent: 10m 
  Work Description: huaxiangsun opened a new pull request, #4170:
URL: https://github.com/apache/hadoop/pull/4170

   …etes
   
   
   
   ### Description of PR
   When Dn with the same uuid is registered with a different ip, 
host2DatanodeMap needs to be updated accordingly.
   
   ### How was this patch tested?
   Tested 3.3.2 with the patch on a eks cluster, restarted the pod hosting 
DataNode and HBase region server. After that, doing a major compaction of Hbase 
region, made sure that locality is kept.
   
   There is also a new unittest case added.
   
   ### For code changes:
   
   - [ ] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




Issue Time Tracking
---

Worklog Id: (was: 756643)
Remaining Estimate: 0h
Time Spent: 10m

> Data locality is lost when DataNode pod restarts in kubernetes 
> ---
>
> Key: HDFS-16540
> URL: https://issues.apache.org/jira/browse/HDFS-16540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.2
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We have HBase RegionServer and Hdfs DataNode running in one pod. When the pod 
> restarts, we found that data locality is lost after we do a major compaction 
> of hbase regions. After some debugging, we found that upon pod restarts, its 
> ip changes. In DatanodeManager, maps like networktopology are updated with 
> the new info. host2DatanodeMap is not updated accordingly. When hdfs client 
> with the new ip tries to find a local DataNode, it fails. 
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org