[
https://issues.apache.org/jira/browse/HDFS-15589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17199870#comment-17199870
]
Xiaoqiao He commented on HDFS-15589:
------------------------------------
Thanks [~zhengchenyu] for your report. Just wonder if any impact to NameNode
when PMB(abbr. `PostponedMisreplicatedBlocks`) keeps large number for long
time? The largest number of PMB near to 100M in my practice, and I do not meet
any performance issue with my inner branch. Any issues do you meet? Thanks.
> Huge PostponedMisreplicatedBlocks can't decrease immediately when start
> namenode after datanode
> -----------------------------------------------------------------------------------------------
>
> Key: HDFS-15589
> URL: https://issues.apache.org/jira/browse/HDFS-15589
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs
> Environment: CentOS 7
> Reporter: zhengchenyu
> Priority: Major
>
> In our test cluster, I restart my namenode. Then I found many
> PostponedMisreplicatedBlocks which doesn't decrease immediately.
> I search the log below like this.
> {code:java}
> 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport:
> from DatanodeRegistration(xx.xx.xx.xx:9866,
> datanodeUuid=c6a9934f-afd4-4437-b976-fed55173ce57, infoPort=9864,
> infoSecurePort=0, ipcPort=9867,
> storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834),
> reports.length=12
> 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport:
> from DatanodeRegistration(xx.xx.xx.xx:9866,
> datanodeUuid=aee144f1-2082-4bca-a92b-f3c154a71c65, infoPort=9864,
> infoSecurePort=0, ipcPort=9867,
> storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834),
> reports.length=12
> 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport:
> from DatanodeRegistration(xx.xx.xx.xx:9866,
> datanodeUuid=d152fa5b-1089-4bfc-b9c4-e3a7d98c7a7b, infoPort=9864,
> infoSecurePort=0, ipcPort=9867,
> storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834),
> reports.length=12
> 2020-09-21 17:02:37,156 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport:
> from DatanodeRegistration(xx.xx.xx.xx:9866,
> datanodeUuid=5cffc1fe-ace9-4af8-adfc-6002a7f5565d, infoPort=9864,
> infoSecurePort=0, ipcPort=9867,
> storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834),
> reports.length=12
> 2020-09-21 17:02:37,161 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport:
> from DatanodeRegistration(xx.xx.xx.xx:9866,
> datanodeUuid=9980d8e1-b0d9-4657-b97d-c803f82c1459, infoPort=9864,
> infoSecurePort=0, ipcPort=9867,
> storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834),
> reports.length=12
> 2020-09-21 17:02:37,197 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport:
> from DatanodeRegistration(xx.xx.xx.xx:9866,
> datanodeUuid=77ff3f5e-37f0-405f-a16c-166311546cae, infoPort=9864,
> infoSecurePort=0, ipcPort=9867,
> storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834),
> reports.length=12
> {code}
> Node: test cluster only have 6 datanode.
> You will see the blockreport called before "Marking all datanodes as stale"
> which is logged by startActiveServices. But
> DatanodeStorageInfo.blockContentsStale only set to false in blockreport, then
> startActiveServices set all datnaode to stale node. So the datanodes will
> keep stale util next blockreport, then PostponedMisreplicatedBlocks keep a
> huge number.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]