[
https://issues.apache.org/jira/browse/HDFS-16016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17828914#comment-17828914
]
Xiping Zhang commented on HDFS-16016:
-------------------------------------
HDFS-16016 is a good improvement, and in our production environment, we have
some large DN with 24 disks and total blocks reaching more than 10 million.
With the development of hardware, it is possible that the DN will become
larger, and if the FBR and IBR are coupled together, the impact on the service
is great. HDFS-16016 can solve exactly this scaling problem of DN. For this
issue HDFS-17129, I have a solution, which is to redefine the semantics of the
FBR. Instead of requiring DN to align all blocks with Namenode by 100% in FBR
this time, we only need to compare all blocks before the last block of FBR,
although the FBR missed some blocks from the incremental report.I've drawn a
diagram for ease of understanding:
* step1:It is the NN processing blockreport process before HDFS-16016 is
upgraded
* step2:NN handles the blockreport process before upgrading HDFS-16016, but
there will be problems HDFS-17129
* step3:We operate only on the blocks before the last zero bound point of the
FBR
* step4:Blocks not manipulated by the previous FBR are processed by the next
FBR, unless the DN does not add any new blocks between FBRS
!image-2024-03-20-18-31-23-937.png!
[~liuguanghua] [~hexiaoqiao] [~tasanuma] hello, Do you have any good
suggestions for me to understand FBR now and make this plan? Using lock
restrictions here would be like going back to square one. If we use this
solution, we only need to remove the remaining to_remove block, and we only
need to remove one piece of code.
> BPServiceActor add a new thread to handle IBR
> ---------------------------------------------
>
> Key: HDFS-16016
> URL: https://issues.apache.org/jira/browse/HDFS-16016
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: JiangHua Zhu
> Assignee: Viraj Jasani
> Priority: Minor
> Labels: pull-request-available
> Fix For: 3.3.6
>
> Attachments: image-2023-11-03-18-11-54-502.png,
> image-2023-11-06-10-53-13-584.png, image-2023-11-06-10-55-50-939.png,
> image-2024-03-20-18-31-23-937.png
>
> Time Spent: 5h 20m
> Remaining Estimate: 0h
>
> Now BPServiceActor#offerService() is doing many things, FBR, IBR, heartbeat.
> We can handle IBR independently to improve the performance of heartbeat and
> FBR.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]