[
https://issues.apache.org/jira/browse/HDFS-16668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18036559#comment-18036559
]
ASF GitHub Bot commented on HDFS-16668:
---------------------------------------
github-actions[bot] closed pull request #4577: HDFS-16668. Clean up
moverExecutor after each iterations.
URL: https://github.com/apache/hadoop/pull/4577
> Clean up MoverExecutor after each iteration to avoid potential thread leak
> --------------------------------------------------------------------------
>
> Key: HDFS-16668
> URL: https://issues.apache.org/jira/browse/HDFS-16668
> Project: Hadoop HDFS
> Issue Type: Improvement
> Affects Versions: 3.3.3
> Reporter: Tai Zhou
> Priority: Major
> Labels: pull-request-available
> Attachments: screenshot-1.png, screenshot-2.png
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Hi,
> I am working on a HDFS smart storage management project recently. It is based
> on the Mover in Hadoop-hdfs project. I noticed that most code in Mover is
> similar to Balancer. However, Mover doesn't clean up MoverExecutor as
> Balancer does.
> If we have multiple NameSystem for Namenode Connectors or have a large number
> of datanodes, Mover will result in threads leaking because there might be
> numerous iterations to process these namespaces. Like our project, we
> modified some source code so that we can use mover.run() once we found the
> blocks did not match the expected storage policies. So our application will
> initialize Namenode Connector and Mover continually. It turns out we have
> thousands of threads or threads pools for MoverExecutor.
> here is what it looks like. We can see here are 9000+ threads like this in
> WAIT condition.
> !screenshot-2.png|width=558,height=209!
> I know generally users may not use Mover like us. They might use it by CLI.
> But more and more users are planing to apply RBF or multiple NameSystems, or
> with a large cluster of datanodes. Mover CLI have to keep more than thousands
> of thread after pressing the enter key.
> I have pulled a quick fix code, if you guys are interested, plz take a look
> at it.
> thx.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]