I got the same error adding -asService in the command line (metrics
already exists), the only diff is that it will retry every 5 mins

2025-03-09 15:05:04,542 INFO balancer.Balancer: Finished one round, will
wait for 5.0 minutes for next round

That does not seem a good workaround, my cluster have hundreds of TB to
rebalance when adding a data node, and I don't remember having such issues
when I was using hadoop 2.9.1.
Is there any issue with balancer on recent hadoop versions?

Thanks,
Sébastien

Le dim. 9 mars 2025 à 16:02, Sébastien Rebecchi <srebec...@kameleoon.com> a
écrit :

> OK I can try then, hoping it will help.
> Btw even if it works, it does not explain this metrics exception.
> Any idea how to solve this, I can't find a way to delete that metrics in
> any hadoop doc.
>
> Thanks
>
> Sébastien.
>
> Le dim. 9 mars 2025 à 15:39, Zhanghaobo <hfutzhan...@163.com> a écrit :
>
>> got it, you can use it as a service and see what will happen.
>>
>> ---- Replied Message ----
>> From Sébastien Rebecchi<srebec...@kameleoon.com>
>> <srebec...@kameleoon.com>
>> Date 03/09/2025 22:22
>> To Zhanghaobo<hfutzhan...@163.com> <hfutzhan...@163.com>
>> Cc user@hadoop.apache.org、hdfs-...@hadoop.apache.org
>> Subject Re: Can not run HDFS balancer cause metrics already exists
>> Hi Zhanghaobo,
>>
>> Thanks for the message.
>>
>> No I don't use as service, as I said the command line is the following: hdfs
>> balancer -Ddfs.balancer.movedWinWidth=5400000
>> -Ddfs.balancer.moverThreads=1000 -Ddfs.balancer.dispatcherThreads=200
>> -Ddfs.datanode.balance.max.concurrent.moves=50
>> -Ddfs.datanode.balance.bandwidthPerSec=100m
>> -Ddfs.balancer.max-size-to-move=10737418240 -threshold 1
>>
>> Also no other balancer is running concurrently on any other node.
>>
>> Sébastien
>>
>> Le dim. 9 mars 2025 à 13:57, Zhanghaobo <hfutzhan...@163.com> a écrit :
>>
>>>
>>> Hi,  @Sébastien Rebecchi
>>> Don't know more details about how you start balancer, did you use
>>> -asService?
>>>
>>>
>>> ---- Replied Message ----
>>> From Sébastien Rebecchi<srebec...@kameleoon.com.INVALID>
>>> <srebec...@kameleoon.com.INVALID>
>>> Date 3/9/2025 18:03
>>> To <user@hadoop.apache.org>,
>>> <user@hadoop.apache.org><hdfs-...@hadoop.apache.org>
>>> <hdfs-...@hadoop.apache.org>
>>> Subject Re: Can not run HDFS balancer cause metrics already exists
>>> Hello
>>>
>>> Could anyone help on this please?
>>> Situation is still the same after several days.
>>> I add some precisions
>>> - hadoop version 3.4.1
>>> - balancer command line run: hdfs balancer
>>> -Ddfs.balancer.movedWinWidth=5400000 -Ddfs.balancer.moverThreads=1000
>>> -Ddfs.balancer.dispatcherThreads=200
>>> -Ddfs.datanode.balance.max.concurrent.moves=50
>>> -Ddfs.datanode.balance.bandwidthPerSec=100m
>>> -Ddfs.balancer.max-size-to-move=10737418240 -threshold 1
>>>
>>> Thank you
>>>
>>>
>>> Le mar. 4 mars 2025, 16:59, Sébastien Rebecchi <srebec...@kameleoon.com>
>>> a écrit :
>>>
>>>> Hello
>>>>
>>>> After having added a new node on my HDFS cluster, I try running
>>>> balancer, but it always fails with the following error, even after retrying
>>>> multiple times during the day, and even after having restarted name node
>>>> What should I do to unlock?
>>>>
>>>> Thanks,
>>>>
>>>> Sébastien
>>>>
>>>>
>>>> ERROR balancer.Balancer: Exiting balancer due an exception
>>>> org.apache.hadoop.metrics2.MetricsException: Metrics source
>>>> Balancer-{HERE REPLACE BY CLUSTER'S BLOCK POOL ID} already exists!
>>>>         at
>>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:152)
>>>>         at
>>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:125)
>>>>         at
>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:229)
>>>>         at
>>>> org.apache.hadoop.hdfs.server.balancer.BalancerMetrics.create(BalancerMetrics.java:52)
>>>>         at
>>>> org.apache.hadoop.hdfs.server.balancer.Balancer.<init>(Balancer.java:362)
>>>>         at
>>>> org.apache.hadoop.hdfs.server.balancer.Balancer.doBalance(Balancer.java:824)
>>>>         at
>>>> org.apache.hadoop.hdfs.server.balancer.Balancer.run(Balancer.java:868)
>>>>         at
>>>> org.apache.hadoop.hdfs.server.balancer.Balancer$Cli.run(Balancer.java:975)
>>>>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
>>>>         at
>>>> org.apache.hadoop.hdfs.server.balancer.Balancer.main(Balancer.java:1133)
>>>>
>>>

Reply via email to