Ren Koike created HDDS-12040:
--------------------------------
Summary: Fix ozone freon cr command
Key: HDDS-12040
URL: https://issues.apache.org/jira/browse/HDDS-12040
Project: Apache Ozone
Issue Type: Bug
Components: freon
Reporter: Ren Koike
Assignee: Ren Koike
"freon cr" command does not work because a map that record replication metrics
in ReplicationSupervisor doesn't have a given key.
How to reproduce the error:
{code:java}
// create a key
ozone freon ockg -n 10
// close a container
ozone admin container close 1
// run the command
ozone freon cr -n 1{code}
The error:
{code:java}
bash-4.4$ ozone freon cr -n 1
2025-01-07 11:39:33,018 [main] INFO impl.MetricsConfig: Loaded properties from
hadoop-metrics2.properties
2025-01-07 11:39:33,137 [main] INFO impl.MetricsSystemImpl: Scheduled Metric
snapshot period at 10 second(s).
2025-01-07 11:39:33,137 [main] INFO impl.MetricsSystemImpl: ozone-freon metrics
system started
2025-01-07 11:39:33,525 [main] WARN cli.ContainerOperationClient: Attempting to
list 1000000 containers. However, this exceeds the cluster's current limit of
4096. The results will be capped at the maximum allowed count.
2025-01-07 11:39:33,859 [main] INFO utils.LeakDetector: Starting leak detector
thread ManagedRocksObject0.
2025-01-07 11:39:33,966 [main] INFO fs.SaveSpaceUsageToFile: Cached usage info
file /data/hdds/scmUsed not found
2025-01-07 11:39:33,977 [main] INFO volume.HddsVolume: Creating HddsVolume:
/data/hdds/hdds of storage type : DISK capacity : 62664830746
2025-01-07 11:39:33,978 [main] INFO volume.MutableVolumeSet: Added Volume :
/data/hdds/hdds to VolumeSet
2025-01-07 11:39:33,984 [main] INFO impl.FilePerChunkStrategy:
ozone.chunk.read.mapped.buffer.max.count is load with 0
2025-01-07 11:39:33,998 [main] INFO impl.FilePerBlockStrategy:
ozone.chunk.read.mapped.buffer.max.count is load with 0
2025-01-07 11:39:34,260 [main] INFO replication.ReplicationSupervisor:
Initializing replication supervisor with thread count = 10
2025-01-07 11:39:34,303 [main] INFO freon.BaseFreonGenerator: Executing test
with prefix 0saivhku9q and number-of-tests 1 0.00% |?
| 0/1
Time: 0:00:00| 2025-01-07 11:39:34,341 [pool-6-thread-1] WARN
replication.ReplicationSupervisor: Failed FAILED replicateContainerCommand:
containerId=1, replicaIndex=0,
sourceNodes=[88b91fe4-eaa3-4d3f-a426-d72e70abbe19(ozone-datanode-1.ozone_default/172.18.0.8)],
priority=NORMAL
java.lang.NullPointerException: Cannot invoke
"java.util.concurrent.atomic.AtomicLong.incrementAndGet()" because the return
value of "java.util.Map.get(Object)" is null
at
org.apache.hadoop.ozone.container.replication.ReplicationSupervisor$TaskRunner.run(ReplicationSupervisor.java:370)
at
org.apache.hadoop.ozone.freon.ClosedContainerReplicator.lambda$replicateContainer$1(ClosedContainerReplicator.java:243)
at com.codahale.metrics.Timer.time(Timer.java:101)
at
org.apache.hadoop.ozone.freon.ClosedContainerReplicator.replicateContainer(ClosedContainerReplicator.java:240)
at
org.apache.hadoop.ozone.freon.BaseFreonGenerator.tryNextTask(BaseFreonGenerator.java:220)
at
org.apache.hadoop.ozone.freon.BaseFreonGenerator.taskLoop(BaseFreonGenerator.java:200)
at
org.apache.hadoop.ozone.freon.BaseFreonGenerator.lambda$startTaskRunners$0(BaseFreonGenerator.java:174)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
2025-01-07 11:39:34,343 [pool-6-thread-1] ERROR freon.BaseFreonGenerator: Error
on executing task 0
java.lang.NullPointerException: Cannot invoke
"java.util.concurrent.atomic.AtomicLong.decrementAndGet()" because the return
value of "java.util.Map.get(Object)" is null
at
org.apache.hadoop.ozone.container.replication.ReplicationSupervisor$TaskRunner.run(ReplicationSupervisor.java:417)
at
org.apache.hadoop.ozone.freon.ClosedContainerReplicator.lambda$replicateContainer$1(ClosedContainerReplicator.java:243)
at com.codahale.metrics.Timer.time(Timer.java:101)
at
org.apache.hadoop.ozone.freon.ClosedContainerReplicator.replicateContainer(ClosedContainerReplicator.java:240)
at
org.apache.hadoop.ozone.freon.BaseFreonGenerator.tryNextTask(BaseFreonGenerator.java:220)
at
org.apache.hadoop.ozone.freon.BaseFreonGenerator.taskLoop(BaseFreonGenerator.java:200)
at
org.apache.hadoop.ozone.freon.BaseFreonGenerator.lambda$startTaskRunners$0(BaseFreonGenerator.java:174)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
100.00%
|?????????????????????????????????????????????????????????????????????????????????????????????????????|
1/1 Time: 0:00:01|
One or more freon test is failed.
1/7/25, 11:39:35 AM
============================================================-- Timers
----------------------------------------------------------------------
replicate-container
count = 1
mean rate = 0.98 calls/second
1-minute rate = 0.00 calls/second
5-minute rate = 0.00 calls/second
15-minute rate = 0.00 calls/second
min = 2.42 milliseconds
max = 2.42 milliseconds
mean = 2.42 milliseconds
stddev = 0.00 milliseconds
median = 2.42 milliseconds
75% <= 2.42 milliseconds
95% <= 2.42 milliseconds
98% <= 2.42 milliseconds
99% <= 2.42 milliseconds
99.9% <= 2.42 milliseconds
Total execution time (sec): 1
Failures: 1
Successful executions: 0
Expected 1 --number-of-tests objects!, successfully executed 0
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]