[
https://issues.apache.org/jira/browse/HDDS-10974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
GuoHao updated HDDS-10974:
--------------------------
Description:
由于
{code:java}
2024-06-03 13:54:45,740 [ContainerReplicationThread-2] INFO
org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinatorTask:
IN_PROGRESS reconstructECContainersCommand: containerID=886692,
replication=rs-10-3-2048k, missingIndexes=[8],
sources={1=24628bc9-5d7b-4310-a21f-9a35e2634fb4(10.175.134.200/10.175.134.200),
2=1ea19c39-5395-4158-bbf9-5ac9f6f057b6(10.175.137.7/10.175.137.7),
3=debf4e51-91b2-4b1b-b0ab-9b06a7939470(10.175.138.93/10.175.138.93),
4=8c0800ad-0026-4fdd-bd6e-6d866e166e49(10.175.137.25/10.175.137.25),
5=1c47e953-1738-43f7-ac16-8facd88be4c6(10.175.137.26/10.175.137.26),
6=2a598049-6f33-4f18-a32a-f9d1f2ad399d(10.175.137.43/10.175.137.43),
7=fe278928-331d-49f0-8525-b4f9a7a2ed41(10.175.134.154/10.175.134.154),
9=c23a4a3c-183a-4baf-ada4-e30800faa907(10.175.134.219/10.175.134.219),
10=6db736df-f4bd-4bd5-a7bd-adf25657487b(10.175.137.8/10.175.137.8),
11=22be4dbf-b278-4537-af37-dd36b2d746bc(10.175.138.74/10.175.138.74),
12=c02658fa-898a-4406-a778-87653c2723c2(10.175.137.27/10.175.137.27),
13=1eee02b5-8e7d-4046-8732-d932e56e0aa3(10.175.138.75/10.175.138.75)},
targets={8=e0ce60b3-75d5-4d00-bcb9-7781ef61e827(10.175.134.135/10.175.134.135)}
2024-06-03 13:54:46,828 [ContainerReplicationThread-2] DEBUG
org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator:
Creating container 886692 on datanode
e0ce60b3-75d5-4d00-bcb9-7781ef61e827(10.175.134.135/10.175.134.135) for index 8
2024-06-03 13:55:45,476 [ContainerReplicationThread-0] WARN
org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator:
Exception while reconstructing the container 872840. Cleaning up all the
recovering containers in the reconstruction process.
java.io.IOException: Chunk write failed at the new target node:
e0ce60b3-75d5-4d00-bcb9-7781ef61e827(10.175.134.135/10.175.134.135). Aborting
the reconstruction process.
at
org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator.checkFailures(ECReconstructionCoordinator.java:332)
at
org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator.reconstructECBlockGroup(ECReconstructionCoordinator.java:299)
at
org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator.reconstructECContainerGroup(ECReconstructionCoordinator.java:176)
at
org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinatorTask.runTask(ECReconstructionCoordinatorTask.java:68)
at
org.apache.hadoop.ozone.container.replication.ReplicationSupervisor$TaskRunner.run(ReplicationSupervisor.java:359)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Unexpected Storage Container Exception:
org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
Requested operation not allowed as ContainerState is UNHEALTHY
at
org.apache.hadoop.hdds.scm.storage.BlockOutputStream.setIoException(BlockOutputStream.java:635)
at
org.apache.hadoop.hdds.scm.storage.ECBlockOutputStream.validateResponse(ECBlockOutputStream.java:323)
at
org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$2(BlockOutputStream.java:747)
at
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
at
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
at
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
... 3 more
Caused by:
org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
Requested operation not allowed as ContainerState is UNHEALTHY
at
org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:707)
at
org.apache.hadoop.hdds.scm.storage.ECBlockOutputStream.validateResponse(ECBlockOutputStream.java:321)
... 7 more{code}
> Exclude containers executing ec reconstruct in the
> StaleRecoveringContainerScrubbingService
> -------------------------------------------------------------------------------------------
>
> Key: HDDS-10974
> URL: https://issues.apache.org/jira/browse/HDDS-10974
> Project: Apache Ozone
> Issue Type: Improvement
> Reporter: GuoHao
> Priority: Major
>
>
> 由于
>
>
> {code:java}
> 2024-06-03 13:54:45,740 [ContainerReplicationThread-2] INFO
> org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinatorTask:
> IN_PROGRESS reconstructECContainersCommand: containerID=886692,
> replication=rs-10-3-2048k, missingIndexes=[8],
> sources={1=24628bc9-5d7b-4310-a21f-9a35e2634fb4(10.175.134.200/10.175.134.200),
> 2=1ea19c39-5395-4158-bbf9-5ac9f6f057b6(10.175.137.7/10.175.137.7),
> 3=debf4e51-91b2-4b1b-b0ab-9b06a7939470(10.175.138.93/10.175.138.93),
> 4=8c0800ad-0026-4fdd-bd6e-6d866e166e49(10.175.137.25/10.175.137.25),
> 5=1c47e953-1738-43f7-ac16-8facd88be4c6(10.175.137.26/10.175.137.26),
> 6=2a598049-6f33-4f18-a32a-f9d1f2ad399d(10.175.137.43/10.175.137.43),
> 7=fe278928-331d-49f0-8525-b4f9a7a2ed41(10.175.134.154/10.175.134.154),
> 9=c23a4a3c-183a-4baf-ada4-e30800faa907(10.175.134.219/10.175.134.219),
> 10=6db736df-f4bd-4bd5-a7bd-adf25657487b(10.175.137.8/10.175.137.8),
> 11=22be4dbf-b278-4537-af37-dd36b2d746bc(10.175.138.74/10.175.138.74),
> 12=c02658fa-898a-4406-a778-87653c2723c2(10.175.137.27/10.175.137.27),
> 13=1eee02b5-8e7d-4046-8732-d932e56e0aa3(10.175.138.75/10.175.138.75)},
> targets={8=e0ce60b3-75d5-4d00-bcb9-7781ef61e827(10.175.134.135/10.175.134.135)}
> 2024-06-03 13:54:46,828 [ContainerReplicationThread-2] DEBUG
> org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator:
> Creating container 886692 on datanode
> e0ce60b3-75d5-4d00-bcb9-7781ef61e827(10.175.134.135/10.175.134.135) for index
> 8
> 2024-06-03 13:55:45,476 [ContainerReplicationThread-0] WARN
> org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator:
> Exception while reconstructing the container 872840. Cleaning up all the
> recovering containers in the reconstruction process.
> java.io.IOException: Chunk write failed at the new target node:
> e0ce60b3-75d5-4d00-bcb9-7781ef61e827(10.175.134.135/10.175.134.135). Aborting
> the reconstruction process.
> at
> org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator.checkFailures(ECReconstructionCoordinator.java:332)
> at
> org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator.reconstructECBlockGroup(ECReconstructionCoordinator.java:299)
> at
> org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator.reconstructECContainerGroup(ECReconstructionCoordinator.java:176)
> at
> org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinatorTask.runTask(ECReconstructionCoordinatorTask.java:68)
> at
> org.apache.hadoop.ozone.container.replication.ReplicationSupervisor$TaskRunner.run(ReplicationSupervisor.java:359)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Unexpected Storage Container Exception:
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
> Requested operation not allowed as ContainerState is UNHEALTHY
> at
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.setIoException(BlockOutputStream.java:635)
> at
> org.apache.hadoop.hdds.scm.storage.ECBlockOutputStream.validateResponse(ECBlockOutputStream.java:323)
> at
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$2(BlockOutputStream.java:747)
> at
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
> at
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
> at
> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
> ... 3 more
> Caused by:
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
> Requested operation not allowed as ContainerState is UNHEALTHY
> at
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:707)
> at
> org.apache.hadoop.hdds.scm.storage.ECBlockOutputStream.validateResponse(ECBlockOutputStream.java:321)
> ... 7 more{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]