[
https://issues.apache.org/jira/browse/HDDS-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17154442#comment-17154442
]
Sammi Chen commented on HDDS-3852:
----------------------------------
Don't have more information since the downloaded container zip file is deleted
on failure. First thing I think we'd better keep the zip file for debug
purpose, then we can quickly find the root cause.
> Failed to import replicated container
> -------------------------------------
>
> Key: HDDS-3852
> URL: https://issues.apache.org/jira/browse/HDDS-3852
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Components: Ozone Datanode
> Reporter: Sammi Chen
> Priority: Major
>
> Find several container replication failure LOG after upgrade Ozone cluster to
> June 12th master branch. The tar file is deleted after import failure.
>
> {code}
> 2020-06-23 14:11:19,662 [ContainerReplicationThread-0] INFO
> org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator:
> Starting replication of container 206 from
> [33b49c34-caa2-4b4f-894e-dce7db4f97b9{ip: 9.180.20.222, host:
> host-9-180-20-222, networkLocation: /rack1, certSerialId: null},
> f8d9ccf6-20c6-4dfa-8a49-012f43a1b27e{ip: 9.179.142.251, host: host251,
> networkLocation: /rack3, certSerialId: null},
> db854037-4846-4093-89de-e492e0f14239{ip: 9.179.142.198, host: host198,
> networkLocation: /rack3, certSerialId: null}]
> 2020-06-23 14:11:20,504 [grpc-default-executor-111] INFO
> org.apache.hadoop.ozone.container.replication.GrpcReplicationClient:
> Container 206 is downloaded to /tmp/container-copy/container-206.tar.gz
> 2020-06-23 14:11:20,505 [ContainerReplicationThread-0] INFO
> org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator:
> Container 206 is downloaded, starting to import.
> 2020-06-23 14:11:20,616 [ContainerReplicationThread-0] ERROR
> org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator:
> Can't import the downloaded container data id=206
> java.io.IOException: Container descriptor is missing from the container
> archive.
> at
> org.apache.hadoop.ozone.container.keyvalue.TarContainerPacker.unpackContainerDescriptor(TarContainerPacker.java:190)
> at
> org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator.importContainer(DownloadAndImportReplicator.java:74)
> at
> org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator.replicate(DownloadAndImportReplicator.java:121)
> at
> org.apache.hadoop.ozone.container.replication.ReplicationSupervisor$TaskRunner.run(ReplicationSupervisor.java:129)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 2020-06-23 14:11:20,616 [ContainerReplicationThread-0] INFO
> org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator:
> Container 206 is replicated successfully
> 2020-06-23 14:11:20,616 [ContainerReplicationThread-0] INFO
> org.apache.hadoop.ozone.container.replication.ReplicationSupervisor:
> Container 206 is replicated.
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]