[
https://issues.apache.org/jira/browse/HDDS-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17147984#comment-17147984
]
Marton Elek commented on HDDS-3852:
-----------------------------------
We discussed it during the Community Meeting. It seems to be hard to reproduce
the problem, therefore we moved out from 0.7.0. Feel free to move it back if
you think it's important to fix (especially as you -- as the release manager --
have the final decision).
Personally I think we need more test with long-running Ozone clusters. The
upgrade tests introduced by Attila might also help.
If you have any more logs or any information, please share, and we can
investigate.
> Failed to import replicated container
> -------------------------------------
>
> Key: HDDS-3852
> URL: https://issues.apache.org/jira/browse/HDDS-3852
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Components: Ozone Datanode
> Reporter: Sammi Chen
> Priority: Major
>
> Find several container replication failure LOG after upgrade Ozone cluster to
> June 12th master branch. The tar file is deleted after import failure.
>
> {code}
> 2020-06-23 14:11:19,662 [ContainerReplicationThread-0] INFO
> org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator:
> Starting replication of container 206 from
> [33b49c34-caa2-4b4f-894e-dce7db4f97b9{ip: 9.180.20.222, host:
> host-9-180-20-222, networkLocation: /rack1, certSerialId: null},
> f8d9ccf6-20c6-4dfa-8a49-012f43a1b27e{ip: 9.179.142.251, host: host251,
> networkLocation: /rack3, certSerialId: null},
> db854037-4846-4093-89de-e492e0f14239{ip: 9.179.142.198, host: host198,
> networkLocation: /rack3, certSerialId: null}]
> 2020-06-23 14:11:20,504 [grpc-default-executor-111] INFO
> org.apache.hadoop.ozone.container.replication.GrpcReplicationClient:
> Container 206 is downloaded to /tmp/container-copy/container-206.tar.gz
> 2020-06-23 14:11:20,505 [ContainerReplicationThread-0] INFO
> org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator:
> Container 206 is downloaded, starting to import.
> 2020-06-23 14:11:20,616 [ContainerReplicationThread-0] ERROR
> org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator:
> Can't import the downloaded container data id=206
> java.io.IOException: Container descriptor is missing from the container
> archive.
> at
> org.apache.hadoop.ozone.container.keyvalue.TarContainerPacker.unpackContainerDescriptor(TarContainerPacker.java:190)
> at
> org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator.importContainer(DownloadAndImportReplicator.java:74)
> at
> org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator.replicate(DownloadAndImportReplicator.java:121)
> at
> org.apache.hadoop.ozone.container.replication.ReplicationSupervisor$TaskRunner.run(ReplicationSupervisor.java:129)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 2020-06-23 14:11:20,616 [ContainerReplicationThread-0] INFO
> org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator:
> Container 206 is replicated successfully
> 2020-06-23 14:11:20,616 [ContainerReplicationThread-0] INFO
> org.apache.hadoop.ozone.container.replication.ReplicationSupervisor:
> Container 206 is replicated.
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]