[ https://issues.apache.org/jira/browse/HDDS-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Attila Doroszlai resolved HDDS-10652. ------------------------------------- Fix Version/s: 1.5.0 Resolution: Fixed > [Upgrade][EC] Reconstruction failing with "java.io.IOException: None of the > block data have checksum" > ----------------------------------------------------------------------------------------------------- > > Key: HDDS-10652 > URL: https://issues.apache.org/jira/browse/HDDS-10652 > Project: Apache Ozone > Issue Type: Bug > Components: EC, ECOfflineRecovery > Reporter: Pratyush Bhatt > Assignee: Siddhant Sangwan > Priority: Major > Labels: pull-request-available > Fix For: 1.5.0 > > > {color:#172b4d}*Upgrade versions:* > Pre upgrade hash: > [https://github.com/apache/ozone/commit/6ee6c357678676661ebb3181a56622c79b487bc1] > Post upgrade Hash: > [https://github.com/apache/ozone/commit/46b6f3def1d84ca769affb4d3f0d84dece6e8567] > {color}{color:#172b4d}*Scenario:* > Write a EC file(5GB) RS-3-2-1024K policy(in this case) before upgrade, after > upgrade, shut down either 2 Parity nodes(this case) or 2 Data nodes, as the > policy supports tolerating 2 DN failure. Check if reconstruction happens > after sometime. > *Observed Behavior:* > 1. Data was successfully written pre-upgrade using Freon. > File name: > _o3://ozone1711558189/ec-construct-vol/ec-construct-buck/ec-construction/0_ > 2. Post upgrade Stop two of the DNs, in this case the Parity nodes that we > obtained from one of the containers that was storing the above file's > data.{color} > {code:java} > ozone admin container info 1004 --json > 2024-03-27 21:35:15,065|INFO|MainThread|machine.py:232 - > run()||GUID=183f2d10-e3a7-407f-adb5-b87f3e3af53b|Exit Code: 0 > 2024-03-27 21:35:15,098|INFO|MainThread|ozone.py:723 - > find_ec_data_parity_hosts()|parity hosts: ['DN-4', 'DN-3'] > 2024-03-27 21:35:15,098|INFO|MainThread|ozone.py:724 - > find_ec_data_parity_hosts()|data hosts: ['DN-8', 'DN-5', 'DN-1'] {code} > {code:java} > 2024-03-27 21:35:15,311|INFO|MainThread|cm_apilib.py:1214 - > stopComponent()|Initiating stop of OZONE_DATANODE at host DN-4 > 2024-03-27 21:35:15,349|INFO|MainThread|cm_apilib.py:1218 - > stopComponent()|Command name = Stop , ID = 2860 > 2024-03-27 21:35:15,580|INFO|MainThread|cm_apilib.py:1214 - > stopComponent()|Initiating stop of OZONE_DATANODE at host DN-3 > 2024-03-27 21:35:15,609|INFO|MainThread|cm_apilib.py:1218 - > stopComponent()|Command name = Stop , ID = 2862 {code} > {color:#172b4d}Node DN-3 and DN-4 are stopped. > 3. Read file's data(Online Reconstruction) and compute checksum, -> That > matched. > 4. Wait for Reconstruction to happen, test waited for 20 Minutes, but Still > only 3 DNs were present even after 20 minutes:{color} > {code:java} > ['DN-5', 'DN-1', 'DN-8']{code} > Infact still after 10 hours(At the time of writing), there are still 3 DNs > only: > {code:java} > date > Thu Mar 28 08:39:16 UTC 2024 > ozone admin container info 1004 --json > { > "containerInfo" : { > "state" : "CLOSED", > "stateEnterTime" : "2024-03-27T18:43:51.934Z", > "replicationConfig" : { > "data" : 3, > "parity" : 2, > "ecChunkSize" : 1048576, > "codec" : "RS", > "requiredNodes" : 5, > "replicationType" : "EC" > }, > "usedBytes" : 1342177280, > "numberOfKeys" : 5, > "lastUsed" : "2024-03-28T08:39:24.535189Z", > "owner" : "om1", > "containerID" : 1004, > "deleteTransactionId" : 0, > "sequenceId" : 0, > "deleted" : false, > "open" : false > }, > "pipeline" : { > "id" : { > "id" : "73532c14-40ac-4924-9353-2f18ab0d63f2" > }, > "replicationConfig" : { > "data" : 3, > "parity" : 2, > "ecChunkSize" : 1048576, > "codec" : "RS", > "requiredNodes" : 5, > "replicationType" : "EC" > }, > "nodesInOrder" : [ { > "level" : 0, > "cost" : 0, > "uuid" : "6179347f-5824-41d4-b722-f1dbc5f14880", > "uuidString" : "6179347f-5824-41d4-b722-f1dbc5f14880", > "ipAddress" : "10.140.37.12", > "hostName" : "DN-5", > "ports" : [ { > "name" : "HTTPS", > "value" : 9883 > }, { > "name" : "CLIENT_RPC", > "value" : 9864 > }, { > "name" : "REPLICATION", > "value" : 9886 > }, { > "name" : "RATIS", > "value" : 9858 > }, { > "name" : "RATIS_ADMIN", > "value" : 9857 > }, { > "name" : "RATIS_SERVER", > "value" : 9856 > }, { > "name" : "STANDALONE", > "value" : 9859 > } ], > "setupTime" : 0, > "persistedOpState" : "IN_SERVICE", > "persistedOpStateExpiryEpochSec" : 0, > "initialVersion" : 0, > "currentVersion" : 1, > "decommissioned" : false, > "maintenance" : false, > "signature" : -662262523, > "networkLocation" : "/default", > "networkName" : "6179347f-5824-41d4-b722-f1dbc5f14880", > "networkFullPath" : "/default/6179347f-5824-41d4-b722-f1dbc5f14880", > "numOfLeaves" : 1 > }, { > "level" : 0, > "cost" : 0, > "uuid" : "d8afb52b-5f4c-4d94-9286-7c3cfd6c315c", > "uuidString" : "d8afb52b-5f4c-4d94-9286-7c3cfd6c315c", > "ipAddress" : "10.140.40.9", > "hostName" : "DN-1", > "ports" : [ { > "name" : "HTTPS", > "value" : 9883 > }, { > "name" : "CLIENT_RPC", > "value" : 9864 > }, { > "name" : "REPLICATION", > "value" : 9886 > }, { > "name" : "RATIS", > "value" : 9858 > }, { > "name" : "RATIS_ADMIN", > "value" : 9857 > }, { > "name" : "RATIS_SERVER", > "value" : 9856 > }, { > "name" : "STANDALONE", > "value" : 9859 > } ], > "setupTime" : 0, > "persistedOpState" : "IN_SERVICE", > "persistedOpStateExpiryEpochSec" : 0, > "initialVersion" : 0, > "currentVersion" : 1, > "decommissioned" : false, > "maintenance" : false, > "signature" : -1387859873, > "networkLocation" : "/default", > "networkName" : "d8afb52b-5f4c-4d94-9286-7c3cfd6c315c", > "networkFullPath" : "/default/d8afb52b-5f4c-4d94-9286-7c3cfd6c315c", > "numOfLeaves" : 1 > }, { > "level" : 0, > "cost" : 0, > "uuid" : "ef7ae3e9-5ec3-49d6-9b93-1c687009bc1e", > "uuidString" : "ef7ae3e9-5ec3-49d6-9b93-1c687009bc1e", > "ipAddress" : "10.140.137.128", > "hostName" : "DN-8", > "ports" : [ { > "name" : "HTTPS", > "value" : 9883 > }, { > "name" : "CLIENT_RPC", > "value" : 9864 > }, { > "name" : "REPLICATION", > "value" : 9886 > }, { > "name" : "RATIS", > "value" : 9858 > }, { > "name" : "RATIS_ADMIN", > "value" : 9857 > }, { > "name" : "RATIS_SERVER", > "value" : 9856 > }, { > "name" : "STANDALONE", > "value" : 9859 > } ], > "setupTime" : 0, > "persistedOpState" : "IN_SERVICE", > "persistedOpStateExpiryEpochSec" : 0, > "initialVersion" : 0, > "currentVersion" : 1, > "decommissioned" : false, > "maintenance" : false, > "signature" : 1098159392, > "networkLocation" : "/default", > "networkName" : "ef7ae3e9-5ec3-49d6-9b93-1c687009bc1e", > "networkFullPath" : "/default/ef7ae3e9-5ec3-49d6-9b93-1c687009bc1e", > "numOfLeaves" : 1 > } ], > "creationTimestamp" : "2024-03-28T08:39:24.480Z", > "stateEnterTime" : "2024-03-28T08:39:24.545517Z", > "leaderNode" : { > "level" : 0, > "cost" : 0, > "uuid" : "6179347f-5824-41d4-b722-f1dbc5f14880", > "uuidString" : "6179347f-5824-41d4-b722-f1dbc5f14880", > "ipAddress" : "10.140.37.12", > "hostName" : "DN-5", > "ports" : [ { > "name" : "HTTPS", > "value" : 9883 > }, { > "name" : "CLIENT_RPC", > "value" : 9864 > }, { > "name" : "REPLICATION", > "value" : 9886 > }, { > "name" : "RATIS", > "value" : 9858 > }, { > "name" : "RATIS_ADMIN", > "value" : 9857 > }, { > "name" : "RATIS_SERVER", > "value" : 9856 > }, { > "name" : "STANDALONE", > "value" : 9859 > } ], > "setupTime" : 0, > "persistedOpState" : "IN_SERVICE", > "persistedOpStateExpiryEpochSec" : 0, > "initialVersion" : 0, > "currentVersion" : 1, > "decommissioned" : false, > "maintenance" : false, > "signature" : -662262523, > "networkLocation" : "/default", > "networkName" : "6179347f-5824-41d4-b722-f1dbc5f14880", > "networkFullPath" : "/default/6179347f-5824-41d4-b722-f1dbc5f14880", > "numOfLeaves" : 1 > }, > "firstNode" : { > "level" : 0, > "cost" : 0, > "uuid" : "6179347f-5824-41d4-b722-f1dbc5f14880", > "uuidString" : "6179347f-5824-41d4-b722-f1dbc5f14880", > "ipAddress" : "10.140.37.12", > "hostName" : "DN-5", > "ports" : [ { > "name" : "HTTPS", > "value" : 9883 > }, { > "name" : "CLIENT_RPC", > "value" : 9864 > }, { > "name" : "REPLICATION", > "value" : 9886 > }, { > "name" : "RATIS", > "value" : 9858 > }, { > "name" : "RATIS_ADMIN", > "value" : 9857 > }, { > "name" : "RATIS_SERVER", > "value" : 9856 > }, { > "name" : "STANDALONE", > "value" : 9859 > } ], > "setupTime" : 0, > "persistedOpState" : "IN_SERVICE", > "persistedOpStateExpiryEpochSec" : 0, > "initialVersion" : 0, > "currentVersion" : 1, > "decommissioned" : false, > "maintenance" : false, > "signature" : -662262523, > "networkLocation" : "/default", > "networkName" : "6179347f-5824-41d4-b722-f1dbc5f14880", > "networkFullPath" : "/default/6179347f-5824-41d4-b722-f1dbc5f14880", > "numOfLeaves" : 1 > }, > "closestNode" : { > "level" : 0, > "cost" : 0, > "uuid" : "6179347f-5824-41d4-b722-f1dbc5f14880", > "uuidString" : "6179347f-5824-41d4-b722-f1dbc5f14880", > "ipAddress" : "10.140.37.12", > "hostName" : "DN-5", > "ports" : [ { > "name" : "HTTPS", > "value" : 9883 > }, { > "name" : "CLIENT_RPC", > "value" : 9864 > }, { > "name" : "REPLICATION", > "value" : 9886 > }, { > "name" : "RATIS", > "value" : 9858 > }, { > "name" : "RATIS_ADMIN", > "value" : 9857 > }, { > "name" : "RATIS_SERVER", > "value" : 9856 > }, { > "name" : "STANDALONE", > "value" : 9859 > } ], > "setupTime" : 0, > "persistedOpState" : "IN_SERVICE", > "persistedOpStateExpiryEpochSec" : 0, > "initialVersion" : 0, > "currentVersion" : 1, > "decommissioned" : false, > "maintenance" : false, > "signature" : -662262523, > "networkLocation" : "/default", > "networkName" : "6179347f-5824-41d4-b722-f1dbc5f14880", > "networkFullPath" : "/default/6179347f-5824-41d4-b722-f1dbc5f14880", > "numOfLeaves" : 1 > }, > "allocationTimeout" : false, > "healthy" : true, > "pipelineState" : "ALLOCATED", > "nodes" : [ { > "level" : 0, > "cost" : 0, > "uuid" : "6179347f-5824-41d4-b722-f1dbc5f14880", > "uuidString" : "6179347f-5824-41d4-b722-f1dbc5f14880", > "ipAddress" : "10.140.37.12", > "hostName" : "DN-5", > "ports" : [ { > "name" : "HTTPS", > "value" : 9883 > }, { > "name" : "CLIENT_RPC", > "value" : 9864 > }, { > "name" : "REPLICATION", > "value" : 9886 > }, { > "name" : "RATIS", > "value" : 9858 > }, { > "name" : "RATIS_ADMIN", > "value" : 9857 > }, { > "name" : "RATIS_SERVER", > "value" : 9856 > }, { > "name" : "STANDALONE", > "value" : 9859 > } ], > "setupTime" : 0, > "persistedOpState" : "IN_SERVICE", > "persistedOpStateExpiryEpochSec" : 0, > "initialVersion" : 0, > "currentVersion" : 1, > "decommissioned" : false, > "maintenance" : false, > "signature" : -662262523, > "networkLocation" : "/default", > "networkName" : "6179347f-5824-41d4-b722-f1dbc5f14880", > "networkFullPath" : "/default/6179347f-5824-41d4-b722-f1dbc5f14880", > "numOfLeaves" : 1 > }, { > "level" : 0, > "cost" : 0, > "uuid" : "d8afb52b-5f4c-4d94-9286-7c3cfd6c315c", > "uuidString" : "d8afb52b-5f4c-4d94-9286-7c3cfd6c315c", > "ipAddress" : "10.140.40.9", > "hostName" : "DN-1", > "ports" : [ { > "name" : "HTTPS", > "value" : 9883 > }, { > "name" : "CLIENT_RPC", > "value" : 9864 > }, { > "name" : "REPLICATION", > "value" : 9886 > }, { > "name" : "RATIS", > "value" : 9858 > }, { > "name" : "RATIS_ADMIN", > "value" : 9857 > }, { > "name" : "RATIS_SERVER", > "value" : 9856 > }, { > "name" : "STANDALONE", > "value" : 9859 > } ], > "setupTime" : 0, > "persistedOpState" : "IN_SERVICE", > "persistedOpStateExpiryEpochSec" : 0, > "initialVersion" : 0, > "currentVersion" : 1, > "decommissioned" : false, > "maintenance" : false, > "signature" : -1387859873, > "networkLocation" : "/default", > "networkName" : "d8afb52b-5f4c-4d94-9286-7c3cfd6c315c", > "networkFullPath" : "/default/d8afb52b-5f4c-4d94-9286-7c3cfd6c315c", > "numOfLeaves" : 1 > }, { > "level" : 0, > "cost" : 0, > "uuid" : "ef7ae3e9-5ec3-49d6-9b93-1c687009bc1e", > "uuidString" : "ef7ae3e9-5ec3-49d6-9b93-1c687009bc1e", > "ipAddress" : "10.140.137.128", > "hostName" : "DN-8", > "ports" : [ { > "name" : "HTTPS", > "value" : 9883 > }, { > "name" : "CLIENT_RPC", > "value" : 9864 > }, { > "name" : "REPLICATION", > "value" : 9886 > }, { > "name" : "RATIS", > "value" : 9858 > }, { > "name" : "RATIS_ADMIN", > "value" : 9857 > }, { > "name" : "RATIS_SERVER", > "value" : 9856 > }, { > "name" : "STANDALONE", > "value" : 9859 > } ], > "setupTime" : 0, > "persistedOpState" : "IN_SERVICE", > "persistedOpStateExpiryEpochSec" : 0, > "initialVersion" : 0, > "currentVersion" : 1, > "decommissioned" : false, > "maintenance" : false, > "signature" : 1098159392, > "networkLocation" : "/default", > "networkName" : "ef7ae3e9-5ec3-49d6-9b93-1c687009bc1e", > "networkFullPath" : "/default/ef7ae3e9-5ec3-49d6-9b93-1c687009bc1e", > "numOfLeaves" : 1 > } ], > "empty" : false, > "type" : "EC" > }, > "replicas" : [ { > "containerID" : 1004, > "state" : "CLOSED", > "datanodeDetails" : { > "level" : 0, > "cost" : 0, > "uuid" : "6179347f-5824-41d4-b722-f1dbc5f14880", > "uuidString" : "6179347f-5824-41d4-b722-f1dbc5f14880", > "ipAddress" : "10.140.37.12", > "hostName" : "DN-5z", > "ports" : [ { > "name" : "HTTPS", > "value" : 9883 > }, { > "name" : "CLIENT_RPC", > "value" : 9864 > }, { > "name" : "REPLICATION", > "value" : 9886 > }, { > "name" : "RATIS", > "value" : 9858 > }, { > "name" : "RATIS_ADMIN", > "value" : 9857 > }, { > "name" : "RATIS_SERVER", > "value" : 9856 > }, { > "name" : "STANDALONE", > "value" : 9859 > } ], > "setupTime" : 0, > "persistedOpState" : "IN_SERVICE", > "persistedOpStateExpiryEpochSec" : 0, > "initialVersion" : 0, > "currentVersion" : 1, > "decommissioned" : false, > "maintenance" : false, > "signature" : -662262523, > "networkLocation" : "/default", > "networkName" : "6179347f-5824-41d4-b722-f1dbc5f14880", > "networkFullPath" : "/default/6179347f-5824-41d4-b722-f1dbc5f14880", > "numOfLeaves" : 1 > }, > "placeOfBirth" : "6179347f-5824-41d4-b722-f1dbc5f14880", > "sequenceId" : 0, > "keyCount" : 5, > "bytesUsed" : 1342177280, > "replicaIndex" : 2 > }, { > "containerID" : 1004, > "state" : "CLOSED", > "datanodeDetails" : { > "level" : 0, > "cost" : 0, > "uuid" : "d8afb52b-5f4c-4d94-9286-7c3cfd6c315c", > "uuidString" : "d8afb52b-5f4c-4d94-9286-7c3cfd6c315c", > "ipAddress" : "10.140.40.9", > "hostName" : "DN-1", > "ports" : [ { > "name" : "HTTPS", > "value" : 9883 > }, { > "name" : "CLIENT_RPC", > "value" : 9864 > }, { > "name" : "REPLICATION", > "value" : 9886 > }, { > "name" : "RATIS", > "value" : 9858 > }, { > "name" : "RATIS_ADMIN", > "value" : 9857 > }, { > "name" : "RATIS_SERVER", > "value" : 9856 > }, { > "name" : "STANDALONE", > "value" : 9859 > } ], > "setupTime" : 0, > "persistedOpState" : "IN_SERVICE", > "persistedOpStateExpiryEpochSec" : 0, > "initialVersion" : 0, > "currentVersion" : 1, > "decommissioned" : false, > "maintenance" : false, > "signature" : -1387859873, > "networkLocation" : "/default", > "networkName" : "d8afb52b-5f4c-4d94-9286-7c3cfd6c315c", > "networkFullPath" : "/default/d8afb52b-5f4c-4d94-9286-7c3cfd6c315c", > "numOfLeaves" : 1 > }, > "placeOfBirth" : "d8afb52b-5f4c-4d94-9286-7c3cfd6c315c", > "sequenceId" : 0, > "keyCount" : 5, > "bytesUsed" : 1342177280, > "replicaIndex" : 3 > }, { > "containerID" : 1004, > "state" : "CLOSED", > "datanodeDetails" : { > "level" : 0, > "cost" : 0, > "uuid" : "ef7ae3e9-5ec3-49d6-9b93-1c687009bc1e", > "uuidString" : "ef7ae3e9-5ec3-49d6-9b93-1c687009bc1e", > "ipAddress" : "10.140.137.128", > "hostName" : "DN-8", > "ports" : [ { > "name" : "HTTPS", > "value" : 9883 > }, { > "name" : "CLIENT_RPC", > "value" : 9864 > }, { > "name" : "REPLICATION", > "value" : 9886 > }, { > "name" : "RATIS", > "value" : 9858 > }, { > "name" : "RATIS_ADMIN", > "value" : 9857 > }, { > "name" : "RATIS_SERVER", > "value" : 9856 > }, { > "name" : "STANDALONE", > "value" : 9859 > } ], > "setupTime" : 0, > "persistedOpState" : "IN_SERVICE", > "persistedOpStateExpiryEpochSec" : 0, > "initialVersion" : 0, > "currentVersion" : 1, > "decommissioned" : false, > "maintenance" : false, > "signature" : 1098159392, > "networkLocation" : "/default", > "networkName" : "ef7ae3e9-5ec3-49d6-9b93-1c687009bc1e", > "networkFullPath" : "/default/ef7ae3e9-5ec3-49d6-9b93-1c687009bc1e", > "numOfLeaves" : 1 > }, > "placeOfBirth" : "711656cf-a99e-4b2c-8c35-f015ee94889c", > "sequenceId" : 0, > "keyCount" : 5, > "bytesUsed" : 1342177280, > "replicaIndex" : 1 > } ] > } {code} > Checked the SCM Logs, it is still sending reconstructECContainersCommand, > {code:java} > 2024-03-28 08:36:56,748 INFO [Under Replicated > Processor]-org.apache.hadoop.hdds.scm.container.replication.ReplicationManager: > Sending command [reconstructECContainersCommand: containerID: 1004, > replicationConfig: EC{rs-3-2-1024k}, sources: > [ef7ae3e9-5ec3-49d6-9b93-1c687009bc1e(DN-8/10.140.137.128) replicaIndex: 1, > 6179347f-5824-41d4-b722-f1dbc5f14880(DN-5/10.140.37.12) replicaIndex: 2, > d8afb52b-5f4c-4d94-9286-7c3cfd6c315c(DN-1/10.140.40.9) replicaIndex: 3], > targets: [572ed33d-a834-4d80-be35-7b1b19c8bd74(DN-7/10.140.234.130), > 711656cf-a99e-4b2c-8c35-f015ee94889c(DN-2/10.140.45.129)], missingIndexes: > [4, 5]] for container ContainerInfo{id=#1004, state=CLOSED, > stateEnterTime=2024-03-27T18:43:51.934Z, > pipelineID=PipelineID=53f5587f-9e6c-465d-a0cb-b82d10c227d3, owner=om1} to > 572ed33d-a834-4d80-be35-7b1b19c8bd74(DN-7/10.140.234.130) with datanode > deadline 1711615886747 and scm deadline 1711615916747 {code} > Checked one of the Target DN DN-7, its throwing below warnings. > {code:java} > 2024-03-28 08:37:14,982 WARN > [ContainerReplicationThread-5]-org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinatorTask: > FAILED reconstructECContainersCommand: containerID=1004, > replication=rs-3-2-1024k, missingIndexes=[4, 5], > sources={1=ef7ae3e9-5ec3-49d6-9b93-1c687009bc1e(DN-8/10.140.137.128), > 2=6179347f-5824-41d4-b722-f1dbc5f14880(DN-5/10.140.37.12), > 3=d8afb52b-5f4c-4d94-9286-7c3cfd6c315c(DN-1/10.140.40.9)}, > targets={4=572ed33d-a834-4d80-be35-7b1b19c8bd74(DN-7/10.140.234.130), > 5=711656cf-a99e-4b2c-8c35-f015ee94889c(DN-2/10.140.45.129)} after 10639 ms > java.io.IOException: None of the block data have checksum which means > 2(parity)+1 blocks are not present > at > org.apache.hadoop.hdds.scm.storage.ECBlockOutputStream.executePutBlock(ECBlockOutputStream.java:156) > at > org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator.reconstructECBlockGroup(ECReconstructionCoordinator.java:325) > at > org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator.reconstructECContainerGroup(ECReconstructionCoordinator.java:171) > at > org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinatorTask.runTask(ECReconstructionCoordinatorTask.java:68) > at > org.apache.hadoop.ozone.container.replication.ReplicationSupervisor$TaskRunner.run(ReplicationSupervisor.java:359) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > 2024-03-28 08:37:14,982 WARN > [ContainerReplicationThread-5]-org.apache.hadoop.ozone.container.replication.ReplicationSupervisor: > Failed FAILED reconstructECContainersCommand: containerID=1004, > replication=rs-3-2-1024k, missingIndexes=[4, 5], > sources={1=ef7ae3e9-5ec3-49d6-9b93-1c687009bc1e(DN-8/10.140.137.128), > 2=6179347f-5824-41d4-b722-f1dbc5f14880(DN-5/10.140.37.12), > 3=d8afb52b-5f4c-4d94-9286-7c3cfd6c315c(DN-1/10.140.40.9)}, > targets={4=572ed33d-a834-4d80-be35-7b1b19c8bd74(DN-7/10.140.234.130), > 5=711656cf-a99e-4b2c-8c35-f015ee94889c(DN-2/10.140.45.129)} {code} > *Expected Behavior:* Reconstruction should have happened > Note: This is fairly reproducible everytime. > cc: [~siddhant] -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org