sodonnel opened a new pull request, #3649:
URL: https://github.com/apache/ozone/pull/3649

   ## What changes were proposed in this pull request?
   
   When there are two replicas online with the same index, eg due to 
decommission, over-replication or maintenance, and the container is under 
replicated due to another missing index, an illegal argument exception can be 
thrown when collecting the source indexes:
   
   ```
   2022-08-02 10:54:26,939 WARN 
org.apache.hadoop.hdds.scm.container.replication.ECUnderReplicationHandler: 
Exception while processing for creating the EC reconstruction container 
commands for #2.
   java.lang.IllegalStateException: Duplicate key 3 (attempted merging values 
ContainerReplica{containerID=#2, state=CLOSED, 
datanodeDetails=fb63a3c8-2e5b-432e-be63-274c41aab79f{ip: 172.27.124.131, host: 
quasar-onjdpu-5.quasar-onjdpu.root.hwx.site, ports: [REPLICATION=9886, 
RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859], 
networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE, 
persistedOpStateExpiryEpochSec: 0}, 
placeOfBirth=2ff0ed60-a461-41a4-8fff-9da6cb4e52ad, sequenceId=0, keyCount=1, 
bytesUsed=34603008,replicaIndex= 3} and ContainerReplica{containerID=#2, 
state=CLOSED, datanodeDetails=2ff0ed60-a461-41a4-8fff-9da6cb4e52ad{ip: 
172.27.193.4, host: quasar-onjdpu-3.quasar-onjdpu.root.hwx.site, ports: 
[REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, 
STANDALONE=9859], networkLocation: /default, certSerialId: null, 
persistedOpState: DECOMMISSIONING, persistedOpStateExpiryEpochSec: 0}, 
placeOfBirth=2ff0ed60-a461-41a4-8fff-9da6cb4e5
 2ad, sequenceId=0, keyCount=1, bytesUsed=34603008,replicaIndex= 3})
           at 
java.base/java.util.stream.Collectors.duplicateKeyException(Collectors.java:133)
           at 
java.base/java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:180)
           at 
java.base/java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
           at 
java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
           at 
java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
           at 
java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
           at 
java.base/java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1603)
           at 
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
           at 
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
           at 
java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
           at 
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
           at 
java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
           at 
org.apache.hadoop.hdds.scm.container.replication.ECUnderReplicationHandler.processAndCreateCommands(ECUnderReplicationHandler.java:151)
           at 
org.apache.hadoop.hdds.scm.container.replication.ReplicationManager.processUnderReplicatedContainer(ReplicationManager.java:366)
           at 
org.apache.hadoop.hdds.scm.container.replication.UnderReplicatedProcessor.processContainer(UnderReplicatedProcessor.java:92)
           at 
org.apache.hadoop.hdds.scm.container.replication.UnderReplicatedProcessor.processAll(UnderReplicatedProcessor.java:76)
           at 
org.apache.hadoop.hdds.scm.ha.BackgroundSCMService.run(BackgroundSCMService.java:101)
           at java.base/java.lang.Thread.run(Thread.java:834)
   ```
   
   This then goes unhandled and causes the under rep processing thread to exit.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-7081
   
   ## How was this patch tested?
   
   Unit test to reproduce and ensure the issue is fixed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to