symious commented on a change in pull request #2172:
URL: https://github.com/apache/ozone/pull/2172#discussion_r619836369
##########
File path:
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconContainerManager.java
##########
@@ -122,26 +122,87 @@ public void checkAndAddNewContainer(ContainerID
containerID,
scmClient.getContainerWithPipeline(containerID.getId());
LOG.debug("Verified new container from SCM {}, {} ",
containerID, containerWithPipeline.getPipeline().getId());
- // If no other client added this, go ahead and add this container.
- if (!containerExist(containerID)) {
- addNewContainer(containerID.getId(), containerWithPipeline);
- }
+ // no need call "containerExist" to check, because
+ // 1 containerExist and addNewContainer can not be atomic
+ // 2 addNewContainer will double check the existence
Review comment:
I think the original check is kind of optimistic locking here?
##########
File path:
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconContainerManager.java
##########
@@ -122,26 +122,87 @@ public void checkAndAddNewContainer(ContainerID
containerID,
scmClient.getContainerWithPipeline(containerID.getId());
LOG.debug("Verified new container from SCM {}, {} ",
containerID, containerWithPipeline.getPipeline().getId());
- // If no other client added this, go ahead and add this container.
- if (!containerExist(containerID)) {
- addNewContainer(containerID.getId(), containerWithPipeline);
- }
+ // no need call "containerExist" to check, because
+ // 1 containerExist and addNewContainer can not be atomic
+ // 2 addNewContainer will double check the existence
+ addNewContainer(containerWithPipeline);
} else {
- // Check if container state is not open. In SCM, container state
- // changes to CLOSING first, and then the close command is pushed down
- // to Datanodes. Recon 'learns' this from DN, and hence replica state
- // will move container state to 'CLOSING'.
- ContainerInfo containerInfo = getContainer(containerID);
- if (containerInfo.getState().equals(HddsProtos.LifeCycleState.OPEN)
- && !replicaState.equals(ContainerReplicaProto.State.OPEN)
- && isHealthy(replicaState)) {
- LOG.info("Container {} has state OPEN, but Replica has State {}.",
- containerID, replicaState);
- try {
- updateContainerState(containerID, FINALIZE);
- } catch (InvalidStateTransitionException e) {
- throw new IOException(e);
- }
+ checkContainerStateAndUpdate(containerID, replicaState);
+ }
+ }
+
+ /**
+ * Check and add new containers in batch if not already present in Recon.
+ *
+ * @param containerReplicaProtoList list of containerReplicaProtos.
+ * @throws IOException on Error.
+ */
+ public void checkAndAddNewContainerBatch(
+ List<ContainerReplicaProto> containerReplicaProtoList)
+ throws IOException {
+ Map<Boolean, List<ContainerReplicaProto>> containers =
+ containerReplicaProtoList.parallelStream()
Review comment:
I think these codes can be simpler, seems too redundant for lambda here.
##########
File path:
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconContainerManager.java
##########
@@ -122,26 +122,87 @@ public void checkAndAddNewContainer(ContainerID
containerID,
scmClient.getContainerWithPipeline(containerID.getId());
LOG.debug("Verified new container from SCM {}, {} ",
containerID, containerWithPipeline.getPipeline().getId());
- // If no other client added this, go ahead and add this container.
- if (!containerExist(containerID)) {
- addNewContainer(containerID.getId(), containerWithPipeline);
- }
+ // no need call "containerExist" to check, because
+ // 1 containerExist and addNewContainer can not be atomic
+ // 2 addNewContainer will double check the existence
+ addNewContainer(containerWithPipeline);
} else {
- // Check if container state is not open. In SCM, container state
- // changes to CLOSING first, and then the close command is pushed down
- // to Datanodes. Recon 'learns' this from DN, and hence replica state
- // will move container state to 'CLOSING'.
- ContainerInfo containerInfo = getContainer(containerID);
- if (containerInfo.getState().equals(HddsProtos.LifeCycleState.OPEN)
- && !replicaState.equals(ContainerReplicaProto.State.OPEN)
- && isHealthy(replicaState)) {
- LOG.info("Container {} has state OPEN, but Replica has State {}.",
- containerID, replicaState);
- try {
- updateContainerState(containerID, FINALIZE);
- } catch (InvalidStateTransitionException e) {
- throw new IOException(e);
- }
+ checkContainerStateAndUpdate(containerID, replicaState);
+ }
+ }
+
+ /**
+ * Check and add new containers in batch if not already present in Recon.
+ *
+ * @param containerReplicaProtoList list of containerReplicaProtos.
+ * @throws IOException on Error.
+ */
+ public void checkAndAddNewContainerBatch(
+ List<ContainerReplicaProto> containerReplicaProtoList)
+ throws IOException {
+ Map<Boolean, List<ContainerReplicaProto>> containers =
+ containerReplicaProtoList.parallelStream()
+ .collect(Collectors.groupingBy(c ->
+ containerExist(ContainerID.valueOf(c.getContainerID()))));
+
+ List<ContainerReplicaProto> existContainers = null;
+ if (containers.containsKey(true)) {
+ existContainers = containers.get(true);
+ }
+ List<Long> noExistContainers = null;
+ if (containers.containsKey(false)){
+ noExistContainers = containers.get(false).parallelStream().
+ map(ContainerReplicaProto::getContainerID)
+ .collect(Collectors.toList());
+ }
+
+ //for now , if any one container in noExistContainers is not found by SCM,
+ //an IOException will be throw and the whole noExistContainers will be
drop.
+ //in some cases,this may slow the process for recon to learn new container,
+ //but it does not matter, just make it simple for the present
+ if (null != noExistContainers) {
+ List<ContainerWithPipeline> verifiedContainerPipeline =
+ scmClient.getContainerWithPipelineBatch(noExistContainers);
Review comment:
Can we fallback to the original procedure here if the above issue
happened?
##########
File path:
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconContainerReportHandler.java
##########
@@ -48,26 +46,16 @@ public ReconContainerReportHandler(NodeManager nodeManager,
@Override
public void onMessage(final ContainerReportFromDatanode reportFromDatanode,
final EventPublisher publisher) {
-
- final ContainerReportsProto containerReport =
- reportFromDatanode.getReport();
ReconContainerManager containerManager =
(ReconContainerManager) getContainerManager();
-
- List<ContainerReplicaProto> reportsList = containerReport.getReportsList();
- for (ContainerReplicaProto containerReplicaProto : reportsList) {
- final ContainerID id = ContainerID.valueOf(
- containerReplicaProto.getContainerID());
- try {
- containerManager.checkAndAddNewContainer(id,
- containerReplicaProto.getState(),
- reportFromDatanode.getDatanodeDetails());
- } catch (IOException ioEx) {
- LOG.error("Exception while checking and adding new container.", ioEx);
- }
- LOG.debug("Got container report for containerID {} ",
- containerReplicaProto.getContainerID());
+ List<ContainerReplicaProto> containerReplicaProtoList =
+ reportFromDatanode.getReport().getReportsList();
+ try {
+ containerManager.checkAndAddNewContainerBatch(containerReplicaProtoList);
Review comment:
Maybe we can set a batch size limit here to avoid the performance impact
on SCM?
##########
File path:
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconContainerManager.java
##########
@@ -122,26 +122,87 @@ public void checkAndAddNewContainer(ContainerID
containerID,
scmClient.getContainerWithPipeline(containerID.getId());
LOG.debug("Verified new container from SCM {}, {} ",
containerID, containerWithPipeline.getPipeline().getId());
- // If no other client added this, go ahead and add this container.
- if (!containerExist(containerID)) {
- addNewContainer(containerID.getId(), containerWithPipeline);
- }
+ // no need call "containerExist" to check, because
+ // 1 containerExist and addNewContainer can not be atomic
+ // 2 addNewContainer will double check the existence
Review comment:
There are still some steps that need to be taken before
`addNewContainer` calls `ContainerStateMap$contains`, with the contains check
here, the performance would be better.
Besides, in `ContainerStateManagerImpl`, the writeLock would be required to
check if the container exists, it's not reasonable to involve the lock here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]