[ https://issues.apache.org/jira/browse/HDDS-351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606274#comment-16606274 ]
Anu Engineer commented on HDDS-351: ----------------------------------- [~ajayydv] Thanks for very patient explanations and updating this patch. I appreciate it. I have some very minor comments on patch v10. * nit: ContainerMapping.java:215: The size call returns the cardinality of the set. So I presume that you want to check {{if (dnWithReplicas.size() ==0)}}, since cardinality of a set cannot be negative. This seems slightly more straightforward than the following fragment. {code:java} if (dnWithReplicas.size() < 1) { {code} * ContainerReportHandler.java#emitReplicationRequestEvent We have this change {code:java} if (existingReplicas == 0) { LOG.error("Container with containerId {} has no replicas. " + "ReplicationManager can't replicate.", containerID); return; } {code} We have no replicas and we return silently without even notifying Replica Manager. Then no one in the system would know about this, and no alerts would be fired. Shouldn't we let this call proceed to ReplicaManager and then ReplicaManager raise the correct alerts/errors? From just reading this patch, I am not able to understand the reason for this change. Would you be kind of explain what issue we are trying to fix with this change? Does some test fail without this change? * ContainerStateManager.java#getContainerReplicas - I get it now, we removed the exception throwing from this function. That is causing the cascading change including the above comment of mine. Since we are throwing an exception as soon as we find the size is zero, why not keep the throw as-is, may add a try catch where is it used? Especially in {{emitReplicationRequestEvent}}, it is just a suggestion I am still trying to understand the trade-offs involved here. * ContainerStateMap.java#getContainerReplicas – There also we have decided to remove exception when we find an error, and that is cascading via {{ContainerStateManager.java#getContainerReplicas}} which causes us to ignore the error condition in {{emitReplicationRequestEvent}} Why remove this exception at all? I am a little confused and worried that we might have introduced a subtle bug with this code change. * nit: HddsConfigkeys.java:Line 75: {{public static final String HDDS_SCM_CHILLMODE_THRESHOLD_PCT = "hdds.scm" + ".chillmode.threshold.pct";}} Generally, we just put the names as a single line below. That avoids the dangerous space issues that can crop up if accidentally some add a space to " .chillmode" or something. * nit: MiniOzoneClusterImpl.java: Line 330, Since we have defined startHDDSDatanodes() --> replace line 331 with that function. * nit: This is easily changeable so does not matter much. Just wanted to point this out to you. For the setting "hdds.scm.chillmode.threshold.pct", the default value is 0.995. That means if you have 200(5G each) containers or below pretty much all the containers need to be reported in for the cluster to boot up. In the ozone world it implies that when we are booting up a 1TB cluster, all machines and containers must be accounted for. Is this by design? In the case of HDFS, assuming a 1TB with 128 MB block size, the same config allows us to come out of Safe mode with 40 blocks missing. Should we tune this number down a bit considering that 128 MB block vs. 5 GB block size of Ozone? * SCMChillModeManager.java:ContainerChillModeRule - The function crashes with NullPointerException if the container is null. Check if the container null before accessing the variable? It is easy to repro with passing a null instead of the container in TestSCMChillModeManager: Line 66. * SCMChillModeManager.java#ContainerChillModeRule#process - Most probably a no-op comment. You might want to check if the EventQueue has a serialization guarantee. That is, it will only deliver one event at a time. If there is no such guarantee in the EventQueue class, we have a race condition in the process function {code:java} if (containerMap.containsKey(c.getContainerID())) { containerMap.remove(c.getContainerID()); containerWithMinReplicas.getAndAdd(1); } } {code} Here is the issue, we might check c.getContainerID, in thread 1 and thread 2 for the same ID, Since there are many data nodes in the system. Then we will go in a remove the container, in one of the threads the delete will fail, no big deal, but the count that we do after that will get messed up. A straightforward fix is to add a "synchronized" on the container map – so we have serialization at the point of check and point of use. That will solve the issue of updating the container map and containerWithMinReplicas with multiple threads. > Add chill mode state to SCM > --------------------------- > > Key: HDDS-351 > URL: https://issues.apache.org/jira/browse/HDDS-351 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Reporter: Ajay Kumar > Assignee: Ajay Kumar > Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-351.00.patch, HDDS-351.01.patch, HDDS-351.02.patch, > HDDS-351.03.patch, HDDS-351.04.patch, HDDS-351.05.patch, HDDS-351.06.patch, > HDDS-351.07.patch, HDDS-351.08.patch, HDDS-351.09.patch, HDDS-351.10.patch > > > Add chill mode state to SCM -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org