Murtadha Hubail has posted comments on this change. Change subject: Asterix NCs Fault Tolerance ......................................................................
Patch Set 3: (47 comments) https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-common/src/main/java/org/apache/asterix/common/config/AsterixReplicationProperties.java File asterix-common/src/main/java/org/apache/asterix/common/config/AsterixReplicationProperties.java: Line 35: //we could centralized then a single configuration file. > I think that it would be better to put this into a JIRA issue (ideally with Removed from here and will open a JIRA about it. https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-common/src/main/java/org/apache/asterix/common/transactions/IRecoveryManager.java File asterix-common/src/main/java/org/apache/asterix/common/transactions/IRecoveryManager.java: Line 116: * Reply the logs that belong to the passed {@code partitions} starting from the {@code lowWaterMarkLSN} > s/Reply/Replay/ Done https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-installer/src/test/resources/clusterts/cluster_with_replication.xml File asterix-installer/src/test/resources/clusterts/cluster_with_replication.xml: Line 19: <cluster xmlns="cluster"> > Fix indentation? Done https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-installer/src/test/resources/integrationts/replication/queries/failover/bulkload/bulkload.1.ddl.aql File asterix-installer/src/test/resources/integrationts/replication/queries/failover/bulkload/bulkload.1.ddl.aql: Line 22: * start 2 nodes, bulkload a dataset, query data, kill one node and wait until the failover complete, query the data again. > line break? Done Line 28: create dataverse TinySocial; > Indentation? Done https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-installer/src/test/resources/integrationts/replication/queries/failover/bulkload/bulkload.2.update.aql File asterix-installer/src/test/resources/integrationts/replication/queries/failover/bulkload/bulkload.2.update.aql: Line 22: * start 2 nodes, bulkload a dataset, query data, kill one node and wait until the failover complete, query the data again. > line break? Done https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-installer/src/test/resources/integrationts/replication/queries/failover/bulkload/bulkload.3.txnqbc.aql File asterix-installer/src/test/resources/integrationts/replication/queries/failover/bulkload/bulkload.3.txnqbc.aql: Line 22: * start 2 nodes, bulkload a dataset, query data, kill one node and wait until the failover complete, query the data again. > line break? Done https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-installer/src/test/resources/integrationts/replication/queries/failover/bulkload/bulkload.6.txnqar.aql File asterix-installer/src/test/resources/integrationts/replication/queries/failover/bulkload/bulkload.6.txnqar.aql: Line 22: * start 2 nodes, bulkload a dataset, query data, kill one node and wait until the failover complete, query the data again. > line break? Done https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-installer/src/test/resources/integrationts/replication/queries/failover/mem_component_recovery/mem_component_recovery.1.ddl.aql File asterix-installer/src/test/resources/integrationts/replication/queries/failover/mem_component_recovery/mem_component_recovery.1.ddl.aql: Line 22: * start 2 nodes, bulkload a dataset, copy it to in-memory dataset, query data from memory, kill one node and wait until the failover complete, query the data again. > line break? Done Line 28: create dataverse TinySocial; > Indentation? Done https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-installer/src/test/resources/integrationts/replication/queries/failover/mem_component_recovery/mem_component_recovery.2.update.aql File asterix-installer/src/test/resources/integrationts/replication/queries/failover/mem_component_recovery/mem_component_recovery.2.update.aql: Line 22: * start 2 nodes, bulkload a dataset, copy it to in-memory dataset, query data from memory, kill one node and wait until the failover complete, query the data again. > line break? Done https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-installer/src/test/resources/integrationts/replication/queries/failover/mem_component_recovery/mem_component_recovery.3.txnqbc.aql File asterix-installer/src/test/resources/integrationts/replication/queries/failover/mem_component_recovery/mem_component_recovery.3.txnqbc.aql: Line 22: * start 2 nodes, bulkload a dataset, copy it to in-memory dataset, query data from memory, kill one node and wait until the failover complete, query the data again. > line break? Done https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-installer/src/test/resources/integrationts/replication/queries/failover/mem_component_recovery/mem_component_recovery.6.txnqar.aql File asterix-installer/src/test/resources/integrationts/replication/queries/failover/mem_component_recovery/mem_component_recovery.6.txnqar.aql: Line 22: * start 2 nodes, bulkload a dataset, copy it to in-memory dataset, query data from memory, kill one node and wait until the failover complete, query the data again. > line break Done https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-installer/src/test/resources/integrationts/replication/queries/failover/metadata_node/metadata_node.1.ddl.aql File asterix-installer/src/test/resources/integrationts/replication/queries/failover/metadata_node/metadata_node.1.ddl.aql: Line 22: * start 2 nodes, create a dataset, kill metadata node and wait until the failover complete, verify the dataset still exists. > line break? Done Line 28: create dataverse TinySocial; > Indentation? Done https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-installer/src/test/resources/integrationts/replication/queries/failover/metadata_node/metadata_node.2.txnqbc.aql File asterix-installer/src/test/resources/integrationts/replication/queries/failover/metadata_node/metadata_node.2.txnqbc.aql: Line 22: * start 2 nodes, create a dataset, kill metadata node and wait until the failover complete, verify the dataset still exists. > line break? Done https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-installer/src/test/resources/integrationts/replication/queries/failover/metadata_node/metadata_node.5.txnqar.aql File asterix-installer/src/test/resources/integrationts/replication/queries/failover/metadata_node/metadata_node.5.txnqar.aql: Line 22: * start 2 nodes, create a dataset, kill metadata node and wait until the failover complete, verify the dataset still exists. > line break? Done https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-om/src/main/java/org/apache/asterix/om/util/AsterixClusterProperties.java File asterix-om/src/main/java/org/apache/asterix/om/util/AsterixClusterProperties.java: Line 301: AsterixReplicationProperties repliactionProperties = AsterixAppContextInfo.getInstance() > a/repliactionProperties/replicationProperties/ Done Line 313: if (ncConfiguration.containsKey(replica)) { > What does it mean, if we got the replica from the replication properties, b NC configuration contains the current active NCs. Replication properties contains the static information of where each node replicas are. So, if any of these replica is not active, it is ignored. Line 337: e.printStackTrace(); > This looks a little scary. Why is it ok to ignore the exception? If we cannot send the request to the NC, which according to our discussion with Yingyi, we decided we should assume reliable networking, then this NC has failed but we haven't received the failure notification yet. Since this request has already been added to the pending requests, then when the failure notification arrives, these partitions will be directed to a different active replica (if we find any). Line 382: e.printStackTrace(); > Should we just continue to run if this fails? If we cannot send the request to the NC, which according to our discussion with Yingyi, we decided we should assume reliable networking, then this NC has failed but we haven't received the failure notification yet. When the failure notification arrives, some other active replica will handle the metadata partition. After that, another metadata node takeover request will be sent to this newly assigned replica. https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-replication/src/main/java/org/apache/asterix/replication/management/ReplicationChannel.java File asterix-replication/src/main/java/org/apache/asterix/replication/management/ReplicationChannel.java: Line 341: String compoentId = LSMComponentProperties.getLSMComponentID(afp.getFilePath()); > s/compoentId/componentId/ Done Line 381: Map<String, ClusterPartition[]> nodePartitions = ((IAsterixPropertiesProvider) asterixAppRuntimeContextProvider > line break? our formatter allows this length :) https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-replication/src/main/java/org/apache/asterix/replication/management/ReplicationManager.java File asterix-replication/src/main/java/org/apache/asterix/replication/management/ReplicationManager.java: Line 123: private String hostIPAddressFirstOctet = null; > This doesn't seem to be a good idea. Why would multiple network interfaces UCI wireless networks have different first byte than wired networks. I encountered an issue when I was doing demos and both wired and wireless interfaces were active. In a real setup however, this shouldn't be the case. This is used just in case the IP address of a node has changed since it failed and reconnected. The new IP address is broadcasted to other replicas so that they know where to reconnect this the node that just came back. This IP address configuration should be centralized on the CC or something like ZooKeeper and other replicas should fetch it from there. https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-replication/src/main/java/org/apache/asterix/replication/recovery/RemoteRecoveryManager.java File asterix-replication/src/main/java/org/apache/asterix/replication/recovery/RemoteRecoveryManager.java: Line 62: //Currently we will not allow a node to perform remote recovery since another replica already tookover its workload > line break? Done Line 73: //start recovery recovery steps > Just 1 recovery? Done Line 88: throw new IllegalStateException("no ACTIVE remote replica(s) exists to performe remote recovery"); > s/performe/perform/ Done Line 103: maxRemoteLSN = replicationManager.getMaxRemoteLSN(selectedRemoteReplicas.keySet()); > Why is maxRemoteLSN initialized separately? Removed. https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-replication/src/main/java/org/apache/asterix/replication/storage/LSMComponentProperties.java File asterix-replication/src/main/java/org/apache/asterix/replication/storage/LSMComponentProperties.java: Line 142: public void setComponentId(String componentId) { > unused? Removed. Line 150: public void setOriginalLSN(long lSN) { > unused? Removed. https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-replication/src/main/java/org/apache/asterix/replication/storage/LSMIndexFileProperties.java File asterix-replication/src/main/java/org/apache/asterix/replication/storage/LSMIndexFileProperties.java: Line 46: public LSMIndexFileProperties(String filePath, long fileSize, String nodeId, boolean lsmComponentFile, > unused Now it is used in create() Line 96: LSMIndexFileProperties fileProp = new LSMIndexFileProperties(); > Just use the constructor with arguments? Done. Line 105: public void setFilePath(String filePath) { > unused? Removed. Line 117: public void setFileName(String fileName) { > unused? Removed. Line 137: public void setFileSize(long fileSize) { > unused? Removed. Line 145: public void setIdxName(String idxName) { > unused? Removed. Line 153: public void setLsmComponentFile(boolean lsmComponentFile) { > unused? Removed. Line 161: public void setRequiresAck(boolean requiresAck) { > unused? Removed. https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-replication/src/main/java/org/apache/asterix/replication/storage/ReplicaResourcesManager.java File asterix-replication/src/main/java/org/apache/asterix/replication/storage/ReplicaResourcesManager.java: Line 78: } > Any reason why we don't use FileUtils.deleteQuietly here as well? Used. Line 126: } > Could we just use Files.deleteIfExists here? Done Line 337: } > "return name.endsWith(LSM_COMPONENT_MASK_SUFFIX);" ? Done. Line 347: } > "return !name.endsWith(LSM_COMPONENT_MASK_SUFFIX);"? Done Line 353: if (name.equalsIgnoreCase(PersistentLocalResourceRepository.METADATA_FILE_NAME)) { > "return name.equalsIgnoreCase(PersistentLocalResourceRepository.METADATA_FI Done https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/PersistentLocalResourceRepository.java File asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/PersistentLocalResourceRepository.java: Line 166: ObjectOutputStream oosToFos = new ObjectOutputStream(fos)) { > Nice, AutoCloseables :) :) https://asterix-gerrit.ics.uci.edu/#/c/573/3/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/service/recovery/RecoveryManager.java File asterix-transactions/src/main/java/org/apache/asterix/transaction/management/service/recovery/RecoveryManager.java: Line 1119: //------------------------------------------------------------------------- > I think it would be nice to split this into 2 methods - one for the analysi I will be modifying the recovery methods in the next patch. I will see how feasible it is. Line 1155: //if we don't have enough memory for one more job, we will force all jobs to spill their cached entities to disk. > Line breaks? Done Line 1268: datasetLifecycleManager.open(resourceAbsolutePath); > It seems that these don't get closed if something throws ... Nice catch :) I moved the closing part to the finally block and added a try-catch between getComponentLSN to close the recently opened index in case of exception. -- To view, visit https://asterix-gerrit.ics.uci.edu/573 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ice26d980912a315fcb3efdd571d6ce88717cfea4 Gerrit-PatchSet: 3 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Murtadha Hubail <hubail...@gmail.com> Gerrit-Reviewer: Ian Maxon <ima...@apache.org> Gerrit-Reviewer: Jenkins <jenk...@fulliautomatix.ics.uci.edu> Gerrit-Reviewer: Murtadha Hubail <hubail...@gmail.com> Gerrit-Reviewer: Till Westmann <ti...@apache.org> Gerrit-Reviewer: abdullah alamoudi <bamou...@gmail.com> Gerrit-HasComments: Yes