Murtadha Hubail has posted comments on this change. Change subject: Introduce Strategy Based Replication and Fault-Tolerance ......................................................................
Patch Set 14: (21 comments) @Yingyi, I addressed your comments in [1]. Regarding unit tests, currently we don't have a way to simulate node failures (on a single JVM) to properly test different fault-tolerance strategies. I have a plan to enable that and add recovery unit tests, but I have to do more refactoring first. [1] https://asterix-gerrit.ics.uci.edu/#/c/1529/ https://asterix-gerrit.ics.uci.edu/#/c/1405/14/asterixdb/asterix-app/src/main/java/org/apache/asterix/app/nc/RecoveryManager.java File asterixdb/asterix-app/src/main/java/org/apache/asterix/app/nc/RecoveryManager.java: PS14, Line 185: LOGGER > Check logging level before calling log. Done https://asterix-gerrit.ics.uci.edu/#/c/1405/14/asterixdb/asterix-app/src/main/java/org/apache/asterix/app/replication/MetadataNodeFaultToleranceStrategy.java File asterixdb/asterix-app/src/main/java/org/apache/asterix/app/replication/MetadataNodeFaultToleranceStrategy.java: PS14, Line 73: synchronized > why synchronized is needed for each method? That's true in the case of node join and node failure. However, the other methods which are message based, each message runs on a different thread. In addition, there is a plan to make an HTTP API to manage cluster state. All these events must be processed one at a time since each response depends on the current state of the cluster. PS14, Line 132: HyracksDataException > Use ErrorCode: Done PS14, Line 144: log > Check logging level before the log call? Done https://asterix-gerrit.ics.uci.edu/#/c/1405/14/asterixdb/asterix-app/src/main/java/org/apache/asterix/app/replication/NoFaultToleranceStrategy.java File asterixdb/asterix-app/src/main/java/org/apache/asterix/app/replication/NoFaultToleranceStrategy.java: PS14, Line 84: > Use ErrorCode: Done https://asterix-gerrit.ics.uci.edu/#/c/1405/14/asterixdb/asterix-app/src/main/java/org/apache/asterix/util/FaultToleranceUtil.java File asterixdb/asterix-app/src/main/java/org/apache/asterix/util/FaultToleranceUtil.java: PS14, Line 62: LOGGER > Check logging level before calling log? Done https://asterix-gerrit.ics.uci.edu/#/c/1405/14/asterixdb/asterix-app/src/main/resources/cluster.xml File asterixdb/asterix-app/src/main/resources/cluster.xml: PS14, Line 35: > > How to set those parameters in NCService based installation? That's a question for Mike Blow :) As long as there is a documented way to set them there, it's a matter of parsing them correctly. We should deprecate managix and migrate everything to NCService. Otherwise we will continue to have these questions. https://asterix-gerrit.ics.uci.edu/#/c/1405/14/asterixdb/asterix-app/src/test/resources/runtimets/results/api/replication/replication.1.adm File asterixdb/asterix-app/src/test/resources/runtimets/results/api/replication/replication.1.adm: PS14, Line 3: batchsize > Where do the original information "enabled" and "factor" go now? In the high availability tag of the cluster.xml. The "factor" will not be there if the specified replication strategy doesn't require a factor configuration. https://asterix-gerrit.ics.uci.edu/#/c/1405/14/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/config/ClusterProperties.java File asterixdb/asterix-common/src/main/java/org/apache/asterix/common/config/ClusterProperties.java: PS14, Line 78: getNode > getNode() -> getNodes() The getNode method is auto-generated from the cluster.xsd. If I change the Node element to Nodes, it will be more confusing when we replace all Node variables by Nodes. https://asterix-gerrit.ics.uci.edu/#/c/1405/14/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/replication/ChainedDeclusteringReplicationStrategy.java File asterixdb/asterix-common/src/main/java/org/apache/asterix/common/replication/ChainedDeclusteringReplicationStrategy.java: PS14, Line 48: log > Check logging level before the log call? Done PS14, Line 79: IllegalStateException > Using error code? Done https://asterix-gerrit.ics.uci.edu/#/c/1405/14/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/replication/ReplicationStrategyFactory.java File asterixdb/asterix-common/src/main/java/org/apache/asterix/common/replication/ReplicationStrategyFactory.java: PS14, Line 52: IllegalArgumentException > Use error code? Done PS14, Line 59: IllegalStateException > Error code? Done https://asterix-gerrit.ics.uci.edu/#/c/1405/14/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/transactions/IRecoveryManager.java File asterixdb/asterix-common/src/main/java/org/apache/asterix/common/transactions/IRecoveryManager.java: PS14, Line 37: INITIAL_RUN > comment what's the difference between INITIAL_RUN and NEW_UNIVERSE? Done PS14, Line 125: startLocalRecovery > Document this interface method. Done https://asterix-gerrit.ics.uci.edu/#/c/1405/14/asterixdb/asterix-common/src/test/java/org/apache/asterix/test/aql/TestExecutor.java File asterixdb/asterix-common/src/test/java/org/apache/asterix/test/aql/TestExecutor.java: PS14, Line 1123: rm > Use File API to remove? Done https://asterix-gerrit.ics.uci.edu/#/c/1405/14/asterixdb/asterix-installer/src/test/java/org/apache/asterix/installer/test/MetadataReplicationIT.java File asterixdb/asterix-installer/src/test/java/org/apache/asterix/installer/test/MetadataReplicationIT.java: PS14, Line 44: integrationts > Use File separator for Windows compatibility. Done https://asterix-gerrit.ics.uci.edu/#/c/1405/5/asterixdb/asterix-replication/src/main/java/org/apache/asterix/replication/management/ReplicationManager.java File asterixdb/asterix-replication/src/main/java/org/apache/asterix/replication/management/ReplicationManager.java: Line 108: > MAJOR SonarQube violation: Done Line 936: > MAJOR SonarQube violation: Done https://asterix-gerrit.ics.uci.edu/#/c/1405/5/asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/replication/MetadataNodeFaultToleranceStrategy.java File asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/replication/MetadataNodeFaultToleranceStrategy.java: Line 80 > MAJOR SonarQube violation: Done https://asterix-gerrit.ics.uci.edu/#/c/1405/5/asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/util/ClusterStateManager.java File asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/util/ClusterStateManager.java: Line 101 > MAJOR SonarQube violation: Done -- To view, visit https://asterix-gerrit.ics.uci.edu/1405 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1d1012f5541ce786f127866efefb9f3db434fedd Gerrit-PatchSet: 14 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Murtadha Hubail <[email protected]> Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Reviewer: Murtadha Hubail <[email protected]> Gerrit-Reviewer: Yingyi Bu <[email protected]> Gerrit-Reviewer: abdullah alamoudi <[email protected]> Gerrit-HasComments: Yes
