----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14946/#review28259 -----------------------------------------------------------
Great stuff here but I think there are still come design decisions to address. I didn't spend too much time looking into the RecoverProcess just yet, I'll do that in a second pass after we address these comments first. Awesome! src/log/log.hpp <https://reviews.apache.org/r/14946/#comment55017> We need some more comments when a log shouldn't be in strict mode. And also why it's not in strict mode by default. src/log/log.hpp <https://reviews.apache.org/r/14946/#comment55018> So I don't fully understand the semantics here. What if I try and write to the log while the local replica is recovering? Do we need to gate other log operations on recovery? See the Registrar for examples of this to understand what I mean by "gate". src/log/recover.cpp <https://reviews.apache.org/r/14946/#comment55024> Okay, this is cheating and you know it! ;) Given the unclear semantics around recovery in Log anyway, I'm not sure that we have to do this. I think we should clean up those semantics first. src/log/replica.hpp <https://reviews.apache.org/r/14946/#comment55004> Hmm, what do you mean by backwards compatibility here? Most old users will want to turn strict on, and most new users will want to have it on by default and they'll likely miss it since it's a defaulted parameter. What's the issue in making strict = true by default? src/log/replica.cpp <https://reviews.apache.org/r/14946/#comment55012> s/meta data/metadata/ src/log/replica.cpp <https://reviews.apache.org/r/14946/#comment55020> If I'm reading this correctly both the 'persist(const ReplicaState&)' and 'persist(uint64_t)' functions are only used in 'setState' and 'setPromised' respectively. Let's just inline them now (less functions). In addition, just like the "persist" naming, how about renaming setState and setPromised to 'update' overloaded by argument type? Then in the code we'd see things like 'update(proposal)' and 'update(state)'. Basically the 'update' family of functions will correspond to updating the metadata. src/log/replica.cpp <https://reviews.apache.org/r/14946/#comment55013> I'd probably still use the assignment operator here just because it's less abrupt with the surrounding code. src/log/replica.cpp <https://reviews.apache.org/r/14946/#comment55014> s/mater/matter/ src/log/replica.cpp <https://reviews.apache.org/r/14946/#comment55015> s/log/replica/ src/log/replica.cpp <https://reviews.apache.org/r/14946/#comment55016> Why can't the default metadata status just be EMPTY? Then no need to set it explicitly right? src/messages/log.proto <https://reviews.apache.org/r/14946/#comment55007> How about s/NORMAL/VOTING/. Also, see my comments below but I think we should embed this enum and then kill the 'REPLICA_' prefixes. src/messages/log.proto <https://reviews.apache.org/r/14946/#comment55003> s/meta data/metadata/ src/messages/log.proto <https://reviews.apache.org/r/14946/#comment55005> How about s/ReplicaInfo/Metadata/? This is more inline with things like 'Promise' too. src/messages/log.proto <https://reviews.apache.org/r/14946/#comment55006> Let's move the ReplicaState enum as an embedded message. See things like Value::Type for other examples of this. Also, I'd like to call it 'Status' instead since state is really overloaded throughout all of this code. ;) message Metadata { enum Status { VOTING = 1; .. } required Status status = 1 [default = EMPTY]; } This will let you address things cleanly, for example: Metadata::Status::VOTING. Also, I added a default, not sure if that is correct or not. src/messages/log.proto <https://reviews.apache.org/r/14946/#comment55009> // DEPRECATED! src/messages/log.proto <https://reviews.apache.org/r/14946/#comment55008> s/REPLICA_INFO/METADATA/ src/messages/log.proto <https://reviews.apache.org/r/14946/#comment55010> // DEPRECATED! src/messages/log.proto <https://reviews.apache.org/r/14946/#comment55011> I think reusing the RecoverRequest for both the network broadcast and a RecoverProcess adds unnecessarily complexity. Why can't the RecoverProcess just use it's Shared<Replica> to explicitly ask the Replica to set update it's metadata? - Benjamin Hindman On Nov. 5, 2013, 12:52 a.m., Jie Yu wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/14946/ > ----------------------------------------------------------- > > (Updated Nov. 5, 2013, 12:52 a.m.) > > > Review request for mesos and Benjamin Hindman. > > > Bugs: MESOS-736 > https://issues.apache.org/jira/browse/MESOS-736 > > > Repository: mesos-git > > > Description > ------- > > This is the last patch of a series of patches that implement catch-up > replicated log. > > Here is summary of this patch: > 1) Introduced RecoverRequest/RecoverResponse in log.proto. > 2) Replaced Promise with ReplicaInfo in log.proto as we need persist recovery > information (maintain backwards compatibility). > 3) A 2-phase empty log start-up algorithm. > 4) Added a test to test two competing recover processes. > > This is a joint work with Yan Xu. > > > Diffs > ----- > > src/Makefile.am 9780d07 > src/log/log.hpp 77edc7a > src/log/recover.hpp PRE-CREATION > src/log/recover.cpp PRE-CREATION > src/log/replica.hpp d1f5ead > src/log/replica.cpp 59a6ff3 > src/messages/log.proto 3d5859f > src/tests/log_tests.cpp ff5f86c > > Diff: https://reviews.apache.org/r/14946/diff/ > > > Testing > ------- > > bin/mesos-tests.sh --gtest_filter=CoordinatorTest.*:ReplicaTest.*:LogTest.* > --gtest_repeat=100 > > > Thanks, > > Jie Yu > >
