[
https://issues.apache.org/jira/browse/HAMA-557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419420#comment-13419420
]
Suraj Menon commented on HAMA-557:
----------------------------------
Thanks for the detailed review Thomas.
I actually continued working from the point of this patch.
> Compile Problem TestCheckpoint overrides the method replayMessages, but it
> does not exist, should it exist?
Sorry, I have fixed the compile problem. Caught me doing mvn install
-DskipTests = true.
> In hama-core there is a folder created called "nullzookeeper", from what
> testcase does that come from?
I think it should be from the TestSyncService, I shall look into it.
> Can we remove the tilde's ~ from debug output?
The logs with ~s are going to be removed completely. I have it for quick check
on progress.
> Do you think we should stick with defining interfaces with "I" in front? I'm
> naming interfaces without them and call the concrete implementations *Impl.
> What do you think is the best?
I can change it and we can continue the naming convention that is already there.
> Now we have a lot of services, we could extract init and close to a
> superinterface, WDYT? Don't know about the usage then if they can be composed.
Most of our services get initialized by ReflectionUtils.newInstance. Until now
we don't have a common service that validates the interaction between each of
these services. Let's look into this once we have such a requirement. I also
feel inits may have different signatures for different services.
Thanks, your additional notes had some good catches. I have fixed few of those
already. In my final version, I am planning to put more unit test cases along
with some fixes for issues I am encountering on my cluster. Hopefully, you were
happy with the way refactored code interacted with each other. I wanted these
to be building blocks for future work.
> Implement Checkpointing service in Hama
> ---------------------------------------
>
> Key: HAMA-557
> URL: https://issues.apache.org/jira/browse/HAMA-557
> Project: Hama
> Issue Type: Sub-task
> Components: bsp core
> Affects Versions: 0.6.0
> Reporter: Suraj Menon
> Assignee: Suraj Menon
> Fix For: 0.6.0
>
> Attachments: HAMA-505-557-610-611-v1.patch,
> HAMA-557-ft-framework.patch
>
>
> Implement checkpointing service in Apache Hama. My patches for HAMA-533 and
> HAMA-534 are blocked on this.
> - Checkpointing should be done as messages are either sent or received. I
> prefer while receiving messages, as we can achieve some parallelism with
> asynchronous messages. Please comment if you differ.
> - BSPMaster should hold the checkpoint status for each task. Checkpoint
> status includes superstep count and file information for which checkpointing
> is complete
> - MessageManager should notify Checkpointer of a new message at BSPPeer.
> - Implement/Reuse MessageBundle class as splitClass in BSPPeerImpl for
> recovery in initInput.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira