[
https://issues.apache.org/jira/browse/HAMA-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102284#comment-13102284
]
Thomas Jungblut commented on HAMA-431:
--------------------------------------
Wow that's a wall of text :D
I'm no contributor (yet?), so I don't have SVN access, that was the main reason
I choose the Google Code repo.
Yes we took a lot of Hadoop's old code for HamaV1, in these days we don't have
failure recovery, detection should be on it's way (HAMA-370).
*Fault tolerance* in HamaV2 should basically just check if a container is
available through some kind of heartbeat. If a task isn't responding, we should
roll back to the state it was before. The Task is responsible for state saving
every superstep e.G. the messages received by other peers. This should be
planted in HDFS along with the task-id so the AM can rerun the task with this
input. -> we need some kind of task attempts.
*Implementation of barrier synchronization:*
I would be very glad if we can get away from Zookeepers Sync service, we had a
lot of ideas how to make it running (see HAMA-387) but it doesn't help. Edward
asked a question on their user list, but they offered just the same ideas we
have tried out before.
{quote}
This should be agreeable, no?
{quote}
Polling is totally agreeable. I very much doubt that Zookeeper isn't internally
polling either.
{quote}
Reuse of MRV2 classes
{quote}
As you might see I totally reuse your classes. It's cool, but it is more work
to cut down your statemachine handling to something simpler than rewriting it
from scratch.
{quote}
I do clearly see that we should re-use MRV2 components like ContainerLauncher
(launches containers on nodes), RMContainerAllocator(requests containers from
ResourceManager), I'll see how we can move these to a separate common library
module from MRV2 so that Hama(and possibly others) can use them.
{quote}
+1, that would be great.
{quote}
Instead of jumping into writing the implementation,I think it helps to spend
some time developing the design till it reaches some level of stability and
then writing down the module structure [...]
{quote}
You are right.
> MapReduce NG integration
> ------------------------
>
> Key: HAMA-431
> URL: https://issues.apache.org/jira/browse/HAMA-431
> Project: Hama
> Issue Type: New Feature
> Reporter: Thomas Jungblut
> Assignee: Thomas Jungblut
>
> We should take a look at how to integrate Hama's BSP Engine to Hadoop's
> nextGen application platform.
> Can be currently found in the 0.23 branch.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira