[ 
https://issues.apache.org/jira/browse/HAMA-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102263#comment-13102263
 ] 

Vinod Kumar Vavilapalli commented on HAMA-431:
----------------------------------------------

Long comment, the leisure of the weekend :)

Good to see the ball rolling.

I had a browsing session on the current HAMA code(let's call this HamaV1 code) 
and the mapreduce-integration branch (actually this should be Yarn-integration, 
let's call this HamaV2).

Some thoughts follow. Some of the following may be naive as I am new around 
here :)

*Regarding the Job and Task state machines*: Yes it does look like you don't 
need a lot of states and their corresponding transitions here, from what I can 
see from HamaV1 _JobInProgress_ and _TaskInProgress_. Is that because you don't 
have good failure handling in HamaV1 (as I read in of the presentations)? It 
that isn't true, ignore what follows. Otherwise, I think it is the right time 
to think about fault tolerance (if at all) and write down the state machines to 
include the faulty scenarios.

*Implementation of barrier synchronization*: Not sure of the problems you ran 
with ZooKeeper in HamaV1, but can't we use the _ApplicationMaster_(AM) in 
HamaV2 as a barrier synchronization service? Each {{BSPPeer}} could 
periodically poll the AM if it can proceed to the next superstep. If and when 
the AM goes down, all the BSPPeers just wait there spinning till AM is 
restarted by the Yarn _ResourceManager_.
   -- Pros: Avoiding ZooKeeper frees BSP from the ZK external dependency, one 
less service needed for running HAMA apps.
   -- Cons: It robs HAMA of the the notification push vis ZK's watcher 
mechanism (notification push vs periodic pull) (This should be agreeable, no?).
  Thoughts?

*Regarding use of MR classes*:
 - _Reuse of MRV2 classes_: I was appalled by the amount of Hadoop MapReduce 
code (kinda) forked in HamaV1. Glad that with Yarn and HamaV2, most of the 
forking will be gone. Still, one look at the HamaV2 code you have at Google 
Code tells me you are trying to mimic MRV2 (MapReduce over YARN) internals. 
IMO, that isn't needed as the Job, Task, TaskAttempt etc in MR have concepts 
specific to MapReduce like Map/Reduce tasks. I think we can redesign these 
objects needed for HAMA here relatively with far more ease. And that's cleaner 
too.
 - _Code reuse from MRV2_: OTOH, I do clearly see that we should re-use MRV2 
components like ContainerLauncher (launches containers on nodes), 
RMContainerAllocator(requests containers from ResourceManager), I'll see how we 
can move these to a separate common library module from MRV2 so that Hama(and 
possibly others) can use them.

*Meta comment*: Instead of jumping into writing the implementation, I think it 
helps to spend some time developing the design till it reaches some level of 
stability and then writing down the module structure(like BspAppMaster module, 
BspChild module etc.), followed by the interfaces of all the data objects and 
the components and finally wiring them together. Once we have all the 
interfaces and communication patterns in place, implementation can be done in 
parallel. It did help us writing MRV2 a lot cleaner, am sure it will help us 
here too.

*General infra thought*: I think having this branch at apache svn helps HAMA's 
incubation status. Also it will be easy for anyone else from the current 
hama-dev interested in working on this to use apache lists, svn etc. (Oh, BTW, 
I am looking for collaborating too :) ). What do you think?

> MapReduce NG integration
> ------------------------
>
>                 Key: HAMA-431
>                 URL: https://issues.apache.org/jira/browse/HAMA-431
>             Project: Hama
>          Issue Type: New Feature
>            Reporter: Thomas Jungblut
>            Assignee: Thomas Jungblut
>
> We should take a look at how to integrate Hama's BSP Engine to Hadoop's 
> nextGen application platform.
> Can be currently found in the 0.23 branch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to