[ 
https://issues.apache.org/jira/browse/HAMA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suraj Menon updated HAMA-567:
-----------------------------

    Attachment: Mapper.java

Hi, Please check the simplest Mapper that I have written, it is a work in 
progress and not tested at all. The WritableKeyValues class is 
WritableComparable on the key. The idea is that every mapper would read and 
exchange the key distribution of each peer among themselves while writing 
everything to a diskqueue. I am working on Spilling Queue with combiner. So in 
the first step all the mapper superstep understands the global key distribution 
and assigns each peer the responsibility for partition of keys such that there 
is a minimum of messages exchaged. The message exchange happens in the next 
superstep. Hence I need to provide a reference to the message queue in the next 
superstep. I also want to achieve parallelism by having a thread work on the 
combiners during the expensive sync operation. Also you can see how getting 
peer ID is ugly today, we need a new API to find peer id from the task id 
provided. All this made me feel the necessity for the API changes.
                
> BSPPeer should provide means for chaining supersteps to share data among them.
> ------------------------------------------------------------------------------
>
>                 Key: HAMA-567
>                 URL: https://issues.apache.org/jira/browse/HAMA-567
>             Project: Hama
>          Issue Type: Improvement
>          Components: bsp core
>    Affects Versions: 0.6.0
>            Reporter: Suraj Menon
>             Fix For: 0.6.0
>
>         Attachments: Mapper.java
>
>
> In most scenarios, a superstep would need certain values or objects that were 
> computed in the previous superstep. When using the chaining Superstep design 
> to implement BSP algorithms, this gets a little ugly/difficult to implement. 
> BSPPeer should provide means (preferably a map<String,Object>) so that the 
> next Superstep can ask for the values in previous superstep using String 
> token to query the map. Also, this map could be checkpointed periodically in 
> the background so that we can completely recover the state of a task after 
> failure. The BSPPeer object should have a dedicated get and set function for 
> updating values in the peer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to