[
https://issues.apache.org/jira/browse/HAMA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Suraj Menon updated HAMA-567:
-----------------------------
Attachment: Mapper.java
Hi, Please check the simplest Mapper that I have written, it is a work in
progress and not tested at all. The WritableKeyValues class is
WritableComparable on the key. The idea is that every mapper would read and
exchange the key distribution of each peer among themselves while writing
everything to a diskqueue. I am working on Spilling Queue with combiner. So in
the first step all the mapper superstep understands the global key distribution
and assigns each peer the responsibility for partition of keys such that there
is a minimum of messages exchaged. The message exchange happens in the next
superstep. Hence I need to provide a reference to the message queue in the next
superstep. I also want to achieve parallelism by having a thread work on the
combiners during the expensive sync operation. Also you can see how getting
peer ID is ugly today, we need a new API to find peer id from the task id
provided. All this made me feel the necessity for the API changes.
> BSPPeer should provide means for chaining supersteps to share data among them.
> ------------------------------------------------------------------------------
>
> Key: HAMA-567
> URL: https://issues.apache.org/jira/browse/HAMA-567
> Project: Hama
> Issue Type: Improvement
> Components: bsp core
> Affects Versions: 0.6.0
> Reporter: Suraj Menon
> Fix For: 0.6.0
>
> Attachments: Mapper.java
>
>
> In most scenarios, a superstep would need certain values or objects that were
> computed in the previous superstep. When using the chaining Superstep design
> to implement BSP algorithms, this gets a little ugly/difficult to implement.
> BSPPeer should provide means (preferably a map<String,Object>) so that the
> next Superstep can ask for the values in previous superstep using String
> token to query the map. Also, this map could be checkpointed periodically in
> the background so that we can completely recover the state of a task after
> failure. The BSPPeer object should have a dedicated get and set function for
> updating values in the peer.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira