Issues about Partitioning and Record converter

Edward J. Yoon Fri, 03 May 2013 17:32:47 -0700

Hi all,

I'm reading our old discussions about record converter, superstep
injection, and common module:


- http://markmail.org/message/ol32pp2ixfazcxfc
- http://markmail.org/message/xwtmfdrag34g5xc4

To clarify goals and objectives:

1. A parallel input partition is necessary for obtaining scalability
and elasticity of a Bulk Synchronous Parallel processing (It's not a
memory issue, or Disk/Spilling Queue, or HAMA-644. Please don't
shake).
2. Input partitioning should be handled at BSP framework level, and it
is for every Hama jobs, not only for Graph jobs.
3. Unnecessary I/O Overhead need to be avoided, and NoSQLs input also
should be considered.

The current problem is that every input of graph jobs should be
rewritten on HDFS. If you have a good idea, Please let me know.

--
Best Regards, Edward J. Yoon
@eddieyoon

Issues about Partitioning and Record converter

Reply via email to