Can't post to jira because it is down or has high latency.

I dislike the idea as well, but it is the most optimal case to write the
vertices.
Consider the Wikipedia linkset, 1gb of text data as adjacency list.
With current trunk version it has at most 10gb.
I have no clear check of how it is with that patch, but I assume that it
will be less than 1gb.
Suppose you have 64mb chunksize in HDFS, meaning 160 bsp tasks to be
launched, as opposed to 16 for the most optimal case.
I don't know if that's an argument for you. Compatibility to MapReduce
shouldn't be our first aim, we can make a BSP job out of the random graph
generator.
However it might be a good thing to consider that giraph is supporting all
inputformats and have a input key/value to vertex parser that runs when
loading vertices.
This would shift the responsibility to the user and we would remove
Writability of the vertices, thus removing the VertexWritable classes.

If you have a good trade-off idea, let me know.


2012/5/24 Edward J. Yoon (JIRA) <[email protected]>

>
>    [
> https://issues.apache.org/jira/browse/HAMA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282244#comment-13282244]
>
> Edward J. Yoon commented on HAMA-580:
> -------------------------------------
>
> I dislike this idea. This makes programming complex and discourages use of
> existing Mapper/Reducer e.g., Reducer, LongSumReducer, ...
>
> > Improve input of graph module
> > -----------------------------
> >
> >                 Key: HAMA-580
> >                 URL: https://issues.apache.org/jira/browse/HAMA-580
> >             Project: Hama
> >          Issue Type: Improvement
> >          Components: graph
> >    Affects Versions: 0.5.0
> >            Reporter: Thomas Jungblut
> >            Assignee: Thomas Jungblut
> >             Fix For: 0.5.0
> >
> >         Attachments: HAMA-580.patch, HAMA-580_1.patch
> >
> >
> > Currently it is too verbose, the wikipedia dataset is going to be
> bloated from 0.95gb to 5gb just because it is writing the classes x-times.
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA
> administrators:
> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
>


-- 
Thomas Jungblut
Berlin <[email protected]>

Reply via email to