Great you share that, so let's do it like this. 2012/5/24 Edward J. Yoon <[email protected]>
> > However it might be a good thing to consider that giraph is supporting > all > > inputformats and have a input key/value to vertex parser that runs when > > loading vertices. > > This would shift the responsibility to the user and we would remove > > Writability of the vertices, thus removing the VertexWritable classes. > > +1 > > On Thu, May 24, 2012 at 4:30 PM, Thomas Jungblut > <[email protected]> wrote: > > Can't post to jira because it is down or has high latency. > > > > I dislike the idea as well, but it is the most optimal case to write the > > vertices. > > Consider the Wikipedia linkset, 1gb of text data as adjacency list. > > With current trunk version it has at most 10gb. > > I have no clear check of how it is with that patch, but I assume that it > > will be less than 1gb. > > Suppose you have 64mb chunksize in HDFS, meaning 160 bsp tasks to be > > launched, as opposed to 16 for the most optimal case. > > I don't know if that's an argument for you. Compatibility to MapReduce > > shouldn't be our first aim, we can make a BSP job out of the random graph > > generator. > > However it might be a good thing to consider that giraph is supporting > all > > inputformats and have a input key/value to vertex parser that runs when > > loading vertices. > > This would shift the responsibility to the user and we would remove > > Writability of the vertices, thus removing the VertexWritable classes. > > > > If you have a good trade-off idea, let me know. > > > > > > 2012/5/24 Edward J. Yoon (JIRA) <[email protected]> > > > >> > >> [ > >> > https://issues.apache.org/jira/browse/HAMA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282244#comment-13282244 > ] > >> > >> Edward J. Yoon commented on HAMA-580: > >> ------------------------------------- > >> > >> I dislike this idea. This makes programming complex and discourages use > of > >> existing Mapper/Reducer e.g., Reducer, LongSumReducer, ... > >> > >> > Improve input of graph module > >> > ----------------------------- > >> > > >> > Key: HAMA-580 > >> > URL: https://issues.apache.org/jira/browse/HAMA-580 > >> > Project: Hama > >> > Issue Type: Improvement > >> > Components: graph > >> > Affects Versions: 0.5.0 > >> > Reporter: Thomas Jungblut > >> > Assignee: Thomas Jungblut > >> > Fix For: 0.5.0 > >> > > >> > Attachments: HAMA-580.patch, HAMA-580_1.patch > >> > > >> > > >> > Currently it is too verbose, the wikipedia dataset is going to be > >> bloated from 0.95gb to 5gb just because it is writing the classes > x-times. > >> > >> -- > >> This message is automatically generated by JIRA. > >> If you think it was sent incorrectly, please contact your JIRA > >> administrators: > >> > https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa > >> For more information on JIRA, see: > http://www.atlassian.com/software/jira > >> > >> > >> > > > > > > -- > > Thomas Jungblut > > Berlin <[email protected]> > > > > -- > Best Regards, Edward J. Yoon > @eddieyoon > -- Thomas Jungblut Berlin <[email protected]>
