Ed Kohlwey commented on GIRAPH-111:

I think there's a few ways to do this.

There's generalized parallel I/O libraries starting to appear, like HCatalog, 
so thats definitely one option. From what I can tell, HCatalog is primarily for 
tabular data though, so that may not make the most sense given Giraph's current 
focus on using Java objects in the regular Java class system to represent data.

We could also copy out the relevant Hadoop I/O classes (InputFormat, 
OutputFormat, etc) into Giraph, rename their packages, and begin reworking them 
in an appropriate way to better suit Giraph.

Finally, we could also just start designing an I/O package from scratch. I 
think this is probably the least incremental or pragmatic approach, so its 
probably not a fantastic option.
> Refactor I/O to be independent of Map/Reduce
> --------------------------------------------
>                 Key: GIRAPH-111
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-111
>             Project: Giraph
>          Issue Type: Improvement
>          Components: graph
>            Reporter: Ed Kohlwey
> The I/O mechanisms should probably be abstracted entirely from Map/Reduce in 
> order to support making Giraph an independent framework.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to