Hi Eli, Yes, GiraphFileInputFormat deals with input splitting in all cases. Note that most of the logic is the same as in current Hadoop, and we extend Hadoop's FileInputFormat. I wish there was a way to avoid any code duplication, but this is messing with implementation-specific code that is mostly private.
Alessandro On 2/8/13 2:58 PM, "Eli Reisman" <[email protected]> wrote: >Hey (maybe @Alessandro, don't know...) I have been looking at the >GiraphFileInputFormat. Am I crazy, or with the advent of edge or vertex >based input files, do we now always generate our own input splits, from >scratch, without hadoop being involved? And if so, is this defaulted to >"on" no matter what, or only when we have dual edge-vertex input >information to process? If so, its one less thing I will have to implement >for the YARN implementation. > >Thanks, looking forward to hearing back, > >Eli
