Re: A new InputFormat. What to extend?

Eli Reisman Fri, 26 Jul 2013 13:55:58 -0700

My instinct is you want to start from one of Giraph's higher-level
abstractions instead of a raw Hadoop InputFormat.



On Thu, Jul 11, 2013 at 4:47 PM, Renato Marroquín Mogrovejo <
[email protected]> wrote:

> Hi Armando,
>
> I really understand what you're saying about the input formats because I am
> also writing an integration with Apache Gora and I am facing the same
> problems. This is because Gora does not rely directly on Hadoop input
> formats but Giraph does.
> I think an alternative would be to write an abstraction for input formats
> which would have to be agnostic to how data is serialized. In this way,
> Giraph could read and write data from any data source without directly
> depending on Hadoop's input format.
> On the other hand we could extend Hadoop input formats and let them live on
> their corresponding modules. IMHO the former option would be a better
> choice for extensibility and modularity purposes.
>
> Renato M.
> Hi guys.
>
> I am currently trying to implement a PoC for the issue GIRAPH-549 (which
> btw is the main topic of my GSoC project).
>
> As suggested in the issue by Claudio I looked at the Faunus
> implementation to connect to Rexster and get the data but at the moment
> I am overwhelmed by all the available classes.
>
> My question and doubt is the following: Faunus approach is to create a
> InputFormat extending directly from the hadoop InputFormat class. I
> however saw that some classes in Giraph extend directly from hadoop
> classes while others extend from VertexInputFormat (like
> TextVertexInputFormat). So what would be the best choice I could make? I
> started extending VertexInputFormat but an opinion from you would be
> very appreciated.
>
> If you need any additional details just let me know.
>
> Cheers,
> Armando
>

Re: A new InputFormat. What to extend?

Reply via email to