Re: A new InputFormat. What to extend?

Renato Marroquín Mogrovejo Thu, 11 Jul 2013 16:48:35 -0700

Hi Armando,

I really understand what you're saying about the input formats because I am
also writing an integration with Apache Gora and I am facing the same
problems. This is because Gora does not rely directly on Hadoop input
formats but Giraph does.
I think an alternative would be to write an abstraction for input formats
which would have to be agnostic to how data is serialized. In this way,
Giraph could read and write data from any data source without directly
depending on Hadoop's input format.
On the other hand we could extend Hadoop input formats and let them live on
their corresponding modules. IMHO the former option would be a better
choice for extensibility and modularity purposes.


Renato M.
Hi guys.

I am currently trying to implement a PoC for the issue GIRAPH-549 (which
btw is the main topic of my GSoC project).

As suggested in the issue by Claudio I looked at the Faunus
implementation to connect to Rexster and get the data but at the moment
I am overwhelmed by all the available classes.

My question and doubt is the following: Faunus approach is to create a
InputFormat extending directly from the hadoop InputFormat class. I
however saw that some classes in Giraph extend directly from hadoop
classes while others extend from VertexInputFormat (like
TextVertexInputFormat). So what would be the best choice I could make? I
started extending VertexInputFormat but an opinion from you would be
very appreciated.

If you need any additional details just let me know.

Cheers,
Armando

Re: A new InputFormat. What to extend?

Reply via email to