Hi Armando, I really understand what you're saying about the input formats because I am also writing an integration with Apache Gora and I am facing the same problems. This is because Gora does not rely directly on Hadoop input formats but Giraph does. I think an alternative would be to write an abstraction for input formats which would have to be agnostic to how data is serialized. In this way, Giraph could read and write data from any data source without directly depending on Hadoop's input format. On the other hand we could extend Hadoop input formats and let them live on their corresponding modules. IMHO the former option would be a better choice for extensibility and modularity purposes.
Renato M. Hi guys. I am currently trying to implement a PoC for the issue GIRAPH-549 (which btw is the main topic of my GSoC project). As suggested in the issue by Claudio I looked at the Faunus implementation to connect to Rexster and get the data but at the moment I am overwhelmed by all the available classes. My question and doubt is the following: Faunus approach is to create a InputFormat extending directly from the hadoop InputFormat class. I however saw that some classes in Giraph extend directly from hadoop classes while others extend from VertexInputFormat (like TextVertexInputFormat). So what would be the best choice I could make? I started extending VertexInputFormat but an opinion from you would be very appreciated. If you need any additional details just let me know. Cheers, Armando
