Re: Question about TextInputFormat pattern for parsing e.g. RDF

Benjamin Heitmann Mon, 12 Mar 2012 13:25:26 -0700

On 12 Mar 2012, at 20:04, Avery Ching wrote:

> 
> My suggestion would be the following:
> 
> Run a MR job to join all your RDFs on the vertex key and you can either 
> convert them to an easy format to parse with a custom VertexInputFormat of 
> your choice.  If these are one way relationships, you need not create the 
> target vertex.  If they are undirect relationships, when you are processing 
> your RDFs in the MR job, add a directed relationship to both vertices.



Avery, thanks for the feedback. 

I was not thinking about using Map-Reduce in that way, but I guess thats a very 
good idea. 

However, besides the amount of pre-processing required for using Giraph/Hadoop, 
the transient nature of the Giraph graph, 
is also an issue. The scenario of which I am thinking, is that for each run of 
my algorithm, just 1% or less of the data is changed. 
So 99% stay the same every time, and they need to be loaded again for each run. 
That wont be a problem if the computation of the algorithm itself is a lot 
longer then loading the graph data. 
However, that might not be always the case. 

So right now I am trying to get a feeling for that trade-off, and for the 
different alternatives to solving the main research problem ;) 


Thanks again for the reply, cheers, Benjamin.

Re: Question about TextInputFormat pattern for parsing e.g. RDF

Reply via email to