Re: [Neo4j] Loading RDF data questions

Peter Neubauer Thu, 06 Oct 2011 03:17:56 -0700

Bruno,

RDF support is provided via Josh Shinavier's SAIL implementation on
top of Neo4j already.


Look at the SPARQL-plugin-in-the-making,
https://github.com/peterneubauer/sparql-plugin/blob/master/src/test/java/org/neo4j/server/plugin/sparql/BerlinDatasetTest.java
for how to load a fiel into Neo4j as an RDF store, and how to query
it. This is using a subset of the Berlin RDF dataset and queries,
http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/ExploreUseCase/index.html,
for instance.

Does that help? I hope to get this into shape very soon, so you can
use the Neo4j Server with the SPARQL plugin in order to load and query
RDF and essentially turn the Neo4j Server into a Triple Store.

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org               - Your high performance graph database.
http://startupbootcamp.org/    - Öresund - Innovation happens HERE.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



On Thu, Oct 6, 2011 at 7:50 AM, Bruno Paiva Lima da Silva
<[email protected]> wrote:
> Hello,
>
> I'm writing to ask whether I am using correctly Neo4J for loading and
> storing RDF datasets.
> For now my performances results have been quite bad. However, it seems
> to me that I haven't understood well how to use the BatchInserter for
> what I want to.
>
> So, I have RDF datasets that can go from 1K to 20M triples, and I want
> to store them into an empty Neo4J graph.
>
> The method I use for the insertion is the following:
>
> - For each triple of my RDF data:
> -- Check if there is a subject node in the graph. If yes, find it, if
> not, create it.
> -- Check if there is a object node in the graph. If yes, find it, if
> not, create it.
> -- Create an edge with a label "predicate" between subject and object.
>
> This method is quite simple and generic, but has also carries a quite
> big problem:
> It spends more time reading and searching than inserting.
>
> Having profiled its execution, it spends almost 90% of the time
> searching if a given node exists.
>
> For now, I have tried to use Neo4J with simple transactions, then I have
> switched to BatchInserter + LuceneIndex, but I still think there is
> space to improve my program.
>
> That said, my questions are:
> - Can anyone tell me, knowing how Neo4J works, how to improve my
> insertion process or tell me if there is a better solution?
> - If there are any big errors in my code. It's not yet very well
> documented, but it is available here:
> https://bitbucket.org/bplsilva/alaska-project/src/e7fdf2e9341b/src/fr/lirmm/graphik/alaska/impl/graph/neo4j/Neo4jFact.java
>
> Thank you very much,
>
> --
> *PAIVA LIMA DA SILVA Bruno*
> PhD Student in Informatics @ Univ. Montpellier 2
> [ GraphIK Research Team: LIRMM, Montpellier (France) ]
> Website: http://bplsilva.com <bplsilva.com>
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Loading RDF data questions

Reply via email to