Re: [Neo4j] New Neo4j SPARQL Plugin

Michael Hunger Thu, 13 Nov 2014 03:48:13 -0800

Interesting.
I always wondered if it was possible to transform RDF to a more compact
property graph model on import but still allow RDF queries / export on top
of that.
This would be more efficient both in space and performance but more
involved at the import stage.


E.g. all rdf triples that identify "properties" would be transformed into
real properties and relevant type/ontology information would be (also)
transformed into Labels.
Only the "real" semantic relationships that add value to the domain would
kept as actual relationships but potentially augmented with properties too.

One could also imagine a "graph optimization applied to the RDF graph" that
does the above but leaves the original RDF model in Neo4j but uses the
optimized version for more efficient querying?

Cool, then perhaps we can meet somewhere in Germany (Berlin, Frankfurt), or
you can come over to our Malmö office for a meetup to show it off?

Looking forward to your blog post. I'd love to see a complete roundtrip
covered, from import to queries and inference.

If you need anything from me, please ping me.

Cheers, Michael


On Thu, Nov 13, 2014 at 12:11 PM, Niclas Hoyer <[email protected]>
wrote:

>
> Do you have any information about the model you use to store RDF
>> efficiently and any performance numbers? Esp. comparing it with cypher?
>> That would be really interesting.
>>
> The plugin is based on the blueprints framework and the "SAIL
> Ouplementation
> <https://github.com/tinkerpop/blueprints/wiki/Sail-Ouplementation>". A
> RDF triple is mapped as a directed edge. All information is stored in the
> properties. So e.g. a URI node in RDF <http://example.com> is mapped as
> ({kind: "uri",
> value: "http://example.com"; }). The URI of an edge is represented as type
> of the edge in Neo4j. The mapping does not use labels on nodes.
>
> I tested performance against the Fuseki
> <https://jena.apache.org/documentation/serving_data/> Graphstore with
> different sizes of datasets. Unfortunately the RDF mapping has its
> drawbacks, because Neo4j needs much more space than Fuseki. The largest
> dataset I tried was 17.9 GB in n-triples format.
> Fuseki uses ~ 9 GB disk space after import, but Neo4j allocated 390 GB.
> That also results in about 27 times slower query execution on this large
> dataset. Using the smallest dataset with just 2 MB Neo4j is just 2.4 times
> slower than Fuseki. I used the "Berlin SPARQL Benchmark
> <http://wifo5-03.informatik.uni-mannheim.de/bizer/berlinsparqlbenchmark/>"
> for testing.
>
> Do you have any examples for rdf / turtle import using the plugin?
>>
> Yes, on the GitHub
> <https://github.com/niclashoyer/neo4j-sparql-extension#sparql-graph-protocol>
> page there is an example for turtle import using curl. A PUT request to the
> graph resource will replace all data in the graph:
>
> $ curl -v -X PUT \
>        localhost:7474/rdf/graph \
>        -H "Content-Type:text/turtle" --data-binary @data.ttl
>
>
>> And if you had a blog post, we could help you promote the plugin and also
>> link it from our website.
>>
> Yes, I don't have a blog post yet. I'll come back to you as soon as I've
> got something.
>
>
>> Where are you located?
>>
>  Kiel, Germany.
>
> Regards,
> Niclas
>
> --
> You received this message because you are subscribed to the Google Groups
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] New Neo4j SPARQL Plugin

Reply via email to