Re: [Neo4j] New Neo4j SPARQL Plugin

Bo Ferri Fri, 21 Nov 2014 00:35:13 -0800

Hi all,

@Niclas: thanks a lot for efforts on providing an up-to-date SPARQL 
extension for Neo4j.


On Thursday, November 20, 2014 5:33:47 PM UTC+1, Michael Hunger wrote:
>
> That's what I meant with the misfit of modeling RDF data 1:1 into the 
> property graph instead of having a "sensible" mapping of only real entities 
> to nodes, real semantic tuples to relationships and everything else to 
> properties.
>
>
I was also often thinking about this approach of mapping RDF to the 
property graph model. However, I think that this wouldn't really scale, 
because then you'll usually double the cost of your query (afaik), i.e., 
you need to have a look at the node properties and at its relations for a 
certain attribute, because you cannot always assume to receive literals or 
resources for a certain attribute (which you'll need to know before to just 
request one of both types). Furthermore, (afaik) node properties are not 
designed to store lists of values, i.e., it requires further processing 
steps to store multiple literal values in one node property (i.e. for one 
attribute).
So I would tend to say that it's best to store everything as 
(node)-[edge]->[node] relationship, i.e., which perfectly aligns with the 
basic structure of an RDF statement (simple subject-predicate-object 
sentence). Then it doesn't matter, whether you are querying for statements 
with literal values or statements with resource values. Moreover, you have 
the opportunity to (better) deal with metadata (external context, ...) 
about statements, since you have the opportunity to also add properties at 
the edges (relationships), i.e., you can do, .e.g., statement-based 
versioning, clustering/partitioning (e.g. á la Named Graphs) or introduce 
qualified attributes for ordering or simply add a unique identifier for the 
statement itself. So you can design a (graph) data model with more 
comprehensive capabilities then RDF ;) (because of the flexibility of the 
property graph model). Finally, you can create indices as necessary, e.g., 
for resource (nodes), for statements (relationships) or for literals to 
speed up the queries.

Last but not least, we implemented this approach (prototypically (?)) as 
Neo4j unmanaged extension that can be found at

https://github.com/dswarm/dswarm-graph-neo4j

More details about the design of the graph data model can be found at

https://github.com/dswarm/dswarm-documentation/wiki/Graph-Data-Model 
and https://github.com/dswarm/dswarm-documentation/wiki/Graph-Exploration

We are happy about every kind of feedback and looking forward to 
interesting discussion about RDF-based graph data models mapped onto the 
property graph data model ;)

Cheers,


Bo


PS: we also experimented with batch import, 
see 
https://github.com/dswarm/dswarm-graph-neo4j/tree/master/src/main/java/org/dswarm/graph/batch

 

> It would be stellar to resolve that in a good way with a sensible default 
> mapping that might be augmented.
> Wes and I discussed that when importing Freebase Data into Neo4j
>
> Michael
>
> On Thu, Nov 20, 2014 at 4:13 PM, Andrii Stesin <[email protected] 
> <javascript:>> wrote:
>
>> On Thursday, November 13, 2014 1:11:47 PM UTC+2, Niclas Hoyer wrote:
>>>
>>> Fuseki uses ~ 9 GB disk space after import, but Neo4j allocated 390 GB. 
>>> That also results in about 27 times slower query execution on this large 
>>> dataset.
>>>
>>  
>> I suspect some data modelling issue here... the difference is way bigger 
>> than one can expect. Factor of 10 won't make me wonder too much, but 40+ ?? 
>> why and how?
>>
>> Using the smallest dataset with just 2 MB Neo4j is just 2.4 times slower 
>>> than Fuseki.
>>>
>>
>> This also makes me wonder, does Neo4j introduce so big an overhead 
>> compared to Fuseki? (small example should completely fit in memory, doesn't 
>> it?)
>>
>> WBR,
>> Andrii
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Neo4j" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] New Neo4j SPARQL Plugin

Reply via email to