For example, the TIM in-memory dataset impl uses 3 indexes on triples and 6 on 
quads to ensure that all one-variable queries (i.e. for triples ?s <p> <o>, <s> 
?p <o>, <s> <p> ?o) will be as direct as possible. The indexes are hashmaps 
(e.g. Map<Node, Map<Node, Set<Node>>>) and don't use the kind of node directory 
that TDB does. 

There are lots of other ways to play that out, according to the balance of 
times costs and storage costs desired and the expected types of queries.

Adam

> On Dec 23, 2017, at 2:56 AM, Lorenz Buehmann 
> <[email protected]> wrote:
> 
> 
> On 23.12.2017 00:47, Andrew U. Frank wrote:
>> are there some rules which queries are linear in the amount of data in
>> the graph? is it correct to assume that searching for a triples based
>> on a single condition (?p a X) is logarithmic in the size of the data
>> collection? 
> Why should it be logarithmic? The complexity of matching a single BGP
> depends on the implementation. I could search for matches by doing a
> scan on the whole dataset - that would for sure be not logarithmic but
> linear. Usually, if exists, a triple store would use the POS index in
> order to find bindings for variable ?p.
> 
> Cheers,
> Lorenz

Reply via email to