On 02/06/15 15:51, Andy Seaborne wrote:
Hi Michael,

A few facts please:

How many birds are there?

What's

SELECT (count(*) AS ?C1)
{ ?d1 <http://www.wikidata.org/entity/P171s> ?v }

SELECT (count(*) AS ?C2)
{ ?s <http://www.wikidata.org/entity/P171v>
<http://www.wikidata.org/entity/Q5113> }

[pressed <send> too quickly:]

One other experiment:

select count(*) where {
?x <http://www.wikidata.org/entity/P171v>+ <http://www.wikidata.org/entity/Q5113> .
   ?d1 <http://www.wikidata.org/entity/P171s> ?x
}

        Andy


It's probably a bad execution plan.  It's supposed to execute the path
backwards which, caveat the reverse link fan out rates, should be OK.


On 02/06/15 14:58, Michael Brunnbauer wrote:

hi all,

I have performance problems with queries using property paths on a Fuseki
2.0.0 TDB with half a billion triples from Wikidata. Ramdom disk
access does
not seem to be the cause. I use a SSD and see low IO tps values during
queries
but high CPU usage. I tried with and without the automatically generated
stats.opt.

Counting all birds takes ca. 8s if not called for the first time (no disk
access, everything in memory):

select count(*) where {
?d1 ( <http://www.wikidata.org/entity/P171s> /
<http://www.wikidata.org/entity/P171v> )+
<http://www.wikidata.org/entity/Q5113>
}

(That's not legal SPARQL :-) The joys of compatibility mode.


Counting all beetles does not seem to finish:

select count(*) where {
?d1 ( <http://www.wikidata.org/entity/P171s> /
<http://www.wikidata.org/entity/P171v> )+
<http://www.wikidata.org/entity/Q22671>
}

I tried with and without stats.opt and also with inverse paths
(^property)
without success.

I guess this is not the "Counting Beyond a Yottabyte" problem?

http://www.w3.org/blog/SW/2012/04/19/no-more-counting-beyond-a-yottabyte-or-why-the-w3c-process-works/

https://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2012Apr/0003.html


No.  (The data isn't a cliche unless the data is really bizarre)

That is a theoretical piece of work on a different design.

(The fact it uses hyperbole and ridicule to make a technical point is
merely annoying.)


If I do a count(distinct ?d1) in the Bird query, I get the same number
so I
guess that the + makes the query "non-counting".

Yes.


Any idea if this slow performance is to be expected and why?

Regards,

Michael Brunnbauer



Reply via email to