On 21/09/2022 08:57, Simon Bin wrote:
> Hi,
>
> we have a data set with 500 million triples. Single named graph, fuseki
> tdb2.
>
> We observe different performance between
>
> select (count(*) as ?cnt) {
> ?s ?p ?o
> }
>
> ~4 minutes
>
> select (count(*) as ?cnt) {
> graph <https://data.coypu.org/> { ?s ?p ?o }
> }
>
> ~3 minutes
Caching.
>
> select (count(*) as ?cnt)
> from <https://data.coypu.org/> {
> ?s ?p ?o
> }
>
> takes forever and longer...?
>
> Especially the last case is surprising, any thoughts?
If you use FROM, there is a dataset created for the request that has one
graph that points to the TDB database but it isn't itself TDB. Every ?s
?p ?o is read because access makes triples.
If it's direct to TDB, and that includes GRAPH, the count counts rows
but does not actually fetch the values of ?s ?p ?o.
BindingTDB/BindingNodeId do lazy evaluation of variable values.
There can be multiple FROM - the machinery is general has to cope with that.
FROM != GRAPH.
Andy