> This is what was discovered before - the cost of scanning and filtering
> isn't that high and why outlier cases may be measurable faster, the bulk
> of queries will be marginally faster. There is always a lot of things
> that can be done; it comes down to contributions and priorities.
>
> And the cost of the join? and of the CONSTRUCT? And if Fuseki, the HTTP
> costs which vary from trivial to a lot depending on result sizes.
> connection caching. The point is "it is complicated" and that means
> however good the point improvement is, it may not have a significant
> overall benefit.
>
> Investigation needed before jumping into implementation.

Well... I don't disagree that it is complicated. I now have a
relatively straightforward query that takes 1.7s even after a few
attempts:

select (count (distinct ?e) as ?count) where {
    {
        ?le adm:logDate ?sdate .
        FILTER(?sdate > "2020-08-20T00:00:00"^^xsd:dateTime)
        ?le a adm:Synced .
        ?va adm:logEntry ?le ;
            adm:adminAbout ?v .
        ?v bdo:volumeOf ?e .
    }
}

I'm reading the 1.7s in the Fuseki logs. Interestingly if I take a
value in the future for the date and get 0 result, the query still
takes 1.7s, for instance:

select ?le where {
    {
        ?le adm:logDate ?sdate .
        FILTER(?sdate > "2020-10-20T00:00:00"^^xsd:dateTime)
    }
}

So it's a bit hard for me to think the bottleneck could be
elsewhere... what other possible bottleneck should I look at?

Note that this contradicts previous findings where a similar query was
faster (around 300ms) if the indexes were not cold... but oddly I
can't reproduce it anymore, the 1.7s result has been consistent over
many queries in a short period of time, so the indexes were not
cold...

I understand it's a lot of code to write and that it's a big project, sorry.

> ARQ does not use the Model API. It's an extension to the ARQ algebra,
> OpExecutor, and subclasses, and one or more optimization Transforms to
> detect the case in a query.
>
> Overall, this isn't an API issue - it's the cost of implementing that
> API vs not doing something elsewhere.

Ok yes

Best,
-- 
Elie

Reply via email to