Let's stick to facts and public information.

There is nothing new here. "TopbraidDataset.listNames is not working for our platform".

TopQuadrant codebase has a mechanism to adapt the ARQ engine. It has even got documentation in the codebase. There are examples in git history as well.

On 29/01/2020 22:31, Holger Knublauch wrote:
Hi Andy,

thanks for the fast response.

On 30/01/2020 01:47, Andy Seaborne wrote:
> I am still trying to narrow down what has changed

It may be JENA-1813 - being about optimziation, the details of the real case probably matter. Let us know what you discover.

JENA-1813 fixed a specific problem caused by the BIND inside the GRAPH ("AS ?dummy" - not the outer one).  If the inside is a different pattern it seems optimize properly if I understand the details. Maybe some small rewrite will work in the real case.

A small rewrite is not an option for us because too many queries are affected, including those written by customers. It is a very common pattern to first compute the GRAPH with a BIND and then walk into that graph (only).

I tried this variation:

SELECT *
WHERE {
      BIND(<http://topbraid.org/teamwork#teamGraph>($this) AS ?teamGraph)
      GRAPH ?teamGraph {
          BIND (str(?teamGraph) AS ?dummy)
      }
}

and ?teamGraph is still unbound inside the GRAPH clause:

You don't need the GRAPH to call BIND (function(..) AS ?dummy) for a proper function.

The key point is that the (extend) is top of (graph) eval (i.e. BIND is last) in GRAPH ... {}.

So put FILTER(true) in:

GRAPH ?teamGraph {
   BIND (42 AS ?dummy)
   FILTER(true)
}

but JENA-1815 may change that.

What will happen is that ARQ will get the right answers. Correct before performant.

Using the graph name inside GRAPH introduces a different situation - it's different optimization issue to the first example query.


(join
   (extend ((?teamGraph (<http://topbraid.org/teamwork#teamGraph> ?this)))
     (table unit))
   (graph ?teamGraph
     (extend ((?dummy (str ?teamGraph)))
       (table unit))))

This doesn't look correct to me.

Looks correct to me.

Even if it formally would be correct, it would be very inefficient and thus render SPARQL's GRAPH keyword useless.

For the audience - ARQ is not executing in a quad-mode; iterating graph names has side effects.

I am not sure what else to provide but I am afraid we cannot use this version of Jena until the optimization is (back) in place.

Contributions welcome.


Thanks,
Holger



There is a follow-up (open) JENA-1815 as well.

Question: was this change intentional and is this behavior going to stay in Jena? If it does stay, how can I switch it off?

The first algebra expression below is the translation of the query and what is executing now some optimization doesn't get applied. It is the right answers.

    Andy

On 29/01/2020 05:08, Holger Knublauch wrote:
Hi,

on a branch of our product we have upgraded to jena 3.14 (due to the Thrift issue). Since then various SPARQL queries have stopped working. Of particular concern appears to be the evaluation of GRAPH ?graph clauses where ?graph is determined by a BIND. I am still trying to narrow down what has changed but suspect it's due to

https://issues.apache.org/jira/browse/JENA-1813

Here is an example query - a simplified version derived from a query in the product:

SELECT *
WHERE {
     BIND(<http://topbraid.org/teamwork#teamGraph>($this) AS ?teamGraph)
     GRAPH ?teamGraph {
         BIND (42 AS ?dummy)
     }
}

which is executed with $this prebound to a graph resource (e.g. <urn:x-evn-master:geo>). In 3.14 this seems to produce the following algebra:

(join
   (extend ((?teamGraph (<http://topbraid.org/teamwork#teamGraph> ?this)))
     (table unit))
   (graph ?teamGraph
     (extend ((?dummy 42))
       (table unit))))

while in 3.13.1 it produces

(sequence
   (extend ((?teamGraph (teamwork:teamGraph ?this)))
     (table unit))
   (graph ?teamGraph
     (extend ((?dummy 42))
       (table unit))))

As a result, the new version enters the GRAPH ?teamGraph with unbound ?teamGraph and thus iterates over all graphs which is not working for our platform:

If my observations so far are correct then this change to Jena would have quite deep consequences and break not just our own queries but also those written by customers. We will likely have to roll back to a previous Jena version for the upcoming release of our product.

Question: was this change intentional and is this behavior going to stay in Jena? If it does stay, how can I switch it off?

Thanks
Holger


Reply via email to