On 01/04/12 09:01, Claude Warren wrote:
On 30/03/12 16:52, Claude Warren wrote:
I have a case where I am using multiple federated calls where each call
is
of the form
Service silent<uri> {
--snip--
}
One of the endpoints is returning bad data in that the XML does not
parse
and so the XML parser throws an exception and my entire query dies.
Now the best answer would be to get the data corected but I don't own
that
data and have no idea if they will fix it.
What I want to know is shouldn't the "Silent" keyword on the Service
call
indicate that if the remote fails it should be ignored.
From http://www.w3.org/2009/sparql/docs/fed/service#serviceFailure it
appears that a single solution with no bindings should be returned. If
this is a correct interpretation I am willing to report a bug and
implement
a bug fix. The issue that I see is that the error is not detected until
a
hasNext() is called on the iterator. This means that the service could
have returned some data before the error was detected. I would propose
that the solution be to have the iterator return "false" at that point
and
move forward with the partial data that was already returned.
Does anyone have a different interpretation of the specification or see
an
issue with the possible solution?
Many thanks,
Claude
Hi Claude,
Hmm - tricky :-)
The key sentence is:
[[
The SILENT keyword indicates that errors encountered while accessing a
remote SPARQL endpoint should be ignored while processing the query.
]]
but HTTP has a bit of an issue here.
Suppose the request is made and "200 OK" is received. That's a contract
that the results are going to be sent and be valid. Bad syntax of
results isn't considered nor are breaks in communications.
The only way the address is for the service operations (class
QueryIterService) to consume and buffer all the results. I've just
added this in QueryIterService.
An effect of this is that you will not get any valid earlier results;
which is what you propose and quite sensible.
There needs to be a QueryIterator implementation that reads another
QueryIterator until some error occurs and signal end at that point.
That would be worthwhile - please do contribute such a thing.
Theer's a QueryIteratorWrapper that can be used to intercept
.hasNext/.next calls so you can add try-ctach.
Do you which SPARQL implementation is generating bad results?
Andy
Andy,
I am not certain which sparql endpoint is generating the bad results
-- though I do intend to find out.
I will look at implementing a QueryIterator as you noted above. I
then need to plug it into the Fuseki query engine chain.
The place to plug it in is in QueryIterService in ARQ (Fuseki is the
protocol engine; ARQ ships with Fuseki).
Since I am querying multiple Sparql endpoints and performing unions on
their results and since it seems to take quite a long time for some
results I was considering implementing a Union query for service calls
that would effectively poll each service endpoint in turn looking for
the next one that has query result avaiable. A polling query iterator
if you will. The hope is to parrallelize the queries as much as
possible.
There as a discussion of this recently on this list.
ARQ is rather prone to serial execution (parallelism in Fuseki is used
to execute multiple concurrent requests, not to give all system
resources for one query). There's nothing fundamental about ARQ's
serial execution of UNIONs - a different implementation of execution or
a different operator could make parallel SERVICE calls.
Andy
But I should probably open another discussion for that topic.
Claude