Re: Make SERVICE multi threaded

Andy Seaborne Mon, 17 Jun 2019 08:33:13 -0700

Good idea is combined with batching or switching to evaluate once andjoining locally with any results from earlier in the execution.

... otherwise it can cause problems for the remote service and look likea denial-of-service attack.


    Andy

On 17/06/2019 09:38, Marco Neumann wrote:

very good Lorenz, looks like Dave is already on the job while we speak.

On Mon, Jun 17, 2019 at 9:32 AM Lorenz <[email protected]>
wrote:

At least it looks like a good time to discuss it given that there is
already some ongoing work on the SERVICE feature?

The mailing list topic is called "Batching federated calls using VALUES
block", initial question was in April [1], latest status mail w.r.t.
external contribution was this month [2]

[1]

https://lists.apache.org/thread.html/ebfbeb950d43ef1f92057c1c4de12bb42f3db1f4a7afc6601243c9c2@%3Cusers.jena.apache.org%3E
[2]

https://lists.apache.org/thread.html/7d85cad8dc54c0bf8fde73ec879a3520127d1f8f70192785d2623874@%3Cusers.jena.apache.org%3E

OK that's useful Lorenz, thank you. I see AKSW is evaluating a number of
solutions here

https://svn.aksw.org/papers/2017/FedEval-summary/public.pdf

Since fuseki is thread-safe one can certainly delegate the query
segmentation to the application logic and issue multiple queries to
individual (fuseki or any other) endpoints concurrently.

use case here is to work with a large sharded dataset from one query.
latency is currently not of essence to the use case but could be

mitigated

by hording nodes on the same network segment.

I just wonder if the threading of SERVICE would require any significant
rewrite of ARQ or if this is already an encapsulated process that lends
itself to threading.



On Mon, Jun 17, 2019 at 6:36 AM Lorenz <[email protected]

.invalid>

wrote:

Honestly, with that extensive use of SERVICE feature it clearly would
make sense to make use of parallel execution. Never heard of such a
query, but sounds like fun.

What is the use-case here? Can you give some insights? Are all of them
remote SPARQL services?

By the way, did you ever consider or even try on of the existing
federated query engines like FedX, ANAPSID, HIBISCUS, etc. ? I'm
wondering how those would work (if even scale) with ~100 sources like it
looks to be the case in your query?

While using a query with a large number (100+) of remote sparql

endpoints,

using the SERVICE keyword for a federated query, I have noticed that

Jena

keeps waiting in the queue for slow responses to finish up before
proceeding to the next node.

Would it not be a good idea to make SERVICE a thread to speed up the
process in the query?


--
Lorenz Bühmann
AKSW group, University of Leipzig
Group: http://aksw.org - semantic web research center


Hello Marco,





Kind regards,
Lorenz

--
Lorenz Bühmann
AKSW group, University of Leipzig
Group: http://aksw.org - semantic web research center

Re: Make SERVICE multi threaded

Reply via email to