Re: Parallel requests on multiple fuseki

2023-12-14 Thread Andy Seaborne

Jorge,

Have you looked at

https://jena.apache.org/documentation/query/service_enhancer.html

It might have features of use to you.

Andy

On 14/12/2023 08:25, George News wrote:

Hi,

I have deployed several Fuseki instances. This email scenario is just 
for 2.


I was testing the SERVICE option in order to launch the same request to
both instances of Fuseki and merge the result under one response.

The SPARQL request I launched is the following:

prefix rdf: 
SELECT * WHERE {
   { SERVICE 
     {
   SELECT ?anything WHERE{?anything rdf:type ?bb}
     } BIND ( AS ?serviceLabel)
   }
   UNION
   { SERVICE 
  {
   SELECT ?anything WHERE{?anything rdf:type ?bb}
  } BIND ( AS ?serviceLabel)
   }
}

The result was the expected. However when using Wireshark and analysing
the logs, I noticed that the request are not send in parallel, but just
one and then the other. This is somehow a waste of time ;)

Is there any way to parallelize sending the same request to many Fuseki
instances and merge the responses? I guess I can make my own solution
using Jena, but I wanted to know if it would be possible using SPARQL.

Thanks.
Jorge


Re: Checking that SPARQL Update will not validate SHACL constraints

2023-12-14 Thread Martynas Jusevičius
Arne’s email got lost somehow but I see it in Andy’s reply.

Thanks for the suggestions.

On Wed, 13 Dec 2023 at 19.52, Andy Seaborne  wrote:

>
>
> On 13/12/2023 15:49, Arne Bernhardt wrote:
> > Hello Martynas,
> >
> > I have no experience with implementing a validation layer for Fuseki.
> >
> > But I might have an idea for your suggested approach:
> > Instead of loading a copy of the graph and modifying it, you could create
> > an org.apache.jena.graph.compose.Delta based on the unmodified graph.
> > Then apply the update to the delta graph and validate the SHACL on the
> > delta graph. If the validation is successful, you can safely apply the
> > update to the original graph and discard the delta graph.
> >
> > You still have to deal with concurrency. For example, the original graph
> > could be changed by a second, faster update while you are still
> validating
> > the first update. It would not be safe to apply the validated changes to
> a
> > graph that has been changed in the meantime.
> >
> > Arne
>
> It'll depends in the SHACL. Many constraints don't need all the data
> available. Some need just the subject and all properties (e.g.
> sh:maxCount). Some need all the data (SPARQL ones - they are opaque to
> analysis so the general way is they need all the data).
>
> If the proxy layer is same JVM, BufferingDatasetGraph may help.
> It can be used to capture the adds and deletes. It can then be validated
> (all data or only the data changing). Flush the changes to the database
> just before the end of the request in the proxy level commit.
>
> If the proxy is in a different JVM, then only certain constraints can be
> supported but they do tend to be the most common checks.
>
>  Andy
>
> >
> >
> >
> >
> > Am Mi., 13. Dez. 2023 um 14:29 Uhr schrieb Martynas Jusevičius <
> > marty...@atomgraph.com>:
> >
> >> Hi,
> >>
> >> I have an objective to only persist constraint-validated data in Fuseki.
> >>
> >> I have a proxy layer that validates all incoming GSP PUT and POST
> >> request graphs in memory and rejects the invalid ones. So far so good.
> >>
> >> What about SPARQL Update requests though? For simplicity's sake, let's
> >> say they are restricted to a single graph as in GSP PATCH [1].
> >> What I can think of is first loading the graph into memory and
> >> executing the update, and then validating the resulting graph against
> >> SHACL. But maybe there's a smarter way?
> >>
> >> Also interested in the more general case without the graph restriction.
> >>
> >> Martynas
> >>
> >> [1] https://www.w3.org/TR/sparql11-http-rdf-update/#http-patch
> >>
> >
>


Parallel requests on multiple fuseki

2023-12-14 Thread George News

Hi,

I have deployed several Fuseki instances. This email scenario is just for 2.

I was testing the SERVICE option in order to launch the same request to
both instances of Fuseki and merge the result under one response.

The SPARQL request I launched is the following:

prefix rdf: 
SELECT * WHERE {
  { SERVICE 
{
  SELECT ?anything WHERE{?anything rdf:type ?bb}
} BIND ( AS ?serviceLabel)
  }
  UNION
  { SERVICE 
 {
  SELECT ?anything WHERE{?anything rdf:type ?bb}
 } BIND ( AS ?serviceLabel)
  }
}

The result was the expected. However when using Wireshark and analysing
the logs, I noticed that the request are not send in parallel, but just
one and then the other. This is somehow a waste of time ;)

Is there any way to parallelize sending the same request to many Fuseki
instances and merge the responses? I guess I can make my own solution
using Jena, but I wanted to know if it would be possible using SPARQL.

Thanks.
Jorge