so yes an apache proxy and a query timeout limit for fuseki instances
it will be.

I just checked the same query on an open source virtuoso instance
(7.2) with the same data and it seems that virtuoso handles the
request much more resourcefully and to completion. Andy can you
enlighten me what the main difference here is in the treatment of the
query by jena (39s) vs virtuoso (1s)?

On Wed, Dec 19, 2018 at 6:56 AM Laura Morales <[email protected]> wrote:
>
> > and needs some explaining why we put open endpoints on the web without 
> > great restrictions
>
> I've always been puzzled by this as well. You never see a publicly reachable 
> PostgreSQL or MariaDB servers, or any other database. There is always a layer 
> in between which defines a list of possible requests, and then every requests 
> is optimized to retrieve data from the database. With a public endpoint 
> instead, this optimization is not possible since anybody can write any query. 
> I think the reason is simply that a sparql endpoint is supposed to answer any 
> type of query which traverses any path that is not well defined a priori. If 
> you only want the server to serve a specific kind of queries instead, in this 
> case you can in fact use some kind of REST API in front of it and translate 
> every request to a sparql query; in this scenario you don't need the endpoint 
> to be public, but you're limiting the type of queries that a user can ask.
>
>
>
>
> Sent: Tuesday, December 18, 2018 at 11:40 PM
> From: "Marco Neumann" <[email protected]>
> To: "Bruno P. Kinoshita" <[email protected]>, [email protected]
> Subject: Re: blocking IP to prevent malicious sparql queries
> It's good to see people using sparql one way or another. It's still an
> unusual thing in the wild and needs some explaining why we put open
> endpoints on the web without great restrictions. But since this one is
> intended to be a sandbox to play with and learn I take indeed a positive
> view on this incident.
>
> On Tue 18 Dec 2018 at 21:34, Bruno P. Kinoshita
> <[email protected]> wrote:
>
> > I think Laura's option is the best/easiest one, and good on you for the
> > positive point-of-view on these spams Marco! :D
> > Bruno
> >
> > From: Marco Neumann <[email protected]>
> > To: [email protected]
> > Sent: Wednesday, 19 December 2018 8:58 AM
> > Subject: Re: blocking IP to prevent malicious sparql queries
> >
> > Thank you Laura,
> >
> > I was hoping for a quick fix and something along the lines of a fuseki
> > blacklist filter in the shiro.ini
> >
> > but yes the reverse proxy is probably a more sensible approach at this
> > point.
> >
> > In any event good to see sparql spam like this here, it means that the
> > Semantic Web has most certainly arrived in the mainstream ;)
> >
> >
> >
> > On Tue, Dec 18, 2018 at 5:35 PM Laura Morales <[email protected]> wrote:
> >
> > > While I think the correct answer is YES (perhaps by implementing a custom
> > > filter), I guess the answer is going to be "use a reverse proxy".
> > >
> > >
> > >
> > >
> > > Sent: Tuesday, December 18, 2018 at 6:16 PM
> > > From: "Marco Neumann" <[email protected]>
> > > To: [email protected]
> > > Subject: blocking IP to prevent malicious sparql queries
> > > is it possible to block indiviual IPs with the shiro.ini?
> > >
> > > We receive a number of malicious sparql queries from an IP in France
> > > (193.52.210.70) today
> > >
> > > that continuously issues the following SPARQL query:
> > >
> > > SELECT ?r (count(*) AS ?count)
> > > WHERE{ ?x ?r ?s
> > > { SELECT ?s WHERE
> > > { ?s a ?o }
> > > OFFSET 124639 LIMIT 1000 }
> > > } GROUP BY ?s ?r OFFSET 0 LIMIT 10000
> > >
> > > resulting in:
> > >
> > > [2018-12-18 18:10:31] AbstractConnector WARN
> > > java.lang.OutOfMemoryError: GC overhead limit exceeded
> > > [2018-12-18 18:10:34] Fuseki WARN [424] RC = 500 : GC overhead limit
> > > exceeded
> > > java.lang.OutOfMemoryError: GC overhead limit exceeded
> > > [2018-12-18 18:10:34] Fuseki INFO [424] 500 GC overhead limit exceeded
> > > (39.946 s)
> > >
> > > and pushes fuseki offline for a few minutes.
> > >
> > >
> > > --
> > >
> > >
> > > ---
> > > Marco Neumann
> > > KONA
> > >
> >
> >
> > --
> >
> >
> > ---
> > Marco Neumann
> > KONA
> >
> >
> >
>
> --
>
>
> ---
> Marco Neumann
> KONA



--


---
Marco Neumann
KONA

Reply via email to