RE: Fuseki - SELECT during INSERT

Nouwt, B. (Barry) Mon, 20 Jan 2020 23:55:35 -0800

Hi Claude/Andy, thanks for the responses.

@Claude: Below a shortened version of the configuration we are using 
(untested). During the processing of an INSERT query, we are firing a new 
SELECT sparql http request to the same secured dataset. We use a user that has 
unlimited permissions to prevent indefinite permission checking but it is 
indeed very slow which is not a problem for now. Our permissions structure uses 
graph patterns (BGPs) to encode which types of triples users have or have no 
access to, so I think this means no reification is being done. The requested 
(or inserted) triples (including some context) are being matched to the graph 
patterns in de security policy and the decision about access is being taken 
based on that.


@Andy: we haven't tested thoroughly yet, but I suspect the CPU is not doing 
anything and just waiting for the INSERT query to finish (which does not finish 
until the SELECT query finishes). So, if SELECT queries are indeed being 
postponed until the INSERT query is finished, this is a deadlock situation. 
I'll see if we can make a threaddump to clarify things.

Again, thanks for your responses and thanks for Apache Jena!

Regards, Barry

----------------------------------------- config.ttl 
--------------------------------

@prefix : <http://www.example.org/#> .
@prefix fuseki: <http://jena.apache.org/fuseki#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
@prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix perm: <http://apache.org/jena/permissions/Assembler#> .


<#service1> rdf:type fuseki:Service ;
    fuseki:name                       "ds" ;
    fuseki:serviceQuery               "sparql" ;   # SPARQL query service
    fuseki:serviceQuery               "query" ;    # SPARQL query service (alt 
name)
    fuseki:serviceUpdate              "update" ;   # SPARQL update service
    fuseki:serviceUpload              "upload" ;   # Non-SPARQL upload service
    fuseki:serviceReadWriteGraphStore "data" ;     # SPARQL Graph store 
protocol (read and write)
    # A separate read-only graph store endpoint:
    fuseki:serviceReadGraphStore      "get" ;      # SPARQL Graph store 
protocol (read only)
    fuseki:dataset                   :dataset ;
    .
 
perm:Model rdfs:subClassOf ja:NamedModel .
tdb:DatasetTDB rdfs:subClassOf ja:RDFDataset .
tdb:GraphTDB rdfs:subClassOf ja:Model .
 
:dataset a ja:RDFDataset ;
        ja:defaultGraph <#securedGraph> ;
.

<#securedGraph> rdf:type perm:Model ;
    perm:baseModel <#infGraph> ;
    ja:modelName "https://www.example.org/securedModel"; ;
    perm:evaluatorImpl <#secEvaluator> ;
.

<#secEvaluator> rdf:type perm:Evaluator ;
    perm:args [  
        rdf:_1 "http://fuseki:3030/ds/query"; ;
    ] ;
    perm:evaluatorClass "nl.example.MySecurityEvaluator"
.

<#infGraph> a ja:InfModel ;
        ja:baseModel <#tdbGraph> ;
        ja:reasoner [ 
        ja:rulesFrom <file:virt/et.rules> ;
    ] ;
.

<#tdbGraph> rdf:type tdb:GraphTDB ;
        tdb:location "DB" ;
.

-----Original Message-----
From: Andy Seaborne <[email protected]> 
Sent: maandag 20 januari 2020 22:51
To: [email protected]
Subject: Re: Fuseki - SELECT during INSERT

Hi Barry,

"hangs indefinitely" -- is this the CPU is doing nothing or the CPU is doing 
work but never finishes?

If it is the former, CPU doing nothing, what would be useful is a JVM 
threaddump. Should be one thread that is stuck (I'm assuming the SELECT runs on 
the same thread as the INSERT and also this deadlocks everytime, not, for 
example, when another request is happening at  the same time).

Related to Claude's point. TDB has multiple-reader-and-single-writer
(MR+SW) transactions but general purpose datasets have a multiple-reader- 
*or*-single-writer (MRSW) lock to work in the general case without knowing 
anything about the component parts. It could be these interact to produce a 
deadlock or locks are being taken twice, again a potential deadlock situation.

    Andy



On Mon, 20 Jan 2020 at 15:16, Claude Warren <[email protected]> wrote:

> Barry,
>
> I just realised you said "Select against the same dataset".  Are you 
> selecting against an unrestricted model/graph? If you query a graph 
> with permissions to determine the permissions you can get into a 
> situation where things are running _very_ slowly.
>
> How have you designed your permissions structure?  Are you reifying 
> the triples and then granting access to the reified nodes?  If so this 
> is an extremely processor intensive way of doing permissions checking.  
> The issue being that your permissions graph will be larger than 3x the 
> size of the graph you are protecting.
>
> Claude
>
> On Mon, Jan 20, 2020 at 3:10 PM Claude Warren <[email protected]> wrote:
>
> > Barry,
> >
> > Can you provide the configuration for the Fuseki server?  I need to 
> > know how the dataset(s) are constructed.
> >
> > Claude
> >
> > On Mon, Jan 20, 2020 at 11:10 AM Claude Warren <[email protected]> wrote:
> >
> >> I am not certain if the lock is the reason but am providing more 
> >> background on the permissions processing so someone with more 
> >> dataset experience can answer.
> >>
> >> To use the permissions on a dataset requires that the dataset be 
> >> constructed from individual models.  As each of the models would 
> >> have to have permissions assigned.  I put this out there because I 
> >> know that TDB has an internal dataset implementation and I want to 
> >> make sure that we
> only
> >> look in the stand alone dataset implementations.
> >>
> >> Claude
> >>
> >> On Mon, Jan 20, 2020 at 10:46 AM Nouwt, B. (Barry) 
> >> <[email protected]> wrote:
> >>
> >>> Hi all,
> >>>
> >>> we have a Security related scenario where whenever an INSERT query 
> >>> gets executed on our Fuseki dataset, we intercept the execution of 
> >>> this
> query
> >>> (using Jena Permissions and its Security Evaluator) and during 
> >>> this interception we execute a SELECT query to the same dataset. 
> >>> Whenever
> we did
> >>> this during a SELECT query (instead of an INSERT query), there was 
> >>> no problem, but when we do it during a INSERT query, it seems like 
> >>> the
> SELECT
> >>> query hangs indefinitely. Could this be caused by a lock of the 
> >>> INSERT
> on
> >>> that dataset?
> >>>
> >>> Regards, Barry
> >>> This message may contain information that is not intended for you. 
> >>> If you are not the addressee or if this message was sent to you by
> mistake,
> >>> you are requested to inform the sender and delete the message. TNO
> accepts
> >>> no liability for the content of this e-mail, for the manner in 
> >>> which
> you
> >>> use it and for damage of any kind resulting from the risks 
> >>> inherent to
> the
> >>> electronic transmission of messages.
> >>>
> >>
> >>
> >> --
> >> I like: Like Like - The likeliest place on the web 
> >> <http://like-like.xenei.com>
> >> LinkedIn: http://www.linkedin.com/in/claudewarren
> >>
> >
> >
> > --
> > I like: Like Like - The likeliest place on the web 
> > <http://like-like.xenei.com>
> > LinkedIn: http://www.linkedin.com/in/claudewarren
> >
>
>
> --
> I like: Like Like - The likeliest place on the web 
> <http://like-like.xenei.com>
> LinkedIn: http://www.linkedin.com/in/claudewarren
>
This message may contain information that is not intended for you. If you are 
not the addressee or if this message was sent to you by mistake, you are 
requested to inform the sender and delete the message. TNO accepts no liability 
for the content of this e-mail, for the manner in which you use it and for 
damage of any kind resulting from the risks inherent to the electronic 
transmission of messages.

RE: Fuseki - SELECT during INSERT

Reply via email to