[jira] [Commented] (JENA-1826) Fuseki RDF/XML response never finishes

Andy Seaborne (Jira) Wed, 22 Jan 2020 10:22:15 -0800


    [ 
https://issues.apache.org/jira/browse/JENA-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021363#comment-17021363
 ]


Andy Seaborne commented on JENA-1826:
-------------------------------------

Just noticed that this is only for RDF/XML.

The information on counts would still be useful to see what structure is 
triggering the issue.

RDF/XML in pretty form can become very expensive as it tries a large number of 
possible prettification options.

We could switch to the flat RDF/XML which does not have the problem (neither 
does pretty Turtle).

 

> Fuseki RDF/XML response never finishes
> --------------------------------------
>
>                 Key: JENA-1826
>                 URL: https://issues.apache.org/jira/browse/JENA-1826
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: Fuseki
>    Affects Versions: Jena 3.14.0
>         Environment: Ubuntu 16.04
> java version "1.8.0_201"
> Java(TM) SE Runtime Environment (build 1.8.0_201-b09)
> Java HotSpot(TM) 64-Bit Server VM (build 25.201-b09, mixed mode)
>            Reporter: Osma Suominen
>            Priority: Major
>         Attachments: W00067442800.ttl
>
>
> I have a web app running SPARQL CONSTRUCT queries against Fuseki and 
> generating web pages. I noticed that Fuseki started hogging all CPU cores a 
> few hours after it was restarted. It turned out that some of the CONSTRUCT 
> queries take a very long time to complete - at least 40 minutes but probably 
> more and it seems quite likely they will never finish.
> I was able to turn this into a fairly minimal example. I've attached a 1.3MB 
> Turtle file (~29k triples) with all the data necessary to demonstrate the 
> problem. 
> Start Fuseki like this: {{./fuseki-server --file W00067442800.ttl /ds}}
> Then open the Fuseki web UI and run this SPARQL query against the dataset:
> {noformat}
> PREFIX schema: <http://schema.org/>       
> PREFIX skos: <http://www.w3.org/2004/02/skos/core#>                   
> CONSTRUCT {
>   <http://urn.fi/URN:NBN:fi:bib:me:W00067442800> ?p ?o .
>   ?o schema:name ?oname ;
>     skos:prefLabel ?olabel .
>   ?inst ?instprop ?instval .
>   ?instval schema:name ?instvalName ;
>     skos:prefLabel ?instvalLabel .
> }
> WHERE {
>   {
>     <http://urn.fi/URN:NBN:fi:bib:me:W00067442800> ?p ?o .
>     OPTIONAL {
>       {
>         ?o schema:name ?oname 
>       }             UNION             {
>         ?o skos:prefLabel ?olabel 
>       }           
>     }         
>   }         UNION         {
>     {
>       <http://urn.fi/URN:NBN:fi:bib:me:W00067442800> schema:workExample ?inst 
>     }           OPTIONAL {
>       {
>         ?inst ?instprop ?instval .
>         OPTIONAL {
>           {
>             ?instval schema:name ?instvalName 
>           }                 UNION                 {
>             ?instval skos:prefLabel ?instvalLabel 
>           }               
>         }             
>       }      
>     }         
>   }       
> }
> {noformat}
> If you select Turtle as the content type, the query will finish in around 3 
> seconds (plus rendering the result in the browser takes a while). If instead 
> you select XML as the format, the query will just keep running, with Fuseki 
> taking over a single CPU core completely. With several such queries running, 
> all the CPU cores will eventually be used.
> This can also be demonstrated using curl (with the above query saved as 
> {{query.rq}}):
> {noformat}
> curl -H 'Accept: text/turtle' --data-urlencode "[email protected]" 
> http://localhost:3030/ds/sparql
> {noformat}
> works fine and gives you the Turtle output;
> {noformat}
> curl -H 'Accept: application/rdf+xml' --data-urlencode "[email protected]" 
> http://localhost:3030/ds/sparql
> {noformat}
> never seems to finish.
> What's perhaps even worse, even a query timeout setting doesn't help. If I 
> start Fuseki with a 10 second query timeout, i.e. {{--timeout 10000}}, it 
> still won't stop the query from hogging the CPU forever. I'm guessing that 
> the problem is in the final stages of the query processing, when the results 
> just have to be serialized into the correct syntax, and the timeout is no 
> longer applied in this stage.
> I discovered this problem while running Fuseki 3.5.0, but it happens with the 
> most recent release 3.14.0 as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (JENA-1826) Fuseki RDF/XML response never finishes

Reply via email to