Re: Slow query when getting rdf:type

Mikael Pesonen Tue, 07 Nov 2017 03:30:53 -0800

Hi,

sorry, I don't understand how tdbstats work. I ran it against the samegraph that making the slow query and got the result below (some linesremoved)


Br,
Mikael

(stats
  (meta

(timestamp"2017-11-07T13:24:16.438+02:00"^^<http://www.w3.org/2001/XMLSchema#dateTime>)

    (run@ "2017/11/07 13:24:16 EET")
    (count 165911))

((VAR <http://www.w3.org/1999/02/22-rdf-syntax-ns#type><http://purl.org/vocab/frbr/core#Work>)

3)

((VAR <http://www.w3.org/1999/02/22-rdf-syntax-ns#type><http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#FileDataObject>)

   1098)

((VAR <http://www.w3.org/1999/02/22-rdf-syntax-ns#type><http://purl.org/vocab/frbr/core#Manifestation>)

   897)

((VAR <http://www.w3.org/1999/02/22-rdf-syntax-ns#type><http://www.w3.org/2004/02/skos/core#Concept>)

36)

((VAR <http://www.w3.org/1999/02/22-rdf-syntax-ns#type><http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#InformationElement>)

1)

((VAR <http://www.w3.org/1999/02/22-rdf-syntax-ns#type><http://purl.org/vocab/frbr/core#Expression>)

3)

((VAR <http://www.w3.org/1999/02/22-rdf-syntax-ns#type><http://www.w3.org/ns/dcat#CatalogRecord>)

   29284)

((VAR <http://www.w3.org/1999/02/22-rdf-syntax-ns#type><http://purl.org/dc/dcmitype/Text>)

   1623)

((VAR <http://www.w3.org/1999/02/22-rdf-syntax-ns#type><http://www.semanticdesktop.org/ontologies/2007/03/22/nfo/#InformationElement>)

2)

((VAR <http://www.w3.org/1999/02/22-rdf-syntax-ns#type><http://xmlns.com/foaf/0.1/Document>)

   1100)

((VAR <http://www.w3.org/1999/02/22-rdf-syntax-ns#type><http://purl.org/dc/dcmitype/Collection>)

   5)
  (<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 34052)
  (<http://xmlns.com/foaf/0.1/primaryTopic> 29264)
(<http://www.semanticdesktop.org/ontologies/2007/03/22/nfo/#wordCount> 59)
...
  (<http://purl.org/dc/elements/1.1/format> 4697)
  (<http://purl.org/dc/terms/created> 725)
  (<http://www.w3.org/2004/02/skos/core#topConceptOf> 1)
(<http://www.semanticdesktop.org/ontologies/2007/01/19/nie/#version> 1)
  (<http://purl.org/dc/elements/1.1/description> 6)
  (<http://www.w3.org/2004/02/skos/core#hiddenLabel> 35)
  (<http://purl.org/dc/terms/type> 1624)
  (<http://purl.org/dc/terms/accessRights> 2)
  (<http://purl.org/dc/terms/identifier> 78)
...

On 30.10.2017 17:10, Andy Seaborne wrote:

Mikael,
I can't find anything that makes rdf:type special. Maybe somedistribution of data is the cause but I'm not seeing it.
Did you get a chance to get some stats?

    Andy


On 27/10/17 12:27, Mikael Pesonen wrote:
Tried this also with other properties such as dcterms:created, and itdidnt slow down with them.
-Mikael


On 27.10.2017 13:02, Andy Seaborne wrote:
In this case, stats won't help. The <some resource> shoudl eb thestarting point.
(quadpattern
  (quad ?g ?s ?p <some:resource>)
  (quad ?g ?s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?type)
)

(quadpattern
  (quad ?g ?s ?p <some:resource>)
  (quad ?g ?s ?p2 ?o2)
)))

Are you using inference as well?

Is it the same <some resource>?

Is the timing for the rdf:type variant on a cold system?

    Andy



On 27/10/17 10:22, Mikael Pesonen wrote:
Hi,
thanks! I'll try that when get chance to stop jena. Yes we areusing TDB.
On 26.10.2017 16:15, Rob Vesse wrote:
Is TDB the underlying database?

If so is there a stats.opt  file in your database directory?
I remember there being issues in the past with the statistics forrdf:type triples being wrongly prioritised. You might want to lookat that file, assuming that it exists, and you try adjustingvalues associated with rdf:type based upon the guidance in thedocumentation:
http://jena.apache.org/documentation/tdb/optimizer.html#statistics-rule-file
Also if this is a database which is being updated then thestatistics can get out of date relative to the database. You canuse the commandline tdbstats tool to try regenerating this:
http://jena.apache.org/documentation/tdb/optimizer.html#generating-a-statistics-file
Note that you will need to stop Fuseki in order to run this asonly a single process is permitted to access a TDB database at a time
Rob
On 26/10/2017 13:47, "Mikael Pesonen" <[email protected]>wrote:
Hi, I have trouble understanding why the first query is slowand second
     one is fast. Using Jena Fuseki 3.4.0.
So I want to get all resources that reference <someresource>, and their
     types:
     SELECT * WHERE
     {
         GRAPH ?g
         {
             ?s ?p <some resource> .
                  ?s a ?type
         }
     }
     SELECT * WHERE
     {
         GRAPH ?g
         {
             ?s ?p <some resource> .
                  ?s ?p2 ?o2
         }
     }
First one takes 5 seconds which is too slow for ourapplication. Can it be rearranged somehow to make fast? Sorry if this is not acorrect forum
     for this.
     Thanks!
     --


--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: [email protected]
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND

Re: Slow query when getting rdf:type

Reply via email to