Hi,

sorry, I don't understand how tdbstats work. I ran it against the same graph that making the slow query and got the result below (some lines removed)

Br,
Mikael

(stats
  (meta
    (timestamp "2017-11-07T13:24:16.438+02:00"^^<http://www.w3.org/2001/XMLSchema#dateTime>)
    (run@ "2017/11/07 13:24:16 EET")
    (count 165911))
  ((VAR <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/vocab/frbr/core#Work>)
   3)
  ((VAR <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#FileDataObject>)
   1098)
  ((VAR <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/vocab/frbr/core#Manifestation>)
   897)
  ((VAR <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2004/02/skos/core#Concept>)
   36)
  ((VAR <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#InformationElement>)
   1)
  ((VAR <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/vocab/frbr/core#Expression>)
   3)
  ((VAR <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/dcat#CatalogRecord>)
   29284)
  ((VAR <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/dcmitype/Text>)
   1623)
  ((VAR <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.semanticdesktop.org/ontologies/2007/03/22/nfo/#InformationElement>)
   2)
  ((VAR <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Document>)
   1100)
  ((VAR <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/dcmitype/Collection>)
   5)
  (<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 34052)
  (<http://xmlns.com/foaf/0.1/primaryTopic> 29264)
(<http://www.semanticdesktop.org/ontologies/2007/03/22/nfo/#wordCount> 59)
...
  (<http://purl.org/dc/elements/1.1/format> 4697)
  (<http://purl.org/dc/terms/created> 725)
  (<http://www.w3.org/2004/02/skos/core#topConceptOf> 1)
(<http://www.semanticdesktop.org/ontologies/2007/01/19/nie/#version> 1)
  (<http://purl.org/dc/elements/1.1/description> 6)
  (<http://www.w3.org/2004/02/skos/core#hiddenLabel> 35)
  (<http://purl.org/dc/terms/type> 1624)
  (<http://purl.org/dc/terms/accessRights> 2)
  (<http://purl.org/dc/terms/identifier> 78)
...

On 30.10.2017 17:10, Andy Seaborne wrote:
Mikael,

I can't find anything that makes rdf:type special.  Maybe some distribution of data is the cause but I'm not seeing it.

Did you get a chance to get some stats?

    Andy


On 27/10/17 12:27, Mikael Pesonen wrote:

Tried this also with other properties such as dcterms:created, and it didnt slow down with them.

-Mikael


On 27.10.2017 13:02, Andy Seaborne wrote:
In this case, stats won't help.  The <some resource> shoudl eb the starting point.

(quadpattern
  (quad ?g ?s ?p <some:resource>)
  (quad ?g ?s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?type)
)

(quadpattern
  (quad ?g ?s ?p <some:resource>)
  (quad ?g ?s ?p2 ?o2)
)))

Are you using inference as well?

Is it the same <some resource>?

Is the timing for the rdf:type variant on a cold system?

    Andy



On 27/10/17 10:22, Mikael Pesonen wrote:

Hi,

thanks! I'll try that when get chance to stop jena. Yes we are using TDB.



On 26.10.2017 16:15, Rob Vesse wrote:
Is TDB the underlying database?

If so is there a stats.opt  file in your database directory?

I remember there being issues in the past with the statistics for rdf:type triples being wrongly prioritised. You might want to look at that file, assuming that it exists, and you try adjusting values associated with rdf:type based upon the guidance in the documentation:

http://jena.apache.org/documentation/tdb/optimizer.html#statistics-rule-file

Also if this is a database which is being updated then the statistics can get out of date relative to the database. You can use the commandline tdbstats tool to try regenerating this:

http://jena.apache.org/documentation/tdb/optimizer.html#generating-a-statistics-file

Note that you will need to stop Fuseki in order to run this as only a single process is permitted to access a TDB database at a time

Rob

On 26/10/2017 13:47, "Mikael Pesonen" <[email protected]> wrote:

     Hi, I have trouble understanding why the first query is slow and second
     one is fast. Using Jena Fuseki 3.4.0.
     So I want to get all resources that reference <some resource>, and their
     types:
     SELECT * WHERE
     {
         GRAPH ?g
         {
             ?s ?p <some resource> .
                  ?s a ?type
         }
     }
     SELECT * WHERE
     {
         GRAPH ?g
         {
             ?s ?p <some resource> .
                  ?s ?p2 ?o2
         }
     }
     First one takes 5 seconds which is too slow for our application. Can it      be rearranged somehow to make fast? Sorry if this is not a correct forum
     for this.
     Thanks!
     --







--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: [email protected]
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND

Reply via email to