Hi,



On 24.9.2016 17:02, Andy Seaborne wrote:


On 23/09/16 09:35, Mikael Pesonen wrote:

Hi,

I have another query that is behaving illogically to me. I am searching
for terms in SKOS vocabulary and also need to retrieve topmost level
concept for each search result.


This query returns entire skos:broader hierarchy for search results and
works in a second (marked related lines with bold)

    SELECT ?graph ?concept
    (group_concat(DISTINCT
concat(?prefLabelm,"@",lang(?prefLabelm));separator="NEWLINE") AS
?prefLabelms)
    (group_concat(DISTINCT
concat(?prefLabel,"@",lang(?prefLabel));separator="NEWLINE")  AS
?prefLabels)
    (group_concat(DISTINCT
concat(?altLabelm,"@",lang(?altLabelm));separator="NEWLINE") AS
?altLabelms)
    (group_concat(DISTINCT
concat(?altLabel,"@",lang(?altLabel));separator="NEWLINE") AS ?altLabels)
    (group_concat(DISTINCT
concat(?def1,"@",lang(?def1));separator="NEWLINENEWLINE") AS ?defs1)
    (group_concat(DISTINCT
concat(?def2,"@",lang(?def2));separator="NEWLINENEWLINE") AS ?defs2)
*(group_concat(DISTINCT
concat(?topConceptLabel,"@",lang(?topConceptLabel));separator="/") AS
?topConceptLabels)*
    WHERE
    {
        GRAPH ?graph { ?graph dcterms:subject "MEDICAL" }
        GRAPH ?graph
        {
            {
                SELECT DISTINCT ?concept ?prefLabelm ?altLabelm WHERE
                {
                    {?concept skos:prefLabel ?prefLabelm FILTER (
(lang(?prefLabelm) = "fi" || lang(?prefLabelm) = "la-FI") &&
REGEX(?prefLabelm, "culo", "i"))}
                    UNION
                    {?concept skos:altLabel ?altLabelm FILTER (
(lang(?altLabelm) = "fi" || lang(?altLabelm) = "la-FI") &&
REGEX(?altLabelm, "culo", "i"))}
                }
                limit 200
            }
            ?concept skos:prefLabel ?prefLabel .

            OPTIONAL { ?concept skos:altLabel ?altLabel  }
            OPTIONAL { ?concept skos:definition ?def1 . OPTIONAL {?def1
rdf:value ?def2 } }
*OPTIONAL {  ?concept skos:broader* ?topConcept . ?topConcept
skos:prefLabel ?topConceptLabel FILTER ( lang(?topConceptLabel) = "fi") }*
        }
    }
GROUP BY ?graph ?concept



But this is what I tried first to get only the one topmost broader for
each, but it takes 17 seconds to run:

SELECT ?graph ?concept *?topConceptLabel*
    (group_concat(DISTINCT
concat(?prefLabelm,"@",lang(?prefLabelm));separator="NEWLINE") AS
?prefLabelms)
    (group_concat(DISTINCT
concat(?prefLabel,"@",lang(?prefLabel));separator="NEWLINE")  AS
?prefLabels)
    (group_concat(DISTINCT
concat(?altLabelm,"@",lang(?altLabelm));separator="NEWLINE") AS
?altLabelms)
    (group_concat(DISTINCT
concat(?altLabel,"@",lang(?altLabel));separator="NEWLINE") AS ?altLabels)
    (group_concat(DISTINCT
concat(?def1,"@",lang(?def1));separator="NEWLINENEWLINE") AS ?defs1)
    (group_concat(DISTINCT
concat(?def2,"@",lang(?def2));separator="NEWLINENEWLINE") AS ?defs2)
    WHERE
    {
        GRAPH ?graph { ?graph dcterms:subject "MEDICAL" }
        GRAPH ?graph
        {
            {
                SELECT DISTINCT ?concept ?prefLabelm ?altLabelm WHERE
                {
                    {?concept skos:prefLabel ?prefLabelm FILTER (
(lang(?prefLabelm) = "fi" || lang(?prefLabelm) = "la-FI") &&
REGEX(?prefLabelm, "culo", "i"))}
                    UNION
                    {?concept skos:altLabel ?altLabelm FILTER (
(lang(?altLabelm) = "fi" || lang(?altLabelm) = "la-FI") &&
REGEX(?altLabelm, "culo", "i"))}
                }
                limit 200
            }
            ?concept skos:prefLabel ?prefLabel .

            OPTIONAL { ?concept skos:altLabel ?altLabel  }
            OPTIONAL { ?concept skos:definition ?def1 . OPTIONAL {?def1
rdf:value ?def2 } }
*OPTIONAL { ?topConcept skos:topConceptOf ?graph . ?concept
skos:broader* ?topConcept . ?topConcept skos:prefLabel ?topConceptLabel
FILTER ( lang(?topConceptLabel) = "fi") }*

You can format queries with qparse or use sparql.org.

First:
    GRAPH ?graph
    {
...
        OPTIONAL
          { ?concept (skos:broader)* ?topConcept .
            ?topConcept  skos:prefLabel  ?topConceptLabel
            FILTER ( lang(?topConceptLabel) = "fi" )
          }


Second:
   GRAPH ?graph
   {
...
        OPTIONAL
          { ?topConcept  skos:topConceptOf  ?graph .
            ?concept (skos:broader)* ?topConcept .
            ?topConcept  skos:prefLabel  ?topConceptLabel
            FILTER ( lang(?topConceptLabel) = "fi" )
          }

so the 2nd query does an extra
"?topConcept  skos:topConceptOf  ?graph ."
before the path.

Try putting it after.
Yes, that one reduced time to 10 seconds.

Also,use a version with the path performance fix.
New version was maybe half a second faster.

Okay so maybe SPARQL is not so optimized language yet? With script I can query all the broader concepts and get the top level one in one sec. Of course not so elegant and is intuitively more work but gets the job done.

Br,
Mikael


Because you have a group and DISTINCT, this may be several matches but this is hidden.

Also, GRAPH ?graph { ... OPTIONAL { use of ?graph .... } ... }

means that the engine may have to separate those two uses of ?graph at the point the BGP executes and sort it out later. The earlier
GRAPH ?graph { ?graph dcterms:subject "MEDICAL" }
may not have so much effect limiting the execution search.

    Andy




        }
    }
GROUP BY ?graph ?concept *?topConceptLabel*



The speed issue seems totally illogical to me but there must be a
correct way to perform the latter query then?

Br,
Mikael


--
www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: [email protected]
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Linnankatu 10 A
FI-20100 Turku
FINLAND

Reply via email to