Hello Mikael,

attachments to not work so I'll post a config that works for me at the
end of the message. You have to adopt the path for

tdb:location

and

text:directory

And as said before, you have to create the text index once.

Note, as I also said in my previous response, your query will work only
for labels that contain the whole word (modulo what the Lucene analyzer
will do during indexing and query parsing). Substring search needs
things like "*" e.g. "Lei*" should also return labels like "Leipzig" and
"Leibnitz"



@prefix :        <http://localhost/jena_example/#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb:     <http://jena.hpl.hp.com/2008/tdb#> .
@prefix ja:      <http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix text:    <http://jena.apache.org/text#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#>
@prefix fuseki:  <http://jena.apache.org/fuseki#> .

## Example of a TDB dataset and text index
## Initialize TDB
[] ja:loadClass "org.apache.jena.tdb.TDB" .
tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
tdb:GraphTDB    rdfs:subClassOf  ja:Model .

## Initialize text query
[] ja:loadClass       "org.apache.jena.query.text.TextQuery" .
# A TextDataset is a regular dataset with a text index.
text:TextDataset      rdfs:subClassOf   ja:RDFDataset .
# Lucene index
text:TextIndexLucene  rdfs:subClassOf   text:TextIndex .


## ---------------------------------------------------------------


:text_dataset rdf:type     text:TextDataset ;
    text:dataset   :my_dataset ;
    text:index     <#indexLucene> ;
    .

# A TDB dataset used for RDF storage
:my_dataset rdf:type      tdb:DatasetTDB ;
    tdb:location "/tmp/tdb-dataset/" ;
#    tdb:unionDefaultGraph true ; # Optional
    .

# Text index description
<#indexLucene> a text:TextIndexLucene ;
    text:directory <file:/tmp/tdb-lucene-index> ;
    text:entityMap <#entMap> ;
    text:storeValues true ;
    text:analyzer [ a text:StandardAnalyzer ] ;
    text:queryAnalyzer [ a text:KeywordAnalyzer ] ;
    text:queryParser text:AnalyzingQueryParser ;
    text:multilingualSupport true ;
 .

<#entMap> a text:EntityMap ;
    text:defaultField     "label" ;
    text:entityField      "uri" ;
    text:uidField         "uid" ;
    text:langField        "lang" ;
    text:graphField       "graph" ;
    text:map (
         [ text:field "label" ;
           text:predicate skos:prefLabel ]
         ) .

<#service> rdf:type fuseki:Service ;
    fuseki:name                     "/ds" ;   # http://host:port/ds-ro
    fuseki:serviceQuery             "query" ;    # SPARQL query service
    fuseki:serviceReadGraphStore    "data" ;     # SPARQL Graph store
protocol (read only)
    fuseki:dataset           :text_dataset ;
    .

 





>
> Hi,
>
>
> On 18/01/2019 18:13, Chris Tomlinson wrote:
>> Hi,
>>
>> 1) If you’re using a default config, it does not have a working
>> jena-text configuration. The config will need to  include
>> skos:prefLabel in the entity map.
>>
>> 2) when you change the jena-text in significant ways, such as
>> changing what analyzer is used for a given property and so on, then
>> you’ll need to rebuild the Lucene index via reloading the dataset or
>> using the textIndexer
>> <https://jena.apache.org/documentation/query/text-query.html#building-a-text-index>.
>> I don’t recall this being mentioned as part of your testing
>>
>> 3) Please indicate exactly which item you’re using
>> jena-fuseki-war-3.9.0.war or jena-fuseki-webapp-3.9.0.jar etc, and
>> the config file itself. The error you’ve mentioned previously:
> We are running Fuseki as service
>
> -----
> [Unit]
> Description=Apache Jena Fuseki
>
> [Service]
> Type=simple
> User=fuseki
> #Environment=JAVA_HOME=/usr/lib/jvm/java-8-oracle/
> Environment=FUSEKI_HOME=/home/text/tools/apache-jena-fuseki-3.9.0
> Environment=FUSEKI_BASE=/home/text/tools/apache-jena-fuseki-3.9.0/run
> ExecStart=/usr/bin/java
> -Dlog4j.configuration=file:/home/text/tools/apache-jena-fuseki-3.9.0/log4j.properties
> -Xmx5600M -jar
> /home/text/tools/apache-jena-fuseki-3.9.0/fuseki-server.jar --update
> --port 3030  --loc=/home/text/tools/jena_data_test/ /ds
>
> [Install]
> WantedBy=multi-user.target
> -----
>
> All settings are default otherwise, we haven't changed any config file.
>
> Are there some minimal settings to this example config so that I could
> get skos:prefLabel working?
>
> https://jena.apache.org/documentation/query/text-query.html#configuration
>
> So when we have a working configuration/assembler file, all is needed
> is to build the index
>
> java  -cp  $FUSEKI_HOME/fuseki-server.jar  jena.textindexer 
> --desc=assembler_file ?
>
>
> Thank everyone for the help
>>> Jan 17 17:00:28 semantic-dev java[16800]: [2019-01-17 17:00:28]
>>> Config     INFO  Load configuration:
>>> file:///home/text/tools/apache-jena-fuseki-3.9.0/run/configuration/text_index.ttl
>>> <file:///home/text/tools/apache-jena-fuseki-3.9.0/run/configuration/text_index.ttl>
>>>
>>> Jan 17 17:00:28 semantic-dev java[16800]: [2019-01-17 17:00:28]
>>> WebAppContext WARN  Failed startup of context
>>> o.e.j.w.WebAppContext@4159e81b{Apache Jena Fuseki
>>> Server,/,file:///home/text/tools/apache-jena-fuseki-3.9.0/webapp/,UNAVAILABLE
>>> <file:///home/text/tools/apache-jena-fuseki-3.9.0/webapp/,UNAVAILABLE>}
>>> Jan 17 17:00:28 semantic-dev java[16800]:         at
>>> org.apache.jena.fuseki.build.FusekiConfig.readAssemblerFile(FusekiConfig.java:148)
>> suggests to me that something in the config file is confusing the
>> readAssemblerFile. It doesn’t look like it’s failing in the reading
>> the jena-text portion of the config.
>>
>> If http://api.finto.fi/download/mesh/mesh-skos.ttl
>> <http://api.finto.fi/download/mesh/mesh-skos.ttl> the dataset, then
>> can you cut it down to just a small test case with some concepts with
>> “medi” and a few without? That along with the other information
>> should help move this further along..
>>
>> 4) Your query:
>>
>>> PREFIX skos: <http://www.w3.org/2004/02/skos/core#
>>> <http://www.w3.org/2004/02/skos/core#>>
>>> PREFIX text: <http://jena.apache.org/text#
>>> <http://jena.apache.org/text#>>
>>> SELECT *
>>> WHERE
>>> {
>>>    GRAPH <http://www.yso.fi/onto/mesh/ <http://www.yso.fi/onto/mesh/>>
>>>    {
>>>      ?concept text:query (skos:prefLabel "medi") .
>>>      ?concept skos:prefLabel ?prefLabel .
>>>
>>>      # FILTER (  REGEX(?prefLabel, "\\bmedi", "i"))
>>>    }
>>> }
>>> limit 10
>> might effectively just be executing:
>>
>>> ?concept skos:prefLabel ?prefLabel .
>> if there is actually no jena-text config - I haven’t checked what
>> happens when there is no TextIndex configured and the text:query is
>> invoked, but may be a noop
>>
>> Thanks,
>> Chris
>>
>>
>>> On Jan 18, 2019, at 8:08 AM, Mikael Pesonen
>>> <[email protected]> wrote:
>>>
>>>
>>>
>>> On 18/01/2019 13:40, Andy Seaborne wrote:
>>>>
>>>> On 17/01/2019 15:45, Mikael Pesonen wrote:
>>>>>
>>>>> On 17/01/2019 17:38, Andy Seaborne wrote:
>>>>>>
>>>>>> On 17/01/2019 12:51, Mikael Pesonen wrote:
>>>>>>>
>>>>>>> On 17/01/2019 13:58, Andy Seaborne wrote:
>>>>>>>>
>>>>>>>> On 16/01/2019 12:50, Mikael Pesonen wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I'm trying to get text search work. Sparql REGEX takes few
>>>>>>>>> seconds to finish so hoping this would be faster. Application
>>>>>>>>> is term search using SKOS ontology.
>>>>>>>>>
>>>>>>>>>    First tested if it's enabled by default
>>>>>>>>>
>>>>>>>>>    ?concept text:query (skos:prefLabel "medi") .
>>>>>>>>>     ?concept skos:prefLabel ?prefLabel
>>>>>>>>>
>>>>>>>>> That returns all concepts so I guess it's not enabled.
>>>>>>>> If it returns all concepts, the first line matched (otherwise
>>>>>>>> you get none). If so, there is a text index and "medi" (case
>>>>>>>> insensitive) matches Lucene rules, everything.
>>>>>>> What does this mean then, why is it matching everything?
>>>>>> If zero matches, you don't get to ?concept skos:prefLabel
>>>>>> ?prefLabel (if the text index is correct)
>>>>>>
>>>>>> The query above, if the index is setup correctly,  gets all
>>>>>> concepts where any skos:prefLabel matches "medi" (not just at the
>>>>>> start), then gets all skos:prefLabel for those concepts. That
>>>>>> does not mean ?prefLabel only matches "medi"
>>>>>>
>>>>>> :c skos:prefLabel "medi" ;
>>>>>>     skos:prefLabel "Other" .
>>>>>>
>>>>>> will return 2 matches including ?prefLabel="Other"
>>>>> Yes that is how I understood it. But  ?concept text:query
>>>>> (skos:prefLabel "medi")  returns all concepts, also those that
>>>>> don't have any label having "medi".
>>>> Then I don't understand what is going on.
>>>>
>>>> Do you have a complete, minimal example that someone can use to
>>>> recreate the situation?
>>>>
>>> This is the query:
>>>
>>> PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
>>> PREFIX text: <http://jena.apache.org/text#>
>>> SELECT *
>>> WHERE
>>> {
>>>    GRAPH <http://www.yso.fi/onto/mesh/>
>>>    {
>>>      ?concept text:query (skos:prefLabel "medi") .
>>>      ?concept skos:prefLabel ?prefLabel .
>>>
>>>      # FILTER (  REGEX(?prefLabel, "\\bmedi", "i"))
>>>    }
>>> }
>>> limit 10
>>>
>>> and graph is dump copied from here:  https://finto.fi/mesh/en/
>>> end of page "Download this vocabulary"
>>>
>>> So to make clear, we have made zero configuration on jena/fuseki,
>>> all is default from 3.9.0 package.
>>>> Andy
>>> -- 
>>> Lingsoft - 30 years of Leading Language Management
>>>
>>> www.lingsoft.fi
>>>
>>> Speech Applications - Language Management - Translation - Reader's
>>> and Writer's Tools - Text Tools - E-books and M-books
>>>
>>> Mikael Pesonen
>>> System Engineer
>>>
>>> e-mail: [email protected]
>>> Tel. +358 2 279 3300
>>>
>>> Time zone: GMT+2
>>>
>>> Helsinki Office
>>> Eteläranta 10
>>> FI-00130 Helsinki
>>> FINLAND
>>>
>>> Turku Office
>>> Kauppiaskatu 5 A
>>> FI-20100 Turku
>>> FINLAND
>>>
>>
>
-- 
Lorenz Bühmann
AKSW group, University of Leipzig
Group: http://aksw.org - semantic web research center

Reply via email to