Re: Text search (lucene) with tdbquery

Samur Araujo Thu, 22 Dec 2016 04:04:25 -0800

Thank you Andy, this makes things clear.

I understood the best way is to use Fuseki engine and access the data via
sparql service.


It would be idea to have the tdbquery command working with the indexes as
well.

Best,
Samur

On 22 December 2016 at 12:43, Andy Seaborne <[email protected]> wrote:

>
>
> On 22/12/16 09:18, Samur Araujo wrote:
>
>> Hi all, Is it possible to use tdbquery for textsearch?
>>
>> I have configured, loaded and index the data using tdbloader and
>> jena.textindexer.
>>
>> However, when I use tdbquery to query the database it seems to ignore the
>> lucene index.
>>
>> I use the command:
>> tdbquery --desc textsearch.ttl --query search.rq
>>
>
> By default, the class files for jena-text will not be available to the
> command.  Did you not get some kind of warning? (which version are you
> using?)
>
> However, the main issue is that theer are two datasets in the assembler
> file.  "tdbquery --desc" will look for the TDB dataset.
>
> I think it needs some coding to run a text backed query. Use
> TextDatasetFactory.create(filename) to build from the assembler.
>
> (Maybe there is a better way someone can point out - I have used jena-text
> in a while).
>
>         Andy
>
>
>> Do I need to provide any extra configuration? Do I need to compile jena
>> with special features?
>>
>
> No - but the TDB command line tools are not sufficient.
>
>
>
>> The assembler is below:
>> @prefix :        <http://localhost/jena_example/#> .
>> @prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
>> @prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
>> @prefix tdb:     <http://jena.hpl.hp.com/2008/tdb#> .
>> @prefix ja:      <http://jena.hpl.hp.com/2005/11/Assembler#> .
>> @prefix text:    <http://jena.apache.org/text#> .
>>
>> ## Example of a TDB dataset and text index
>> ## Initialize TDB
>> [] ja:loadClass "org.apache.jena.tdb.TDB" .
>> tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
>> tdb:GraphTDB    rdfs:subClassOf  ja:Model .
>>
>> ## Initialize text query
>> [] ja:loadClass       "org.apache.jena.query.text.TextQuery" .
>> # A TextDataset is a regular dataset with a text index.
>> text:TextDataset      rdfs:subClassOf   ja:RDFDataset .
>> # Lucene index
>> text:TextIndexLucene  rdfs:subClassOf   text:TextIndex .
>> # Solr index
>> text:TextIndexSolr    rdfs:subClassOf   text:TextIndex .
>>
>> ## ---------------------------------------------------------------
>> ## This URI must be fixed - it's used to assemble the text dataset.
>>
>> :text_dataset rdf:type     text:TextDataset ;
>>     text:dataset   <#dataset> ;
>>     text:index     <#indexLucene> ;
>>     .
>>
>> # A TDB datset used for RDF storage
>> <#dataset> rdf:type      tdb:DatasetTDB ;
>>     tdb:location "DB2" ;
>>     tdb:unionDefaultGraph true ; # Optional
>>     .
>>
>> # Text index description
>> <#indexLucene> a text:TextIndexLucene ;
>>     text:directory <file:Lucene2> ;
>>     ##text:directory "mem" ;
>>     text:entityMap <#entMap> ;
>>     .
>>
>> # Mapping in the index
>> # URI stored in field "uri"
>> # rdfs:label is mapped to field "text"
>> <#entMap> a text:EntityMap ;
>>     text:entityField      "uri" ;
>>     text:defaultField     "text" ;
>>     text:map (
>>          [ text:field "text" ; text:predicate rdfs:label ]
>>          ) .
>>
>>
>>


-- 
Senior Data Scientist
Geophy
www.geophy.com

Nieuwe Plantage 54-55
2611XK  Delft
+31 (0)70 7640725

1 Fore Street
EC2Y 9DT  London
+44 (0)20 37690760

Re: Text search (lucene) with tdbquery

Reply via email to