Ok, the problem was that my TDB was built from RDF/XML I guess. I have a
prebuilt TDB that was built from RDF/XML. Just wondering is there a way to
run textindexer on a TDB that is not turtle? If not I will just convert to
turtle using jena then reload the tdb.


On Tue, Aug 20, 2013 at 4:02 PM, Andy Seaborne <[email protected]> wrote:

> Your mails are corrupted - see below.
>
> Works for me.  I fixed up your data [newlines in bad places]
> (please - use Turtle!), loaded it with tdbloader (one warning) and ran
> textindexers.
>
> 20:53:07.581 INFO  textindexer          :: 10 (10 per second) properties
> indexed
>
>         Andy
>
> On 20/08/13 18:15, Brad Moran wrote:
>
>> *I am running jena.textindexer from bash on a jena TDB that is similar
>> to:*
>>
>
> "similar to"?
>
>  *
>> *
>>
>
> Too many stars.
>
> No rdf:RDF
>
>       xmlns:mms="http://rdf.cdisc.**org/mms# <http://rdf.cdisc.org/mms#>"
>>      
>> xmlns:sdtm="http://rdf.cdisc.**org/sdtm-1-2/std#<http://rdf.cdisc.org/sdtm-1-2/std#>
>> "
>>      
>> xmlns:rdf="http://www.w3.org/**1999/02/22-rdf-syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>> "
>>      
>> xmlns:skos="http://www.w3.org/**2004/02/skos/core#<http://www.w3.org/2004/02/skos/core#>
>> "
>>      
>> xmlns:sdtmigs="http://rdf.**cdisc.org/sdtmig-3-1-2/schema#<http://rdf.cdisc.org/sdtmig-3-1-2/schema#>
>> **"
>>      
>> xmlns:owl="http://www.w3.org/**2002/07/owl#<http://www.w3.org/2002/07/owl#>
>> "
>>      
>> xmlns:dc="http://purl.org/dc/**elements/1.1/<http://purl.org/dc/elements/1.1/>
>> "
>>      
>> xmlns="http://rdf.cdisc.org/**sdtmig-3-1-2/std#<http://rdf.cdisc.org/sdtmig-3-1-2/std#>
>> "
>>      
>> xmlns:xsd="http://www.w3.org/**2001/XMLSchema#<http://www.w3.org/2001/XMLSchema#>
>> "
>>      
>> xmlns:sdtms="http://rdf.cdisc.**org/sdtm-1-2/schema#<http://rdf.cdisc.org/sdtm-1-2/schema#>
>> "
>>      
>> xmlns:sdtmct="http://rdf.**cdisc.org/sdtm/ct#<http://rdf.cdisc.org/sdtm/ct#>
>> "
>>      
>> xmlns:rdfs="http://www.w3.org/**2000/01/rdf-schema#<http://www.w3.org/2000/01/rdf-schema#>
>> "
>>    
>> xml:base="http://rdf.cdisc.**org/sdtmig-3-1-2/std<http://rdf.cdisc.org/sdtmig-3-1-2/std>
>> ">
>>
>> <mms:DataElement rdf:ID="Column.EX.DOMAIN">
>>      <mms:dataElementName rdf:datatype="
>> http://www.w3.org/2001/**XMLSchema#string<http://www.w3.org/2001/XMLSchema#string>
>> "
>>      >DOMAIN</mms:dataElementName>
>>      <mms:dataElementLabel rdf:datatype="
>> http://www.w3.org/2001/**XMLSchema#string<http://www.w3.org/2001/XMLSchema#string>
>> "
>>      >Domain Abbreviation</mms:**dataElementLabel>
>>      <mms:ordinal rdf:datatype="
>> http://www.w3.org/2001/**XMLSchema#positiveInteger<http://www.w3.org/2001/XMLSchema#positiveInteger>
>> "
>>      >2</mms:ordinal>
>>      <mms:broader rdf:resource="
>> http://rdf.cdisc.org/sdtm-1-2/**std#DE.Identifier.DOMAIN<http://rdf.cdisc.org/sdtm-1-2/std#DE.Identifier.DOMAIN>
>> "/>
>>      <sdtms:dataElementCompliance rdf:resource="
>> http://rdf.cdisc.org/sdtm-1-2/**schema#Classifier.**RequiredVariable<http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.RequiredVariable>
>> "/>
>>      <sdtmigs:references rdf:datatype="
>> http://www.w3.org/2001/**XMLSchema#string<http://www.w3.org/2001/XMLSchema#string>
>> "
>>      >SDTM 2.2.4, SDTMIG 4.1.2.2, SDTMIG Appendix C2</sdtmigs:references>
>>      <mms:dataElementDescription rdf:datatype="
>> http://www.w3.org/2001/**XMLSchema#string<http://www.w3.org/2001/XMLSchema#string>
>> "
>>      >Two-character abbreviation for the domain.</mms:**
>> dataElementDescription>
>>      <mms:context rdf:resource="#Table.EX"/>
>>      <sdtmigs:**controlledTermsOrFormat rdf:datatype="
>> http://www.w3.org/2001/**XMLSchema#string<http://www.w3.org/2001/XMLSchema#string>
>> "
>>      >EX</sdtmigs:**controlledTermsOrFormat>
>>      <sdtms:dataElementType rdf:resource="
>> http://rdf.cdisc.org/sdtm-1-2/**schema#Classifier.Character<http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.Character>
>> "/>
>>      <mms:dataElementType rdf:datatype="
>> http://www.w3.org/2001/**XMLSchema#QName<http://www.w3.org/2001/XMLSchema#QName>
>> "
>>      >xsd:string</mms:**dataElementType>
>>      <sdtms:dataElementRole rdf:resource="
>> http://rdf.cdisc.org/sdtm-1-2/**schema#Classifier.**IdentifierVariable<http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.IdentifierVariable>
>> "/>
>>    </mms:DataElement>
>>
>> *However, I receive:*
>> *
>>
>> *
>>       INFO  0 (0 per second) properties indexed
>>
>> *My assembly file is:*
>> *
>>
>> *
>> @prefix :        
>> <http://localhost/jena_**example/#<http://localhost/jena_example/#>>
>> .
>> @prefix rdf:     
>> <http://www.w3.org/1999/02/22-**rdf-syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>>
>> .
>> @prefix rdfs:    
>> <http://www.w3.org/2000/01/**rdf-schema#<http://www.w3.org/2000/01/rdf-schema#>>
>> .
>> @prefix tdb:     
>> <http://jena.hpl.hp.com/2008/**tdb#<http://jena.hpl.hp.com/2008/tdb#>>
>> .
>> @prefix ja:      
>> <http://jena.hpl.hp.com/2005/**11/Assembler#<http://jena.hpl.hp.com/2005/11/Assembler#>>
>> .
>> @prefix text:    <http://jena.apache.org/text#> .
>> @prefix mms:     <http://rdf.cdisc.org/mms#> .
>> @prefix sdtms:   
>> <http://rdf.cdisc.org/sdtm-1-**2/schema#<http://rdf.cdisc.org/sdtm-1-2/schema#>>
>> .
>> @prefix sdtmigs: 
>> <http://rdf.cdisc.org/sdtmig-**3-1-2/schema#<http://rdf.cdisc.org/sdtmig-3-1-2/schema#>>
>> .
>>
>> ## Example of a TDB dataset and text index
>> ## Initialize TDB
>> [] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
>> tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
>> tdb:GraphTDB    rdfs:subClassOf  ja:Model .
>>
>> ## Initialize text query
>> [] ja:loadClass       "org.apache.jena.query.text.**TextQuery" .
>> # A TextDataset is a regular dataset with a text index.
>> text:TextDataset      rdfs:subClassOf   ja:RDFDataset .
>> # Lucene index
>> text:TextIndexLucene  rdfs:subClassOf   text:TextIndex .
>>
>> ## ------------------------------**------------------------------**---
>> ## This URI must be fixed - it's used to assemble the text dataset.
>>
>> :text_dataset rdf:type     text:TextDataset ;
>>      text:dataset   <#dataset> ;
>>      text:index     <#indexLucene> ;
>>      .
>>
>> # A TDB dataset used for RDF storage
>> <#dataset> rdf:type      tdb:DatasetTDB ;
>>      tdb:location "tdb" ;
>>      .
>>
>> # Text index description
>> <#indexLucene> a text:TextIndexLucene ;
>>      text:directory <file:luceneIndexes> ;
>>      text:entityMap <#entMap> ;
>>      .
>>
>> # Mapping in the index
>> # URI stored in field "uri"
>> # rdfs:label is mapped to field "text"
>> <#entMap> a text:EntityMap ;
>>      text:entityField      "uri" ;
>>      text:defaultField     "text" ;
>>      text:map (
>>           [ text:field "text" ; text:predicate mms:dataElementName ]
>>           [ text:field "text" ; text:predicate mms:dataElementDescription
>> ]
>>   [ text:field "text" ; text:predicate mms:dataElementLabel ]
>>   [ text:field "text" ; text:predicate mms:dataElementType ]
>>   [ text:field "text" ; text:predicate mms:ordinal ]
>>   [ text:field "text" ; text:predicate mms:broader ]
>>   [ text:field "text" ; text:predicate sdtms:dataElementType ]
>>   [ text:field "text" ; text:predicate sdtms:dataElementRole ]
>>   [ text:field "text" ; text:predicate sdtms:dataElementCompliance ]
>>   [ text:field "text" ; text:predicate sdtmigs:references ]
>>           ) .
>>
>>
>>
>> *It looks like I have all the properties in my assembly file that could
>>
>> appear in the nodes of my TDB. So this leads me to believe the problem is
>> with my assembly file. I tried indexing just one property and the same
>> problem occured. When I run the text indexer, segments_1 and segments.gen
>> appear in my index directory, however there are just a couple of random
>> characters in each file. Just to be sure nothing was indexed I tried
>> running a text query on it with java:*
>> *
>> *
>> *
>>                  QueryExecution qExec = QueryExecutionFactory.create(
>>
>> *
>>                          "PREFIX text: <http://jena.apache.org/text#>
>> PREFIX
>> mms: <http://rdf.cdisc.org/mms#> "
>>                          + "SELECT * WHERE{?s text:query
>> (mms:dataElementName 'AE')}", ds);
>>
>>                  ResultSet rs = qExec.execSelect();
>>
>>
>> *I get these warnings and an empty result set:*
>> *
>>
>> *
>>          WARN  o.apache.jena.query.text.**TextQueryPF - Failed to find
>> the
>> text index : tried context and as a text-enabled dataset
>>          WARN  o.apache.jena.query.text.**TextQueryPF - No text index -
>> no
>> text search performed
>> *
>> *
>> *
>> *
>> *I am thinking that the problem is with the assembler file because I am
>> new
>>
>> to creating these. However, I cannot find much online about this problem.
>> Any suggestions?*
>> *
>> *
>>
>>
>

Reply via email to