*I am running jena.textindexer from bash on a jena TDB that is similar to:*
*
*
    xmlns:mms="http://rdf.cdisc.org/mms#";
    xmlns:sdtm="http://rdf.cdisc.org/sdtm-1-2/std#";
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
    xmlns:skos="http://www.w3.org/2004/02/skos/core#";
    xmlns:sdtmigs="http://rdf.cdisc.org/sdtmig-3-1-2/schema#";
    xmlns:owl="http://www.w3.org/2002/07/owl#";
    xmlns:dc="http://purl.org/dc/elements/1.1/";
    xmlns="http://rdf.cdisc.org/sdtmig-3-1-2/std#";
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#";
    xmlns:sdtms="http://rdf.cdisc.org/sdtm-1-2/schema#";
    xmlns:sdtmct="http://rdf.cdisc.org/sdtm/ct#";
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#";
  xml:base="http://rdf.cdisc.org/sdtmig-3-1-2/std";>

<mms:DataElement rdf:ID="Column.EX.DOMAIN">
    <mms:dataElementName rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
    >DOMAIN</mms:dataElementName>
    <mms:dataElementLabel rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
    >Domain Abbreviation</mms:dataElementLabel>
    <mms:ordinal rdf:datatype="
http://www.w3.org/2001/XMLSchema#positiveInteger";
    >2</mms:ordinal>
    <mms:broader rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/std#DE.Identifier.DOMAIN"/>
    <sdtms:dataElementCompliance rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.RequiredVariable"/>
    <sdtmigs:references rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
    >SDTM 2.2.4, SDTMIG 4.1.2.2, SDTMIG Appendix C2</sdtmigs:references>
    <mms:dataElementDescription rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
    >Two-character abbreviation for the domain.</mms:dataElementDescription>
    <mms:context rdf:resource="#Table.EX"/>
    <sdtmigs:controlledTermsOrFormat rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
    >EX</sdtmigs:controlledTermsOrFormat>
    <sdtms:dataElementType rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.Character"/>
    <mms:dataElementType rdf:datatype="
http://www.w3.org/2001/XMLSchema#QName";
    >xsd:string</mms:dataElementType>
    <sdtms:dataElementRole rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.IdentifierVariable"/>
  </mms:DataElement>

*However, I receive:*
*
*
     INFO  0 (0 per second) properties indexed

*My assembly file is:*
*
*
@prefix :        <http://localhost/jena_example/#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb:     <http://jena.hpl.hp.com/2008/tdb#> .
@prefix ja:      <http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix text:    <http://jena.apache.org/text#> .
@prefix mms:     <http://rdf.cdisc.org/mms#> .
@prefix sdtms:   <http://rdf.cdisc.org/sdtm-1-2/schema#> .
@prefix sdtmigs: <http://rdf.cdisc.org/sdtmig-3-1-2/schema#> .

## Example of a TDB dataset and text index
## Initialize TDB
[] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
tdb:GraphTDB    rdfs:subClassOf  ja:Model .

## Initialize text query
[] ja:loadClass       "org.apache.jena.query.text.TextQuery" .
# A TextDataset is a regular dataset with a text index.
text:TextDataset      rdfs:subClassOf   ja:RDFDataset .
# Lucene index
text:TextIndexLucene  rdfs:subClassOf   text:TextIndex .

## ---------------------------------------------------------------
## This URI must be fixed - it's used to assemble the text dataset.

:text_dataset rdf:type     text:TextDataset ;
    text:dataset   <#dataset> ;
    text:index     <#indexLucene> ;
    .

# A TDB dataset used for RDF storage
<#dataset> rdf:type      tdb:DatasetTDB ;
    tdb:location "tdb" ;
    .

# Text index description
<#indexLucene> a text:TextIndexLucene ;
    text:directory <file:luceneIndexes> ;
    text:entityMap <#entMap> ;
    .

# Mapping in the index
# URI stored in field "uri"
# rdfs:label is mapped to field "text"
<#entMap> a text:EntityMap ;
    text:entityField      "uri" ;
    text:defaultField     "text" ;
    text:map (
         [ text:field "text" ; text:predicate mms:dataElementName ]
         [ text:field "text" ; text:predicate mms:dataElementDescription ]
 [ text:field "text" ; text:predicate mms:dataElementLabel ]
 [ text:field "text" ; text:predicate mms:dataElementType ]
 [ text:field "text" ; text:predicate mms:ordinal ]
 [ text:field "text" ; text:predicate mms:broader ]
 [ text:field "text" ; text:predicate sdtms:dataElementType ]
 [ text:field "text" ; text:predicate sdtms:dataElementRole ]
 [ text:field "text" ; text:predicate sdtms:dataElementCompliance ]
 [ text:field "text" ; text:predicate sdtmigs:references ]
         ) .



*It looks like I have all the properties in my assembly file that could
appear in the nodes of my TDB. So this leads me to believe the problem is
with my assembly file. I tried indexing just one property and the same
problem occured. When I run the text indexer, segments_1 and segments.gen
appear in my index directory, however there are just a couple of random
characters in each file. Just to be sure nothing was indexed I tried
running a text query on it with java:*
*
*
*
                QueryExecution qExec = QueryExecutionFactory.create(
*
                        "PREFIX text: <http://jena.apache.org/text#> PREFIX
mms: <http://rdf.cdisc.org/mms#> "
                        + "SELECT * WHERE{?s text:query
(mms:dataElementName 'AE')}", ds);

                ResultSet rs = qExec.execSelect();


*I get these warnings and an empty result set:*
*
*
        WARN  o.apache.jena.query.text.TextQueryPF - Failed to find the
text index : tried context and as a text-enabled dataset
        WARN  o.apache.jena.query.text.TextQueryPF - No text index - no
text search performed
*
*
*
*
*I am thinking that the problem is with the assembler file because I am new
to creating these. However, I cannot find much online about this problem.
Any suggestions?*
*
*

Reply via email to