Your mails are corrupted - see below.
Works for me. I fixed up your data [newlines in bad places]
(please - use Turtle!), loaded it with tdbloader (one warning) and ran
textindexers.
20:53:07.581 INFO textindexer :: 10 (10 per second) properties
indexed
Andy
On 20/08/13 18:15, Brad Moran wrote:
*I am running jena.textindexer from bash on a jena TDB that is similar to:*
"similar to"?
*
*
Too many stars.
No rdf:RDF
xmlns:mms="http://rdf.cdisc.org/mms#"
xmlns:sdtm="http://rdf.cdisc.org/sdtm-1-2/std#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:skos="http://www.w3.org/2004/02/skos/core#"
xmlns:sdtmigs="http://rdf.cdisc.org/sdtmig-3-1-2/schema#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns="http://rdf.cdisc.org/sdtmig-3-1-2/std#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:sdtms="http://rdf.cdisc.org/sdtm-1-2/schema#"
xmlns:sdtmct="http://rdf.cdisc.org/sdtm/ct#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xml:base="http://rdf.cdisc.org/sdtmig-3-1-2/std">
<mms:DataElement rdf:ID="Column.EX.DOMAIN">
<mms:dataElementName rdf:datatype="
http://www.w3.org/2001/XMLSchema#string"
>DOMAIN</mms:dataElementName>
<mms:dataElementLabel rdf:datatype="
http://www.w3.org/2001/XMLSchema#string"
>Domain Abbreviation</mms:dataElementLabel>
<mms:ordinal rdf:datatype="
http://www.w3.org/2001/XMLSchema#positiveInteger"
>2</mms:ordinal>
<mms:broader rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/std#DE.Identifier.DOMAIN"/>
<sdtms:dataElementCompliance rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.RequiredVariable"/>
<sdtmigs:references rdf:datatype="
http://www.w3.org/2001/XMLSchema#string"
>SDTM 2.2.4, SDTMIG 4.1.2.2, SDTMIG Appendix C2</sdtmigs:references>
<mms:dataElementDescription rdf:datatype="
http://www.w3.org/2001/XMLSchema#string"
>Two-character abbreviation for the domain.</mms:dataElementDescription>
<mms:context rdf:resource="#Table.EX"/>
<sdtmigs:controlledTermsOrFormat rdf:datatype="
http://www.w3.org/2001/XMLSchema#string"
>EX</sdtmigs:controlledTermsOrFormat>
<sdtms:dataElementType rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.Character"/>
<mms:dataElementType rdf:datatype="
http://www.w3.org/2001/XMLSchema#QName"
>xsd:string</mms:dataElementType>
<sdtms:dataElementRole rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.IdentifierVariable"/>
</mms:DataElement>
*However, I receive:*
*
*
INFO 0 (0 per second) properties indexed
*My assembly file is:*
*
*
@prefix : <http://localhost/jena_example/#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
@prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix text: <http://jena.apache.org/text#> .
@prefix mms: <http://rdf.cdisc.org/mms#> .
@prefix sdtms: <http://rdf.cdisc.org/sdtm-1-2/schema#> .
@prefix sdtmigs: <http://rdf.cdisc.org/sdtmig-3-1-2/schema#> .
## Example of a TDB dataset and text index
## Initialize TDB
[] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
tdb:DatasetTDB rdfs:subClassOf ja:RDFDataset .
tdb:GraphTDB rdfs:subClassOf ja:Model .
## Initialize text query
[] ja:loadClass "org.apache.jena.query.text.TextQuery" .
# A TextDataset is a regular dataset with a text index.
text:TextDataset rdfs:subClassOf ja:RDFDataset .
# Lucene index
text:TextIndexLucene rdfs:subClassOf text:TextIndex .
## ---------------------------------------------------------------
## This URI must be fixed - it's used to assemble the text dataset.
:text_dataset rdf:type text:TextDataset ;
text:dataset <#dataset> ;
text:index <#indexLucene> ;
.
# A TDB dataset used for RDF storage
<#dataset> rdf:type tdb:DatasetTDB ;
tdb:location "tdb" ;
.
# Text index description
<#indexLucene> a text:TextIndexLucene ;
text:directory <file:luceneIndexes> ;
text:entityMap <#entMap> ;
.
# Mapping in the index
# URI stored in field "uri"
# rdfs:label is mapped to field "text"
<#entMap> a text:EntityMap ;
text:entityField "uri" ;
text:defaultField "text" ;
text:map (
[ text:field "text" ; text:predicate mms:dataElementName ]
[ text:field "text" ; text:predicate mms:dataElementDescription ]
[ text:field "text" ; text:predicate mms:dataElementLabel ]
[ text:field "text" ; text:predicate mms:dataElementType ]
[ text:field "text" ; text:predicate mms:ordinal ]
[ text:field "text" ; text:predicate mms:broader ]
[ text:field "text" ; text:predicate sdtms:dataElementType ]
[ text:field "text" ; text:predicate sdtms:dataElementRole ]
[ text:field "text" ; text:predicate sdtms:dataElementCompliance ]
[ text:field "text" ; text:predicate sdtmigs:references ]
) .
*It looks like I have all the properties in my assembly file that could
appear in the nodes of my TDB. So this leads me to believe the problem is
with my assembly file. I tried indexing just one property and the same
problem occured. When I run the text indexer, segments_1 and segments.gen
appear in my index directory, however there are just a couple of random
characters in each file. Just to be sure nothing was indexed I tried
running a text query on it with java:*
*
*
*
QueryExecution qExec = QueryExecutionFactory.create(
*
"PREFIX text: <http://jena.apache.org/text#> PREFIX
mms: <http://rdf.cdisc.org/mms#> "
+ "SELECT * WHERE{?s text:query
(mms:dataElementName 'AE')}", ds);
ResultSet rs = qExec.execSelect();
*I get these warnings and an empty result set:*
*
*
WARN o.apache.jena.query.text.TextQueryPF - Failed to find the
text index : tried context and as a text-enabled dataset
WARN o.apache.jena.query.text.TextQueryPF - No text index - no
text search performed
*
*
*
*
*I am thinking that the problem is with the assembler file because I am new
to creating these. However, I cannot find much online about this problem.
Any suggestions?*
*
*