Ok, the problem was that my TDB was built from RDF/XML I guess. I have a prebuilt TDB that was built from RDF/XML. Just wondering is there a way to run textindexer on a TDB that is not turtle? If not I will just convert to turtle using jena then reload the tdb.
On Tue, Aug 20, 2013 at 4:02 PM, Andy Seaborne <[email protected]> wrote: > Your mails are corrupted - see below. > > Works for me. I fixed up your data [newlines in bad places] > (please - use Turtle!), loaded it with tdbloader (one warning) and ran > textindexers. > > 20:53:07.581 INFO textindexer :: 10 (10 per second) properties > indexed > > Andy > > On 20/08/13 18:15, Brad Moran wrote: > >> *I am running jena.textindexer from bash on a jena TDB that is similar >> to:* >> > > "similar to"? > > * >> * >> > > Too many stars. > > No rdf:RDF > > xmlns:mms="http://rdf.cdisc.**org/mms# <http://rdf.cdisc.org/mms#>" >> >> xmlns:sdtm="http://rdf.cdisc.**org/sdtm-1-2/std#<http://rdf.cdisc.org/sdtm-1-2/std#> >> " >> >> xmlns:rdf="http://www.w3.org/**1999/02/22-rdf-syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#> >> " >> >> xmlns:skos="http://www.w3.org/**2004/02/skos/core#<http://www.w3.org/2004/02/skos/core#> >> " >> >> xmlns:sdtmigs="http://rdf.**cdisc.org/sdtmig-3-1-2/schema#<http://rdf.cdisc.org/sdtmig-3-1-2/schema#> >> **" >> >> xmlns:owl="http://www.w3.org/**2002/07/owl#<http://www.w3.org/2002/07/owl#> >> " >> >> xmlns:dc="http://purl.org/dc/**elements/1.1/<http://purl.org/dc/elements/1.1/> >> " >> >> xmlns="http://rdf.cdisc.org/**sdtmig-3-1-2/std#<http://rdf.cdisc.org/sdtmig-3-1-2/std#> >> " >> >> xmlns:xsd="http://www.w3.org/**2001/XMLSchema#<http://www.w3.org/2001/XMLSchema#> >> " >> >> xmlns:sdtms="http://rdf.cdisc.**org/sdtm-1-2/schema#<http://rdf.cdisc.org/sdtm-1-2/schema#> >> " >> >> xmlns:sdtmct="http://rdf.**cdisc.org/sdtm/ct#<http://rdf.cdisc.org/sdtm/ct#> >> " >> >> xmlns:rdfs="http://www.w3.org/**2000/01/rdf-schema#<http://www.w3.org/2000/01/rdf-schema#> >> " >> >> xml:base="http://rdf.cdisc.**org/sdtmig-3-1-2/std<http://rdf.cdisc.org/sdtmig-3-1-2/std> >> "> >> >> <mms:DataElement rdf:ID="Column.EX.DOMAIN"> >> <mms:dataElementName rdf:datatype=" >> http://www.w3.org/2001/**XMLSchema#string<http://www.w3.org/2001/XMLSchema#string> >> " >> >DOMAIN</mms:dataElementName> >> <mms:dataElementLabel rdf:datatype=" >> http://www.w3.org/2001/**XMLSchema#string<http://www.w3.org/2001/XMLSchema#string> >> " >> >Domain Abbreviation</mms:**dataElementLabel> >> <mms:ordinal rdf:datatype=" >> http://www.w3.org/2001/**XMLSchema#positiveInteger<http://www.w3.org/2001/XMLSchema#positiveInteger> >> " >> >2</mms:ordinal> >> <mms:broader rdf:resource=" >> http://rdf.cdisc.org/sdtm-1-2/**std#DE.Identifier.DOMAIN<http://rdf.cdisc.org/sdtm-1-2/std#DE.Identifier.DOMAIN> >> "/> >> <sdtms:dataElementCompliance rdf:resource=" >> http://rdf.cdisc.org/sdtm-1-2/**schema#Classifier.**RequiredVariable<http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.RequiredVariable> >> "/> >> <sdtmigs:references rdf:datatype=" >> http://www.w3.org/2001/**XMLSchema#string<http://www.w3.org/2001/XMLSchema#string> >> " >> >SDTM 2.2.4, SDTMIG 4.1.2.2, SDTMIG Appendix C2</sdtmigs:references> >> <mms:dataElementDescription rdf:datatype=" >> http://www.w3.org/2001/**XMLSchema#string<http://www.w3.org/2001/XMLSchema#string> >> " >> >Two-character abbreviation for the domain.</mms:** >> dataElementDescription> >> <mms:context rdf:resource="#Table.EX"/> >> <sdtmigs:**controlledTermsOrFormat rdf:datatype=" >> http://www.w3.org/2001/**XMLSchema#string<http://www.w3.org/2001/XMLSchema#string> >> " >> >EX</sdtmigs:**controlledTermsOrFormat> >> <sdtms:dataElementType rdf:resource=" >> http://rdf.cdisc.org/sdtm-1-2/**schema#Classifier.Character<http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.Character> >> "/> >> <mms:dataElementType rdf:datatype=" >> http://www.w3.org/2001/**XMLSchema#QName<http://www.w3.org/2001/XMLSchema#QName> >> " >> >xsd:string</mms:**dataElementType> >> <sdtms:dataElementRole rdf:resource=" >> http://rdf.cdisc.org/sdtm-1-2/**schema#Classifier.**IdentifierVariable<http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.IdentifierVariable> >> "/> >> </mms:DataElement> >> >> *However, I receive:* >> * >> >> * >> INFO 0 (0 per second) properties indexed >> >> *My assembly file is:* >> * >> >> * >> @prefix : >> <http://localhost/jena_**example/#<http://localhost/jena_example/#>> >> . >> @prefix rdf: >> <http://www.w3.org/1999/02/22-**rdf-syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>> >> . >> @prefix rdfs: >> <http://www.w3.org/2000/01/**rdf-schema#<http://www.w3.org/2000/01/rdf-schema#>> >> . >> @prefix tdb: >> <http://jena.hpl.hp.com/2008/**tdb#<http://jena.hpl.hp.com/2008/tdb#>> >> . >> @prefix ja: >> <http://jena.hpl.hp.com/2005/**11/Assembler#<http://jena.hpl.hp.com/2005/11/Assembler#>> >> . >> @prefix text: <http://jena.apache.org/text#> . >> @prefix mms: <http://rdf.cdisc.org/mms#> . >> @prefix sdtms: >> <http://rdf.cdisc.org/sdtm-1-**2/schema#<http://rdf.cdisc.org/sdtm-1-2/schema#>> >> . >> @prefix sdtmigs: >> <http://rdf.cdisc.org/sdtmig-**3-1-2/schema#<http://rdf.cdisc.org/sdtmig-3-1-2/schema#>> >> . >> >> ## Example of a TDB dataset and text index >> ## Initialize TDB >> [] ja:loadClass "com.hp.hpl.jena.tdb.TDB" . >> tdb:DatasetTDB rdfs:subClassOf ja:RDFDataset . >> tdb:GraphTDB rdfs:subClassOf ja:Model . >> >> ## Initialize text query >> [] ja:loadClass "org.apache.jena.query.text.**TextQuery" . >> # A TextDataset is a regular dataset with a text index. >> text:TextDataset rdfs:subClassOf ja:RDFDataset . >> # Lucene index >> text:TextIndexLucene rdfs:subClassOf text:TextIndex . >> >> ## ------------------------------**------------------------------**--- >> ## This URI must be fixed - it's used to assemble the text dataset. >> >> :text_dataset rdf:type text:TextDataset ; >> text:dataset <#dataset> ; >> text:index <#indexLucene> ; >> . >> >> # A TDB dataset used for RDF storage >> <#dataset> rdf:type tdb:DatasetTDB ; >> tdb:location "tdb" ; >> . >> >> # Text index description >> <#indexLucene> a text:TextIndexLucene ; >> text:directory <file:luceneIndexes> ; >> text:entityMap <#entMap> ; >> . >> >> # Mapping in the index >> # URI stored in field "uri" >> # rdfs:label is mapped to field "text" >> <#entMap> a text:EntityMap ; >> text:entityField "uri" ; >> text:defaultField "text" ; >> text:map ( >> [ text:field "text" ; text:predicate mms:dataElementName ] >> [ text:field "text" ; text:predicate mms:dataElementDescription >> ] >> [ text:field "text" ; text:predicate mms:dataElementLabel ] >> [ text:field "text" ; text:predicate mms:dataElementType ] >> [ text:field "text" ; text:predicate mms:ordinal ] >> [ text:field "text" ; text:predicate mms:broader ] >> [ text:field "text" ; text:predicate sdtms:dataElementType ] >> [ text:field "text" ; text:predicate sdtms:dataElementRole ] >> [ text:field "text" ; text:predicate sdtms:dataElementCompliance ] >> [ text:field "text" ; text:predicate sdtmigs:references ] >> ) . >> >> >> >> *It looks like I have all the properties in my assembly file that could >> >> appear in the nodes of my TDB. So this leads me to believe the problem is >> with my assembly file. I tried indexing just one property and the same >> problem occured. When I run the text indexer, segments_1 and segments.gen >> appear in my index directory, however there are just a couple of random >> characters in each file. Just to be sure nothing was indexed I tried >> running a text query on it with java:* >> * >> * >> * >> QueryExecution qExec = QueryExecutionFactory.create( >> >> * >> "PREFIX text: <http://jena.apache.org/text#> >> PREFIX >> mms: <http://rdf.cdisc.org/mms#> " >> + "SELECT * WHERE{?s text:query >> (mms:dataElementName 'AE')}", ds); >> >> ResultSet rs = qExec.execSelect(); >> >> >> *I get these warnings and an empty result set:* >> * >> >> * >> WARN o.apache.jena.query.text.**TextQueryPF - Failed to find >> the >> text index : tried context and as a text-enabled dataset >> WARN o.apache.jena.query.text.**TextQueryPF - No text index - >> no >> text search performed >> * >> * >> * >> * >> *I am thinking that the problem is with the assembler file because I am >> new >> >> to creating these. However, I cannot find much online about this problem. >> Any suggestions?* >> * >> * >> >> >
