Hi, I am currently having a problem getting the exact results I want from my text queries. I attached one example of my rdf that I begin with. Then I run tdbloader and successfully create an index using this assembler file with jena.textindexer:
@prefix : <http://localhost/jena_example/#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> . @prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> . @prefix text: <http://jena.apache.org/text#> . @prefix mms: <http://rdf.cdisc.org/mms#> . @prefix sdtms: <http://rdf.cdisc.org/sdtm-1-2/schema#> . @prefix sdtmigs: <http://rdf.cdisc.org/sdtmig-3-1-2/schema#> . @prefix sends: <http://rdf.cdisc.org/send/schema#> . @prefix sendigs: <http://rdf.cdisc.org/send-3.0/schema#> . @prefix cts: <http://rdf.cdisc.org/ct/schema#> . ## Example of a TDB dataset and text index ## Initialize TDB [] ja:loadClass "com.hp.hpl.jena.tdb.TDB" . tdb:DatasetTDB rdfs:subClassOf ja:RDFDataset . tdb:GraphTDB rdfs:subClassOf ja:Model . ## Initialize text query [] ja:loadClass "org.apache.jena.query.text.TextQuery" . # A TextDataset is a regular dataset with a text index. text:TextDataset rdfs:subClassOf ja:RDFDataset . # Lucene index text:TextIndexLucene rdfs:subClassOf text:TextIndex . ## --------------------------------------------------------------- ## This URI must be fixed - it's used to assemble the text dataset. :text_dataset rdf:type text:TextDataset ; text:dataset <#dataset> ; text:index <#indexLucene> ; . # A TDB dataset used for RDF storage <#dataset> rdf:type tdb:DatasetTDB ; tdb:location "tdb" ; # if from command line use: "NetBeansProjects/mdr-older/trunk/tdb" . # Text index description <#indexLucene> a text:TextIndexLucene ; text:directory <file:luceneIndexes> ; text:entityMap <#entMap> ; . # Mapping in the index # URI stored in field "uri" # rdfs:label is mapped to field "text" <#entMap> a text:EntityMap ; text:entityField "uri" ; text:defaultField "text" ; text:map ( [ text:field "text" ; text:predicate mms:dataElementName ] [ text:field "text" ; text:predicate mms:dataElementDescription ] [ text:field "text" ; text:predicate mms:dataElementLabel ] [ text:field "text" ; text:predicate mms:dataElementType ] [ text:field "text" ; text:predicate mms:ordinal ] [ text:field "text" ; text:predicate mms:broader ] [ text:field "text" ; text:predicate mms:Dataset ] [ text:field "text" ; text:predicate mms:contextName ] [ text:field "text" ; text:predicate mms:contextLabel ] [ text:field "text" ; text:predicate mms:contextDescription ] [ text:field "text" ; text:predicate sdtms:dataElementType ] [ text:field "text" ; text:predicate sdtms:dataElementRole ] [ text:field "text" ; text:predicate sdtms:dataElementCompliance ] [ text:field "text" ; text:predicate sdtms:supportedBySDTMIG ] [ text:field "text" ; text:predicate sdtms:supportedBySEND ] [ text:field "text" ; text:predicate sdtmigs:references ] [ text:field "text" ; text:predicate sdtmigs:domainStructure ] [ text:field "text" ; text:predicate sdtmigs:domainCode ] [ text:field "text" ; text:predicate sdtmigs:controlledTermsOrFormat ] [ text:field "text" ; text:predicate sends:dataElementCompliance ] [ text:field "text" ; text:predicate sends:dataElementRole ] [ text:field "text" ; text:predicate sendigs:domainStructure ] [ text:field "text" ; text:predicate sendigs:domainCode ] [ text:field "text" ; text:predicate sendigs:controlledTermsOrFormat ] [ text:field "text" ; text:predicate cts:cdiscDefinition] [ text:field "text" ; text:predicate cts:nciPreferredTerm] [ text:field "text" ; text:predicate cts:nciCode] [ text:field "text" ; text:predicate cts:cdiscSynonyms] [ text:field "text" ; text:predicate cts:cdiscSubmissionValue] [ text:field "text" ; text:predicate cts:codelistName] [ text:field "text" ; text:predicate cts:isExtensibleCodelist] ) . I then try to run queries against this dataset, as an example say I want to search "AE" then I would expect every dataElement within the AE domain to be returned. However, I cannot get the desired result. If I search: PREFIX : <http://localhost/jena_example/#> PREFIX text: < http://jena.apache.org/text#> PREFIX mms: <http://rdf.cdisc.org/mms#> SELECT * {?s text:query (mms:dataElementName 'AE')} I get: <http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.DOMAIN> <http://rdf.cdisc.org/sdtmig-3-1-2/std#Table.AE> <http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.FA.FACAT> when I would expect to get: <http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AERELNST> <http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AEENDY> <http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AEMODIFY> <http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AETOXGR> <http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AEREFID> <http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AESCAT> <http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AESEQ> <http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AESMIE> (And the rest of the .AE dataElements just listed a few here) I also tried playing with this query a lot, but could not get the desired result for example I tried the other form of query as well: PREFIX : <http://localhost/jena_example/#> PREFIX text: < http://jena.apache.org/text#> PREFIX mms: <http://rdf.cdisc.org/mms#> SELECT * {?subject mms:contextName ?o . ?s text:query (mms:contextName 'SE')} I am not sure whether the problem is a result of my query being formed incorrectly, or whether the problem could be in my assembler file that creates the index (is there a better/more complete way to create an index for this rdf model?). Any suggestions would help, like I mentioned in the beginning one of the rdf files from tdb is attached. Thanks.
