Ok, this is a sample of several large rdf files I am working with:

<mms:DataElement rdf:ID="Column.AE.AERELNST">
    <mms:dataElementType rdf:datatype="
http://www.w3.org/2001/XMLSchema#QName";
    >xsd:string</mms:dataElementType>
    <mms:dataElementName rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
    >AERELNST</mms:dataElementName>
    <sdtms:dataElementType rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.Character"/>
    <mms:context>
      <mms:Dataset rdf:ID="Table.AE">
        <sdtmigs:domainStructure rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
        >One record per adverse event per subject</sdtmigs:domainStructure>
        <mms:contextName rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
        >AE</mms:contextName>
        <mms:contextLabel rdf:parseType="Literal">Adverse
Events</mms:contextLabel>
        <mms:ordinal rdf:datatype="
http://www.w3.org/2001/XMLSchema#positiveInteger";
        >8</mms:ordinal>
        <sdtmigs:domainCode rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
        >AE</sdtmigs:domainCode>
        <mms:context rdf:resource="#EventsObservationClass"/>
      </mms:Dataset>
    </mms:context>
    <sdtms:dataElementCompliance rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.PermissibleVariable"/>
    <mms:ordinal rdf:datatype="
http://www.w3.org/2001/XMLSchema#positiveInteger";
    >21</mms:ordinal>
    <sdtmigs:references rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
    >SDTM 2.2.2</sdtmigs:references>
    <sdtms:dataElementRole rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.RecordQualifier"/>
    <mms:dataElementLabel rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
    >Relationship to Non-Study Treatment</mms:dataElementLabel>
    <mms:dataElementDescription rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
    >Records the investigator's opinion as to whether the event may have
been due to a treatment other than study drug. May be reported as free
text. Example: "MORE LIKELY RELATED TO ASPIRIN
USE.".</mms:dataElementDescription>
    <mms:broader rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/std#DE.Event.--RELNST"/>
  </mms:DataElement>
  <mms:DataElement rdf:ID="Column.SU.SUMODIFY">
    <mms:dataElementLabel rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
    >Modified Substance Name</mms:dataElementLabel>
    <sdtms:dataElementType rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.Character"/>
    <mms:dataElementType rdf:datatype="
http://www.w3.org/2001/XMLSchema#QName";
    >xsd:string</mms:dataElementType>
    <sdtmigs:references rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
    >SDTM 2.2.1, SDTMIG 4.1.3.6</sdtmigs:references>
    <mms:dataElementDescription rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
    >If SUTRT is modified, then the modified text is placed
here.</mms:dataElementDescription>
    <mms:ordinal rdf:datatype="
http://www.w3.org/2001/XMLSchema#positiveInteger";
    >8</mms:ordinal>
    <sdtms:dataElementCompliance rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.PermissibleVariable"/>
    <mms:dataElementName rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
    >SUMODIFY</mms:dataElementName>
    <mms:context rdf:resource="#Table.SU"/>
    <mms:broader rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/std#DE.Intervention.--MODIFY"/>
    <sdtms:dataElementRole rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.SynonymQualifier"/>
  </mms:DataElement>
  <mms:DataElement rdf:ID="Column.CO.IDVAR">
    <sdtms:dataElementRole rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.RecordQualifier"/>
    <mms:dataElementName rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
    >IDVAR</mms:dataElementName>
    <sdtms:dataElementCompliance rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.PermissibleVariable"/>
    <sdtmigs:controlledTermsOrFormat rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
    >*</sdtmigs:controlledTermsOrFormat>
    <mms:dataElementLabel rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
    >Identifying Variable</mms:dataElementLabel>
    <mms:dataElementDescription rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
    >Identifying variable in the parent dataset that identifies the
record(s) to which the comment applies. Examples AESEQ or CMGRPID. Used
only when individual comments are related to domain records. Null for
comments collected on separate CRFs.</mms:dataElementDescription>
    <mms:context>
      <mms:Dataset rdf:ID="Table.CO">
        <sdtmigs:domainCode rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
        >CO</sdtmigs:domainCode>
        <mms:contextName rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
        >CO</mms:contextName>
        <sdtmigs:domainStructure rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
        >One record per comment per subject</sdtmigs:domainStructure>
        <mms:contextLabel
rdf:parseType="Literal">Comments</mms:contextLabel>
        <mms:ordinal rdf:datatype="
http://www.w3.org/2001/XMLSchema#positiveInteger";
        >2</mms:ordinal>
        <mms:context rdf:resource="#SpecialPurposeDomain"/>
      </mms:Dataset>
    </mms:context>
    <mms:dataElementType rdf:datatype="
http://www.w3.org/2001/XMLSchema#QName";
    >xsd:string</mms:dataElementType>
    <mms:ordinal rdf:datatype="
http://www.w3.org/2001/XMLSchema#positiveInteger";
    >6</mms:ordinal>
    <sdtms:dataElementType rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.Character"/>
  </mms:DataElement>


I then index this data using this assembler file using jena.textindexer:

@prefix :        <http://localhost/jena_example/#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb:     <http://jena.hpl.hp.com/2008/tdb#> .
@prefix ja:      <http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix text:    <http://jena.apache.org/text#> .
@prefix mms:     <http://rdf.cdisc.org/mms#> .
@prefix sdtms:   <http://rdf.cdisc.org/sdtm-1-2/schema#> .
@prefix sdtmigs: <http://rdf.cdisc.org/sdtmig-3-1-2/schema#> .

## Example of a TDB dataset and text index
## Initialize TDB
[] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
tdb:GraphTDB    rdfs:subClassOf  ja:Model .

## Initialize text query
[] ja:loadClass       "org.apache.jena.query.text.TextQuery" .
# A TextDataset is a regular dataset with a text index.
text:TextDataset      rdfs:subClassOf   ja:RDFDataset .
# Lucene index
text:TextIndexLucene  rdfs:subClassOf   text:TextIndex .

## ---------------------------------------------------------------
## This URI must be fixed - it's used to assemble the text dataset.

:text_dataset rdf:type     text:TextDataset ;
    text:dataset   <#dataset> ;
    text:index     <#indexLucene> ;
    .

# A TDB dataset used for RDF storage
<#dataset> rdf:type      tdb:DatasetTDB ;
    tdb:location "tdb" ;
    # if from command line use: "NetBeansProjects/mdr-older/trunk/tdb"
    .

# Text index description
<#indexLucene> a text:TextIndexLucene ;
    text:directory <file:luceneIndexes> ;
    text:entityMap <#entMap> ;
    .

# Mapping in the index
# URI stored in field "uri"
# rdfs:label is mapped to field "text"
<#entMap> a text:EntityMap ;
    text:entityField      "uri" ;
    text:defaultField     "text" ;
    text:map (
         [ text:field "text" ; text:predicate mms:dataElementName ]
         [ text:field "text" ; text:predicate mms:dataElementDescription ]
 [ text:field "text" ; text:predicate mms:dataElementLabel ]
 [ text:field "text" ; text:predicate mms:dataElementType ]
 [ text:field "text" ; text:predicate mms:ordinal ]
 [ text:field "text" ; text:predicate mms:broader ]
         [ text:field "text" ; text:predicate mms:Dataset ]
         [ text:field "text" ; text:predicate mms:contextName ]
         [ text:field "text" ; text:predicate mms:contextLabel ]
         [ text:field "text" ; text:predicate mms:contextDescription ]
[ text:field "text" ; text:predicate sdtms:dataElementType ]
  [ text:field "text" ; text:predicate sdtms:dataElementRole ]
[ text:field "text" ; text:predicate sdtms:dataElementCompliance ]
         [ text:field "text" ; text:predicate sdtms:supportedBySDTMIG ]
         [ text:field "text" ; text:predicate sdtms:supportedBySEND ]
[ text:field "text" ; text:predicate sdtmigs:references ]
         [ text:field "text" ; text:predicate sdtmigs:domainStructure ]
         [ text:field "text" ; text:predicate sdtmigs:domainCode ]
         [ text:field "text" ; text:predicate
sdtmigs:controlledTermsOrFormat ]
         ) .

Finally I try to run a query on the dataset with the index:

PREFIX : <http://localhost/jena_example/#> PREFIX text: <
http://jena.apache.org/text#> PREFIX mms: <http://rdf.cdisc.org/mms#>
SELECT * {?s text:query (mms:dataElementName 'AE')}

I would expect to get the first dataElement: AERELNST. I am unsure as to
whether my problem is in the format of my query or in the format of my
assembler file. Any thoughts?



On Sun, Sep 1, 2013 at 7:43 AM, Andy Seaborne <[email protected]> wrote:

> On 01/09/13 00:02, Brad Moran wrote:
>
>> sorry the file type should be saved as .owl
>>
>
> I see no data.  If you had an attachment, then they don't get through to
> the mailing list.
>
> Would it be possible to create a complete, minimal example of your setup?
>  A small amount of data that shows the situation.
> This description is quite long - is it all needed or can you see the same
> issues in a smaller configuration?
>
>         Andy
>
>
>>
>> On Sat, Aug 31, 2013 at 7:00 PM, Brad Moran <[email protected]
>> <mailto:[email protected]>**> wrote:
>>
>>     Hi,
>>     I am currently having a problem getting the exact results I want
>>     from my text queries. I attached one example of my rdf that I begin
>>     with. Then I run tdbloader and successfully create an index using
>>     this assembler file with jena.textindexer:
>>
>>     @prefix :        
>> <http://localhost/jena_**example/#<http://localhost/jena_example/#>>
>> .
>>     @prefix rdf:     
>> <http://www.w3.org/1999/02/22-**rdf-syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>>
>> .
>>     @prefix rdfs:    
>> <http://www.w3.org/2000/01/**rdf-schema#<http://www.w3.org/2000/01/rdf-schema#>>
>> .
>>     @prefix tdb:     
>> <http://jena.hpl.hp.com/2008/**tdb#<http://jena.hpl.hp.com/2008/tdb#>>
>> .
>>     @prefix ja:      
>> <http://jena.hpl.hp.com/2005/**11/Assembler#<http://jena.hpl.hp.com/2005/11/Assembler#>>
>> .
>>     @prefix text:    <http://jena.apache.org/text#> .
>>     @prefix mms:     <http://rdf.cdisc.org/mms#> .
>>     @prefix sdtms:   
>> <http://rdf.cdisc.org/sdtm-1-**2/schema#<http://rdf.cdisc.org/sdtm-1-2/schema#>>
>> .
>>     @prefix sdtmigs: 
>> <http://rdf.cdisc.org/sdtmig-**3-1-2/schema#<http://rdf.cdisc.org/sdtmig-3-1-2/schema#>>
>> .
>>     @prefix sends: 
>> <http://rdf.cdisc.org/send/**schema#<http://rdf.cdisc.org/send/schema#>>
>> .
>>     @prefix sendigs: 
>> <http://rdf.cdisc.org/send-3.**0/schema#<http://rdf.cdisc.org/send-3.0/schema#>>
>> .
>>     @prefix cts: 
>> <http://rdf.cdisc.org/ct/**schema#<http://rdf.cdisc.org/ct/schema#>>
>> .
>>
>>     ## Example of a TDB dataset and text index
>>     ## Initialize TDB
>>     [] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
>>     tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
>>     tdb:GraphTDB    rdfs:subClassOf  ja:Model .
>>
>>     ## Initialize text query
>>     [] ja:loadClass       "org.apache.jena.query.text.**TextQuery" .
>>     # A TextDataset is a regular dataset with a text index.
>>     text:TextDataset      rdfs:subClassOf   ja:RDFDataset .
>>     # Lucene index
>>     text:TextIndexLucene  rdfs:subClassOf   text:TextIndex .
>>
>>     ## ------------------------------**------------------------------**
>> ---
>>     ## This URI must be fixed - it's used to assemble the text dataset.
>>
>>     :text_dataset rdf:type     text:TextDataset ;
>>          text:dataset   <#dataset> ;
>>          text:index     <#indexLucene> ;
>>          .
>>
>>     # A TDB dataset used for RDF storage
>>     <#dataset> rdf:type      tdb:DatasetTDB ;
>>          tdb:location "tdb" ;
>>          # if from command line use: "NetBeansProjects/mdr-older/**
>> trunk/tdb"
>>          .
>>
>>     # Text index description
>>     <#indexLucene> a text:TextIndexLucene ;
>>          text:directory <file:luceneIndexes> ;
>>          text:entityMap <#entMap> ;
>>          .
>>
>>     # Mapping in the index
>>     # URI stored in field "uri"
>>     # rdfs:label is mapped to field "text"
>>     <#entMap> a text:EntityMap ;
>>          text:entityField      "uri" ;
>>          text:defaultField     "text" ;
>>          text:map (
>>               [ text:field "text" ; text:predicate mms:dataElementName ]
>>               [ text:field "text" ; text:predicate
>>     mms:dataElementDescription ]
>>       [ text:field "text" ; text:predicate mms:dataElementLabel ]
>>       [ text:field "text" ; text:predicate mms:dataElementType ]
>>       [ text:field "text" ; text:predicate mms:ordinal ]
>>       [ text:field "text" ; text:predicate mms:broader ]
>>               [ text:field "text" ; text:predicate mms:Dataset ]
>>               [ text:field "text" ; text:predicate mms:contextName ]
>>               [ text:field "text" ; text:predicate mms:contextLabel ]
>>               [ text:field "text" ; text:predicate mms:contextDescription
>> ]
>>     [ text:field "text" ; text:predicate sdtms:dataElementType ]
>>     [ text:field "text" ; text:predicate sdtms:dataElementRole ]
>>     [ text:field "text" ; text:predicate sdtms:dataElementCompliance ]
>>               [ text:field "text" ; text:predicate
>> sdtms:supportedBySDTMIG ]
>>               [ text:field "text" ; text:predicate sdtms:supportedBySEND ]
>>     [ text:field "text" ; text:predicate sdtmigs:references ]
>>               [ text:field "text" ; text:predicate
>> sdtmigs:domainStructure ]
>>               [ text:field "text" ; text:predicate sdtmigs:domainCode ]
>>               [ text:field "text" ; text:predicate
>>     sdtmigs:**controlledTermsOrFormat ]
>>               [ text:field "text" ; text:predicate
>>     sends:dataElementCompliance ]
>>               [ text:field "text" ; text:predicate sends:dataElementRole ]
>>               [ text:field "text" ; text:predicate
>> sendigs:domainStructure ]
>>               [ text:field "text" ; text:predicate sendigs:domainCode ]
>>               [ text:field "text" ; text:predicate
>>     sendigs:**controlledTermsOrFormat ]
>>               [ text:field "text" ; text:predicate cts:cdiscDefinition]
>>               [ text:field "text" ; text:predicate cts:nciPreferredTerm]
>>               [ text:field "text" ; text:predicate cts:nciCode]
>>               [ text:field "text" ; text:predicate cts:cdiscSynonyms]
>>               [ text:field "text" ; text:predicate
>> cts:cdiscSubmissionValue]
>>               [ text:field "text" ; text:predicate cts:codelistName]
>>               [ text:field "text" ; text:predicate
>> cts:isExtensibleCodelist]
>>               ) .
>>
>>
>>     I then try to run queries against this dataset, as an example say I
>>     want to search "AE" then I would expect every dataElement within the
>>     AE domain to be returned. However, I cannot get the desired result.
>>     If I search:
>>
>>     PREFIX : 
>> <http://localhost/jena_**example/#<http://localhost/jena_example/#>>
>> PREFIX text:
>>     <http://jena.apache.org/text#> PREFIX mms:
>>     <http://rdf.cdisc.org/mms#> SELECT * {?s text:query
>>     (mms:dataElementName 'AE')}
>>
>>     I get:
>>
>>     
>> <http://rdf.cdisc.org/sdtmig-**3-1-2/std#Column.AE.DOMAIN<http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.DOMAIN>
>> >
>>     
>> <http://rdf.cdisc.org/sdtmig-**3-1-2/std#Table.AE<http://rdf.cdisc.org/sdtmig-3-1-2/std#Table.AE>
>> >
>>     
>> <http://rdf.cdisc.org/sdtmig-**3-1-2/std#Column.FA.FACAT<http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.FA.FACAT>
>> >
>>
>>     when I would expect to get:
>>
>>     
>> <http://rdf.cdisc.org/sdtmig-**3-1-2/std#Column.AE.AERELNST<http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AERELNST>
>> >
>>     
>> <http://rdf.cdisc.org/sdtmig-**3-1-2/std#Column.AE.AEENDY<http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AEENDY>
>> >
>>     
>> <http://rdf.cdisc.org/sdtmig-**3-1-2/std#Column.AE.AEMODIFY<http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AEMODIFY>
>> >
>>     
>> <http://rdf.cdisc.org/sdtmig-**3-1-2/std#Column.AE.AETOXGR<http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AETOXGR>
>> >
>>     
>> <http://rdf.cdisc.org/sdtmig-**3-1-2/std#Column.AE.AEREFID<http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AEREFID>
>> >
>>     
>> <http://rdf.cdisc.org/sdtmig-**3-1-2/std#Column.AE.AESCAT<http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AESCAT>
>> >
>>     
>> <http://rdf.cdisc.org/sdtmig-**3-1-2/std#Column.AE.AESEQ<http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AESEQ>
>> >
>>     
>> <http://rdf.cdisc.org/sdtmig-**3-1-2/std#Column.AE.AESMIE<http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AESMIE>
>> >
>>     (And the rest of the .AE dataElements just listed a few here)
>>
>>     I also tried playing with this query a lot, but could not get the
>>     desired result for example I tried the other form of query as well:
>>
>>     PREFIX : 
>> <http://localhost/jena_**example/#<http://localhost/jena_example/#>>
>> PREFIX text:
>>     <http://jena.apache.org/text#> PREFIX mms:
>>     <http://rdf.cdisc.org/mms#> SELECT * {?subject mms:contextName ?o .
>>     ?s text:query (mms:contextName 'SE')}
>>
>>
>>     I am not sure whether the problem is a result of my query being
>>     formed incorrectly, or whether the problem could be in my assembler
>>     file that creates the index (is there a better/more complete way to
>>     create an index for this rdf model?). Any suggestions would help,
>>     like I mentioned in the beginning one of the rdf files from tdb is
>>     attached. Thanks.
>>
>>
>>
>

Reply via email to