Brad,

That didn't make it through (and it's truncated). Do you have a smaller data sample? Does the data have to be that long to show the issue? Can you put it up somewhere I can pull it from?

        Andy

On 11/09/13 05:57, Brad Moran wrote:
Ok, this is a sample of several large rdf files I am working with:

<?xml version="1.0"?>
<rdf:RDF
     xmlns:mms="http://rdf.cdisc.org/mms#";
     xmlns:sdtm="http://rdf.cdisc.org/sdtm-1-2/std#";
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
     xmlns:skos="http://www.w3.org/2004/02/skos/core#";
     xmlns:sdtmigs="http://rdf.cdisc.org/sdtmig-3-1-2/schema#";
     xmlns:owl="http://www.w3.org/2002/07/owl#";
     xmlns:dc="http://purl.org/dc/elements/1.1/";
     xmlns="http://rdf.cdisc.org/sdtmig-3-1-2/std#";
     xmlns:xsd="http://www.w3.org/2001/XMLSchema#";
     xmlns:sdtms="http://rdf.cdisc.org/sdtm-1-2/schema#";
     xmlns:sdtmct="http://rdf.cdisc.org/sdtm/ct#";
     xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#";
   xml:base="http://rdf.cdisc.org/sdtmig-3-1-2/std";>
<mms:DataElement rdf:ID="Column.AE.AERELNST">
     <mms:dataElementType rdf:datatype="
http://www.w3.org/2001/XMLSchema#QName";
     >xsd:string</mms:dataElementType>
     <mms:dataElementName rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
     >AERELNST</mms:dataElementName>
     <sdtms:dataElementType rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.Character"/>
     <mms:context>
       <mms:Dataset rdf:ID="Table.AE">
         <sdtmigs:domainStructure rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
         >One record per adverse event per subject</sdtmigs:domainStructure>
         <mms:contextName rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
         >AE</mms:contextName>
         <mms:contextLabel rdf:parseType="Literal">Adverse
Events</mms:contextLabel>
         <mms:ordinal rdf:datatype="
http://www.w3.org/2001/XMLSchema#positiveInteger";
         >8</mms:ordinal>
         <sdtmigs:domainCode rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
         >AE</sdtmigs:domainCode>
         <mms:context rdf:resource="#EventsObservationClass"/>
       </mms:Dataset>
     </mms:context>
     <sdtms:dataElementCompliance rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.PermissibleVariable"/>
     <mms:ordinal rdf:datatype="
http://www.w3.org/2001/XMLSchema#positiveInteger";
     >21</mms:ordinal>
     <sdtmigs:references rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
     >SDTM 2.2.2</sdtmigs:references>
     <sdtms:dataElementRole rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.RecordQualifier"/>
     <mms:dataElementLabel rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
     >Relationship to Non-Study Treatment</mms:dataElementLabel>
     <mms:dataElementDescription rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
     >Records the investigator's opinion as to whether the event may have
been due to a treatment other than study drug. May be reported as free
text. Example: "MORE LIKELY RELATED TO ASPIRIN
USE.".</mms:dataElementDescription>
     <mms:broader rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/std#DE.Event.--RELNST"/>
   </mms:DataElement>
   <mms:DataElement rdf:ID="Column.SU.SUMODIFY">
     <mms:dataElementLabel rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
     >Modified Substance Name</mms:dataElementLabel>
     <sdtms:dataElementType rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.Character"/>
     <mms:dataElementType rdf:datatype="
http://www.w3.org/2001/XMLSchema#QName";
     >xsd:string</mms:dataElementType>
     <sdtmigs:references rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
     >SDTM 2.2.1, SDTMIG 4.1.3.6</sdtmigs:references>
     <mms:dataElementDescription rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
     >If SUTRT is modified, then the modified text is placed
here.</mms:dataElementDescription>
     <mms:ordinal rdf:datatype="
http://www.w3.org/2001/XMLSchema#positiveInteger";
     >8</mms:ordinal>
     <sdtms:dataElementCompliance rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.PermissibleVariable"/>
     <mms:dataElementName rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
     >SUMODIFY</mms:dataElementName>
     <mms:context rdf:resource="#Table.SU"/>
     <mms:broader rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/std#DE.Intervention.--MODIFY"/>
     <sdtms:dataElementRole rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.SynonymQualifier"/>
   </mms:DataElement>
   <mms:DataElement rdf:ID="Column.CO.IDVAR">
     <sdtms:dataElementRole rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.RecordQualifier"/>
     <mms:dataElementName rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
     >IDVAR</mms:dataElementName>
     <sdtms:dataElementCompliance rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.PermissibleVariable"/>
     <sdtmigs:controlledTermsOrFormat rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
     >*</sdtmigs:controlledTermsOrFormat>
     <mms:dataElementLabel rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
     >Identifying Variable</mms:dataElementLabel>
     <mms:dataElementDescription rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
     >Identifying variable in the parent dataset that identifies the
record(s) to which the comment applies. Examples AESEQ or CMGRPID. Used
only when individual comments are related to domain records. Null for
comments collected on separate CRFs.</mms:dataElementDescription>
     <mms:context>
       <mms:Dataset rdf:ID="Table.CO">
         <sdtmigs:domainCode rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
         >CO</sdtmigs:domainCode>
         <mms:contextName rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
         >CO</mms:contextName>
         <sdtmigs:domainStructure rdf:datatype="
http://www.w3.org/2001/XMLSchema#string";
         >One record per comment per subject</sdtmigs:domainStructure>
         <mms:contextLabel
rdf:parseType="Literal">Comments</mms:contextLabel>
         <mms:ordinal rdf:datatype="
http://www.w3.org/2001/XMLSchema#positiveInteger";
         >2</mms:ordinal>
         <mms:context rdf:resource="#SpecialPurposeDomain"/>
       </mms:Dataset>
     </mms:context>
     <mms:dataElementType rdf:datatype="
http://www.w3.org/2001/XMLSchema#QName";
     >xsd:string</mms:dataElementType>
     <mms:ordinal rdf:datatype="
http://www.w3.org/2001/XMLSchema#positiveInteger";
     >6</mms:ordinal>
     <sdtms:dataElementType rdf:resource="
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.Character"/>
   </mms:DataElement>


I then index this data using this assembler file using jena.textindexer:

@prefix :        <http://localhost/jena_example/#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb:     <http://jena.hpl.hp.com/2008/tdb#> .
@prefix ja:      <http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix text:    <http://jena.apache.org/text#> .
@prefix mms:     <http://rdf.cdisc.org/mms#> .
@prefix sdtms:   <http://rdf.cdisc.org/sdtm-1-2/schema#> .
@prefix sdtmigs: <http://rdf.cdisc.org/sdtmig-3-1-2/schema#> .

          ) .

Finally I try to run a query on the dataset with the index:

PREFIX : <http://localhost/jena_example/#> PREFIX text: <
http://jena.apache.org/text#> PREFIX mms: <http://rdf.cdisc.org/mms#>
SELECT * {?s text:query (mms:dataElementName 'AE')}

I would expect to get the first dataElement: AERELNST. I am unsure as to
whether my problem is in the format of my query or in the format of my
assembler file. Any thoughts?


On Sun, Sep 1, 2013 at 7:43 AM, Andy Seaborne <[email protected]> wrote:

On 01/09/13 00:02, Brad Moran wrote:

sorry the file type should be saved as .owl


I see no data.  If you had an attachment, then they don't get through to
the mailing list.

Would it be possible to create a complete, minimal example of your setup?
  A small amount of data that shows the situation.
This description is quite long - is it all needed or can you see the same
issues in a smaller configuration?

         Andy



On Sat, Aug 31, 2013 at 7:00 PM, Brad Moran <[email protected]
<mailto:[email protected]>**> wrote:

     Hi,
     I am currently having a problem getting the exact results I want
     from my text queries. I attached one example of my rdf that I begin
     with. Then I run tdbloader and successfully create an index using
     this assembler file with jena.textindexer:

     @prefix :        
<http://localhost/jena_**example/#<http://localhost/jena_example/#>>
.
     @prefix rdf:     
<http://www.w3.org/1999/02/22-**rdf-syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>>
.
     @prefix rdfs:    
<http://www.w3.org/2000/01/**rdf-schema#<http://www.w3.org/2000/01/rdf-schema#>>
.
     @prefix tdb:     
<http://jena.hpl.hp.com/2008/**tdb#<http://jena.hpl.hp.com/2008/tdb#>>
.
     @prefix ja:      
<http://jena.hpl.hp.com/2005/**11/Assembler#<http://jena.hpl.hp.com/2005/11/Assembler#>>
.
     @prefix text:    <http://jena.apache.org/text#> .
     @prefix mms:     <http://rdf.cdisc.org/mms#> .
     @prefix sdtms:   
<http://rdf.cdisc.org/sdtm-1-**2/schema#<http://rdf.cdisc.org/sdtm-1-2/schema#>>
.
     @prefix sdtmigs: 
<http://rdf.cdisc.org/sdtmig-**3-1-2/schema#<http://rdf.cdisc.org/sdtmig-3-1-2/schema#>>
.
     @prefix sends: 
<http://rdf.cdisc.org/send/**schema#<http://rdf.cdisc.org/send/schema#>>
.
     @prefix sendigs: 
<http://rdf.cdisc.org/send-3.**0/schema#<http://rdf.cdisc.org/send-3.0/schema#>>
.
     @prefix cts: 
<http://rdf.cdisc.org/ct/**schema#<http://rdf.cdisc.org/ct/schema#>>
.

     ## Example of a TDB dataset and text index
     ## Initialize TDB
     [] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
     tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
     tdb:GraphTDB    rdfs:subClassOf  ja:Model .

     ## Initialize text query
     [] ja:loadClass       "org.apache.jena.query.text.**TextQuery" .
     # A TextDataset is a regular dataset with a text index.
     text:TextDataset      rdfs:subClassOf   ja:RDFDataset .
     # Lucene index
     text:TextIndexLucene  rdfs:subClassOf   text:TextIndex .

     ## ------------------------------**------------------------------**
---
     ## This URI must be fixed - it's used to assemble the text dataset.

     :text_dataset rdf:type     text:TextDataset ;
          text:dataset   <#dataset> ;
          text:index     <#indexLucene> ;
          .

     # A TDB dataset used for RDF storage
     <#dataset> rdf:type      tdb:DatasetTDB ;
          tdb:location "tdb" ;
          # if from command line use: "NetBeansProjects/mdr-older/**
trunk/tdb"
          .

     # Text index description
     <#indexLucene> a text:TextIndexLucene ;
          text:directory <file:luceneIndexes> ;
          text:entityMap <#entMap> ;
          .

     # Mapping in the index
     # URI stored in field "uri"
     # rdfs:label is mapped to field "text"
     <#entMap> a text:EntityMap ;
          text:entityField      "uri" ;
          text:defaultField     "text" ;
          text:map (
               [ text:field "text" ; text:predicate mms:dataElementName ]
               [ text:field "text" ; text:predicate
     mms:dataElementDescription ]
       [ text:field "text" ; text:predicate mms:dataElementLabel ]
       [ text:field "text" ; text:predicate mms:dataElementType ]
       [ text:field "text" ; text:predicate mms:ordinal ]
       [ text:field "text" ; text:predicate mms:broader ]
               [ text:field "text" ; text:predicate mms:Dataset ]
               [ text:field "text" ; text:predicate mms:contextName ]
               [ text:field "text" ; text:predicate mms:contextLabel ]
               [ text:field "text" ; text:predicate mms:contextDescription
]
     [ text:field "text" ; text:predicate sdtms:dataElementType ]
     [ text:field "text" ; text:predicate sdtms:dataElementRole ]
     [ text:field "text" ; text:predicate sdtms:dataElementCompliance ]
               [ text:field "text" ; text:predicate
sdtms:supportedBySDTMIG ]
               [ text:field "text" ; text:predicate sdtms:supportedBySEND ]
     [ text:field "text" ; text:predicate sdtmigs:references ]
               [ text:field "text" ; text:predicate
sdtmigs:domainStructure ]
               [ text:field "text" ; text:predicate sdtmigs:domainCode ]
               [ text:field "text" ; text:predicate
     sdtmigs:**controlledTermsOrFormat ]
               [ text:field "text" ; text:predicate
     sends:dataElementCompliance ]
               [ text:field "text" ; text:predicate sends:dataElementRole ]
               [ text:field "text" ; text:predicate
sendigs:domainStructure ]
               [ text:field "text" ; text:predicate sendigs:domainCode ]
               [ text:field "text" ; text:predicate
     sendigs:**controlledTermsOrFormat ]
               [ text:field "text" ; text:predicate cts:cdiscDefinition]
               [ text:field "text" ; text:predicate cts:nciPreferredTerm]
               [ text:field "text" ; text:predicate cts:nciCode]
               [ text:field "text" ; text:predicate cts:cdiscSynonyms]
               [ text:field "text" ; text:predicate
cts:cdiscSubmissionValue]
               [ text:field "text" ; text:predicate cts:codelistName]
               [ text:field "text" ; text:predicate
cts:isExtensibleCodelist]
               ) .


     I then try to run queries against this dataset, as an example say I
     want to search "AE" then I would expect every dataElement within the
     AE domain to be returned. However, I cannot get the desired result.
     If I search:

     PREFIX : 
<http://localhost/jena_**example/#<http://localhost/jena_example/#>>
PREFIX text:
     <http://jena.apache.org/text#> PREFIX mms:
     <http://rdf.cdisc.org/mms#> SELECT * {?s text:query
     (mms:dataElementName 'AE')}

     I get:

     
<http://rdf.cdisc.org/sdtmig-**3-1-2/std#Column.AE.DOMAIN<http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.DOMAIN>

     
<http://rdf.cdisc.org/sdtmig-**3-1-2/std#Table.AE<http://rdf.cdisc.org/sdtmig-3-1-2/std#Table.AE>

     
<http://rdf.cdisc.org/sdtmig-**3-1-2/std#Column.FA.FACAT<http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.FA.FACAT>


     when I would expect to get:

     
<http://rdf.cdisc.org/sdtmig-**3-1-2/std#Column.AE.AERELNST<http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AERELNST>

     
<http://rdf.cdisc.org/sdtmig-**3-1-2/std#Column.AE.AEENDY<http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AEENDY>

     
<http://rdf.cdisc.org/sdtmig-**3-1-2/std#Column.AE.AEMODIFY<http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AEMODIFY>

     
<http://rdf.cdisc.org/sdtmig-**3-1-2/std#Column.AE.AETOXGR<http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AETOXGR>

     
<http://rdf.cdisc.org/sdtmig-**3-1-2/std#Column.AE.AEREFID<http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AEREFID>

     
<http://rdf.cdisc.org/sdtmig-**3-1-2/std#Column.AE.AESCAT<http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AESCAT>

     
<http://rdf.cdisc.org/sdtmig-**3-1-2/std#Column.AE.AESEQ<http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AESEQ>

     
<http://rdf.cdisc.org/sdtmig-**3-1-2/std#Column.AE.AESMIE<http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AESMIE>

     (And the rest of the .AE dataElements just listed a few here)

     I also tried playing with this query a lot, but could not get the
     desired result for example I tried the other form of query as well:

     PREFIX : 
<http://localhost/jena_**example/#<http://localhost/jena_example/#>>
PREFIX text:
     <http://jena.apache.org/text#> PREFIX mms:
     <http://rdf.cdisc.org/mms#> SELECT * {?subject mms:contextName ?o .
     ?s text:query (mms:contextName 'SE')}


     I am not sure whether the problem is a result of my query being
     formed incorrectly, or whether the problem could be in my assembler
     file that creates the index (is there a better/more complete way to
     create an index for this rdf model?). Any suggestions would help,
     like I mentioned in the beginning one of the rdf files from tdb is
     attached. Thanks.






Reply via email to