Hi Karl,

The Search API is designed with some extensibility hooks in it that make it 
possible to wire in all sorts of queries that aren't supported out of the box 
yet.

In this case, I think you could implement it using a custom constraint (see the 
Search Developer's Guide for some background reading on that).  Pasted below is 
a generic example that creates a nested element query during parsing when your 
query string contains something like:  myelement:foo or myelement:bar.  (The 
qtextconst attribute is used to support search:unparse().)

Hopefully you can extrapolate from here to extend the Search API to meet your 
specific needs...

Cheers,

--Colleen


**** Sample options node ****
<options xmlns="http://marklogic.com/appservices/search";>
  <constraint name="myelement">
    <custom facet="false">
      <parse apply="my-custom-query" ns="http://example/custom"; 
at="/lib/custom-constraints.xqy"/>
    </custom>
  </constraint>
</options>


**** Sample XQuery library at /lib/custom-constraints.xqy ************

xquery version "1.0-ml";


module namespace custom="http://example/custom";;

declare default function namespace "http://www.w3.org/2005/xpath-functions";;


declare function custom:my-custom-query($qtext, $right)  {

  element cts:element-query {

   qtextconst {fn:concat("myelement:",$right//cts:text/string())}

    element cts:element {xs:QName("custom:a")},

    element cts:element-query {

      element cts:element {xs:QName("custom:b")},

      element cts:element-word-query {

        element cts:element {xs:QName("custom:target-node")},

        element cts:text {

          attribute xml:lang {"en"},

          $right//cts:text/string()

        },

        element cts:option{"child"}

      },

      element cts:option{"child"}

    },

    element cts:option{"child"}

  }

};

________________________________________
From: [email protected] 
[[email protected]] On Behalf Of Karl Erisman 
[[email protected]]
Sent: Tuesday, November 24, 2009 11:41 AM
To: [email protected]
Subject: [MarkLogic Dev General] RE: XML structure/schema design for MLS

Kelly,

Your cts:element-query() example is exactly how I was planning on
performing the search (I also mentioned something about
post-processing with a FLOWR, but that would not be necessary).

So, we've established that our goal is possible (clearly) in ML.  Now,
can the search be completed using the SearchAPI?

This is one way:

search:search('actor_constraint:"John C Doe"
relation_constraint:"Healthcare Provider"', <options
xmlns="http://marklogic.com/appservices/search";><constraint
name="actor_constraint"><word><element ns="" name="DisplayName"
/></word></constraint><constraint
name="relation_constraint"><value><element ns="" name="Text"
/></value></constraint></options>)

...but I'm not quite satisfied with that.  I want to be a bit more
specific and enforce that the <Text> element must be the child of a
<Relation> element.  How would that be done?

Thanks for your input on the issue of structuring data for ML -- those
approaches are very sensible.  Transformation is far easier than
internal processing of content, probably for all of us.

Karl

> Date: Tue, 24 Nov 2009 05:49:49 -0800
> From: Kelly Stirman <[email protected]>
> Subject: [MarkLogic Dev General] RE: XML structure/schema design for
>       MLS
> To: "[email protected]"
>       <[email protected]>
> Message-ID:
>       <[email protected]>
> Content-Type: text/plain; charset="us-ascii"
>
> Karl,
>
> Can you say how you would do this without using the SearchAPI?
>
> I would tend to use cts:element-query():
>
> cts:search(doc(),cts:element-query(xs:QName("Actor"),
> cts:and-query((
>   cts:element-value-query(xs:QName("Text"),"Healthcare Provider"),
>   cts:element-word-query(xs:QName("DisplayName"),"John C Doe")
> ))
> ))
>
> I think the first thing is establishing that what you want to do is possible 
> in ML, then seeing if it is possible with the SearchAPI.
>
> As for the general discussion regarding what to do with your XML in ML, I 
> think it is possible to design XML that is not well suited for 
> searching/updating/manipulating in ML. In your case where data is normalized 
> in the document, this seems like an artifact of some other system or process. 
> I would normalize those values into the document at load time plus any other 
> upfront transformation that makes your life easier, then transform at 
> delivery time. Another alternative is to create an XML structure that has 
> everything you need for searches that is a sibling to the original node:
>
> <root>
> <searchable-xml/>
> <original-xml/>
> </root>
>
> Kelly

>> Message: 1
>> Date: Mon, 23 Nov 2009 15:36:26 -0600
>> From: Karl Erisman <[email protected]>
>> Subject: RE: [MarkLogic Dev General] XML structure/schema design for
>>       MLS
>> To: [email protected]
>> Message-ID:
>>       <[email protected]>
>> Content-Type: text/plain; charset=ISO-8859-1
>>
>> Geert, thanks for responding.  Let me provide background using an 
>> explanation involving the Continuity of Care Record (CCR) schema.  CCR is 
>> one of multiple standards for storing clinical data for a single patient.
>>
>> CCR includes an <Actors> node that contains individual <Actor> nodes, each 
>> with an ID.  These represent people or other entities who are in some way 
>> involved with treatment of the patient.  The Actors section is a normalized 
>> set of actors referenced by ID inside other sections of the CCR (the main 
>> section being the "Body").  Here's an example
>> <Actor>:
>>
>> <Actor>
>> <ActorObjectID>X32423982309E30</ActorObjectID>
>> <Person>
>>   <Name>
>>     <CurrentName>
>>       <Given>John</Given>
>>       <Family>Doe</Family>
>>     </CurrentName>
>>     <DisplayName>John C. Doe, M.D.</DisplayName>
>>   </Name>
>> </Person>
>> <Relation>
>>   <Text>Healthcare Provider</Text>
>> </Relation>
>> </Actor>
>>
>> The normalization resulting from the use of the Actors section makes it 
>> difficult to search.  For example, to perform a search against part of the 
>> body section that is constrained to elements that are associated with a 
>> particular actor, you cannot simply use the name of the actor directly.  You 
>> must search the Actors section for the actor name, find the corresponding 
>> ID, then perform your search against the body using your criteria and 
>> ensuring that the results are constrained to those that reference that ID.
>>
>> That, I believe, illustrates why I want to simplify representation for 
>> internal storage.  Using a different internal storage format does not 
>> preclude exchange of data with other parties because, as I alluded to 
>> previously, document transformation could be performed on export to the 
>> "standard du jour".
>>
>> Much of my concern stems from being a new user of the Search API.  I don't 
>> see how to constrain searches to specific hierarchical paths in order to 
>> ensure that results don't include elements of the same name that are at a 
>> different path; <value> and <word> constraints only seem to allow you to 
>> generate simple queries.  If you have many <Actor> sections like the one 
>> above in a single document, how could you use the Search API to find the 
>> "Healthcare Provider" named "John C Doe"?
>> I know how to do it using composed cts:query functions and FLOWR 
>> post-processing, but that strategy is hard to generalize (and I need a 
>> general solution to support search criteria resulting from web forms).
>>
>> Thanks for reading; hopefully that is enough background to at least spark 
>> further discussion.
>> Karl
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to