Karl,
Can you say how you would do this without using the SearchAPI?
I would tend to use cts:element-query():
cts:search(doc(),cts:element-query(xs:QName("Actor"),
cts:and-query((
cts:element-value-query(xs:QName("Text"),"Healthcare Provider"),
cts:element-word-query(xs:QName("DisplayName"),"John C Doe")
))
))
I think the first thing is establishing that what you want to do is possible in
ML, then seeing if it is possible with the SearchAPI.
As for the general discussion regarding what to do with your XML in ML, I think
it is possible to design XML that is not well suited for
searching/updating/manipulating in ML. In your case where data is normalized in
the document, this seems like an artifact of some other system or process. I
would normalize those values into the document at load time plus any other
upfront transformation that makes your life easier, then transform at delivery
time. Another alternative is to create an XML structure that has everything you
need for searches that is a sibling to the original node:
<root>
<searchable-xml/>
<original-xml/>
</root>
Kelly
Message: 1
Date: Mon, 23 Nov 2009 15:36:26 -0600
From: Karl Erisman <[email protected]>
Subject: RE: [MarkLogic Dev General] XML structure/schema design for
MLS
To: [email protected]
Message-ID:
<[email protected]>
Content-Type: text/plain; charset=ISO-8859-1
Geert, thanks for responding. Let me provide background using an explanation
involving the Continuity of Care Record (CCR) schema. CCR is one of multiple
standards for storing clinical data for a single patient.
CCR includes an <Actors> node that contains individual <Actor> nodes, each with
an ID. These represent people or other entities who are in some way involved
with treatment of the patient. The Actors section is a normalized set of
actors referenced by ID inside other sections of the CCR (the main section
being the "Body"). Here's an example
<Actor>:
<Actor>
<ActorObjectID>X32423982309E30</ActorObjectID>
<Person>
<Name>
<CurrentName>
<Given>John</Given>
<Family>Doe</Family>
</CurrentName>
<DisplayName>John C. Doe, M.D.</DisplayName>
</Name>
</Person>
<Relation>
<Text>Healthcare Provider</Text>
</Relation>
</Actor>
The normalization resulting from the use of the Actors section makes it
difficult to search. For example, to perform a search against part of the body
section that is constrained to elements that are associated with a particular
actor, you cannot simply use the name of the actor directly. You must search
the Actors section for the actor name, find the corresponding ID, then perform
your search against the body using your criteria and ensuring that the results
are constrained to those that reference that ID.
That, I believe, illustrates why I want to simplify representation for internal
storage. Using a different internal storage format does not preclude exchange
of data with other parties because, as I alluded to previously, document
transformation could be performed on export to the "standard du jour".
Much of my concern stems from being a new user of the Search API. I don't see
how to constrain searches to specific hierarchical paths in order to ensure
that results don't include elements of the same name that are at a different
path; <value> and <word> constraints only seem to allow you to generate simple
queries. If you have many <Actor> sections like the one above in a single
document, how could you use the Search API to find the "Healthcare Provider"
named "John C Doe"?
I know how to do it using composed cts:query functions and FLOWR
post-processing, but that strategy is hard to generalize (and I need a general
solution to support search criteria resulting from web forms).
Thanks for reading; hopefully that is enough background to at least spark
further discussion.
Karl
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general