Hello,

 

I've been playing around with cts:search trying to get performance better on
our data set. I'm having a few problems! 

 

The data (simplified here) is a set of nested <auth> elements which
represent the root, chapter, section, etc levels of a document. Each <auth>
contains a <meta> which in turn controls <field> elements - each field
element has a @type attribute which indicates its type. The structure of the
data cannot be changed.

 

Firstly, how do I search for a word in an element with a particular
attribute i.e. in the example below look for the word "queries" in
field[@type='title'] (example 1 is what I thought worked but it doesn't it
returns the hits on the precis field too)

 

Secondly, how do I search for something only at the root level i.e. if I
just wanted to match a word in /d/auth/field[@type='title'] as opposed to
/d/auth//field[@type='titlle']

 

I know that I could use the path in the cts:search to narrow down the search
to a particular level/field but the trouble then is when I want to combine
two searches with and/or

 

i.e. I can do cts:search(doc()/d/auth/meta/field[@type='title'], 'precis')
but would I search for say "beer" in title and "pdfs" in precis.

 

I'm sure I had some of this working about 4 years ago but I can't remember
it!

 

At the moment I've got it doing something like:

 

for $r in cts:search(/d/auth/meta/field[@type='title'], 'beer') return <item
id='{ $r/ancestor::auth[not(parent::auth)]/@id }' />

,               for $r in cts:search(/d/auth/meta/field[@type='precis'],
'pdfs') return <item id='{ $r/ancestor::auth[not(parent::auth)]/@id }' />

 

 

The application then does the combination and score arithmetic. Trouble is
that the $r/ancestor::. bit is quite slow on large sets and a lot of
information is returned to the app from the server which when doing an "and"
is usually a lot larger than the intersection. I could probably do the
intersection in XQuery too but I suspect the combining and finding the root
id bit would still be slow?

 

Any pointers appreciated!

 

D

 

 

xdmp:document-insert('domtest.xml',

<d>

<auth id="1werwer">

      <meta>

            <field type="title">The Book of Queries</field>

            <field type="precis">One man's struggle to overcome confusion
and reach enlightenment</field>

      </meta>

</auth>

<auth id="2sdfsfsdf">

      <meta>

            <field type="title">Misunderstanding CTS</field>

            <field type="precis">My autobiography</field>

      </meta>

      <auth id="xcvzxcvzxcv">

            <meta>

                  <field type="number">Chapter 1</field>

                  <field type="title">Queries explained</field>

            </meta>

      </auth>

</auth>

<auth id="4cvzxcvxcv">

      <meta>

            <field type="title">The good beer guid</field>

            <field type="precis">What you really should be reading instead
of pdfs about queries</field>

      </meta>

      <auth id="3wersvscdv">

            <meta>

                  <field type="number">Chapter 1</field>

                  <field type="title">Timoth Taylor</field>

            </meta>

      </auth>

</auth></d>

);

 

EXAMPLE 1: NOT WORKING - look for "queries" in field "title"

 

declare namespace xm="http://xsd.oup.com/xauth/meta";

 

for $r in cts:search(doc('domtest.xml')/d/auth, 

      cts:element-query(xs:QName('meta'), 

            cts:and-query((

                  cts:element-attribute-value-query(xs:QName('field'),
xs:QName('type'), 'title')

            ,     cts:element-word-query(xs:QName('field'), 'queries')

            ))

      )

)

return

      <res id='{ $r/@id }' score='{ cts:score($r) }' />

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to