Hi Dom,

First off, please use the latest MarkLogic 7 (7.0-4 right now) as your 
base-line, as it has a lot of bug fixes since 7.0-1.

In MarkLogic 7, you can specify a path for what is indexed as your field.  This 
is a way more precise way than the old include/exclude way (which is still 
there, really only for compatibility purposes).  Here is a little info on it:

http://docs.marklogic.com/guide/admin/fields#id_23934

-Danny

From: [email protected] 
[mailto:[email protected]] On Behalf Of Dominic Beesley
Sent: Monday, October 13, 2014 5:01 AM
To: 'MarkLogic Developer Discussion'
Subject: Re: [MarkLogic Dev General] CTS Path restrictions

Thanks Danny,

We're moving to 7 at the moment and I'm using 7.0-1 as the base-line version 
though I've not seen anything in the documentation that is any different to 4.2 
which was my previous base-line version.

I'm not sure what mean by a path field - do you mean a pre-built attribute or 
something or is there another type of lexicon or index that I've missed?

My main issue now is that it seems that iterating through the result set to 
move from the meta/field element up to the root auth element to return the id 
is relatively slow. On some searches I can us cts:uris to just get the uris 
which kind of works but then I lost the relevance score...

Cheers

Dom


From: [email protected] 
[mailto:[email protected]] On Behalf Of Danny Sokolsky
Sent: 12 October 2014 20:19
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] CTS Path restrictions

Hi Dominic,

It sounds like you got this working somewhat, but here are a few observations.

You did not say what version of MarkLogic this is or what index settings you 
have.  For this type of search to resolve accurately unfiltered, you would need 
to have word positions and element positions enabled on the db config.  Also, 
newer versions of MarkLogic have a lot of improvements in this area.

Another thing that might be worth trying here is to create a path field 
containing the data you are interested.  That is likely to be more efficient.

-Danny

________________________________
From: 
[email protected]<mailto:[email protected]>
 [[email protected]] on behalf of Dominic Beesley 
[[email protected]]
Sent: Sunday, October 12, 2014 3:35 AM
To: 'MarkLogic Developer Discussion'
Subject: Re: [MarkLogic Dev General] CTS Path restrictions
I've partially answered my own questions...(I think)

for $r in cts:search(doc('domtest.xml')/d/auth,
      cts:element-query(xs:QName('meta'), cts:element-query(xs:QName('field'),
                  cts:and-query((
                        cts:element-attribute-value-query(xs:QName('field'), 
xs:QName('type'), 'title')
                  ,     cts:word-query('queries')
                  ))

      ))
      , ('filtered', 'checked')
)
return
      <res id='{ $r/@id }' score='{ cts:score($r) }' />

Will match the word "queries" in field[@type='title']

And

for $r in cts:search(doc('domtest.xml')/d/auth/meta,
      cts:element-query(xs:QName('field'),
                  cts:and-query((
                        cts:element-attribute-value-query(xs:QName('field'), 
xs:QName('type'), 'title')
                  ,     cts:word-query('queries')
                  ))

      )
      , ('filtered', 'checked')
)
return
      <res id='{ $r/ancestor-or-self::auth[last()]/@id }' score='{ 
cts:score($r) }' />

Will match only at the root level though with the added work of having to find 
the root id in the result but I still have to do extra work combining any res 
items returned for multiple hits in the same root.

D

From: 
[email protected]<mailto:[email protected]>
 [mailto:[email protected]] On Behalf Of Dominic Beesley
Sent: 10 October 2014 16:12
To: MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] CTS Path restrictions

Hello,

I've been playing around with cts:search trying to get performance better on 
our data set. I'm having a few problems!

The data (simplified here) is a set of nested <auth> elements which represent 
the root, chapter, section, etc levels of a document. Each <auth> contains a 
<meta> which in turn controls <field> elements - each field element has a @type 
attribute which indicates its type. The structure of the data cannot be 
changed...

Firstly, how do I search for a word in an element with a particular attribute 
i.e. in the example below look for the word "queries" in field[@type='title'] 
(example 1 is what I thought worked but it doesn't it returns the hits on the 
precis field too)

Secondly, how do I search for something only at the root level i.e. if I just 
wanted to match a word in /d/auth/field[@type='title'] as opposed to 
/d/auth//field[@type='titlle']

I know that I could use the path in the cts:search to narrow down the search to 
a particular level/field but the trouble then is when I want to combine two 
searches with and/or

i.e. I can do cts:search(doc()/d/auth/meta/field[@type='title'], 'precis') but 
would I search for say "beer" in title and "pdfs" in precis.

I'm sure I had some of this working about 4 years ago but I can't remember it!

At the moment I've got it doing something like:

for $r in cts:search(/d/auth/meta/field[@type='title'], 'beer') return <item 
id='{ $r/ancestor::auth[not(parent::auth)]/@id }' />
,               for $r in cts:search(/d/auth/meta/field[@type='precis'], 
'pdfs') return <item id='{ $r/ancestor::auth[not(parent::auth)]/@id }' />


The application then does the combination and score arithmetic. Trouble is that 
the $r/ancestor::... bit is quite slow on large sets and a lot of information 
is returned to the app from the server which when doing an "and" is usually a 
lot larger than the intersection. I could probably do the intersection in 
XQuery too but I suspect the combining and finding the root id bit would still 
be slow?

Any pointers appreciated!

D


xdmp:document-insert('domtest.xml',
<d>
<auth id="1werwer">
      <meta>
            <field type="title">The Book of Queries</field>
            <field type="precis">One man's struggle to overcome confusion and 
reach enlightenment</field>
      </meta>
</auth>
<auth id="2sdfsfsdf">
      <meta>
            <field type="title">Misunderstanding CTS</field>
            <field type="precis">My autobiography</field>
      </meta>
      <auth id="xcvzxcvzxcv">
            <meta>
                  <field type="number">Chapter 1</field>
                  <field type="title">Queries explained</field>
            </meta>
      </auth>
</auth>
<auth id="4cvzxcvxcv">
      <meta>
            <field type="title">The good beer guid</field>
            <field type="precis">What you really should be reading instead of 
pdfs about queries</field>
      </meta>
      <auth id="3wersvscdv">
            <meta>
                  <field type="number">Chapter 1</field>
                  <field type="title">Timoth Taylor</field>
            </meta>
      </auth>
</auth></d>
);

EXAMPLE 1: NOT WORKING - look for "queries" in field "title"

declare namespace xm="http://xsd.oup.com/xauth/meta";

for $r in cts:search(doc('domtest.xml')/d/auth,
      cts:element-query(xs:QName('meta'),
            cts:and-query((
                  cts:element-attribute-value-query(xs:QName('field'), 
xs:QName('type'), 'title')
            ,     cts:element-word-query(xs:QName('field'), 'queries')
            ))
      )
)
return
      <res id='{ $r/@id }' score='{ cts:score($r) }' />
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to