Re: [MarkLogic Dev General] Are xml attributes indexed by default

seme...@hotmail.com Wed, 11 Jan 2012 14:40:00 -0800


/*[itemMeta/url/@href eq $public_url]
is fully searchable on MarkLogic 5, but not 4.1

dunno why

From: evan.l...@marklogic.com
To: general@developer.marklogic.com
Date: Wed, 11 Jan 2012 14:36:40 -0800
Subject: Re: [MarkLogic Dev General] Are xml attributes indexed by default

Or slightly more simply:
/*[itemMeta/url/@href eq $public_url]
However, I gave this to xdmp:plan() and, for some reason, it's not fully 
leveraging the indexes here, despite my assumption that it would. At index 
resolution, the above query selects all the XML documents in the database (same 
as /*).
It turns out xdmp:plan() is pretty essential here to figure out a formulation 
that works to fully leverage the indexes (even though they theoretically all 
should work).
The following reformulation is no better:
xdmp:plan(  collection()[/*/itemMeta/url/@href eq $public_url]/*)
However, strangely, the following (equivalent) reformulation hits the mark 
(selects 0 fragments in my database, which has no <itemMeta> elements):
xdmp:plan(  collection()[*/itemMeta/url/@href eq $public_url]/*)
And it's the only one that gives me this reassuring message: Step 1 predicate 1 
contributed 3 constraints: */itemMeta/url/@href eq "some_string"
Moral of the story is that xdmp:plan() is your friend. And the last query above 
is the most efficient.
Evan LenzSoftware Developer, CommunityMarkLogic 
Corporationhttp://developer.marklogic.com
From:  Geert Josten <geert.jos...@dayon.nl>
Reply-To:  General MarkLogic Developer Discussion 
<general@developer.marklogic.com>
Date:  Wed, 11 Jan 2012 13:47:33 -0800
To:  General MarkLogic Developer Discussion <general@developer.marklogic.com>
Subject:  Re: [MarkLogic Dev General] Are xml attributes indexed by default

Re: [MarkLogic Dev General] Are xml attributes indexed by defaultHi Jon, Your 
original expression was like this: /*[

    /*/itemMeta/url[@href=$public_url]
    ] That would return every root element in the database if there is any doc 
which meets the predicate expression. I think writing it as follows might 
improve things: /*[

    ./itemMeta/url[@href=$public_url]
    ] (note the period before /itemMeta) You could also use cts functions to 
explicitly rely on indexes: cts:search(collection(), 
cts:element-query(xs:QName(“itemMeta”), 
cts:element-attribute-value-query(xs:QName(“url”), xs:QName(“href”), 
$public_url)))/* Using CQ or Query Console Profile features, or functions like 
xdmp:plan can also help to pinpoint and analyze performance problems.. Kind 
regards,Geert Van: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] Namens Jonathan Cook - FM&T
Verzonden: woensdag 11 januari 2012 15:02
Aan: General MarkLogic Developer Discussion
Onderwerp: Re: [MarkLogic Dev General] Are xml attributes indexed by default 
Thanks for the reply,

In words I want to select the document that matches a given href eg return the 
whole document which matches itemMeta/url[@href=$public_url]

Xml could look like this:

<?xml version="1.0" encoding="UTF-8"?>

<story>  
  <itemMeta> 
    <url href="news/business-1234"/>  

    <assetTypeCode>STY</assetTypeCode>  

    <headline>Full Headline</headline>  

    <shortHeadline>Headline</shortHeadline>  

    <summary>Summary</summary>  

    <section id="1" name="Technology"/>  

    <firstCreated>2011-10-17T14:47:52+00:00</firstCreated>  

    <lastUpdated>2011-12-05T13:40:19+00:00</lastUpdated>  

    <publicationStatus>PUBLISHED</publicationStatus>  

    <language>en-GB</language>  

    <siteId>News v6</siteId>  

    <provider>cps</provider> 

  </itemMeta>  
  <pageOptions/> 

  <body/> 
  <media>
     <images/>
 </media> 

  <relatedGroups/> 
</story>

So you would want to return the whole story document. However other documents 
could have a different root node. ItemMeta will always exist under the root 
node with this structure so as you say I suspect there is a more efficient way 
of doing it.

I had a quick play with let $results := 
xinc:node-expand(//bbc:itemMeta/bbc:url[@href=$public_url]) but that just 
returns the actual url element not the whole document.

Thanks
Jon

On 11/01/2012 13:39, "Geert Josten" <geert.jos...@dayon.nl> wrote:Hi Jon,

You always have the Word index (Word lexicon), which can be explicitly used 
with cts:element-attribute-word-query (and the element counterpart 
cts:element-word-query. You will have to enable wildcard switches on the 
database to use wildcards within the searched values. If you want to lookup 
exact values, you might be better off with a range index, which can be 
explicitly used with cts:element-value-query and 
cts:element-attribute-value-query.

Whether or not XPath expressions are backed by these indexes and lexicons, 
depends how well the query optimizer is able to match them. You might be 
interested in this blog item from Evan, which explains how you can analyze this:

http://developer.marklogic.com/blog/learning-with-query-trace

However, I think you are asking something different here, looking at your code. 
To be honest, the argument of xinc:node-expand looks rather suspicious. /* 
accesses the entire database, but you do it a second time in the predicate. 
That is indeed very expensive. Can you put in words what you are trying to do? 
We might be able to suggest better ways of handling this..

Kind regards,
Geert

Van: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] Namens Jonathan Cook - FM&T
Verzonden: woensdag 11 januari 2012 12:56
Aan: general@developer.marklogic.com
Onderwerp: [MarkLogic Dev General] Are xml attributes indexed by default

Hi,

Are attributes indexed by default in Mark Logic Server? The documentation says 
the whole document structure is indexed but that doesn’t answer the question 
really.

For example I want to ensure the href attribute is indexed for the following 
xquery:
xquery version "1.0-ml";

import module namespace xinc = "http://marklogic.com/xinclude"; at 
"/MarkLogic/xinclude/xinclude.xqy";

let $public_url := "%s"

let $results := xinc:node-expand(/*[
    /*/itemMeta/url[@href=$public_url]
    ])

let $asset := $results[1]
return $asset

Perhaps the above xquery isn’t very efficient as I also read widcards couldn’t 
be used with indexes?

This option is set to false but I suspect this is something different:
“Index attribute value positions for faster near searches involving 
element-attribute-value-query (slower document loads and larger database 
files).”

Thanks
Jon 

http://www.bbc.co.ukThis e-mail (and any attachments) is confidential and may 
contain personal views which are not the views of the BBC unless specifically 
stated.If you have received it in error, please delete it from your system.Do 
not use, copy or disclose the information in any way nor act in reliance on it 
and notify the sender immediately.Please note that the BBC monitors e-mails 
sent or received.Further communication will signify your consent to 
this._______________________________________________
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general

http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal 
views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.

Do not use, copy or disclose the information in any way nor act in reliance on 
it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.

_______________________________________________
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] Are xml attributes indexed by default

Reply via email to