This is where an index really is useful..

for $keyword in cts:element-attribute-values(xs:Qname("keyword",
xs:Qname("value"))
let $count := cts:frequency($keyword)
order by $count descending
return <keyword value="{$keyword}" count="{$count}"/>




 

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Steve
Sent: 13 November 2008 10:13
To: Michael Blakeley
Cc: General Mark Logic Developer Discussion
Subject: Re: [MarkLogic Dev General] Improving XPathPerformance
onSearchResults

Sorry if I seem to keep moving the goalposts. I've been playing with
some of the parameters, and experimenting with my XQuery and the best
performance I can get is using the following query. Unfortunately, it
still takes a considerable amount of time to execute.  Profiling shows
that the XPath $nodes/@count is taking the time.

let $allKeywords := fn:collection()/doc/keywords/keyword/@value
let $distinct := fn:distinct-values($allKeywords)

for $k in $distinct
  return
    let $search := cts:element-attribute-word-query(fn:QName("",
"keyword"), fn:QName("", "value"), $k)
    let $results := cts:search(fn:collection()/doc/keywords/keyword,
$search)
    let $nodes := [EMAIL PROTECTED] eq $k]
    let $counts := $nodes/@count
       return
          <keyword value="{$k}" total="{fn:sum($counts)}" />


On Thu, Nov 13, 2008 at 9:21 AM, Steve <[EMAIL PROTECTED]>
wrote:
> I've been applying what I've learnt so far from this thread, but I'm 
> having a bit of trouble getting good performance when I put it all 
> together. The query I'm trying to execute in order to get the sum of a

> count of keywords is below:
>
> let $allKeywords := fn:collection()/doc/keywords/keyword/@value
> let $distinct := fn:distinct-values($allKeywords)
>
> for $k in $distinct
>  return
>    let $search := cts:element-attribute-word-query(fn:QName("",
> "keyword"), fn:QName("", "value"), $k)
>    let $results := cts:search(fn:collection()/doc/keywords/keyword,
> $search)/@count
>       return
> fn:sum($results)
>
> Running profiling on the query shows me that it's the XPath stuff I do

> on the search results that's holding everything up, can anyone advise 
> how I can improve this?
>
> Thankl
>
> On Wed, Nov 12, 2008 at 7:24 PM, Michael Blakeley 
> <[EMAIL PROTECTED]> wrote:
>> To be fair, absorbing the architecture and indexing behavior of a 
>> modern RDBMS isn't trivial either. XML content adds another 
>> dimension, but I hope you find the performance guide at 
>> http://developer.marklogic.com/pubs/4.0/
>> helpful. There are also useful bits of server architecture discussion

>> in the dev and admin guides.
>>
>> In the general case I wouldn't expect adding a range index to greatly

>> improve value query performance. The list cache is pretty efficient 
>> at keeping frequently-used terms in memory.
>>
>> Usually the range indexes are created for applications that need 
>> particular
>> features: fast sorting on a node value, fast range queries, fast 
>> access to distinct values, etc.
>>
>> -- Mike
>>
>> Whitby, Rob, CMG wrote:
>>>
>>> Wow, I didn't realise that. It will improve performance though 
>>> right? On a large database I assume the index of all XML elements 
>>> and attributes can't be held in memory.
>>>
>>> Understanding how the functions relate to the indexes is probably 
>>> one of areas I've found hardest with MarkLogic.
>>>
>>> Thanks
>>> Rob
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: [EMAIL PROTECTED]
>>> [mailto:[EMAIL PROTECTED] On Behalf Of 
>>> Michael Blakeley
>>> Sent: 12 November 2008 16:55
>>> To: General Mark Logic Developer Discussion
>>> Cc: [EMAIL PROTECTED]
>>> Subject: Re: [MarkLogic Dev General] Improving XPathPerformance 
>>> onSearchResults
>>>
>>> Actually that query does *not* require any special indexes. The 
>>> server always indexes all XML element values and element-attribute
values.
>>>
>>> You would only need an attribute range index for a fast "order by" 
>>> on keyword/@value, or for a cts:attribute-value-range-query term, or

>>> for
>>> cts:element-attribute-values() and its associated functions.
>>>
>>> -- Mike
>>>
>>> Whitby, Rob, CMG wrote:
>>>>
>>>> If you put an attribute range index on keyword/@value you can do 
>>>> something like this:
>>>>
>>>> cts:search(
>>>>  /doc/classifications/classification,
>>>>  cts:element-attribute-value-query(xs:Qname("keyword",
>>>> xs:Qname("value"), "something")
>>>> )
>>>>
>>>> (untested!)
>>>>
>>>> Rob
>>>>
>>>> -----Original Message-----
>>>> From: [EMAIL PROTECTED]
>>>> [mailto:[EMAIL PROTECTED] On Behalf Of Steve
>>>> Sent: 12 November 2008 14:41
>>>> To: James Clippinger
>>>> Cc: General Mark Logic Developer Discussion
>>>> Subject: Re: [MarkLogic Dev General] Improving XPath Performance 
>>>> onSearchResults
>>>>
>>>> I should probably add that I'm trying to extract all classification

>>>> values for the documents that have a specific keyword value.
>>>>
>>>> On Wed, Nov 12, 2008 at 2:40 PM, Steve <[EMAIL PROTECTED]>
>>>> wrote:
>>>>>
>>>>> Thanks for your response.
>>>>>
>>>>> I've tried your suggestion and it doesn't really help. Looking at 
>>>>> the
>>>
>>>>> profiling document, I can see that it's clearly the XPath on the 
>>>>> document results that is slowing the who thing down. Is there any 
>>>>> other ways that I can improve this. I've included a sample 
>>>>> document (small), so you can see what I'm trying to achieve.
>>>>>
>>>>> <doc>
>>>>>  <classifications>
>>>>>   <classification value="123" />
>>>>>   <classification value="324" />
>>>>>  </classifications>
>>>>>  <keywords>
>>>>>   <keyword value="word1" />
>>>>>   <keyword value="word2" />
>>>>>
>>>>>  </keywords>
>>>>> </doc>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Nov 12, 2008 at 2:24 PM, James Clippinger 
>>>>> <[EMAIL PROTECTED]> wrote:
>>>>>>
>>>>>> Steve, your query is doing some heavyweight filtering for the 
>>>>>> XPath because it is doing two steps:
>>>>>>
>>>>>> 1) Execute the cts:search(): generate a list of all documents 
>>>>>> matching the query in relevance order.
>>>>>>
>>>>>> 2) Execute the XPath: reorder the documents into document order 
>>>>>> and find only those with /doc/classifications/classification 
>>>>>> elements, returning those classification elements.
>>>>>>
>>>>>> Since you are using XPath and thus returning results in document 
>>>>>> order, you probably want to use cts:contains() in an XPath 
>>>>>> predicate
>>>
>>>>>> rather than cts:search().  cts:contains() in a rooted XPath 
>>>>>> expression will use the search indexes when appropriate, so it's 
>>>>>> as fast as the equivalent
>>>>>> cts:search() expression.  Try this:
>>>>>>
>>>>>> let $search := cts:element-attribute-word-query(fn:QName("",
>>>>>> "keyword"), fn:QName("", "value"), "something") return 
>>>>>> fn:collection()/doc[cts:contains(.,
>>>>>> $search)/classifications/classification
>>>>>>
>>>>>> James
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: [EMAIL PROTECTED]
>>>>>>> [mailto:[EMAIL PROTECTED] On Behalf Of 
>>>>>>> Steve
>>>>>>> Sent: Wednesday, November 12, 2008 8:54 AM
>>>>>>> To: [email protected]
>>>>>>> Subject: [MarkLogic Dev General] Improving XPath Performance on 
>>>>>>> SearchResults
>>>>>>>
>>>>>>> I've written a query which I use to search my data set and I am 
>>>>>>> able to get the results back very quickly. However the results 
>>>>>>> that I get back show the complete document that the search 
>>>>>>> matched, where as I
>>>
>>>>>>> want a particular node returned.
>>>>>>> At the moment I'm doing this:
>>>>>>>
>>>>>>> let $search := cts:element-attribute-word-query(fn:QName("",
>>>>>>> "keyword"), fn:QName("", "value"), "something") let $results := 
>>>>>>> cts:search(fn:collection(),
$search)/doc/classifications/classification
>>>>>>>   return $results
>>>>>>>
>>>>>>> I've tried profiling this query and I've found that there is a 
>>>>>>> big lag filtering the $results of the search using XPath.
>>>>>>> Is there any way, either through using a different query or 
>>>>>>> search notation, or by indexes etc that I can speed this up.
>>>>>>>
>>>>>>> Thanks in advance...
>>>>>>> _______________________________________________
>>>>>>> General mailing list
>>>>>>> [email protected]
>>>>>>> http://xqzone.com/mailman/listinfo/general
>>>>>>>
>>>> _______________________________________________
>>>> General mailing list
>>>> [email protected]
>>>> http://xqzone.com/mailman/listinfo/general
>>>> _______________________________________________
>>>> General mailing list
>>>> [email protected]
>>>> http://xqzone.com/mailman/listinfo/general
>>>
>>> _______________________________________________
>>> General mailing list
>>> [email protected]
>>> http://xqzone.com/mailman/listinfo/general
>>> _______________________________________________
>>> General mailing list
>>> [email protected]
>>> http://xqzone.com/mailman/listinfo/general
>>
>>
>
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to