Sorry if I seem to keep moving the goalposts. I've been playing with
some of the parameters, and experimenting with my XQuery and the best
performance I can get is using the following query. Unfortunately, it
still takes a considerable amount of time to execute. Profiling shows
that the XPath $nodes/@count is taking the time.
let $allKeywords := fn:collection()/doc/keywords/keyword/@value
let $distinct := fn:distinct-values($allKeywords)
for $k in $distinct
return
let $search := cts:element-attribute-word-query(fn:QName("",
"keyword"), fn:QName("", "value"), $k)
let $results := cts:search(fn:collection()/doc/keywords/keyword, $search)
let $nodes := [EMAIL PROTECTED] eq $k]
let $counts := $nodes/@count
return
<keyword value="{$k}" total="{fn:sum($counts)}" />
On Thu, Nov 13, 2008 at 9:21 AM, Steve <[EMAIL PROTECTED]> wrote:
> I've been applying what I've learnt so far from this thread, but I'm
> having a bit of trouble getting good performance when I put it all
> together. The query I'm trying to execute in order to get the sum of a
> count of keywords is below:
>
> let $allKeywords := fn:collection()/doc/keywords/keyword/@value
> let $distinct := fn:distinct-values($allKeywords)
>
> for $k in $distinct
> return
> let $search := cts:element-attribute-word-query(fn:QName("",
> "keyword"), fn:QName("", "value"), $k)
> let $results := cts:search(fn:collection()/doc/keywords/keyword,
> $search)/@count
> return
> fn:sum($results)
>
> Running profiling on the query shows me that it's the XPath stuff I do
> on the search results that's holding everything up, can anyone advise
> how I can improve this?
>
> Thankl
>
> On Wed, Nov 12, 2008 at 7:24 PM, Michael Blakeley
> <[EMAIL PROTECTED]> wrote:
>> To be fair, absorbing the architecture and indexing behavior of a modern
>> RDBMS isn't trivial either. XML content adds another dimension, but I hope
>> you find the performance guide at http://developer.marklogic.com/pubs/4.0/
>> helpful. There are also useful bits of server architecture discussion in the
>> dev and admin guides.
>>
>> In the general case I wouldn't expect adding a range index to greatly
>> improve value query performance. The list cache is pretty efficient at
>> keeping frequently-used terms in memory.
>>
>> Usually the range indexes are created for applications that need particular
>> features: fast sorting on a node value, fast range queries, fast access to
>> distinct values, etc.
>>
>> -- Mike
>>
>> Whitby, Rob, CMG wrote:
>>>
>>> Wow, I didn't realise that. It will improve performance though right? On
>>> a large database I assume the index of all XML elements and attributes
>>> can't be held in memory.
>>>
>>> Understanding how the functions relate to the indexes is probably one of
>>> areas I've found hardest with MarkLogic.
>>>
>>> Thanks
>>> Rob
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: [EMAIL PROTECTED]
>>> [mailto:[EMAIL PROTECTED] On Behalf Of Michael
>>> Blakeley
>>> Sent: 12 November 2008 16:55
>>> To: General Mark Logic Developer Discussion
>>> Cc: [EMAIL PROTECTED]
>>> Subject: Re: [MarkLogic Dev General] Improving XPathPerformance
>>> onSearchResults
>>>
>>> Actually that query does *not* require any special indexes. The server
>>> always indexes all XML element values and element-attribute values.
>>>
>>> You would only need an attribute range index for a fast "order by" on
>>> keyword/@value, or for a cts:attribute-value-range-query term, or for
>>> cts:element-attribute-values() and its associated functions.
>>>
>>> -- Mike
>>>
>>> Whitby, Rob, CMG wrote:
>>>>
>>>> If you put an attribute range index on keyword/@value you can do
>>>> something like this:
>>>>
>>>> cts:search(
>>>> /doc/classifications/classification,
>>>> cts:element-attribute-value-query(xs:Qname("keyword",
>>>> xs:Qname("value"), "something")
>>>> )
>>>>
>>>> (untested!)
>>>>
>>>> Rob
>>>>
>>>> -----Original Message-----
>>>> From: [EMAIL PROTECTED]
>>>> [mailto:[EMAIL PROTECTED] On Behalf Of Steve
>>>> Sent: 12 November 2008 14:41
>>>> To: James Clippinger
>>>> Cc: General Mark Logic Developer Discussion
>>>> Subject: Re: [MarkLogic Dev General] Improving XPath Performance
>>>> onSearchResults
>>>>
>>>> I should probably add that I'm trying to extract all classification
>>>> values for the documents that have a specific keyword value.
>>>>
>>>> On Wed, Nov 12, 2008 at 2:40 PM, Steve <[EMAIL PROTECTED]>
>>>> wrote:
>>>>>
>>>>> Thanks for your response.
>>>>>
>>>>> I've tried your suggestion and it doesn't really help. Looking at the
>>>
>>>>> profiling document, I can see that it's clearly the XPath on the
>>>>> document results that is slowing the who thing down. Is there any other
>>>>> ways
>>>>> that I can improve this. I've included a sample document (small), so you
>>>>> can
>>>>> see what I'm trying to achieve.
>>>>>
>>>>> <doc>
>>>>> <classifications>
>>>>> <classification value="123" />
>>>>> <classification value="324" />
>>>>> </classifications>
>>>>> <keywords>
>>>>> <keyword value="word1" />
>>>>> <keyword value="word2" />
>>>>>
>>>>> </keywords>
>>>>> </doc>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Nov 12, 2008 at 2:24 PM, James Clippinger
>>>>> <[EMAIL PROTECTED]> wrote:
>>>>>>
>>>>>> Steve, your query is doing some heavyweight filtering for the XPath
>>>>>> because it is doing two steps:
>>>>>>
>>>>>> 1) Execute the cts:search(): generate a list of all documents matching
>>>>>> the query in relevance order.
>>>>>>
>>>>>> 2) Execute the XPath: reorder the documents into document order and
>>>>>> find only those with /doc/classifications/classification elements,
>>>>>> returning
>>>>>> those classification elements.
>>>>>>
>>>>>> Since you are using XPath and thus returning results in document order,
>>>>>> you probably want to use cts:contains() in an XPath predicate
>>>
>>>>>> rather than cts:search(). cts:contains() in a rooted XPath expression
>>>>>> will use the search indexes when appropriate, so it's as fast as the
>>>>>> equivalent
>>>>>> cts:search() expression. Try this:
>>>>>>
>>>>>> let $search := cts:element-attribute-word-query(fn:QName("",
>>>>>> "keyword"), fn:QName("", "value"), "something") return
>>>>>> fn:collection()/doc[cts:contains(.,
>>>>>> $search)/classifications/classification
>>>>>>
>>>>>> James
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: [EMAIL PROTECTED]
>>>>>>> [mailto:[EMAIL PROTECTED] On Behalf Of Steve
>>>>>>> Sent: Wednesday, November 12, 2008 8:54 AM
>>>>>>> To: [email protected]
>>>>>>> Subject: [MarkLogic Dev General] Improving XPath Performance on
>>>>>>> SearchResults
>>>>>>>
>>>>>>> I've written a query which I use to search my data set and I am able
>>>>>>> to get the results back very quickly. However the results that I get
>>>>>>> back show the complete document that the search matched, where as I
>>>
>>>>>>> want a particular node returned.
>>>>>>> At the moment I'm doing this:
>>>>>>>
>>>>>>> let $search := cts:element-attribute-word-query(fn:QName("",
>>>>>>> "keyword"), fn:QName("", "value"), "something") let $results :=
>>>>>>> cts:search(fn:collection(), $search)/doc/classifications/classification
>>>>>>> return $results
>>>>>>>
>>>>>>> I've tried profiling this query and I've found that there is a big lag
>>>>>>> filtering the $results of the search using XPath.
>>>>>>> Is there any way, either through using a different query or search
>>>>>>> notation, or by indexes etc that I can speed this up.
>>>>>>>
>>>>>>> Thanks in advance...
>>>>>>> _______________________________________________
>>>>>>> General mailing list
>>>>>>> [email protected]
>>>>>>> http://xqzone.com/mailman/listinfo/general
>>>>>>>
>>>> _______________________________________________
>>>> General mailing list
>>>> [email protected]
>>>> http://xqzone.com/mailman/listinfo/general
>>>> _______________________________________________
>>>> General mailing list
>>>> [email protected]
>>>> http://xqzone.com/mailman/listinfo/general
>>>
>>> _______________________________________________
>>> General mailing list
>>> [email protected]
>>> http://xqzone.com/mailman/listinfo/general
>>> _______________________________________________
>>> General mailing list
>>> [email protected]
>>> http://xqzone.com/mailman/listinfo/general
>>
>>
>
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general