Hi Kelly,

I tried your solution, it seems much faster. The problem for me now is: our
production machine does not have element range index for coden and volume, I set
it on development machine, it is pretty fast, took about 0.1 ~ 0.2 seconds to
get result, this makes me pretty excited.  But it does not have enough data to
let me feel the speed for the large data.  

I tried the following query on development box (this use
cts:element-tange-query)
let $coden := "AAA"
for $vol in 
cts:element-values(fn:QName("http://www.aip.org/phoenixml","volume";),(),
                   ("collation=http://marklogic.com/collation/en/MO";), 
                  
cts:element-range-query(fn:QName("http://www.aip.org/phoenixml","coden";),
"=", $coden,
                                         
"collation=http://marklogic.com/collation/en/MO";) )
order by $vol collation "http://marklogic.com/collation/en/MO";                 
                     
return 
<volume vol={$vol}>
  {
     for $iss in
cts:element-values(fn:QName("http://www.aip.org/phoenixml","issno";),(),
                   ("collation=http://marklogic.com/collation/en/MO";), 
                   cts:and-query((
                  
cts:element-range-query(fn:QName("http://www.aip.org/phoenixml","coden";),
"=", $coden,
                                         
"collation=http://marklogic.com/collation/en/MO";),
                  
cts:element-range-query(fn:QName("http://www.aip.org/phoenixml","volume";),
"=", $vol,
                                         
"collation=http://marklogic.com/collation/en/MO";) 
                                ))
                             )
     order by  $iss collation "http://marklogic.com/collation/en/MO";           
                           
     return 
       <issue>{ $iss }</issue>
  }
</volume>


the element-range-query took about 0.2 seconds.  


Then I tried the following query on production using cts:element-value-query:
let $coden := "AAA"
for $vol in 
cts:element-values(fn:QName("http://www.aip.org/phoenixml","volume";),(),
                   ("collation=http://marklogic.com/collation/en/MO";), 
                  
cts:element-value-query(fn:QName("http://www.aip.org/phoenixml","coden";),
$coden, "exact"))
order by $vol collation "http://marklogic.com/collation/en/MO";                 
                     
return 
<volume vol={$vol}>
  {
     for $iss in
cts:element-values(fn:QName("http://www.aip.org/phoenixml","issno";),(),
                   ("collation=http://marklogic.com/collation/en/MO";), 
                   cts:and-query((
                  
cts:element-value-query(fn:QName("http://www.aip.org/phoenixml","coden";), 
$coden,"exact"),
                  
cts:element-value-query(fn:QName("http://www.aip.org/phoenixml","volume";),
 $vol,"exact") 
                                ))
                             )
     order by  $iss collation "http://marklogic.com/collation/en/MO";           
                           
     return 
       <issue>{ $iss }</issue>
  
  }
</volume>

This is production box, the very first time I run it, it took 4.4 seconds,
after that, maybe everything is in memory already, it took about 0.3 seconds to
run.

I also tried to run this query on development box and try to see the difference
between cts:element-range-query and cts:element-value-query,  maybe data not big
enough, or maybe all the index are in memory already, I could not see the big
difference.

So my question here is:

1. In theory what is the difference on the performance for these two query ?
which one suppose to be faster in my situation? I need to understand more before
I add index to production box.

2. This query is going to be used as part of the home page for all the codens,
so it is going to be called a lot. And I also have other part of data to display
on home page. This means this query should be as fast as possible. If the query
takes 0.2 seconds on development machine, it might take 0.5 seconds on
production. (I have not try yet, just guess). Is there any thing I can do to
optimize it?  I'm trying to see if it can fit for realtime call or I have to do
upfront calculation.

Thanks a lot,

Helen




 


 
>>> Kelly Stirman <[email protected]> 04/07/10 10:05 AM >>> 
Hi Helen,

You can take Danny's approach a few steps further to use range indexes for
everything.

Something like:

let $codens := cts:element- values(xs:QName("coden"),(),(),cts:and- query(()))

for $c in $codens 
return
        <coden name="{$c}">
        {
                let $volumes := cts:element-
values(xs:QName("volume"),(),cts:element- range-
query(xs:QName("coden"),"=",$c))
                for $v in $volumes
                return
                <volume name="{$v}">
                {
                        let $issues := cts:element-
values(xs:QName("article"),(),cts:and- query((cts:element- range-
query(xs:QName("volume"),"=",$v), cts:element- range-
query(xs:QName("coden"),"=",$c))))
                        for $i in $issues
                        return  <issue name="{$i}"/>
                }
                </volume>
        }
        </coden>

Let us know if this helps and we may be able to do some further optimization.

Kelly

PS -  you could probably add cts:frequency() to each level in the tree to get
counts if you'd like.

Message: 1
Date: Tue, 6 Apr 2010 14:48:31 - 0700
From: Danny Sokolsky <[email protected]>
Subject: RE: [MarkLogic Dev General] how to build the tree based on
        data
To: General Mark Logic Developer Discussion
        <[email protected]>
Message- ID:
        <c9924d15b04672479b089f7d55ffc13254498...@exchg- BE.marklogic.com>
Content- Type: text/plain; charset="us- ascii"

If there is a range index on coden, then you can substitute the:

fn:distinct- values(//coden)

with 

cts:element- values(xs:QName("coden"))

That should speed things up a bit....

- Danny


_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to