Hi Geert, Your point is really useful to me. I finally put the range index on production and tested the speed. The result is amazing.
I really don't need to expand the whole tree at the first time, I only need to expand one volume at one time, so I tested the following: 1. I build the volume list and then expand the issue list for all the volumes, the speed is PT0.297228S 2. I did another test, I build the volume list and then only expand the issue list for one volume, the speed is PT0.010312S. I got almost 29 times faster. Thanks for your suggestion. The above test is done without extra ordering. For the order by part, I have a question: I build the range index using collation http://marklogic.com/collation/en/MO, it is the same as the one I need for order by. I'm not sure when cts:element-values returns value, is it using the collation that I put for range index? It seems the list is correct, I just want to double check. another question for cts:element-values, is it possible that I can change the return ordering? currently the list it returns are in ascending order, if I want the list returned in descending order, is there anyway I can do it? or I have to use order by? Thanks a lot, helen >>> Geert Josten <[email protected]> 04/12/10 9:35 AM >>> Hi Helen, Element- value- query works better with a range index, but will revert to word indexes and filtering afterwards if there is no range index. That could explain part of the performance difference. Secondly, ordering and looping through many values can slow things down. Do you need to show them all at once? Or could you apply pagination somehow? Don't forget that you will be doing many index lookups if your first element- values returns a lot of values. Also, you are using order by twice, that might be redundant. Values returned by element- values are already ordered, though perhaps not with the collation you are using.. Kind regards, Geert > drs. G.P.H. (Geert) Josten Consultant Daidalos BV Hoekeindsehof 1- 4 2665 JZ Bleiswijk T +31 (0)10 850 1200 F +31 (0)10 850 1199 mailto:[email protected] http://www.daidalos.nl/ KvK 27164984 P Please consider the environment before printing this mail. De informatie - verzonden in of met dit e- mailbericht - is afkomstig van Daidalos BV en is uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onbedoeld hebt ontvangen, verzoeken wij u het te verwijderen. Aan dit bericht kunnen geen rechten worden ontleend. > From: general- [email protected] > [mailto:general- [email protected]] On Behalf Of > Helen Chen > Sent: donderdag 8 april 2010 22:58 > To: [email protected] > Cc: Helen Chen > Subject: [MarkLogic Dev General] RE: how to build the tree > based on data > > Hi Kelly, > > I tried your solution, it seems much faster. The problem for > me now is: our production machine does not have element range > index for coden and volume, I set it on development machine, > it is pretty fast, took about 0.1 ~ 0.2 seconds to get > result, this makes me pretty excited. But it does not have > enough data to let me feel the speed for the large data. > > I tried the following query on development box (this use > cts:element- tange- query) > let $coden := "AAA" > for $vol in > cts:element- values(fn:QName("http://www.aip.org/phoenixml","vo > lume"),(), > > ("collation=http://marklogic.com/collation/en/MO"), > > cts:element- range- query(fn:QName("http://www.aip.org/phoenixml > ","coden"), > "=", $coden, > > "collation=http://marklogic.com/collation/en/MO") ) > order by $vol collation > "http://marklogic.com/collation/en/MO" > > return > <volume vol={$vol}> > { > for $iss in > cts:element- values(fn:QName("http://www.aip.org/phoenixml","is > sno"),(), > > ("collation=http://marklogic.com/collation/en/MO"), > cts:and- query(( > > cts:element- range- query(fn:QName("http://www.aip.org/phoenixml > ","coden"), > "=", $coden, > > "collation=http://marklogic.com/collation/en/MO"), > > cts:element- range- query(fn:QName("http://www.aip.org/phoenixml > ","volume"), > "=", $vol, > > "collation=http://marklogic.com/collation/en/MO") > )) > ) > order by $iss collation > "http://marklogic.com/collation/en/MO" > > return > <issue>{ $iss }</issue> > } > </volume> > > > the element- range- query took about 0.2 seconds. > > > Then I tried the following query on production using > cts:element- value- query: > let $coden := "AAA" > for $vol in > cts:element- values(fn:QName("http://www.aip.org/phoenixml","vo > lume"),(), > > ("collation=http://marklogic.com/collation/en/MO"), > > cts:element- value- query(fn:QName("http://www.aip.org/phoenixml > ","coden"), > $coden, "exact")) > order by $vol collation > "http://marklogic.com/collation/en/MO" > > return > <volume vol={$vol}> > { > for $iss in > cts:element- values(fn:QName("http://www.aip.org/phoenixml","is > sno"),(), > > ("collation=http://marklogic.com/collation/en/MO"), > cts:and- query(( > > cts:element- value- query(fn:QName("http://www.aip.org/phoenixml > ","coden"), > $coden,"exact"), > > cts:element- value- query(fn:QName("http://www.aip.org/phoenixml > ","volume"), > $vol,"exact") > )) > ) > order by $iss collation > "http://marklogic.com/collation/en/MO" > > return > <issue>{ $iss }</issue> > > } > </volume> > > This is production box, the very first time I run it, it took > 4.4 seconds, after that, maybe everything is in memory > already, it took about 0.3 seconds to run. > > I also tried to run this query on development box and try to > see the difference between cts:element- range- query and > cts:element- value- query, maybe data not big enough, or maybe > all the index are in memory already, I could not see the big > difference. > > So my question here is: > > 1. In theory what is the difference on the performance for > these two query ? > which one suppose to be faster in my situation? I need to > understand more before I add index to production box. > > 2. This query is going to be used as part of the home page > for all the codens, so it is going to be called a lot. And I > also have other part of data to display on home page. This > means this query should be as fast as possible. If the query > takes 0.2 seconds on development machine, it might take 0.5 > seconds on production. (I have not try yet, just guess). Is > there any thing I can do to optimize it? I'm trying to see > if it can fit for realtime call or I have to do upfront calculation. > > Thanks a lot, > > Helen > > > > > > > > > >>> Kelly Stirman <[email protected]> 04/07/10 10:05 AM >>> > Hi Helen, > > You can take Danny's approach a few steps further to use > range indexes for > everything. > > Something like: > > let $codens := cts:element- > values(xs:QName("coden"),(),(),cts:and- query(())) > > for $c in $codens > return > <coden name="{$c}"> > { > let $volumes := cts:element- > values(xs:QName("volume"),(),cts:element- range- > query(xs:QName("coden"),"=",$c)) > for $v in $volumes > return > <volume name="{$v}"> > { > let $issues := cts:element- > values(xs:QName("article"),(),cts:and- query((cts:element- range- > query(xs:QName("volume"),"=",$v), cts:element- range- > query(xs:QName("coden"),"=",$c)))) > for $i in $issues > return <issue name="{$i}"/> > } > </volume> > } > </coden> > > Let us know if this helps and we may be able to do some > further optimization. > > Kelly > > PS - you could probably add cts:frequency() to each level in > the tree to get > counts if you'd like. > > Message: 1 > Date: Tue, 6 Apr 2010 14:48:31 - 0700 > From: Danny Sokolsky <[email protected]> > Subject: RE: [MarkLogic Dev General] how to build the tree based on > data > To: General Mark Logic Developer Discussion > <[email protected]> > Message- ID: > <c9924d15b04672479b089f7d55ffc13254498...@exchg- > BE.marklogic.com> > Content- Type: text/plain; charset="us- ascii" > > If there is a range index on coden, then you can substitute the: > > fn:distinct- values(//coden) > > with > > cts:element- values(xs:QName("coden")) > > That should speed things up a bit.... > > - Danny > > > _______________________________________________ > General mailing list > [email protected] > http://xqzone.com/mailman/listinfo/general > > _______________________________________________ > General mailing list > [email protected] > http://xqzone.com/mailman/listinfo/general > _______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general
