Right, predicates only look like array indexes. The predicate expression will be evaluated for every item in the input sequence.
I touched on this at http://blakeley.com/blogofile/archives/518/ > Function calls inside an XPath predicate can be horrible for performance, > since the function must be called for every item in the predicate's input > sequence. If the result of the functioncall is static, simply bind the result > to a variable. This is also trueof operators: even simple math operations > like: > $list[$start to $start + $size] > should be rewritten as > $list[$start to $stop] > If you have trouble seeing why this might be a problem, consider a list with > 100 items. Now consider this expression: > $list[ xdmp:sleep(100) ] > Evaluation will cost 100-ms per item, or 10 seconds total. Every expression > takes a finite amount of time to evaluate, and performance optimization is > sometimes a matter of reducing the expression count. -- Mike On 3 Oct 2012, at 13:06 , Will Thompson wrote: > Thanks for the explanation David. I thought it seemed odd, but this makes > sense. > > From: [email protected] > [mailto:[email protected]] On Behalf Of David Lee > Sent: Wednesday, October 03, 2012 12:54 PM > To: MarkLogic Developer Discussion > Subject: Re: [MarkLogic Dev General] Unexpected behavior for xdmp:random() in > predicate > > At first this seemed like a bug but profiling it makes more sense ... > xdmp:random() is not a typcical xquery function as it produces a different > value every call. > ( unlike most xquery functions ... looking for the word "imdopedant" but cant > find it ). > > Profiling these 2 expressions on my own DB which has /*:twitter elements > > for $i in (1 to 10) > let $idx := xdmp:random(1000)+1 > return count((//*:twitter)[$idx]) > > vs > > for $i in (1 to 10) > return count((//*:twitter)[xdmp:random(1000)+1]) > > > I get similar results as you. > > But the profile shows all > #1 > > Profile 52 ExpressionsPT6.13123S > Module:Line No.:Col No. > Count > Shallow % > Shallow µs > Deep % > Deep µs > Expression > .main:3:16 > 10 > 91 > > 5580242 > 91 > > 5580242 > fn:collection()/descendant::*:twitter > .main:3:16 > 10 > 9 > > 549566 > 100 > > 6129808 > (fn:collection()/descendant::*:twitter)[$idx] > .main:3:7 > 10 > 0.013 > > 795 > 100 > > 6130603 > fn:count((fn:collection()/descendant::*:twitter)[$idx]) > .main:1:0 > 1 > 0.0022 > > 134 > 100 > > 6130892 > for $i in 1 to 10 let $idx := xdmp:random(1000) + 1 return > fn:count((fn:collection()/descendant::*:twitter)[$idx]) > .main:2:12 > 10 > 0.0015 > > 89 > 0.0015 > > 89 > xdmp:random(1000) > .main:2:29 > 10 > 0.001 > > 63 > 0.0025 > > 152 > xdmp:random(1000) + 1 > .main:1:13 > 1 > 0.000049 > > 3 > 0.000049 > > 3 > 1 to 10 > #2 > > > Profile 2134232 ExpressionsPT8.293776S > Module:Line No.:Col No. > Count > Shallow % > Shallow µs > Deep % > Deep µs > Expression > .main:3:16 > 10 > 68 > > 5680234 > 68 > > 5680234 > fn:collection()/descendant::*:twitter > .main:3:7 > 10 > 13 > > 1109878 > 100 > > 8290044 > fn:count((fn:collection()/descendant::*:twitter)[xdmp:random(1000) + 1]) > .main:3:27 > 1067100 > 8.1 > > 668253 > 8.1 > > 668253 > xdmp:random(1000) > .main:3:44 > 1067100 > 5.3 > > 438670 > 13 > > 1106923 > xdmp:random(1000) + 1 > .main:3:16 > 10 > 4.7 > > 393009 > 79 > > 6524821 > (fn:collection()/descendant::*:twitter)[xdmp:random(1000) + 1] > .main:1:0 > 1 > 0.0012 > > 101 > 100 > > 8290147 > for $i in 1 to 10 return > fn:count((fn:collection()/descendant::*:twitter)[xdmp:random(1000) + 1]) > .main:1:13 > 1 > 0.000024 > > 2 > 0.000024 > > 2 > 1 to 10 > > ---------------- > > So in the first case, xdmp:random gets called 10 times, and its value is > used to select a single element off that position. > > In the second case xdmp:random gets called 1067100 times (10x# of docs) > and the count is the number of times xdmp:random returned *the same value* % > 1000 as the position of the document. > > The above example is equivilent to : > > for $i in (1 to 10) > > return count((//*:twitter)[position() = xdmp:random(1000)+1]) > > which amounts to this really > > for $i in ( 1 to 10 ) > return count( > for $t at $pos in //*:twitter > $pos eq xdmp:random(1000) + 1 > ) > > > So this count is returning the number of times xdmp:random(1000) + 1 returns > the same value as the position > > > > ----------------------------------------------------------------------------- > David Lee > Lead Engineer > MarkLogic Corporation > [email protected] > Phone: +1 812-482-5224 > Cell: +1 812-630-7622 > www.marklogic.com > > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of Will Thompson > Sent: Wednesday, October 03, 2012 3:19 PM > To: MarkLogic Developer Discussion > Subject: [MarkLogic Dev General] Unexpected behavior for xdmp:random() in > predicate > > When I use xdmp:random() like this, I expect this to always return 1, but > that's not the result. > > for $i in (1 to 10) > return count((//element)[xdmp:random(1000)+1]) > => 0 1 0 0 1 2 0 2 0 1 > > But when I assign the random value to a variable first, the output is as > expected and execution time is much faster. > > for $i in (1 to 10) > let $idx := xdmp:random(1000)+1 > return count((//element)[$idx]) > => 1 1 1 1 1 1 1 1 1 1 > > Is there an explanation for this? > > Thanks, > > Will > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
