HI Ron.

That's a great write-up.

An example of this in real-terms that comes to mind is the use of
cts:element-values in a FLWOR statement and the 'concurrent' option with an
order-by clause.  We are essentially telling the server via the concurrent
option to go off and dance around in another thread etc and only ask that
by the time the final results get to us that they are in some sort of
order.  So, with this in combination with lazy evaluation, we can see that
what happens is not necessarily in a certain order(because we even said tot
he server via the 'concurrent' tab to go off and do this in whatever way it
sees fit).  This can even get more fun if we decide to use cts:frequency
within this FLWOR statement.  Now the server is again evaluating and
returning in a completely unpredictable manner. But of course, by the time
we drop out of the FLWOR statement, the final order-by is applied.

Regards,
David



Kind Regards,
David Ennis


David Ennis
*Content Engineer*

[image: HintTech]  <http://www.hinttech.com/>
Mastering the value of content
creative | technology | content

Delftechpark 37i
2628 XJ Delft
The Netherlands
T: +31 88 268 25 00
M: +31 63 091 72 80

[image: http://www.hinttech.com] <http://www.hinttech.com>
<https://twitter.com/HintTech>  <http://www.facebook.com/HintTech>
<http://www.linkedin.com/company/HintTech>

On 5 October 2014 15:23, Ron Hitchens <r...@ronsoft.com> wrote:

>
>    It's important to remember that a "for" expression in XQuery, and
> similar expressions such as "foreach" in other functional languages, do not
> represent an imperative order of evaluation as in procedural languages.  An
> XQuery "for" is not a loop, it's a selector than applies an expression to
> each of a set of selected values, producing a sequence of result values.
>
>    In a functional language like XQuery, side effects are not allowed
> (ignoring MarkLogic extensions like map:map and xdmp:set for the moment).
> If there are no side effects that must be accounted for between iterations
> of the "loop", then there is no particular reason to evaluate each
> iteration of the loop in sequence.  In fact, they can run in any order or
> in parallel (hint: multi-core optimization).
>
>    When an "order by" clause is used, that controls the order of the items
> in the final result sequence.  But it does not mandate the evaluation order
> of the "loop" iterations.  Once all the values from all iterations have
> been computed, they can then be sorted into the requested order.  Because
> there are no side effects allowed, it doesn't matter which order they were
> actually evaluated in, or if they were done concurrently.
>
>    It is entirely reasonable and correct for log messages or wall-clock
> timestamps to be "out of order" from the order you intuitively expect.  As
> long as the resulting sequence is in the order your code specified, then
> it's correct.  Writing a log message is a sort of "soft side effect".  If
> doesn't modify the state of the running program but it's causing an action
> that's independent of the evaluation of the functional result.
>
>    You should always think of a "for" as a single expression that produces
> a sequence of values.  Don't use it as a flow control mechanism.  Work with
> the ordered result sequence you get from a FLWOR, don't use it to control
> code execution order.
>
>    Hope that helps.
>
> ---
> Ron Hitchens {r...@overstory.co.uk}  +44 7879 358212
>
> On Oct 2, 2014, at 8:41 PM, Michael Blakeley <m...@blakeley.com> wrote:
>
> > I try not to rely on evaluation order anywhere, if I can help it. But
> I'm not sure whether MarkLogic's behavior matches the spec or not.
> >
> > According to http://www.w3.org/TR/xquery/#id-orderby-return "the
> resulting order determines the order in which the return clause is
> evaluated".
> >
> > I like tests, so here's one:
> >
> > for $i at $x in 1 to 5
> > let $_ := xdmp:sleep(1)
> > let $start := xdmp:elapsed-time()
> > order by $x descending
> > return text { $x, $start, xdmp:elapsed-time() }
> > =>
> > 5 PT0.006194S PT0.006221S
> > 4 PT0.004826S PT0.004854S
> > 3 PT0.003606S PT0.00364S
> > 2 PT0.002314S PT0.002325S
> > 1 PT0.001134S PT0.001211S
> >
> > With 7.0-4, the smallest elapsed-time values are at the end and both
> elapsed-time values match closely. To me this suggests that the return
> clause was evaluated before the results were sorted, not after.
> >
> > -- Mike
> >
> > On 2 Oct 2014, at 11:19 , Rachel Wilson <rachel.wil...@bbc.co.uk> wrote:
> >
> >> Thanks Mike,
> >>
> >> Are you saying never to do anything in the return statement that depends
> >> on the order - that the order by clause only orders the final results.
> >> It's so surprising because  it does *mostly* work, until I added the
> order
> >> by clause it was *completely* wrong.  And it's worked fine for other
> >> simple data sets - just not against production data sadly :-)
> >>
> >> I think I've been lead astray in my expectations by the order by
> examples
> >> for the return statement section in this article
> >> http://www.stylusstudio.com/xquery_flwor.html
> >>
> >> Oh and a quick check - I'm assuming this is the same for any xquery
> >> provider, it's not just a quirk of marklogic optimisation.
> >>
> >> Thanks for the lessons!
> >>
> >> Rachel
> >>
> >> -----Original Message-----
> >> From: Michael Blakeley <m...@blakeley.com>
> >> Reply-To: MarkLogic Developer Discussion <
> general@developer.marklogic.com>
> >> Date: Thursday, 2 October 2014 17:42
> >> To: MarkLogic Developer Discussion <general@developer.marklogic.com>
> >> Subject: Re: [MarkLogic Dev General] order by clause ignored?
> >>
> >> The order of operations in a FLWOR isn't guaranteed. The return
> expression
> >> could be evaluated before the sorting is done.
> >>
> >> You might also be interested to know that maps are unordered. So you can
> >> put those items into the map in any order you like, but the keys will
> end
> >> up in whatever internal order the map uses. That should be consistent,
> but
> >> won't be under your control.
> >>
> >> As an alternative you can order the keys when processing the map. Or
> >> consider using json:object instead. That's like a map but has an ordered
> >> sequence of keys.
> >>
> >> -- Mike
> >>
> >> On 2 Oct 2014, at 09:17 , Rachel Wilson <rachel.wil...@bbc.co.uk>
> wrote:
> >>
> >>> I realise this is more of an xquery question but everyone here is
> >>> stumped :-s
> >>>
> >>> I have a query that is trying to calculate some earliest and latest
> >>> dates that an event occurred.  The events are represented in xml like
> >>> this:
> >>>
> >>> <publication>
> >>> <publicationDateTime>2013-08-14T15:06:51.921302Z</publicationDateTime>
> >>> <documents>
> >>>   <document>
> >>>       <id>b119c2a3-5436-59d2-9771-a67f8e2b4172</id>
> >>>       <version>1</version>
> >>>   </document>
> >>> </documents>
> >>> </publication>
> >>>
> >>> There are two events for the this particular document (i.e. with the
> >>> same document id), and therefore two publication documents on different
> >>> days.  In order to work out the first and last dates I'm sorting all my
> >>> publication docs by publication date, then taking the sorted
> publication
> >>> documents in order I'm iterating through the publication document ids
> in
> >>> order and remembering them in the correct maps for earliest seen and
> >>> latest seen dates.  However the results in the map are as if the
> >>> publication documents are coming out in document order.
> >>>
> >>> If I run this query the document dates are coming out in the right
> order
> >>> as I'd expect:
> >>> ------------------------------------------------------------------
> >>> let $earliestPubDates := map:map()
> >>> let $latestPubDates := map:map()
> >>> let $buildMaps :=
> >>>  for $x in doc()/publication
> >>>  let $publishedDate := $x/publicationDateTime/string()
> >>>  let $documents := $x/documents/document
> >>>  order by $x/publicationDateTime
> >>>  return
> >>>    (: map building stuff commented out in order to see how the dates
> >>> are being ordered :)
> >>>    $publishedDate
> >>>
> >>> return $buildMaps
> >>> ---------------------------------------------------------------------
> >>> returns:
> >>> 2013-08-14T15:06:51.921302Z
> >>> 2014-09-03T14:46:07.757612Z
> >>>
> >>>
> >>>
> >>> However, after someone showed me how to debug the loop a little with an
> >>> error statement, I'm finding that the first publication processed is
> the
> >>> later one, not the earlier one.  It looks like the order by in this
> case
> >>> is being ignored.  Could someone tell me how and why?
> >>>
> >>>
> >>>
> >>>
> >>>
> --------------------------------------------------------------------------
> >>> --
> >>> let $earliestPubDates := map:map()
> >>> let $latestPubDates := map:map()
> >>>
> >>> let $buildMaps :=
> >>>  for $x in doc()/publication
> >>>  let $documents := $x/documents/document
> >>>  order by $x/publicationDateTime
> >>>  return
> >>>     for $contentId in $documents/id/string()
> >>>     let $publishedDate := $x/publicationDateTime/string()
> >>>     let $putEarliest := if (not(map:contains($earliestPubDates,
> >>> $contentId))) then (
> >>>                            let $dummy := map:put($earliestPubDates,
> >>> $contentId, $x/publicationDateTime)
> >>>                            let $a := error("Adding earliest published
> >>> date " || $x/publicationDateTime)
> >>>                            return $dummy
> >>>                         ) else ()
> >>>     let $putLatest := map:put($latestPubDates, $contentId,
> >>> $publishedDate)
> >>>     return ()
> >>>
> >>> return $buildMaps
> >>> ----------------------------------
> >>> Returns (in an error stack trace): "Adding earliest published date
> >>> 2014-09-03T14:46:07.757612Z"
> >>>
> >>>
> >>> Is it something to do with when the order by is applied to the
> statement?
> >>>
> >>> Any help gratefully received
> >>> Rachel
> >>> _______________________________________________
> >>> General mailing list
> >>> General@developer.marklogic.com
> >>> http://developer.marklogic.com/mailman/listinfo/general
> >>
> >> _______________________________________________
> >> General mailing list
> >> General@developer.marklogic.com
> >> http://developer.marklogic.com/mailman/listinfo/general
> >>
> >> _______________________________________________
> >> General mailing list
> >> General@developer.marklogic.com
> >> http://developer.marklogic.com/mailman/listinfo/general
> >>
> >
> > _______________________________________________
> > General mailing list
> > General@developer.marklogic.com
> > http://developer.marklogic.com/mailman/listinfo/general
>
> _______________________________________________
> General mailing list
> General@developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general
>
_______________________________________________
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to