OK. I guess I'll wade into this discussion since I wrote the original code
that Adam is asking about.
Something must have changed with that code if it's failing now, because it
was debugged and tested long before I left the project. There are, or at least
were, automated API tests checking that code path. Maybe some source control
archeology?
Don't disable MarkLogic's function mapping arbitrarily in that codebase,
because there is quite a lot of code that depends on it. I usually leave a
comment where it's used to call attention to it. Disabling function mapping
will cause a lot of things to break, which would lead you down a rat hole of
unnecessarily rewriting a lot of working code. Migrating to the XQuery 3.0 map
operator would be an option (which wasn't available when we started writing the
code on ML 5), but proceed with caution.
Let vs declare: I do often eschew FLWORs for main module bodies in favor of
declared variables. To some extent this is personal taste but there is also
some method to the madness. Declared variables are evaluated during the static
analysis phase, before the module body begins evaluation. I often have a
series of variable declarations like the one Adam describes:
declare variable $content-type as xs:string :=
slib:validated-bulk-load-content-type()
This, when eventually evaluated, sets $content-type to a value returned by a
function that checks the validity of the Content-Type header. The variable
also has an explicit type and cardinality of one and only one. This means
that, when used, $content-type will either cause an exception, thrown from the
validation function or from a type mismatch on the assignment, or have a valid
value for its type. There is an error handler configured on that appserver
which will trap any such exceptions and return a structured error message.
I use this pattern to assign all the values that need validation to declared
variables. Then make use of the variables as appropriate. Any referenced
variables are then known to be good if no exceptions are thrown. It also lets
me leverage the lazy evaluation characteristic of declared variables because
only the ones I actually use are assigned. That's a feature. Basically the
validation rules are written into the variable assignments and only the
variables referenced in the run-time execution path of the module body need to
be reified.
Now, apart from style preference, there should be no effective difference
between using "let" bindings and declaring variables for a given operation set.
This:
declare variable $foo as xs:string := lib:functionA();
declare variable $bar as xs:string := lib:functionB ($foo);
functionC ($foo, $bar)
Should always produce the same result as this:
let $foo as xs:string := lib:functionA()
let $bar as xs:string := lib:functionB ($foo)
return functionC ($foo, $bar)
There are differences in when and how the values are evaluated, but the end
result must be the same. If it's not, either there is a bug in the XQuery
implementation or the two are not actually equivalent.
I would contend that if you're changing from declared variables to let
bindings and getting different results then you've made a translation mistake
or something non-obvious is happening in the original code (which led to a
non-obvious translation mistake).
Do let us know when you get to the bottom of it.
---
Ron Hitchens {[email protected]} +44 7879 358212
On Jun 19, 2014, at 8:39 PM, David Lee <[email protected]> wrote:
> I dont believe you can count on the depth of precision of your statement.
> We start delving into exactly what "evaluate" means ... and then we have a
> really long interesting discussion
> which has very few concrete answers in the general case (or often in the
> specific case).
>
> For example ... are you *positive* that these two let statements are
> "evaluated sequentially"
> (whatever that means) in MarkLogic ?
>
> let $a := cts:search( /foo , "word1" ),
> $b := cts:search( /bar , "word2" )
> return ( xdmp:estimate( ($a,$b) )
>
> I wouldn't count on it ... (for any definition of "evaluated sequentially")
> And what does "evaluate" and "evaluate sequentially" actually mean ? Does
> it mean that the final results of $a are available in memory and ready to be
> served up before cts:search( /bar , "word2" ) is started ?
> If thats what you think it means, then you'll be in for a surprise.
> XQuery Specs certainly have nothing to say about it ... specifically and
> intentionally nothing.
>
> I'll leave it at that... In general it would be wises to make fewer rather
> than more assumptions about
> what is done when ... and focus more on the results. Anything with side
> effects is eventually going to surprise you.
> If you absolutely need to be in control over ordering and timing then there
> are transactional statements,
> and "hints" like xdmp:eagar and xdmp:lazy ... but the less you depend on
> ordering the better.
> In general extensions which knowingly produce side effects (like maps,
> arrays) try to do so in an 'obvious' order,
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Keith L.
> Breinholt
> Sent: Thursday, June 19, 2014 3:14 PM
> To: MarkLogic Developer Discussion
> Cc: Sahay, Saurabh (RBI-UK)
> Subject: Re: [MarkLogic Dev General] Evaluation of `declare variable` vs `let`
>
> Thanks for the clarification David.
>
> However, I believe the observation stands. Variables created using 'declare'
> statements will never get evaluated in parallel since they are only evaluated
> on demand which is inherently sequential.
>
> However, if you have a 'declared' variable in a module it will get evaluated
> exactly once no matter how many times it is used. Whereas a 'let' variable
> will get evaluated every time, it may get pulled from a cache but the let
> variable will be evaluated each time the flower it resides in is evaluated.
> Is this observation correct David?
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of David Lee
> Sent: Thursday, June 19, 2014 11:09 AM
> To: MarkLogic Developer Discussion
> Cc: Sahay, Saurabh (RBI-UK)
> Subject: Re: [MarkLogic Dev General] Evaluation of `declare variable` vs `let`
>
> I would be cautious with the phrase ' let variables are executed in parallel'
>
> They are not necessarily evaluated in "parallel" , (prefer 'evaluated' to
> 'executed'), that would imply all let clauses are run in separate threads,
> cores, processes, systems without any synchronization.
> They are not (necessarily).
> Rather they are evaluated in an implementation dependent fashion that
> produces the required results (as defined by the XQuery language).
> For the most part any 'parallelism' of XQuery evaluation is not detectable by
> the program itself ( requires using a side-effect producing statement) and is
> not guaranteed even on the same code run twice.
> ( the second run may be able to use cached results from the first run and not
> bother doing anything 'in parallel').
>
> Also 'parallelism' is fundamentally different from lazy evaluation. Lazy
> evaluation can occur in a completely single threaded non parallel fashion -
> its the order (or presence) of evaluation which is not defined.
> 'Parallel' implies actual concurrency.
>
>
>
>
>
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Keith L.
> Breinholt
> Sent: Thursday, June 19, 2014 12:57 PM
> To: MarkLogic Developer Discussion
> Cc: Sahay, Saurabh (RBI-UK)
> Subject: Re: [MarkLogic Dev General] Evaluation of `declare variable` vs `let`
>
> If your other variable $etag-map is the empty sequence you may have run into
> function mapping behavior.
>
> Look for a good explanation of function mapping here:
> https://urldefense.proofpoint.com/v1/url?u=http://nelsonwells.net/2013/05/two-reasons-your-marklogic-code-is-failing-silently-part-2/&k=wlPCrglRP6kzT4RbABWMaw%3D%3D%0A&r=2FOxwjXkcRFP9Zb5gsGqutGbMyYaH6V5O1y2qyDOE%2Bw%3D%0A&m=CUmMn9C4WRhcwBnM6r5tLEccNYhXVZfbUkbVlJI1syk%3D%0A&s=beb987fd4c0505474ae79717ee2f70c04b9f8bc01edcccc3f86d5357059cdb3a
>
> And here:
> https://urldefense.proofpoint.com/v1/url?u=http://docs.marklogic.com/guide/xquery/enhanced%23id_55459&k=wlPCrglRP6kzT4RbABWMaw%3D%3D%0A&r=2FOxwjXkcRFP9Zb5gsGqutGbMyYaH6V5O1y2qyDOE%2Bw%3D%0A&m=CUmMn9C4WRhcwBnM6r5tLEccNYhXVZfbUkbVlJI1syk%3D%0A&s=306dbf21bd4be9410840fad6887b0b31529bdaabe6d7b4f08246aa73d40615d6
>
> An easy way to know if function mapping is causing this behavior is to turn
> it off at the top of your module with this statement.
>
> declare option xdmp:mapping "false";
>
> Unless you understand all the intricacies of function mapping I highly
> suggest you turn it off.
>
> As to the usage of declare vs. let I agree with Danny that it is mostly a
> matter of preference.
>
> The only distinguishing ... declare variables are lazily evaluated and let
> variables are executed in parallel unless a let variable's definition
> expression uses another let variable which will force sequential evaluation
> of any parent variables in the definition expression.
>
> -Keith
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Retter, Adam
> (RBI-UK)
> Sent: Thursday, June 19, 2014 9:29 AM
> To: MarkLogic Developer Discussion
> Cc: Sahay, Saurabh (RBI-UK)
> Subject: [MarkLogic Dev General] Evaluation of `declare variable` vs `let`
>
> Given a main module with no user defined functions, is it considered better
> practice to use `declare variable` as opposed to `let`, the reason I ask is
> twofold:
>
> 1) I have started on a large code-base that seems to eschew `let` in favour
> of using `declare variable`. This seems somewhat strange to me.
>
> 2) We have seen different evaluation strategies, whereby a function which
> returns an empty-sequence which is bound in a declare variable clause, is
> never executed. Yet when we re-write that as a let binding, the code is
> executed. I think perhaps the query optimiser in ML is being too aggressive
> here?
>
>
> For example -
>
> declare variable $content-type as xs:string :=
> slib:validated-bulk-load-content-type();
>
> and then a few lines later we have -
>
> slib:normal-bulk-load-response ($etag-map, fn:count
> ($etag-map/store:etag-entry), $content-type)
>
> Subsequently the `slib:normal-bulk-load-response` calls
> xdmp:set-response-content-type($content-type). However the
> `slib:validated-bulk-content-type()` function is never evaluated, we are
> certain of this because it eventually calls `fn:error`, yet the error never
> occurs!
>
> If we switch the `declare variable $content-type` for a `let $content-type`
> then we do see the error occurring!
>
> Cheers Adam.
>
> DISCLAIMER
> This message is intended only for the use of the person(s) ("Intended
> Recipient") to whom it is addressed. It may contain information, which is
> privileged and confidential. Accordingly any dissemination, distribution,
> copying or other use of this message or any of its content by any person
> other than the Intended Recipient may constitute a breach of civil or
> criminal law and is strictly prohibited. If you are not the Intended
> Recipient, please contact the sender as soon as possible.
> Reed Business Information Limited. Registered Office: Quadrant House, The
> Quadrant, Sutton, Surrey, SM2 5AS, UK.
> Registered in England under Company No. 151537
>
> _______________________________________________
> General mailing list
> [email protected]
> https://urldefense.proofpoint.com/v1/url?u=http://developer.marklogic.com/mailman/listinfo/general&k=wlPCrglRP6kzT4RbABWMaw%3D%3D%0A&r=2FOxwjXkcRFP9Zb5gsGqutGbMyYaH6V5O1y2qyDOE%2Bw%3D%0A&m=I10j%2FLAEJ46Ln0PGkynObVI4344UZ9ymM1GdOCR3qJ4%3D%0A&s=bd54b2326bd1675b33cd525bc93f0d07e5f8ff9aae9a427ec8a593bfcafff7a6
>
>
> NOTICE: This email message is for the sole use of the intended recipient(s)
> and may contain confidential and privileged information. Any unauthorized
> review, use, disclosure or distribution is prohibited. If you are not the
> intended recipient, please contact the sender by reply email and destroy all
> copies of the original message.
>
> _______________________________________________
> General mailing list
> [email protected]
> https://urldefense.proofpoint.com/v1/url?u=http://developer.marklogic.com/mailman/listinfo/general&k=wlPCrglRP6kzT4RbABWMaw%3D%3D%0A&r=2FOxwjXkcRFP9Zb5gsGqutGbMyYaH6V5O1y2qyDOE%2Bw%3D%0A&m=CUmMn9C4WRhcwBnM6r5tLEccNYhXVZfbUkbVlJI1syk%3D%0A&s=cc4cf7d99bb15c2bfc8ca9ff435ae3fe161e7c6e81ef8a5cbf1502ece673e3ef
> _______________________________________________
> General mailing list
> [email protected]
> https://urldefense.proofpoint.com/v1/url?u=http://developer.marklogic.com/mailman/listinfo/general&k=wlPCrglRP6kzT4RbABWMaw%3D%3D%0A&r=2FOxwjXkcRFP9Zb5gsGqutGbMyYaH6V5O1y2qyDOE%2Bw%3D%0A&m=CUmMn9C4WRhcwBnM6r5tLEccNYhXVZfbUkbVlJI1syk%3D%0A&s=cc4cf7d99bb15c2bfc8ca9ff435ae3fe161e7c6e81ef8a5cbf1502ece673e3ef
>
>
> NOTICE: This email message is for the sole use of the intended recipient(s)
> and may contain confidential and privileged information. Any unauthorized
> review, use, disclosure or distribution is prohibited. If you are not the
> intended recipient, please contact the sender by reply email and destroy all
> copies of the original message.
>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general