As I said, it's personal taste to some extent. I generally prefer module
variables for declaring the required start state of a module. When a module
becomes more complicated those values, which are a pre-conditions, are globally
available to all functions and can be relied upon to be correct. Using a FLWOR
as the module main body means you'd need to pass them around to function calls.
The same pattern works for simple modules like this as well as more
complicated ones.
However, there is still the mystery of why changing to a let binding appears
to exhibit different behavior. The two options should be semantically
equivalent. I still think something else is going on. If the declared
variable is this:
declare variable $content-type as xs:string :=
slib:validated-bulk-load-content-type();
It has a type and cardinality of "xs:string". An exception will be thrown
if it's not assigned a single string. You later reference $content-type like
this:
slib:normal-bulk-load-response ($etag-map, fn:count
($etag-map/store:etag-entry), $content-type)
That function call references $content-type, which must ultimately be
evaluated to create the parameter value. At that point, if it isn't a single,
non-empty string value, then there should be a type-clash exception thrown by
the runtime.
So what I don't understand is how can slib:normal-bulk-load-response could
ever run if slib:validated-bulk-load-content-type didn't?
If this is really what's happening, there is a serious XQuery bug here. But
I'd still bet that something else is actually happening.
---
Ron Hitchens {[email protected]} +44 7879 358212
On Jun 20, 2014, at 10:44 AM, "Retter, Adam (RBI-UK)" <[email protected]>
wrote:
> I am not particularly interested in who did what to the code, rather I just
> want to fix the issue and move on.
>
> I noticed the issue, actually by studying the JUnit tests. There are several
> tests which were passing, which were badly written and should never pass,
> i.e. they were each sending an invalid Content-Type header (these are recent
> tests written by others). I spent some time working out why the tests were
> passing when they should obviously be failing and that lead to this email.
> i.e. some XQuery validation code which is called via a `declare variable`,
> yet is never executed even though it does appear to be referenced elsewhere.
>
> By literally changing only the responsible `declare variable` to a `let`,
> this validation code was then executed and the JUnit tests failed (as they
> should have). I then went back and fixed the bad JUnit tests so that they
> always send the correct required Content-Type header.
>
> So why changing from a `declare variable` to a `let` fixed the issue is still
> a mystery to me, as that variable binding is referenced elsewhere in the code
> and so from what everyone has told me so far, should have been lazily
> evaluated. However, it is was not!
>
> Where we are updating code, we are turning off `MarkLogic's function mapping`
> because it is simply too easy to shoot yourself in the foot. In several
> places now where we have switched it off, we have found bugs where function
> mapping was being used implicitly and the developer did not realise and so
> thought they had the correct behaviour but ultimately did not. Actually in
> some instances, simply turning off function mapping enabled us to fix a whole
> swathe of bugs which we did not yet realise existed.
>
> I am not sure if I understand your argument for using `declare variable`
> instead of let in Main Modules, certainly the modules I have seen so far do
> not have any code branching (e.g. if, typeswitch, for, simple map operator
> expressions), so all variables will be eventually evaluated anyway (unless
> you have some dead code that needs to be removed). In that instance I cannot
> imagine any difference between using `declare variable` or `let`, as in
> either instance all code has to be evaluated eventually anyway. There may be
> further examples, in the code-base which benefit from this, but I am only
> just getting started...
>
> " I would contend that if you're changing from declared variables to let
> bindings and getting different results then you've made a translation mistake
> or something non-obvious is happening in the original code (which led to a
> non-obvious translation mistake)." <-- Pretty certain I have not, I changed
> one line from a `declare variable` to a `let` and I have had several other
> developers here also check it to make sure I was not doing something stupid.
>
> Not sure we will get to the bottom of this one, but well, I have a workaround
> (i.e. use `let` for this one), so I am ok :-)
>
> Cheers Adam.
>
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Ron Hitchens
> Sent: 20 June 2014 01:16
> To: MarkLogic Developer Discussion
> Subject: Re: [MarkLogic Dev General] Evaluation of `declare variable` vs `let`
>
>
> OK. I guess I'll wade into this discussion since I wrote the original code
> that Adam is asking about.
>
> Something must have changed with that code if it's failing now, because it
> was debugged and tested long before I left the project. There are, or at
> least were, automated API tests checking that code path. Maybe some source
> control archeology?
>
> Don't disable MarkLogic's function mapping arbitrarily in that codebase,
> because there is quite a lot of code that depends on it. I usually leave a
> comment where it's used to call attention to it. Disabling function mapping
> will cause a lot of things to break, which would lead you down a rat hole of
> unnecessarily rewriting a lot of working code. Migrating to the XQuery 3.0
> map operator would be an option (which wasn't available when we started
> writing the code on ML 5), but proceed with caution.
>
> Let vs declare: I do often eschew FLWORs for main module bodies in favor of
> declared variables. To some extent this is personal taste but there is also
> some method to the madness. Declared variables are evaluated during the
> static analysis phase, before the module body begins evaluation. I often
> have a series of variable declarations like the one Adam describes:
>
> declare variable $content-type as xs:string :=
> slib:validated-bulk-load-content-type()
>
> This, when eventually evaluated, sets $content-type to a value returned by
> a function that checks the validity of the Content-Type header. The variable
> also has an explicit type and cardinality of one and only one. This means
> that, when used, $content-type will either cause an exception, thrown from
> the validation function or from a type mismatch on the assignment, or have a
> valid value for its type. There is an error handler configured on that
> appserver which will trap any such exceptions and return a structured error
> message.
>
> I use this pattern to assign all the values that need validation to
> declared variables. Then make use of the variables as appropriate. Any
> referenced variables are then known to be good if no exceptions are thrown.
> It also lets me leverage the lazy evaluation characteristic of declared
> variables because only the ones I actually use are assigned. That's a
> feature. Basically the validation rules are written into the variable
> assignments and only the variables referenced in the run-time execution path
> of the module body need to be reified.
>
> Now, apart from style preference, there should be no effective difference
> between using "let" bindings and declaring variables for a given operation
> set.
>
> This:
>
> declare variable $foo as xs:string := lib:functionA(); declare variable $bar
> as xs:string := lib:functionB ($foo); functionC ($foo, $bar)
>
> Should always produce the same result as this:
>
> let $foo as xs:string := lib:functionA() let $bar as xs:string :=
> lib:functionB ($foo) return functionC ($foo, $bar)
>
> There are differences in when and how the values are evaluated, but the end
> result must be the same. If it's not, either there is a bug in the XQuery
> implementation or the two are not actually equivalent.
>
> I would contend that if you're changing from declared variables to let
> bindings and getting different results then you've made a translation mistake
> or something non-obvious is happening in the original code (which led to a
> non-obvious translation mistake).
>
> Do let us know when you get to the bottom of it.
>
> ---
> Ron Hitchens {[email protected]} +44 7879 358212
>
> On Jun 19, 2014, at 8:39 PM, David Lee <[email protected]> wrote:
>
>> I dont believe you can count on the depth of precision of your statement.
>> We start delving into exactly what "evaluate" means ... and then we
>> have a really long interesting discussion which has very few concrete
>> answers in the general case (or often in the specific case).
>>
>> For example ... are you *positive* that these two let statements are
>> "evaluated sequentially"
>> (whatever that means) in MarkLogic ?
>>
>> let $a := cts:search( /foo , "word1" ),
>> $b := cts:search( /bar , "word2" )
>> return ( xdmp:estimate( ($a,$b) )
>>
>> I wouldn't count on it ... (for any definition of "evaluated sequentially")
>> And what does "evaluate" and "evaluate sequentially" actually mean ? Does
>> it mean that the final results of $a are available in memory and ready to
>> be served up before cts:search( /bar , "word2" ) is started ?
>> If thats what you think it means, then you'll be in for a surprise.
>> XQuery Specs certainly have nothing to say about it ... specifically and
>> intentionally nothing.
>>
>> I'll leave it at that... In general it would be wises to make fewer rather
>> than more assumptions about
>> what is done when ... and focus more on the results. Anything with side
>> effects is eventually going to surprise you.
>> If you absolutely need to be in control over ordering and timing then
>> there are transactional statements, and "hints" like xdmp:eagar and
>> xdmp:lazy ... but the less you depend on ordering the better.
>> In general extensions which knowingly produce side effects (like maps,
>> arrays) try to do so in an 'obvious' order,
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: [email protected]
>> [mailto:[email protected]] On Behalf Of Keith L.
>> Breinholt
>> Sent: Thursday, June 19, 2014 3:14 PM
>> To: MarkLogic Developer Discussion
>> Cc: Sahay, Saurabh (RBI-UK)
>> Subject: Re: [MarkLogic Dev General] Evaluation of `declare variable`
>> vs `let`
>>
>> Thanks for the clarification David.
>>
>> However, I believe the observation stands. Variables created using
>> 'declare' statements will never get evaluated in parallel since they are
>> only evaluated on demand which is inherently sequential.
>>
>> However, if you have a 'declared' variable in a module it will get evaluated
>> exactly once no matter how many times it is used. Whereas a 'let' variable
>> will get evaluated every time, it may get pulled from a cache but the let
>> variable will be evaluated each time the flower it resides in is evaluated.
>> Is this observation correct David?
>>
>> -----Original Message-----
>> From: [email protected]
>> [mailto:[email protected]] On Behalf Of David
>> Lee
>> Sent: Thursday, June 19, 2014 11:09 AM
>> To: MarkLogic Developer Discussion
>> Cc: Sahay, Saurabh (RBI-UK)
>> Subject: Re: [MarkLogic Dev General] Evaluation of `declare variable`
>> vs `let`
>>
>> I would be cautious with the phrase ' let variables are executed in parallel'
>>
>> They are not necessarily evaluated in "parallel" , (prefer 'evaluated' to
>> 'executed'), that would imply all let clauses are run in separate threads,
>> cores, processes, systems without any synchronization.
>> They are not (necessarily).
>> Rather they are evaluated in an implementation dependent fashion that
>> produces the required results (as defined by the XQuery language).
>> For the most part any 'parallelism' of XQuery evaluation is not detectable
>> by the program itself ( requires using a side-effect producing statement)
>> and is not guaranteed even on the same code run twice.
>> ( the second run may be able to use cached results from the first run and
>> not bother doing anything 'in parallel').
>>
>> Also 'parallelism' is fundamentally different from lazy evaluation. Lazy
>> evaluation can occur in a completely single threaded non parallel fashion -
>> its the order (or presence) of evaluation which is not defined.
>> 'Parallel' implies actual concurrency.
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: [email protected]
>> [mailto:[email protected]] On Behalf Of Keith L.
>> Breinholt
>> Sent: Thursday, June 19, 2014 12:57 PM
>> To: MarkLogic Developer Discussion
>> Cc: Sahay, Saurabh (RBI-UK)
>> Subject: Re: [MarkLogic Dev General] Evaluation of `declare variable`
>> vs `let`
>>
>> If your other variable $etag-map is the empty sequence you may have run into
>> function mapping behavior.
>>
>> Look for a good explanation of function mapping here:
>> https://urldefense.proofpoint.com/v1/url?u=http://nelsonwells.net/2013
>> /05/two-reasons-your-marklogic-code-is-failing-silently-part-2/&k=wlPC
>> rglRP6kzT4RbABWMaw%3D%3D%0A&r=2FOxwjXkcRFP9Zb5gsGqutGbMyYaH6V5O1y2qyDO
>> E%2Bw%3D%0A&m=CUmMn9C4WRhcwBnM6r5tLEccNYhXVZfbUkbVlJI1syk%3D%0A&s=beb9
>> 87fd4c0505474ae79717ee2f70c04b9f8bc01edcccc3f86d5357059cdb3a
>>
>> And here:
>> https://urldefense.proofpoint.com/v1/url?u=http://docs.marklogic.com/g
>> uide/xquery/enhanced%23id_55459&k=wlPCrglRP6kzT4RbABWMaw%3D%3D%0A&r=2F
>> OxwjXkcRFP9Zb5gsGqutGbMyYaH6V5O1y2qyDOE%2Bw%3D%0A&m=CUmMn9C4WRhcwBnM6r
>> 5tLEccNYhXVZfbUkbVlJI1syk%3D%0A&s=306dbf21bd4be9410840fad6887b0b31529b
>> daabe6d7b4f08246aa73d40615d6
>>
>> An easy way to know if function mapping is causing this behavior is to turn
>> it off at the top of your module with this statement.
>>
>> declare option xdmp:mapping "false";
>>
>> Unless you understand all the intricacies of function mapping I highly
>> suggest you turn it off.
>>
>> As to the usage of declare vs. let I agree with Danny that it is mostly a
>> matter of preference.
>>
>> The only distinguishing ... declare variables are lazily evaluated and let
>> variables are executed in parallel unless a let variable's definition
>> expression uses another let variable which will force sequential evaluation
>> of any parent variables in the definition expression.
>>
>> -Keith
>>
>> -----Original Message-----
>> From: [email protected]
>> [mailto:[email protected]] On Behalf Of Retter,
>> Adam (RBI-UK)
>> Sent: Thursday, June 19, 2014 9:29 AM
>> To: MarkLogic Developer Discussion
>> Cc: Sahay, Saurabh (RBI-UK)
>> Subject: [MarkLogic Dev General] Evaluation of `declare variable` vs
>> `let`
>>
>> Given a main module with no user defined functions, is it considered better
>> practice to use `declare variable` as opposed to `let`, the reason I ask is
>> twofold:
>>
>> 1) I have started on a large code-base that seems to eschew `let` in favour
>> of using `declare variable`. This seems somewhat strange to me.
>>
>> 2) We have seen different evaluation strategies, whereby a function which
>> returns an empty-sequence which is bound in a declare variable clause, is
>> never executed. Yet when we re-write that as a let binding, the code is
>> executed. I think perhaps the query optimiser in ML is being too aggressive
>> here?
>>
>>
>> For example -
>>
>> declare variable $content-type as xs:string :=
>> slib:validated-bulk-load-content-type();
>>
>> and then a few lines later we have -
>>
>> slib:normal-bulk-load-response ($etag-map, fn:count
>> ($etag-map/store:etag-entry), $content-type)
>>
>> Subsequently the `slib:normal-bulk-load-response` calls
>> xdmp:set-response-content-type($content-type). However the
>> `slib:validated-bulk-content-type()` function is never evaluated, we are
>> certain of this because it eventually calls `fn:error`, yet the error never
>> occurs!
>>
>> If we switch the `declare variable $content-type` for a `let $content-type`
>> then we do see the error occurring!
>>
>> Cheers Adam.
>>
>> DISCLAIMER
>> This message is intended only for the use of the person(s) ("Intended
>> Recipient") to whom it is addressed. It may contain information, which is
>> privileged and confidential. Accordingly any dissemination, distribution,
>> copying or other use of this message or any of its content by any person
>> other than the Intended Recipient may constitute a breach of civil or
>> criminal law and is strictly prohibited. If you are not the Intended
>> Recipient, please contact the sender as soon as possible.
>> Reed Business Information Limited. Registered Office: Quadrant House, The
>> Quadrant, Sutton, Surrey, SM2 5AS, UK.
>> Registered in England under Company No. 151537
>>
>> _______________________________________________
>> General mailing list
>> [email protected]
>> https://urldefense.proofpoint.com/v1/url?u=http://developer.marklogic.
>> com/mailman/listinfo/general&k=wlPCrglRP6kzT4RbABWMaw%3D%3D%0A&r=2FOxw
>> jXkcRFP9Zb5gsGqutGbMyYaH6V5O1y2qyDOE%2Bw%3D%0A&m=I10j%2FLAEJ46Ln0PGkyn
>> ObVI4344UZ9ymM1GdOCR3qJ4%3D%0A&s=bd54b2326bd1675b33cd525bc93f0d07e5f8f
>> f9aae9a427ec8a593bfcafff7a6
>>
>>
>> NOTICE: This email message is for the sole use of the intended recipient(s)
>> and may contain confidential and privileged information. Any unauthorized
>> review, use, disclosure or distribution is prohibited. If you are not the
>> intended recipient, please contact the sender by reply email and destroy all
>> copies of the original message.
>>
>> _______________________________________________
>> General mailing list
>> [email protected]
>> https://urldefense.proofpoint.com/v1/url?u=http://developer.marklogic.
>> com/mailman/listinfo/general&k=wlPCrglRP6kzT4RbABWMaw%3D%3D%0A&r=2FOxw
>> jXkcRFP9Zb5gsGqutGbMyYaH6V5O1y2qyDOE%2Bw%3D%0A&m=CUmMn9C4WRhcwBnM6r5tL
>> EccNYhXVZfbUkbVlJI1syk%3D%0A&s=cc4cf7d99bb15c2bfc8ca9ff435ae3fe161e7c6
>> e81ef8a5cbf1502ece673e3ef
>> _______________________________________________
>> General mailing list
>> [email protected]
>> https://urldefense.proofpoint.com/v1/url?u=http://developer.marklogic.
>> com/mailman/listinfo/general&k=wlPCrglRP6kzT4RbABWMaw%3D%3D%0A&r=2FOxw
>> jXkcRFP9Zb5gsGqutGbMyYaH6V5O1y2qyDOE%2Bw%3D%0A&m=CUmMn9C4WRhcwBnM6r5tL
>> EccNYhXVZfbUkbVlJI1syk%3D%0A&s=cc4cf7d99bb15c2bfc8ca9ff435ae3fe161e7c6
>> e81ef8a5cbf1502ece673e3ef
>>
>>
>> NOTICE: This email message is for the sole use of the intended recipient(s)
>> and may contain confidential and privileged information. Any unauthorized
>> review, use, disclosure or distribution is prohibited. If you are not the
>> intended recipient, please contact the sender by reply email and destroy all
>> copies of the original message.
>>
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general