Dear Christian,

I have to admit that all what you say makes sense and I was even digging in my swapped long term memory for something like the annotation you mention.

The background information you share makes me feel more comfortable and I'll take your advice.

Thanks!

Marco.

On 07/02/23 19:22, Christian Grün wrote:
Hi Marco,

let $ops := (
    for $i in (1 to 5)
    let $url := "http://www.google.com";
    return function(){
(file:write(file:create-temp-file("ttt",string($i)),fetch:content-type($url)))
    }
)
let $download := xquery:fork-join($ops)
return count($ops)

I've noticed often the archive arrives empty. So after investigation I've found that 
query [1] isnon predictable. It is often optimized to "count(0)".
That’s surprising, and I didn’t manage to reproduce it. count($pos)
should consistently yield 5, as the number of (non-executed) function
items attached to $ops is 5, regardless of what is supposed to happen
before or after the variable declaration. Maybe you wrote
"count($download)" or something similar? Is there any other way to get
it reproduced?

However, I can confirm that xquery:fork-join($ops) is not evaluated,
as its result, which is bound to $download, is never referenced.
Here’s a better way to write it:

let $ops := (... for $i in ...)
return (
   xquery:fork-join($ops),
   count($pos)
)

Another solution to enforce the evaluation of the function call is the
basex:non-deterministic pragma [1]…

let $ops := (... for $i in ...)
let $download := (# basex:non-deterministic #) { xquery:fork-join($ops) }
return count($ops)

…but in general, it’s better to get rid of unreferenced variables in
the code whenever possible.

Some background noise: Non-deterministic and side-effecting functions
carve out a niche existence in the official W3 standards, as they
contradict the nature of functional languages. It’s tricky for the
optimizer to treat them properly: Function items are deterministic,
but when they are evaluated, they may trigger side effects.
Deterministic code that seems irrelevant is removed from the original
query whenever possible, so solutions to circumvent this are to either
wrap the expression with the pragma (thus, annotating it as
non-deterministic), or by moving it to a result sequence.

And a monologic side note: Maybe we should internally annotate
xquery:fork-join as non-deterministic. Even if it may contain purely
deterministic code, it’s almost always used for non-deterministic
operations in practice.

Hope this helps,
Christian

[1] https://docs.basex.org/wiki/XQuery_Extensions

Reply via email to