Re: [basex-talk] Too aggressive optimizations?
Dear Christian, I have to admit that all what you say makes sense and I was even digging in my swapped long term memory for something like the annotation you mention. The background information you share makes me feel more comfortable and I'll take your advice. Thanks! Marco. On 07/02/23 19:22, Christian Grün wrote: Hi Marco, let $ops := ( for $i in (1 to 5) let $url := "http://www.google.com; return function(){ (file:write(file:create-temp-file("ttt",string($i)),fetch:content-type($url))) } ) let $download := xquery:fork-join($ops) return count($ops) I've noticed often the archive arrives empty. So after investigation I've found that query [1] isnon predictable. It is often optimized to "count(0)". That’s surprising, and I didn’t manage to reproduce it. count($pos) should consistently yield 5, as the number of (non-executed) function items attached to $ops is 5, regardless of what is supposed to happen before or after the variable declaration. Maybe you wrote "count($download)" or something similar? Is there any other way to get it reproduced? However, I can confirm that xquery:fork-join($ops) is not evaluated, as its result, which is bound to $download, is never referenced. Here’s a better way to write it: let $ops := (... for $i in ...) return ( xquery:fork-join($ops), count($pos) ) Another solution to enforce the evaluation of the function call is the basex:non-deterministic pragma [1]… let $ops := (... for $i in ...) let $download := (# basex:non-deterministic #) { xquery:fork-join($ops) } return count($ops) …but in general, it’s better to get rid of unreferenced variables in the code whenever possible. Some background noise: Non-deterministic and side-effecting functions carve out a niche existence in the official W3 standards, as they contradict the nature of functional languages. It’s tricky for the optimizer to treat them properly: Function items are deterministic, but when they are evaluated, they may trigger side effects. Deterministic code that seems irrelevant is removed from the original query whenever possible, so solutions to circumvent this are to either wrap the expression with the pragma (thus, annotating it as non-deterministic), or by moving it to a result sequence. And a monologic side note: Maybe we should internally annotate xquery:fork-join as non-deterministic. Even if it may contain purely deterministic code, it’s almost always used for non-deterministic operations in practice. Hope this helps, Christian [1] https://docs.basex.org/wiki/XQuery_Extensions
Re: [basex-talk] Too aggressive optimizations?
Hi Marco, let $ops := ( for $i in (1 to 5) let $url := "http://www.google.com; return function(){ (file:write(file:create-temp-file("ttt",string($i)),fetch:content-type($url))) } ) let $download := xquery:fork-join($ops) return count($ops) > I've noticed often the archive arrives empty. So after investigation I've > found that query [1] isnon predictable. It is often optimized to "count(0)". That’s surprising, and I didn’t manage to reproduce it. count($pos) should consistently yield 5, as the number of (non-executed) function items attached to $ops is 5, regardless of what is supposed to happen before or after the variable declaration. Maybe you wrote "count($download)" or something similar? Is there any other way to get it reproduced? However, I can confirm that xquery:fork-join($ops) is not evaluated, as its result, which is bound to $download, is never referenced. Here’s a better way to write it: let $ops := (... for $i in ...) return ( xquery:fork-join($ops), count($pos) ) Another solution to enforce the evaluation of the function call is the basex:non-deterministic pragma [1]… let $ops := (... for $i in ...) let $download := (# basex:non-deterministic #) { xquery:fork-join($ops) } return count($ops) …but in general, it’s better to get rid of unreferenced variables in the code whenever possible. Some background noise: Non-deterministic and side-effecting functions carve out a niche existence in the official W3 standards, as they contradict the nature of functional languages. It’s tricky for the optimizer to treat them properly: Function items are deterministic, but when they are evaluated, they may trigger side effects. Deterministic code that seems irrelevant is removed from the original query whenever possible, so solutions to circumvent this are to either wrap the expression with the pragma (thus, annotating it as non-deterministic), or by moving it to a result sequence. And a monologic side note: Maybe we should internally annotate xquery:fork-join as non-deterministic. Even if it may contain purely deterministic code, it’s almost always used for non-deterministic operations in practice. Hope this helps, Christian [1] https://docs.basex.org/wiki/XQuery_Extensions
[basex-talk] Too aggressive optimizations?
Dear all, my scenario is a RestXQ: - download resources and store them in temporary directory. - do it with fork-join in order to obtain smaller latency - compress to zip archive and return the archive data. I've noticed often the archive arrives empty. So after investigation I've found that query [1] isnon predictable. It is often optimized to "count(0)". I can manage to produce results from time to time but not consistently with [2]. [3] Seems the safer solution. The behavior is the same with 9.x and 10. Since I do not feel very comfortable, is there someone who can tell me if I'm doing it wrong or if there is a secure solution or if I should abandon fork-join tout-court? Thanks a lot. Regards, Marco. [1] let $ops := ( for $i in (1 to 5) let $url := "http://www.google.com; return function(){ (file:write(file:create-temp-file("ttt",string($i)),fetch:content-type($url))) } ) let $download := xquery:fork-join($ops) return count($ops) [2] let $ops := ( for $i in (1 to 5) let $url := "http://www.google.com; return function(){ (file:write(file:create-temp-file("ttt",string($i)),fetch:content-type($url)),1) } ) let $download := xquery:fork-join($ops) return count($ops) [3] let $ops := xquery:fork-join( for $i in (1 to 5) let $url := "http://www.google.com; return function(){ (1, file:write(file:create-temp-file("ttt",string($i)),fetch:content-type($url))) } ) return count($ops)