Thank you Nico for taking the time to analyze and comment on my code. I
will study your answer, but not this evening :-) Much appreciated!

All the best,
Chuck


On Fri, Dec 20, 2024 at 1:12 PM Nico Verwer (Rakensi) <nver...@rakensi.com>
wrote:

> Hello Chuck,
>
> The reason why you have infinite recursion is, that the signature of
> `map:merge` is
>
> map:merge($maps as map(*)*, $options as map(*)) as map(*)
>
> The first parameter is a sequence of maps, and the second parameter is a
> map with options.
> This means that in your code
>
>       map:merge($pmid_map, map {
>         $key : subsequence($pmids, $start, $end)
>       })
>
> the map containing the subsequence is interpreted as a (meaningless) map
> with options, and you only keep the empty map that you started with.
> To fix this, insert parentheses like so:
>
>       map:merge(
>         ( $pmid_map
>         , map {$key : subsequence($pmids, $start, $end)}
>         )
>       )
>
> The $options parameter is optional, so I omitted it.
> However, there are more mistakes, which I will go into below. And a better
> approach is to use the *for - group by - return* that I posted earlier.
> This is a lot faster, more concise, and more readable in my opinion.
> My comments below are only meant to highlight some learning points.
>
> declare function local:by_hundreds($pmids as xs:string*, $pmid_map as
> map(*))
>
> as map(*) {
>   let $key_count := count(map:keys($pmid_map))
>
> Better is map:size($pmid_map)
>
>   return
>     (: base case: each value except the last should have 100 items
>        in it; the last should have <= 100; in this case we return the map
>        goal is to build a map with 63 keys, with each value
>        a sequence of 100 or 93 (in the last case) strings :)
>     if ((($key_count + 1) * 100) >= count($pmids))
>       then $pmid_map
>
> This means that you will miss the last 93 strings, because *$key_count *
> 100* will be less then *count($pmids)*, so there are strings left.
>
>     else
>       (: starting index for subsequence() :)
>       let $start := 1 + ($key_count * 100)
>       (: ending index for subsequence() :)
>       let $end := $start + 99
>       (: value of the map key for this 100 items :)
>       let $key := $key_count + 1
>
> If you pass a subsequence of $pmids as the first parameter, instead of
> passing the whole list every time, there is less computation to do, and the
> base case becomes easier to detect.
>
>       (: attempt to log all recurrent executions to files
>       let $foo :=
> local:write_results(<res><start>{$start}</start><end>{$end}</end></res>,
> $key_count)
>       :)
>       (: create a map with $key as key and the subsequence from n to n + 99
>          of $pmids and merge it with input accumulator map to make a new
> map :)
>       let $pmid_map_new := map:merge($pmid_map, map {
>         $key : subsequence($pmids, $start, $end)
>       })
>
> Since there is only one key in this map, it is better to use *map:entry($key,
> subsequence($pmids, $start, $end))*.
>
>       (: recur; the number of keys in $pmid_map_new will enable the next
>          execution to determine which items from $pmids to use :)
>       return local:by_hundreds($pmids, $pmid_map_new)
> };
>
>
> The recursive pattern you use can be written more efficiently as a
> fold-left. See the excellent XQuery book by Priscilla Walmsley for examples.
>
> But as I said, the conventional way to do this in XQuery would be
>
>   map:merge(
>     for $pmidset at $i in $pmids
>     group by $chunk as xs:integer := ($i idiv 100)
>     return map:entry($chunk, $pmidset)
>   )
>
>
> Happy XQuerying!
>
> Best regards,
> Nico Verwer
>
>
>
>
>

Reply via email to