Hi,

For (my personal) clarity, I have split up the original function in two
parts:

declare function local:step_one($nodes as node()*) as array(*)*
{
  let $text := for $node in $nodes
     return $node/text() =>
     tokenize() => distinct-values()
  let $idf := $text   =>
     tidyTM:wordCount_arr()
  return $idf
};

In local:step_one(), I first create a sequence with the distinct tokens
for each $node. All the sequences are joined in $text.
I then call wordCount_arr to count the occurences of each word in $text:

declare function tidyTM:wordCount_arr(
  $Words as xs:string*)
  as array(*) {
for $w in $Words
  let $f := $w
  group by $f
  order by count($w) descending
return ([$f, count($w)])
} ;

I would say that tidyTM:wordCount_arr returns a sequence of arrays but I
am not certain if I have specified the correct return-type?

Calling local:step_one(tidyTM:remove_Stopwords($nodes, "Stp", $Stoppers))
returns:
["probleem", 703]
["opgelost.", 248]
....

I had hoped that calling  the following local:wordFreq, would add the
idf to each element but instead I get an error

declare function local:wordFreq_idf($nodes as node()*)  as array(*)
{
  let $count := count($nodes)
  let $idf := local:step_one($nodes)
  let $result := for-each( $idf,
    function($z) {array:append ($z, math:log($count div $z(2) ) ) } )
  return $result
};
[XPTY0004] Cannot promote (array(xs:anyAtomicType))+ to array(*): $idf
:= ([ "probleem", 703 ], [ "opgelost.", 248 ], ...).


Cheers, Ben

Op 31-03-2020 om 16:29 schreef Martin Honnen:
> So does the working function return a sequence of arrays? That doesn't
> match the
>   as array(*)
> return type declaration, it seems.
> 
> What does tidyTM:wordCount_arr() return, a single array (of atomic items)?


Reply via email to