Hello -- So as part of building tests, I'm regularizing the text contents of some Word documents into single strings. (Which makes it relatively easy to make sure no words have gotten lost or changed order when compared to other stages of the process.)
Regularization is a tactful way to put this particular atrocity: let $stringTidy as function(xs:string+) as xs:string := function($in as xs:string+) as xs:string {$in => string-join(' ') => replace(xquery:eval($menuMatch),'') => replace('
',' ') => replace('	',' ') => replace('
',' ') => replace('\p{Zs}',' ') => replace(' +',' ') => replace(' ([,\.;:])','$1') => replace('^ ','') => replace(' $','')} $menuMatch gets stripped out of the Word because it's added by processing, rather than being present in the source file which generates the other half of the compare. (It's currently U+1405, ᐅ, though I devoutly hope this doesn't matter!) It gets read from an XSL source document, which I've included in minimal form, along with some sample data and a minimal-ish query. If I use $menuMatch in the replace, it doesn't work, in the sense that the ᐅ character is NOT removed from the string. If I xquery:eval() it, as here, the replace does work to remove the ᐅ from the string. I don't expect to need xquery:eval to use a variable as the second argument of replace(). Am I wrong? Has the pile of arrow operators exceeded the bounds of reason? Thanks! Graydon
<<attachment: basex-test.zip>>