Hi Hans-Jürgen,

You are right. I’ve created an issue to get this fixed [1].

Best,
Christian

[1] https://github.com/BaseXdb/basex/issues/2141



On Tue, Sep 13, 2022 at 4:43 PM Hans-Juergen Rennau <hren...@yahoo.de> wrote:
>
> Dear BaseX people,
>
> it seems to me there is a bug concerning Full Text Search, using option 
> "window":
>
> let $text1 := '1 The usability of a Web site is how well the site'
> let $text2 := '2 The usability of a Web site is how well the sitx'
> let $text3 := '3 The usability of a Web site is how well the site'
> return (
>   $text1[. contains text 'usability web site' all words window 5 words],
>   $text2[. contains text 'usability web site' all words window 5 words],
>   $text3[. contains text 'usability web site' all words window 10 words]
> )
>
> This query should return all three, $text1, $text2 and $text3, but it only 
> returns $text2 and $text3.
>
> So it seems to me that the implemented logic is: "all matches of the "all 
> words" search must be within a 5-words window, but it should be "there is a 
> match of the "all words" search which is within a 5-words window. More 
> detailed argument in PS.
>
> Kind regards,
> Hans-Jürgen
>
> PS: Compare https://www.w3.org/TR/xpath-full-text-30/#ftwindow
>
> "A window selection examines the matches generated by the preceding portion 
> of the FTSelection, and selects those for which the matched tokens and 
> phrases (more precisely, the individual StringIncludes of that match) are all 
> found within a window whose size is a specified number of FTUnits (words, 
> sentences, or paragraphs); for each such window, the window selection then 
> generates a match containing the merge of those StringIncludes, plus any 
> StringExcludes that fall within the window."
>
> (Italic added by me)
>
> The detailed semantics are given in 4.2.4 FTWords [1]. Pseudo-function 
> fts:applyFtWordsAllWord()  constructs for each of the words an fts:allMatches 
> element and then performs conjunction of these elements, based on recursive 
> application of 4.3.6.2 FTAnd [2]. Each fts:allMatches element contains one 
> fts:match element for each occurrence of the word in question. The "ANDing" 
> of two fts:allMatches is described by pseudo-function fts:ApplyFTAnd(), which 
> creates one match for each pair of matches found in the operands - in other 
> words, all combinations of operand matches are considered.
>
> As an example consider:
>     "foo bar" all words
>
> and assume "foo" occurs two times and "bar" occurs two times. The 
> fts:allMatches element representing the result is the result of applying 
> fts:ApplyFTAnd() to two fts:allMatches elements, one obtained for word "foo", 
> one obtained for word "bar". Schematically:
>
> $fts:allMatches_foo:
>     <fts:match>foo(1)</fts:match>
>     <fts:match>foo(2)</fts:match>
>
> $fts:allMatches_bar:
>     <fts:match>bar(1)</fts:match>
>     <fts:match>bar(2)</fts:match>
>
> fts:applyFTAnd($fts:allMatches_foo, $fts:allMatches_bar) is a single 
> fts:allMatches containing four matches, each one of which is a combination of 
> matches found in the operands:
>
> <fts:allMatches_allwords> =
>     <fts:match>foo(1), bar(1)</fts:match>
>     <fts:match>foo(1), bar(2)</fts:match>
>     <fts:match>foo(2), bar(1)</fts:match>
>     <fts:match>foo(2), bar(2)</fts:match>
>
> Now extend the query to
>     "foo bar" all words window 5 words
>
> The result is true, if there is at least one combination of occurrences of 
> "foo" and "bar" found in a window of at most 5 for word "bar". Referring to 
> the semantic intermediates: if <fts:allMatches_allwords> contains at least 
> one <fts:match> element for which all contained matches satisfy the window 
> condition.
>
> [1] https://www.w3.org/TR/xpath-full-text-30/#tq-ft-fs-FTWords
> [2] https://www.w3.org/TR/xpath-full-text-30/#tq-ft-fs-FTAnd
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

Reply via email to