The following XQuery run in the GUI (pulled from github and built a few minutes ago from source)

ft:tokens('testdata'),
ft:search('testdata', 'r.ḥ', map {'wildcards': true()})/.., '----------'
,collection('testdata')//*[text() contains text 'r.ḥ' using wildcards]

yields

<entry count="4">rwḥ</entry>
----------

with collection('testdata')

<_>
  <gram xmlns="http://www.tei-c.org/ns/1.0"; type="root" 
xml:lang="ar-aeb-x-vicav">rwḥ</gram>
  <gram xmlns="http://www.tei-c.org/ns/1.0"; type="root" 
xml:lang="ar-aeb-x-vicav">rwḥ</gram>
  <gram xmlns="http://www.tei-c.org/ns/1.0"; type="root" 
xml:lang="ar-aeb-x-vicav">rwḥ</gram>
  <gram xmlns="http://www.tei-c.org/ns/1.0"; type="root" 
xml:lang="ar-aeb-x-tunis-vicav">rwḥ</gram>
</_>

But the gh1800() test changed like this:

final String text ="999 aa 1111 rwḥ";
[...]
query("ft:search('" +NAME +"', 'r.ḥ', " + options +")", text);

works.

Am 06.02.2020 um 13:45 schrieb Christian Grün:
I just tried to use the gh1800 test to replicate my problem and it does
not show there. It fails using the GUI.
I need your help: What does not show there? What fails, what happens?




Am 06.02.2020 um 13:35 schrieb Christian Grün:
Hi Omar,

Yes, that seems to solve the problem partly. Using wildcards now yields the 
same result as no wildcards.
Glad to hear.

But if there is a complex unicode character in the search string, "." for one 
character looses its meaning.
…
Would you like a PR for the test gh1800 using complex unicode characters?
A little test case would be helpful indeed. It seems to be a different issue:

• The first expression is evaluated without the full-text expression.
The reason is that the full-text index algorithms are limited to basic
regular expressions; not all of them can be answered by an index (and
'r{1,1}' is currently not detected as being identical to `r.`). If I
remember correctly, the index will not be accessed either if a pattern
starts with `.*` (this pattern would lead to a full index scan).

• The second expression is rewritten for index access. I tried to
build a little command script (test.bxs), but it doesn’t seem to
reflect the case you encountered:

set ftindex true
create db test <xml>rwḥ</xml>
xquery /*[text() contains text 'r.{1,1}ḥ' using wildcards]
xquery /*[text() contains text 'r.ḥ' using wildcards]
close

Could  you extend this example script a little, such that it
demonstrates what goes wrong?

Thanks in advance,
Christian

Reply via email to