Hello Alexandre,
Any phrase is searched as a group of words, so each word should be found
first in the free-text index. For phrase 'social d*' it means searches
for
'social' and 'd*', then checks for word positions such that 'social' and
'd*' are in the desired proximity at some place of some document that
contains both. So 'd*' should be selective enough by itself, without a
connection to "social"; however it does not.
Re. "litterals < 4 chars" :
This can be tweaked in
wp_wildcard_range (const char * word, caddr_t * lower, caddr_t * higher)
{
char * star = strchr (word, '*');
int leading = star ? (int) (star - word) : 0;
if (star)
{
if (leading < 4)
return RANGE_ERROR;
...
in libsrc/Wi/text.c, by replacing "leading < 4" with "leading < 2" or
something, but that can simply make different problems to report if
#define for WST_WILDCARD_MAX is kept unchanged and there are too many
words that match the wildcard.
Best Regards,
Ivan Mikhailov
OpenLink Software
http://virtuoso.openlinksw.com
On Tue, 2011-03-15 at 14:01 -0500, Alexandre Passant wrote:
> Hi all,
>
>
> I have trouble to understand how the free-text index works in
> Virtuoso, especially the minimal length of the indexed string.