On 03/20/2018 02:31 PM, zheng xiaojin wrote:
(I am not sure whether you have seen my message, so resend again)
Hi,
When I use FTS5, I have met that, there are some cases which will get mis-match 
results with prefix search.

Like "select * from tbl_fts where tbl_fts match 'lucy*';",which I want to get records like 
"lucya","lucyabc" etc, and

"lux" or "lulu" is not what I want but returned.

Such problems are not common, But I have tried to build such test case which 
can lead to this problem very easy.

Thanks for looking into and reporting this problem. Are you able to post the source for the program you used to execute the test described below?

Dan.


Here is how I generate it:

1) create an fts5 table. and insert some record like "lucya","lucyb".

2) prepare some records: a) lusheng b)lulu; c)lunix; d)luma; e) pengyu. a,b,c,d 
are have some same prefix(lu), e is some other random case.

3) before insert into the fts table with 2) records, appending some random 
letter to make each record different.

Like: "lulu","luluabc","luluefg", also "lunix","lunixabc",etc

4) for-loop insert, and each loop trying to lantch the query 'lucy*', \

check the match result will finding the mis-match result, the corrent results should be 
"lucya","lucyb", not "luluabc"...


When mis-match happen, I try to analyze the prefix search mechanism and find 
that, there are 2 points which I think have problems:

1) fts5LeafSeek, when search failed, and exec goto search_failed, in 
search_failed, the 2 if condition will not satisfy commonly. In my mind, I 
think it should return,

but not, and then the search_success logic exec.

2) fts5SetupPrefixIter, when gather results, the logic to set the flag bNewTerm 
has some leak, which will set bNewTerm=false,

but the record is not what we want indeed.


These 2 logic problems lead to mis-match results. I try to remove the bNewTerm 
logic directly, and make it compare every loog,

then, the mis-match results disappear.

// relevant code

Change below

if (bNewTerm) {

if (nTerm < nToken || memcmp(pToken, pTerm, nToken)) break;

}

to

if (nTerm < nToken || memcmp(pToken, pTerm, nToken)) break;


Need your help to recheck the FTS5 prefix search logic, thank you very much.

Yours,

xiaojin zheng


获取 Outlook for Android<https://aka.ms/ghei36>

_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to