Hi,
I downloaded the amalgamation sources in order to create a build of sqlite
with FTS3 enabled. The problem for me is that the default "simple" tokenizer
is not behaving precisely how I want. In fact, I'd prefer if it wouldn't
count punctuation as a delimeter, and stuck purely to whitespace.
In the simpleCreate() function there's some code that initializes an array
that records with characters are delimiters or not:
for(i=1; i<0x80; i++){
t->delim[i] = !isalnum(i);
}
I thought that if I made a simple edit to use the isspace() function then
I'd achieve what I was after, i.e.,
for(i=1; i<0x80; i++){
t->delim[i] = isspace(i);
}
However, when I build this version, create my fts virtual tables and then
query them I get zero results. When I revert back to !isalnum I get results,
but as I'm seeing words that are being split where I don't want them to be.
I must admit my C experience isn't great, but I've been trying for far too
many hours now with little gain. I'd really appreciate some pointers!
Thanks in advance,
Andy
--
View this message in context:
http://www.nabble.com/Simple-Tokenizer-in-FTS3-tp22911635p22911635.html
Sent from the SQLite mailing list archive at Nabble.com.
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users