Hello! The FTS3 work fine but is really unfriendly to developers. As example it is easy to write tcl interface code for snowball stemmer utility "stemwords" and for stopwords dictionary but there are no ways to use it in FTS3. The user functions can be easy writed on C or any other language but FTS3 does not work with these.
1. There are no interfaces for stemmer, stopwords dictionary, etc. in the FTS3 extension. It's very difficult to understand the code of FTS3 extension and change it. Is it possible to add calls of user-defined functions for this tasks? The virtual table creating command may be extended as sql-command ::= CREATE VIRTUAL TABLE [ database-name .] table-name USING fts3 [( [ argument [, argument, [argument, ...] ]*] )] argument ::= name | TOKENIZE tokenizer | FUNCTION user_function tokenizer ::= SIMPLE | PORTER | user-defined When FUNCTION return null than the word must be ignored else the tokenized word is replaced by returned from function. As example application can bind these functions like as #!/usr/bin/tclsh8.5 package require sqlite3 sqlite3 db :memory: proc stopword {word} { ... } proc stemmer {word} { ... } db function stopword stopword db function stemmer stemmer db eval {CREATE VIRTUAL TABLE t USING fts3(content, TOKENIZE icu ru_RU, FUNCTION stopword, FUNCTION stemmer);} Of cource we can extend the example above with a synonyms dictionary function or internal soundex() function or other. I think the feature is "must have". 2. The snippet function have now the ability for change snippet text size and return very small text fragment. As example the standalone unix diff -u command return 3 lines before and after context and this can easy be changed by command-line arguments. Yes, application can use self snippet realization on base of offsets() information but it's produce additional difficults. 3. The user defined tokenizer function will be very helpful. The tokenizer is stream interface and must have the stream position so the user defined tokenizer can have the interface like to tokenizer (document_text, document_position) This function can be called from xNext() interface function. I don't sure about the realization and may be the interface will be different. Best regards, Alexey Pechnikov. http://pechnikov.tel/ _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users