?Hi, I know I'm a newcomer into the SQLite project, but I'm excited about what FTS5 has to offer. To me it seems simple and powerful, and has some really nice ideas.
Is it possible for me to contribute on the module, or is it too late for that? I would like to mention two new ideas I would offer to introduce. First, a customizable list of stopwords: https://en.wikipedia.org/wiki/Stop_words ? (I didn't find anything similar to that in the documentation, am I missing something?) I know I can add it via a custom tokenizer, but wouldn't it be useful to have it straight out of the box? Also, I would like to mention the usefulness of some statistics to create more advanced ranking formulas. Things like: the Longest Common Subsequence between query and document, number of unique matched keywords, etc. These and other values are really useful in applications where bm25 is not suitable or enough. I come from using an engine called Sphinx Search (used on huge things like Craigslist), which offers such factors. Using them, they have defined rankers that mix bm25 with proximity, and some other they call SPH_RANK_SPH04, which includes a weighting boost for the result appearing at the beginning of the text field, and a bigger boost if its an exact match: http://sphinxsearch.com/docs/latest/builtin-rankers.html The formulas (in sphinx higher is better) for them are: http://sphinxsearch.com/docs/latest/formulas-for-builtin-rankers.html And the list of supported factor is: http://sphinxsearch.com/docs/latest/ranking-factors.html. Of course having all of them would be overkill, but if you find them interesting, we can get the most useful ones, allowing people to build rankers to their own needs. ?Once again, you people are the experts and know if such ideas are feasible and where is the right place to include them, so please tell me your opinions. ?