I accidentally sent this message to the sqlite-dev mailing list and finally
found this thread to post it in the appropriate location:


So I was having a peruse of SQLite documentation and found this FTS5 branch
in the timeline.
http://www.sqlite.org/src/timeline?n=100&r=fts5

>From what I gather, some of the main improvements include Okapi BM25 ranking
as a default ranking option and connecting into custom ranking functions.

https://github.com/neozenith/sqlite-okapi-bm25

I have spent some time forking this implementation from 'rads' to allow for
weighted fields in the bm25f and it is great to see this implemented in this
changeset:
http://www.sqlite.org/src/info/1cd15a1759004d5d

Although I see no provision for the bm25+ lower bounding fix where a
document that has 1 of 10 tokens matching in a document ranking the same as
10 of 100 tokens matching in a document.

Although they both have 10% of the document matching, having 10 matching
terms should be higher than a document with only 1 term.
http://en.wikipedia.org/wiki/Okapi_BM25#Modifications



Also since there is work already being done on the FTS engine I would like
to call attention to this thread I started some time ago regarding exposing
token position (as opposed to byte position) to allow for proximity ranking
algorithms.
http://sqlite.1065341.n5.nabble.com/Proximity-ranking-with-FTS-td76149.html

Otherwise super excited about this branches development.



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/fts5-tp77822p80578.html
Sent from the SQLite mailing list archive at Nabble.com.

Reply via email to