Package: wnpp
Severity: wishlist
Owner: Edward Betts <edw...@4angle.com>
X-Debbugs-Cc: debian-de...@lists.debian.org, debian-python@lists.debian.org

* Package name    : sqlite-fts4
  Version         : 1.0.1
  Upstream Author : Simon Willison
* URL             : https://github.com/simonw/sqlite-fts4
* License         : Apache 2.0
  Programming Lang: Python
  Description     : Document scoring Python library for SQLite FTS4

Custom SQLite functions written in Python for ranking documents indexed
using the FTS4 extension.

## rank_score()

This is an extremely simple ranking function, based on an example in the
SQLite documentation. It generates a score for each document using the
sum of the score for each column. The score for each column is
calculated as the number of search matches in that column divided by the
number of search matches for every column in the index - a classic
TF-IDF calculation.

## rank_bm25()

An implementation of the Okapi BM25 scoring algorithm.

## decode_matchinfo()

SQLite's built-in matchinfo() function returns results as a binary
string. This binary represents a list of 32 bit unsigned integers, but
reading the binary results is not particularly human-friendly.

## annotate_matchinfo()

This function decodes the matchinfo document into a verbose JSON
structure that describes exactly what each of the returned integers
actually means.

Blog post about the creation of this library:
https://simonwillison.net/2019/Jan/7/exploring-search-relevance-algorithms-sqlite/

This is a dependency of the sqlite-utils tool by the same author.

I plan to maintain this package as part of the python modules team.

Reply via email to