On Fri, 04 Dec 2009 06:52:38 -0500, Aaron Ecay <aaronecay at gmail.com> wrote: > The same algorithm is implemented in C here: > http://www.mnogosearch.org/guesser/ > > Licensed under the GPL and includes presets for ~50 languages.
That indeed does look very interesting, (at least what I can get from google's cache of the website, as the server seems to be down just now). Oh, but I can just "apt-get source mnogosearch" and find src/mguesser.c and src/guesser.c at least. > A potential drawback is that it doesn't handle raw HTML very well, > according to the documentation. Shouldn't really be an issue. Notmuch will already want to de-tagify HTML before indexing anyway. -Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20091204/d0377ad3/attachment.pgp>
