Hi, On Wed, 15 Jan 2020 at 06:44, Arun Isaac <arunis...@systemreboot.net> wrote:
> > Well, the issue is 'scoring' a query. > > I think the issue of whether to use an inverted index is orthogonal to > the quest to improve the relevance of the search results. Implementing > tf-idf like you suggested could greatly benefit from having fast > searches. I think it is not so orthogonal. In general, a fast and good system is a combination of relevant scoring adapted to the good data structure, IMHO. However, I agree that adding an inverted index will improve the current situation of "guix search" -- keeping the current scoring function -- and ease the end-user experience. > Pierre Neidhardt <m...@ambrevar.xyz> writes: > > > By the way, what about using Xapian in Guix? > > > > https://en.wikipedia.org/wiki/Xapian > > > > If it's relevant, maybe we can follow up with a discussion in a new > > thread. > > I feel xapian is too much work (considering that we don't yet have guile > bindings) compared to our own simple implementation of an inverted > index. But, of course, I am biased since I wrote the inverted index > code! :-) It depends on how long run we are talking. :-) Xapian avoids to reinvent the wheel. ;-) > But, on a more serious note, if we move to xapian, we will not be able > to support regular expression based search queries that we support > today. I am not convinced... > On the question of whether xapian is too heavy, I think we should make > it an optional dependency of Guix so that it remains possible to build > and use a more minimalistic Guix for those interested in such things. Guix comes with SQLite and it is ok. The question is: how Xapian is minimalist. :-) (need some investigation) All the best, simon