-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi,
I am aware of the fact that the corpus is a bit small (but nicer for presentation purposes), but it surprised me that I found no way (even when playing with the parameters) to get at least 1 common word from the the set. (it wasn't intended to be usable, but presentable) I will play around a bit more and add some documents. Thanks for the hints. Greetings Florian Gilcher Jens Kraemer wrote: > Hi, > > first of all, 6 documents is not really a corpus to judge the usability > of more_like_this - by default it will only consider terms occuring in > at least 5 documents to be of any relevance (:min_doc_freq option). So > if you have very different documents where the only common words are > filtered out as noise words, you'll end up without any terms to use > for finding similar documents, which would lead to the query you > mentioned. > > However more_like_this should indeed return an empty result set in this > case ;-) > > Besides that, you should store term vectors (give :term_vector => :yes > for the fields you want to use more_like_this on in your call to > acts_as_ferret), this will speed up the search for relevant terms. > > > Jens > > > On Tue, Jul 17, 2007 at 12:11:55PM +0200, Florian Gilcher wrote: > Hi, > > I have the following Problem: > > I created a fairly simple sample project to try out acts_as_ferret and > present the results. > > The test set is relatively easy: I have extracts from 6 > Wikipedia-Articles about several Topics, which are copied into a model > that has two fields: title and text. This works quite well, until I try > to use #more_like_this, which returns all of the other articles, even if > they have nothing to do with the active article. I debugged a bit and > found out that the query build by #more_like_this is nothing more then > "-id:<id of the active record>". > (so the _result_ is correct) > > To try that out on the console, I used: > > entry = Entry.find(1) > entry.more_like_this(:field_names => ['text']) > > Either I'm doing something entirely wrong or there is a bug. ;) Before > filing a ticket, I want to rule out the first case. > > Ferret version is 0.11.4, aaf version is the current stable version > (although trunk didn't work as well). > > I uploaded the demo project together with a dump of the Database to: > > Project: http://putstuff.putfile.com/95477/8752808 > Dump: http://putstuff.putfile.com/95479/6169502 > > Thanks in advance. > Florian Gilcher > > P.S.: There is another minor bug. Altough #more_like_this does set a > default option for :field_names (line #35), this option leads to a crash > in #retrieve_terms. The default option is nil and #retrieve_terms thus > tries to call #each on nil. (line #113) _______________________________________________ Ferret-talk mailing list [email protected] http://rubyforge.org/mailman/listinfo/ferret-talk >> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGnLDa8RlGMqQ8m7oRAvfwAJ9Tf3n8doy/EzkDS/Q4Mgf+WNTZZwCeMCnu 75or+J8oDXojyqO4oUzt3IY= =uhKz -----END PGP SIGNATURE----- _______________________________________________ Ferret-talk mailing list [email protected] http://rubyforge.org/mailman/listinfo/ferret-talk

