Jeff Cc wrote: >> Do you mean the "*" feature (prefix* and *infix*) ? Where >> the search term "program*" matches the database text >> "program", "programmer", "programs" ... >> >> Those work for me in version sphinx-0.9.8-svn-r1065 and >> sphinx-0.9.8-svn-r1112 ... I have done quite some testing >> on r0165 (still testing the newest r1112) and that seems >> to work OK for me. Set the "enable_star" to 1 and set a >> min_prefix_leng or a min_infix_leng. > > No that's the stemming feature I believe.. it just changes prefixes > and suffixes on words and is language dependent. Awesome feature > however. Not sure how (or if) Ferret implements it. > > What I meant was just straight wildcards as in a MySQL LIKE clause, > example: "[EMAIL PROTECTED]" to find all emails @gmail.com
I have the impression the enable_star _is_ really the feature that does allow search for "[EMAIL PROTECTED]" to find all emails @ gamil.com (if you add the '@' sign to the char table actually ... (which is another problem, since '@' also has a special meaning as a field indicator for field specific search). For the enable star the user must explicitely give a '*'. WIthout a '*' the match is only for "exact match". I give an example at the end of my blog: (http://www.vandenabeele.com/Ultrasphinx-performance) where I tested with and without the enable_star feature and always without stemming (since I had not stemmer for the Duthch language). 0.001 sec [ext/0/rel 1409 (0,20)] [complete] c 0.001 sec [ext/0/rel 1409 (0,20)] [complete] c* 0.000 sec [ext/0/rel 35 (0,20)] [complete] co 0.000 sec [ext/0/rel 35 (0,20)] [complete] co* 0.000 sec [ext/0/rel 5 (0,20)] [complete] com 0.000 sec [ext/0/rel 5 (0,20)] [complete] com* 0.000 sec [ext/0/rel 10 (0,20)] [complete] comp 0.003 sec [ext/0/rel 5343 (0,20)] [complete] comp* 0.000 sec [ext/0/rel 0 (0,20)] [complete] compl 0.000 sec [ext/0/rel 1473 (0,20)] [complete] compl* 0.000 sec [ext/0/rel 0 (0,20)] [complete] comple 0.000 sec [ext/0/rel 1214 (0,20)] [complete] comple* 0.000 sec [ext/0/rel 0 (0,20)] [complete] complet 0.000 sec [ext/0/rel 793 (0,20)] [complete] complet* 0.000 sec [ext/0/rel 458 (0,20)] [complete] complete 0.000 sec [ext/0/rel 642 (0,20)] [complete] complete* 0.000 sec [ext/0/rel 30 (0,20)] [complete] completed 0.000 sec [ext/0/rel 30 (0,20)] [complete] completed* 0.000 sec [ext/0/rel 0 (0,20)] [complete] completel 0.000 sec [ext/0/rel 130 (0,20)] [complete] completel* 0.000 sec [ext/0/rel 10 (0,20)] [complete] completely. What happens is that with less than 4 characters, the * has no effect, but from 4 characters on, the * expands to all words that match the same first 4 letters. And that is an interesting feature the major public search engines do not offer. At this time, with the relatively small database I expect initially for our project (< 10 MByte or so), it should not be a problem to keep indices with start expansion after 4 letters in memory. An issue that I still have is that a final '.' of a sentence is attached to the index data and so not found without attaching a '.' or '*' to the search term. ++++ I solved the '.' issue in the meanwhile with a crude solution of removing the '.' character from the char_table list (which causes other problems ...). The stemming will e.g. 'companies' and 'company' to a stem of 'compani' (both in the search term and in the database index), without the user needing to add a special * to the search. so any combination of 'company' and 'companies' will match. HTH, Peter -- Posted via http://www.ruby-forum.com/. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Deploying Rails" group. To post to this group, send email to rubyonrails-deployment@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/rubyonrails-deployment?hl=en -~----------~----~----~----~------~----~------~--~---