Hi > On Jul 24, 1:59 am, Pat Allan <[email protected]> wrote: >> Hi Robb >> >> Current versions of TS don't use stemming by default, but the english >> stemmer is something bundled with Sphinx. If it stems international >> to >> intern, then I guess it's not perfect :) >> ..
>> >>> Is this English morphology stemming going on? Surprises me if so. >>> Because here in English, this is obviously an incorrect result. http://www.google.com/search?client=safari&rls=en-us&q=%22intern+relations%22&ie=UTF-8&oe=UTF-8 Some people abbreviate "international relations" to "intern. relations". I guess the stemmer is picking that up, unfortunately. It'd be nice if one could select that sort of thing as a parameter (ie: perform stemming, but ignore abbreviations). Anyone know if that's feasible? This is the sort of thing that makes me avoid the stemmer setting... I'd hate to confuse users. As an aside: There's a mobile service provider in South Africa called MTN. Whenever I searched for them on google, google expanded "mtn" to "mountain" thinking I was abbreviating. Very irritating... Seems they have fixed it now, though. Oskar --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/thinking-sphinx?hl=en -~----------~----~----~----~------~----~------~--~---
