Hi! I'd like to respond on this point:
> 5. Can someone imagine situation when more than one Analyzers are used in an application? Not only can I imagine such a situation, but I'd also strongly recommand it for any high-quality application! If you are just targetting speed and light cpu usage, sure, one single analyzer is enough. But your application will get the precision/recall it deserves. A nice search engine should be flexible enough to use several analyzers, and combine their result to retrieve the best possible recall/precision. For example, say you are looking for something related to "selling toothbrushes". The application should retrieve all the occurrences matching exactly "selling toothbrushes" (using a strict analyzer), but it may also retrieve "sell toothbrush" (using a stemming normalizer). Why not retrieving "buy toothbrush" or "sell dental tools" as well (kind of semantic normalizer/analyzer). One could also imagine retrieving "Selin Toothbrushies" (phonetic normalizer). Ok, so this increases the precision, but unfortunately increases drastically the recall, right ? wrong : all this analyzers should be ordered, and the final result should be a calculation using the results of all those indexes. For instance, the results of the strict-analyzer-index should be heavier than stemming, which should be heavier than phonetic, etc. The very simple reason is that the more aggressive is the normalization process, the less likely /hazardous is it to be exactly what the user is looking for. Sure, it's CPU intensive, but here is the dilemma of the search engines : be fast or be smart. My belief is that lucene, as a search engine, should allow both kind of application (and I personnaly prefer smart SE, rather than fast ones). Rodrigo -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
