Thanks Daniel,

But when searching, I will run my "standardization" tools again before
querying Lucene, so what you mentioned will not be a problem.
If someone searches for mainstrasse, my tools will split it again to main
and strasse, and then lucene will be able to find it.


Daniel Naber-5 wrote:
> 
> On Monday 21 May 2007 22:05, bhecht wrote:
> 
>> Is there any point for me to start creating custom analyzers with filter
>> for stop words, synonyms, and implementing my own "sub string" filter,
>> for separating tokens into "sub words" (like "mainstrasse"=> "main",
>> "strasse")
> 
> Yes: I assume your document should be found both with "strasse" and with 
> "mainstrasse". You will then need to put main, strasse, and mainstrasse at 
> the same position (setPositionIncrement(0)). If you don't do that, phrase 
> queries will not work anymore as expected. Thus you need an analyzer, 
> modifying the string before they are put in Lucene is not enough.
> 
> Regards
>  Daniel
> 
> -- 
> http://www.danielnaber.de
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/stop-words%2C-synonyms...-what%27s-in-it-for-me--tf3792510.html#a10726812
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to