Hi!

On Thu, Mar 09, 2006 at 04:31:23AM -0800, Raul Raja Martinez wrote:
> Hi I have a lot of html indexed such as:
> 
> Martínez
> 
> Of course my users are gonna search for Martínez and they're not gonna 
> get a match.
> 
> Is there a common approach to solve this kind of problem in lucene, 
> Maybe some utility class or something?

there is a class named Entities in Jakarta commons-lang, which can
be used to resolve html entities before indexing. Maybe you could
integrate this into a custom analyzer.

http://jakarta.apache.org/commons/lang/xref/org/apache/commons/lang/Entities.html

regards,
Jens

-- 
webit! Gesellschaft für neue Medien mbH          www.webit.de
Dipl.-Wirtschaftsingenieur Jens Krämer       [EMAIL PROTECTED]
Schnorrstraße 76                         Tel +49 351 46766  0
D-01069 Dresden                          Fax +49 351 46766 66

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to