Hi, I'm working on a simple search robot. I want to index some pages of my site, a bit like ht:/dig does. The first step is to retrieve all the important words in the page. As I'm french, I also have to deal with accents. I choosed to loose a bit of information, by replacing them with the simple character, ie replacing � by e, and also é by e. The problem is that there is *A LOT* of characters I 'd like to suppress (and it has to be doubled because of the HTML chars). Do you guys know of a simple way to do that ? Maybe a way to restrict the characters to the unaccentuated one ? Best regards Sylvain Computers are like air conditioners - they stop working properly when you open Windows UNIX _IS_ user friendly. It's just selective about who its friends are. "If Bill Gates had a nickel for every time Windows crashes... Oh, wait! He does!" ______________________________________________ Sylvain Roche Responsable d�veloppement Add-Online www.add-online.fr 80 rue d'Alsace 69100 VILLEURBANNE France tel : +33 437431260 fax: +33 437431269 =========================================================================== To unsubscribe: mailto [EMAIL PROTECTED] with body: "signoff JSP-INTEREST". Some relevant FAQs on JSP/Servlets can be found at: http://java.sun.com/products/jsp/faq.html http://www.esperanto.org.nz/jsp/jspfaq.html http://www.jguru.com/jguru/faq/faqpage.jsp?name=JSP http://www.jguru.com/jguru/faq/faqpage.jsp?name=Servlets
