According to =?ISO-8859-1?Q?Am=E9lie_Frenette?=: > I have asked you questions a couple of time and I thank you much for your > work ! Another challenge is following... > > search characters "l'" gives no results (it is in the stop words list) > search characters "l'asthme" 55 results > search characters "asthme" 111 results > > I wonder if you can help me with this one ?
Well, that depends on what the problem is. Note that htdig's bad_words list is not a list of stop words in the sense that htdig would stop indexing if it encounters one of these words. It merely suppresses these words from the search database, and from the search query. So, given that, what you describe above doesn't sound like a problem to me. If the apostrophe is in valid_punctuation, as it is by default, then when htdig encounters a word like "l'asthme" it will try to index the whole word, stripped of punctuation, plus any component parts of it. In this case, the "l" is suppressed, so only "asthme" and "lasthme" go into the word database. htsearch doesn't do the splitting, however, so a query for "l'asthme" would only match "lastme" in the database, while a query for "asthme" would match "asthme" in the database, whether it came from the word by itself, or from "l'asthme" after the "l'" was stripped off. So, it's logical that it would get more matches. -- Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) _______________________________________________________________ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

