Hi there! I suppose the bad words list contains the most often used words of the language. Is it imaginable that htdig indexes all files to be indexed and finds out the most often used words and prints them out, so I could decide which words I want to exclude from the index to speed up searching?
Would it help if I told you that the university of Leipzig has published word lists containing the 100, 1000 and 10000 most often used words of english, german, french and dutch at http://woclu2.informatik.uni-leipzig.de/html/wliste.html - no copyrights and no restrictions seem to be applied to the downloadable files? Peter Asemann _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

