After patching release 3.2.0b3 to compile with Mac OS X Server 1.2r3 (patch see below), I found that rundig will not index words containing german umlauts coded by plain HTML character entities.
While seeing many discussions around locales, I don't even think that the locale is the real problem. The system is setup correctly (in this case german) and even htsearch seems to behave "german". Example: assume you have a HTML file containing the word phrase "Büro". Run rundig and try to find the phrase "B�ro" with htsearch. "htsearch" will correctly interpret and encode umlauts like �, �, �, etc. to their HTML entities "ä", "ö", "ü", etc. You can see this by typing the search phrase "B�ro". As a result htsearch will show up a page with HTML source "No matches were found for 'Büro'" When looking into the ascii wordlist retreived by "htdig -t ..." you don't find any word containing umlauts. However most of my documents beeing indexed contain words encoded by HTML character entities. Thus, I think the problem seems to be related to indexing (htdig?) and not locales. Any more experiences? Any hints for further patches? ---------------------------- Patch for htdig 3.2.0b3 / Mac OS X Server 1.2r3 1. Edit file htlib/mktime.c 2. Change the debug section (line 48-55) to be #if DEBUG # include <stdio.h> # if STDC_HEADERS # include <stdlib.h> # endif /* Make it work even if the system's libc has its own mktime routine. */ //# define mktime my_mktime #endif /* DEBUG */ // AS COMMENT ABOVE INTRODUCES: // SHOULD BE OUTSIDE FOR OS X Server 1.2r3: // system libc.a contains mktime here! # define mktime my_mktime 3. Not htdig should compile without further maintenance _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

