According to Thilo Bauer: > Beeing aware of the fact, that htdig 3.2.0b3 is > still beta and while trying to evaluate the general > problem with special characters and HTML character > entities I got it compile and run on at least two > different platforms: > > - Mac OS X Server (1.2r3) > - Cygwin (Win NT, latest release)
Not only is 3.2.0b3 a beta, it's also a very old beta. The 3.2.0b4 release is still under development, but it fixes lots of known problems with 3.2.0b3, which we don't recommend you spend any more time on. Please grab the latest 3.2.0b4 development snapshot from http://www.htdig.org/files/snapshots/ and give it a try. > Finally, there still remains one common problem > on those environments: > > htdig doesn't index words containing > special characters (umlaut: ä �, etc.). > > Wouldn't it be possible to create a more portable > solution which does not make assumptions on locales > or something other operating system and installation > dependent stuff? Yes, I think the use of locales was to make it easy to add a bit of support for internationalisation to ht://Dig. Unfortunately, there are lots of limitations, problems, and configuration issues involved in this. Ultimately, it would be better to make ht://Dig locale-independent, but it's not a trivial change and so far no one has stepped up to the challenge. In the meantime, though, if htdig isn't indexing accented characters properly, it's almost certainly a problem with your locale setting or perhaps with the locale implementation on your system. See http://www.htdig.org/FAQ.html#q5.8 for details. The fact that htsearch is converting ISO-8859-1 characters back to their SGML entities does NOT imply that locale support is working - this conversion is done independently of the locale setting. (This in itself is a problem, because it does the conversion even if you're indexing using a different encoding. Fixing this is on our to-do list.) > As a hint for others, evaluating this beta release > I'll provide bugfixes to be corrected manually to > successfully build the binaries: > > 1. Mac OS X Server 1.2r3: > > "mktime" resides in libc.a, which doesn't > seem to be detected automatically (configure). > > In file "htlib/mktime.c" (48-55): > move "#define my_mktime..." out of the "#ifdef DEBUG" > block section. It now should compile (I've tuned > the standard version including OpenSSL, libz.a, etc.). Please let us know if this is still a problem with a recent 3.2.0b4 snapshot. > 2. Cygwin: > > ... has problems with some "#ifdef" sections. > Neither "DBL_MAX" nor "FLOAT_MAX" is defined here > (file: htsearch/Display.cc). Another problem was > related to an "#ifdef" section in file "htlib/HtRegex.h": > "regex.h" and <regex.h> includes have to be exchanged > (inverse logic). I believe all this is fixed in current 3.2.0b4 development. Please let us know if that's not the case. -- Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

