According to Thilo Bauer:
> Beeing aware of the fact, that htdig 3.2.0b3 is
> still beta and while trying to evaluate the general
> problem with special characters and HTML character
> entities I got it compile and run on at least two
> different platforms:
> 
> - Mac OS X Server (1.2r3)
> - Cygwin (Win NT, latest release)

Not only is 3.2.0b3 a beta, it's also a very old beta.  The 3.2.0b4
release is still under development, but it fixes lots of known
problems with 3.2.0b3, which we don't recommend you spend any more
time on.  Please grab the latest 3.2.0b4 development snapshot from
http://www.htdig.org/files/snapshots/ and give it a try.

> Finally, there still remains one common problem
> on those environments:
> 
>    htdig doesn't index words containing
>    special characters (umlaut: ä �, etc.).
> 
> Wouldn't it be possible to create a more portable
> solution which does not make assumptions on locales
> or something other operating system and installation
> dependent stuff?

Yes, I think the use of locales was to make it easy to add a bit of
support for internationalisation to ht://Dig.  Unfortunately, there are
lots of limitations, problems, and configuration issues involved in this.
Ultimately, it would be better to make ht://Dig locale-independent,
but it's not a trivial change and so far no one has stepped up to the
challenge.

In the meantime, though, if htdig isn't indexing accented characters
properly, it's almost certainly a problem with your locale
setting or perhaps with the locale implementation on your system.
See http://www.htdig.org/FAQ.html#q5.8 for details.  The fact that
htsearch is converting ISO-8859-1 characters back to their SGML entities
does NOT imply that locale support is working - this conversion is done
independently of the locale setting.  (This in itself is a problem,
because it does the conversion even if you're indexing using a different
encoding.  Fixing this is on our to-do list.)

> As a hint for others, evaluating this beta release
> I'll provide bugfixes to be corrected manually to
> successfully build the binaries:
> 
> 1. Mac OS X Server 1.2r3:
> 
> "mktime" resides in libc.a, which doesn't
> seem to be detected automatically (configure).
> 
> In file "htlib/mktime.c" (48-55):
> move "#define my_mktime..." out of the "#ifdef DEBUG"
> block section. It now should compile (I've tuned
> the standard version including OpenSSL, libz.a, etc.).

Please let us know if this is still a problem with a recent 3.2.0b4
snapshot.

> 2. Cygwin:
> 
> ... has problems with some "#ifdef" sections.
> Neither "DBL_MAX" nor "FLOAT_MAX" is defined here
> (file: htsearch/Display.cc). Another problem was
> related to an "#ifdef" section in file "htlib/HtRegex.h":
> "regex.h" and <regex.h> includes have to be exchanged
> (inverse logic).

I believe all this is fixed in current 3.2.0b4 development.  Please
let us know if that's not the case.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to