This is all quite fascinating, but it's not my issue, which is that in 
HTML, accented and otherwise modified characters of the Roman character 
set are supposed to be represented with HTML multi-byte special 
characters, not as high-bit-set ASCII, since the conventions for 
interpreting extended ASCII that vary from platform to platform. htdig is 
mapping these HTML special characters to a single-byte internal 
representation on text acquisition but then not mapping them back to HTML 
on output, leading to weird-looking search displays -- try searching for 
"Rabelais" at www.maroney.org, and look at how the words "Francois" and 
"Theleme" come out in Mac Explorer. Is there any way to fix this, short 
of writing a wrapper around htsearch that does character mapping? 
Shouldn't htdig just do the right thing to start with?

Also, is the answer to my other questions (about unwanted backslash 
removal in search results, and restriction to exclude subdirectories) 
that there's nothing that can be done short of modifying htdig source 
code? My ISP is finicky about binary executables and even if I dug up a 
UNIX shell login somewhere and made this change I wouldn't be able to use 
a custom version on my web site. If the answer is that these things can't 
be done in vanilla htdig I'd like to know. Thanks!

--
Tim Maroney    [EMAIL PROTECTED]    http://www.maroney.org
"The world is made possible, in part, by murk."

----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.

Reply via email to