According to gunis:
> I am using ht//Dig on my web server. How can I search on my web if my page
> is using windows 1250 (text include diacritical mark) encoding?

The FAQ entry http://www.htdig.org/FAQ.html#q4.10 describes how to index
documents in other languages, but the snag is that you have to choose a
"locale" that makes use of the character encoding used in your documents.

I don't know of any locales on UNIX/Linux systems that use windows-1250
encoding.  If you can find, or build such a locale, that would certainly
be one way to proceed.  (I unfortunately wouldn't know how to do either,
so I can't advise you any further on those approaches.)  You'd then need
to indicate that htsearch results are also in windows-1250 encoding, by means
of a tag like this in your header.html:

    <meta http-equiv="Content-Type" content="text/html; charset=windows-1250">

The other option is to use an external converter that maps all windows
1250 characters to their iso-8859-1 or iso-8859-2 equivalents, and use
a locale for that new encoding.  Your search results would then be in
that new encoding, so you'd need to change the meta tag above in your
header.html accordingly, or you can remove it altogether for iso-8859-1
encoding which is the default.  You can call this external converter
(say we call it w1250toiso) from htdig via this config attribute in
your htdig.conf:

external_parsers:       text/html->text/html-internal /path/to/your/w1250toiso

See http://www.htdig.org/attrs.html#external_parsers for details, and
note the distinction between parsers and converters.  I'm suggesting you
use a converter, not a parser.  The arguments to the script are the same,
but the output of the converter will be HTML, not parser records.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to