[htdig] pdf2text, catdoc and French accents

stephan . bastiaens Thu, 09 Jan 2003 09:43:12 -0800

Title: pdf2text, catdoc and French accents

Hi,
I've installed htDig on a Red Hat 8.0 box and have some problems with ISO to Unicode (UTF-8) conversions.
The website to dig is ISO8859 based as are the documents referred to (pdf, doc, xls and ppt).
The parsing and searchengine works fine except for special chars.
This is due to a Unicode conversion done by my Linux box.
In fact, for plain html and text-files we can avoid the conversion when we turn Unicode conversion off on the Linux box (unicode_stop command). But I can't find a solution for the doc2html (pdf2text) or catdoc parsers.

Does anybody have a hint, clue or solution ?
Tx

------------------------------------------------------- This SF.NET email is sponsored by: SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See! http://www.vasoftware.com _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

[htdig] pdf2text, catdoc and French accents

Reply via email to