peter karlsson wrote:

> > Anyone out there indexing pdf-files with ISO-Latin characters in them
> > (������ mainly)? Seems that htdig doesn't understand the meaning of the
> > special characters, and shows them w/o conversion;
>
> I don't believe that htdig *care* what character set the external parsers
> feed it with, but assumes that it is the one as specified in the current
> locale. You would probably want to hack the program you're using to parse
> PDF files to correctly convert its output to the local character set.

That's true; these weird char's show up on excerpt and can be searched for.

But I'm not using anything special to parse PDF's; just Adobe Acrobat and
then indexing with htdig. That's why I'm asking here. And the PS created by
Acrobat works is fine.

Further I don't think simply converting char's would help since the converter
would have to recognize between PC and Mac char sets, which overlap.

--
- Antti Rauramo, WWW- ja tietokanta-asiantuntija, Edita Verkkoviestint�
- [EMAIL PROTECTED], +358-9-8501 4004 (mobile)



------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word unsubscribe in
the SUBJECT of the message.

Reply via email to