htdig and lyx-pdf-documents

M.B. Schiekel Fri, 09 Jan 2004 09:33:32 -0800

Hello,

I�m trying to establish a document server with htdig under SuSE-8.2. In
this context I also tried to build an index of pdf-files, created with
LyX, with the htdig external_parsers method.
This works for all my pdf-files, except the ones from LyX-sources.


After a long time of debugging in the following files:
genhtdig.pl -> htdig -> doc2html.pl -> pdf2html.pl -> pdftotext

In the end I found the following in the MAN-page of pdftotext:

BUGS
       Some  PDF  files  contain  fonts whose encodings have been
       mangled beyond recognition.  There is  no  way  (short  of
       OCR) to extract text from these files.

Question: is that really the point, why pdftotext fails in processing
lyx-pdf files?
And if so, is there another way, in indexing lyx-pdf files?

Thank you
bernhard

-- 
http://home.t-online.de/home/mb.schiekel/
GPG-Key available: GnuPG-1.2.2

htdig and lyx-pdf-documents

Reply via email to