2007/9/19, Behdad Esfahbod <[EMAIL PROTECTED]>: > Anyway, I wrote about PDF text extraction from the point of view of what > cairo should be doing to generate perfectly text-extractable PDFs. > Forwarding the message here as people may be interested. I also point > out a few poppler bugs. I plan to fix them at some point, but it may be > an obvious small fix to those familiar with the code base.
Two things to note, since you are talking about extracting information from PDFs you created yourself: - tagged PDF can embed more information in the PDF than pure glyphs and may help - if tagged PDF is not enough, you can embed even more information yourself using private structures Best Martin _______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
