Re: [poppler] Using the poppler utils in my software

Alec Taylor Mon, 16 Jul 2012 09:06:53 -0700

Actually you probably want to use pdftoxml, including in the
poppler-utils. If that is giving you too many mistakes, maybe checkout
an OCR engine such as Tesseract.


Reparsing the XML shouldn't give you too much trouble; if it is, ask
me and I'll give you access to the poppler repo I modified ~8 months
ago which included my algorithms for extracting logical structure
information from PDFs by post-processing the XML and regenerating XML
containing the modified markup.

On Tue, Jul 17, 2012 at 1:32 AM, Jean-Philippe Green
<[email protected]> wrote:
> Hello. I asked about this on the IRC with no luck, so I'll try this instead.
>
> How can I use the poppler utils (such as pdftotext) in my software without
> executing a platform dependent executable? Is it included somewhere in the
> library?
>
>
> If you want to know, I'm trying to do a software that reads a pdf-schedule
> from the company where I work and make an icalendar that can be used in
> google calendar and more. They can't provide a calender file so I feel I
> need to do this. I want all of my co-workers to be able to use it too, so I
> need to do it platform independent.
>
> Thank you!
>
> _______________________________________________
> poppler mailing list
> [email protected]
> http://lists.freedesktop.org/mailman/listinfo/poppler
>
_______________________________________________
poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler

Re: [poppler] Using the poppler utils in my software

Reply via email to