The easiest way I can think of is to grab it from the headers and footers. I am about to submit a patch (any day now) which separate the header and footers into separate tags from which you can access from pdftohtml -xml.
I will then work on incorporating it all back into the PDF, with ToC linkage (I will make a new pdftopdf utility). On Wed, Nov 9, 2011 at 5:17 PM, 杨辉强 <[email protected]> wrote: > Hi, all: > I want to extract title from pdf file. Although PDFDoc has a function > getDocInfo() to get title info, it is > empty most of the time. Thus, I have to guess it by myself. I wish you can > give me some advices. > > Thank you! > Best wishes! > _______________________________________________ > poppler mailing list > [email protected] > http://lists.freedesktop.org/mailman/listinfo/poppler > _______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
