On Thursday 10 November 2011 14:36:39 Leonard Rosenthol wrote: > EXCEPT that Poppler (and by extension, pdftoxml) does NOT process the > tagging & structure of the PDF :(.
This is not true, you can at least get Outline textx with poppler > That's why I was hoping that you were > ADDING THIS FEATURE to Poppler's core. > > Leonard > > On 11/9/11 10:44 PM, "Alec Taylor" <[email protected]> wrote: > >Running pdftohtml -xml, analysing XML, processing information back into > >PDF > > > >On Thu, Nov 10, 2011 at 2:01 PM, Leonard Rosenthol <[email protected]> > > > >wrote: > >> On 11/9/11 10:02 AM, "Alec Taylor" <[email protected]> wrote: > >>>>Are you also submitting patches to read & process any tags & structure > >>>>in > >>>> > >>>> the PDF? If the PDF is already tagged, then it will have any > >>>> headers/footers already identified accordingly. You should be using > >>>> > >>>>this > >>>> > >>>> when present. > >>> > >>>Yes, I am using the RapidXML library, which I specifically chose for > >>>speed and that it is header only. > >>> > >> What does an XML library have to do with processing PDF structure & > >> tagging (ISO 32000-1:2008, 14.7-14.9)??? > >> > >> > >> Leonard > > > >_______________________________________________ > >poppler mailing list > >[email protected] > >http://lists.freedesktop.org/mailman/listinfo/poppler > > _______________________________________________ > poppler mailing list > [email protected] > http://lists.freedesktop.org/mailman/listinfo/poppler -- Пётр Керзум Группа разработки поисковой платформы СПб, тел. 8508 _______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
