ahh gud ,, so is there any way we can get these optional info ? On Thu, Jan 28, 2010 at 6:19 PM, Leonard Rosenthol <[email protected]>wrote:
> PDF DOES support rich semantic structure including all of things listed > below (ISO 32000-1:2008, 14.7, 14.8 and 14.9). HOWEVER, it is optional and > therefore many PDF documents do not contain the necessary elements. And, > as pointed out, without the presence of such elements already in the PDF - > the best you can do is GUESS. > > -----Original Message----- > From: [email protected] [mailto: > [email protected]] On Behalf Of > [email protected] > Sent: Thursday, January 28, 2010 7:04 AM > To: amit aggarwal > Cc: [email protected] > Subject: Re: [poppler] Extract pdf > > Hi, > > I think PDF is a page description language and defines > nothing for semantic structure; how to store the titles > of section, subsection, figure and tables. Therfore, I > guess, poppler cannot extract - because, PDF does not have. > > Is there any reliable framework defining such and your > target documentations follow? > > Regards, > mpsuzuki > > On Thu, 28 Jan 2010 17:23:17 +0530 > amit aggarwal <[email protected]> wrote: > > >Hi All, > > > >I want to extract the following inforamaton for pdf > >1) All Chapter Section and Subsection titles, > >2) name of the Figures and tables > > > >Can any one plz help me for the same ? > > > >-- > >Thanks > >Amit Aggarwal > > > _______________________________________________ > poppler mailing list > [email protected] > http://lists.freedesktop.org/mailman/listinfo/poppler > -- Thanks Amit Aggarwal
_______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
