Re: [poppler] Extract pdf

amit aggarwal Thu, 28 Jan 2010 05:17:48 -0800

ahh gud ,, so is there any way we can get these optional info ?

On Thu, Jan 28, 2010 at 6:19 PM, Leonard Rosenthol <[email protected]>wrote:


> PDF DOES support rich semantic structure including all of things listed
> below (ISO 32000-1:2008, 14.7, 14.8 and 14.9). HOWEVER, it is optional and
> therefore many PDF documents do not contain the necessary elements.   And,
> as pointed out, without the presence of such elements already in the PDF -
> the best you can do is GUESS.
>
> -----Original Message-----
> From: [email protected] [mailto:
> [email protected]] On Behalf Of
> [email protected]
> Sent: Thursday, January 28, 2010 7:04 AM
> To: amit aggarwal
> Cc: [email protected]
> Subject: Re: [poppler] Extract pdf
>
> Hi,
>
> I think PDF is a page description language and defines
> nothing for semantic structure; how to store the titles
> of section, subsection, figure and tables. Therfore, I
> guess, poppler cannot extract - because, PDF does not have.
>
> Is there any reliable framework defining such and your
> target documentations follow?
>
> Regards,
> mpsuzuki
>
> On Thu, 28 Jan 2010 17:23:17 +0530
> amit aggarwal <[email protected]> wrote:
>
> >Hi All,
> >
> >I want to extract the following inforamaton for pdf
> >1) All Chapter Section and Subsection titles,
> >2)  name of the Figures and tables
> >
> >Can any one plz help me for the same ?
> >
> >--
> >Thanks
> >Amit Aggarwal
> >
> _______________________________________________
> poppler mailing list
> [email protected]
> http://lists.freedesktop.org/mailman/listinfo/poppler
>



-- 
Thanks
Amit Aggarwal

_______________________________________________
poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler

Re: [poppler] Extract pdf

Reply via email to