Re: [poppler] Extract pdf

mpsuzuki Thu, 28 Jan 2010 05:01:58 -0800

Oh, I was not aware of the semantic feature in PDF 1.7
(it tells that the logical structure facility was already
 introduced in PDF 1.3). Thank you for pointing out my
misunderstanding. What is the most popular workflow to
generate such semantically structured PDF?


Regards,
mpsuzuki

On Thu, 28 Jan 2010 04:49:45 -0800
Leonard Rosenthol <[email protected]> wrote:

>PDF DOES support rich semantic structure including all of things listed below 
>(ISO 32000-1:2008, 14.7, 14.8 and 14.9). HOWEVER, it is optional and therefore 
>many PDF documents do not contain the necessary elements.   And, as pointed 
>out, without the presence of such elements already in the PDF - the best you 
>can do is GUESS.
>
>-----Original Message-----
>From: [email protected] 
>[mailto:[email protected]] On Behalf Of 
>[email protected]
>Sent: Thursday, January 28, 2010 7:04 AM
>To: amit aggarwal
>Cc: [email protected]
>Subject: Re: [poppler] Extract pdf
>
>Hi,
>
>I think PDF is a page description language and defines
>nothing for semantic structure; how to store the titles
>of section, subsection, figure and tables. Therfore, I
>guess, poppler cannot extract - because, PDF does not have.
>
>Is there any reliable framework defining such and your
>target documentations follow?
>
>Regards,
>mpsuzuki
>
>On Thu, 28 Jan 2010 17:23:17 +0530
>amit aggarwal <[email protected]> wrote:
>
>>Hi All,
>>
>>I want to extract the following inforamaton for pdf
>>1) All Chapter Section and Subsection titles,
>>2)  name of the Figures and tables
>>
>>Can any one plz help me for the same ?
>>
>>-- 
>>Thanks
>>Amit Aggarwal
>>
>_______________________________________________
>poppler mailing list
>[email protected]
>http://lists.freedesktop.org/mailman/listinfo/poppler
_______________________________________________
poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler

Re: [poppler] Extract pdf

Reply via email to