Hi guys and thanks.
I was thinking that maybe we could extract only a section of a page...
So, if one bookmark represents the start of the section I want to extract,
I can get the start point of the extraction. Then, maybe the next bookmark
can be the end of the area I want to extract... Would not that be easily
feasible ? And then I only need to extract from my start point to the end
point
If one of you has got a code, please, that would be very helpful!
cheers
David
2011/11/10 Leonard Rosenthol <lrose...@adobe.com>
> 1) Bookmarks in a PDF do NOT HAVE TO be associated with content – they
> can refer to any Action as well.
>
> 2) Bookmarks that do point to content, normally only point to the START of
> the content – there is no END.
>
> 3) UNLESS the PDF is tagged and structured properly, and the bookmarks are
> connected to that structure. IN THAT ONE CASE, then you have the actual
> content per bookmark.
>
> Unfortunately, the number of PDFs that meet all three criteria is almost 0
> :(.
>
> Leonard
>
> From: david guede <guede.da...@gmail.com>
> Reply-To: Post here <itext-questions@lists.sourceforge.net>
> Date: Wed, 9 Nov 2011 04:16:26 -0800
> To: Post here <itext-questions@lists.sourceforge.net>
> Subject: [iText-questions] question about extracting text with itext
>
> Hi everybody,
>
> I have never used iText and need to manipulate contents from pdf files.
> Basically, I have got one pdf file called sample.pdf
> In this file, I have got some bookmarks (corresponding to the sections of
> my file). This can be an example :
>
> Sample.pdf
>
> SAMPLE
> 1 Introduction
> 1.1 IntroductionPart1
> blablablablablablablabla
> 1.2 IntroductionPart2
> blibliblibliblibliblibliblibliblibliblibi
> 2 Section1
> 2.1 Section1Part1
> blublublublublublublublbu
> 2.2 Section1Part2
> blobloblobloblo
>
> Of course this a very simplified example of the pdf file I need to work
> with...
>
> My question is very simple, is there any way to extract the text for each
> bookmark to .txt file ? For example, with my example, I would like to
> create 6 .txt file:
>
> Introduction.txt
> "1.1 IntroductionPart1
> blablablablablablablabla
> 1.2 IntroductionPart2
> blibliblibliblibliblibliblibliblibliblibi"
>
> IntroductionPart1.txt
> " blablablablablablablabla"
>
> IntroductionPart2.txt
> " blibliblibliblibliblibliblibliblibliblibi"
>
>
> Section1.txt
> " 2.1 Section1Part1
> blublublublublublublublbu
> 2.2 Section1Part2
> blobloblobloblo"
>
> Section1Part1.txt
> "blublublublublublublublbu"
>
> Section1Part2.txt
> "blobloblobloblo"
>
> If so, could you please send me an example or tell me where I could find
> one...
>
> Thanks in adavnce
>
> David
>
>
> ------------------------------------------------------------------------------
> RSA(R) Conference 2012
> Save $700 by Nov 18
> Register now
> http://p.sf.net/sfu/rsa-sfdev2dev1
> _______________________________________________
> iText-questions mailing list
> iText-questions@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/itext-questions
>
> iText(R) is a registered trademark of 1T3XT BVBA.
> Many questions posted to this list can (and will) be answered with a
> reference to the iText book: http://www.itextpdf.com/book/
> Please check the keywords list before you ask for examples:
> http://itextpdf.com/themes/keywords.php
>
------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions
iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php