I tried your experiment with the Split tool in the iText Toolbox, splitting once at page 2, then splitting the resulting two-page second part again at page 2. Visual examination of the resulting pages suggests there would be no problem with text extraction, using my own primitive text extraction tool.
Please try using PDFBox to extract the text from the attached two-page file and see what happens. If all of the text is extracted, I think we've isolated your problem to PDFBox, rather than it's being an iText problem.. Best regards, Bill Segraves ----- Original Message ---- From: rosette <rose...@arx.com> To: itext-questions@lists.sourceforge.net Sent: Sunday, October 4, 2009 2:21:41 AM Subject: Re: [iText-questions] Splitting by Itext Hi all, In the attachment you can find the following documents: 1. The full document (test_split1.pdf) - 3 pages The test_split1.pdf was splited to two documents by iText :Doc_0.PDF (1 page - you can see it also in the attachment) and to other document which contains the rest 2 pages. I can extract the text from Doc_0.PDF by iText. 2. I'm taking now the file that contains the 2 last pages and I'm spliting it again by iText - I failed here. You can see the file in (Doc_1.PDF). PDFBox says that extract sucessed but it returns empty string! Any help will be appreciated. Thanks, Rosette rosette wrote: > > Hi, > > I have a PDF file that I'm spliting by Itext, it works well! > My problem begins when I want to extract a text from the PDF that was > cerated by iText. > > If I have a PDF document with 10 pages and I split this document to 2 > documents. > > The first from pages 1-3 and the second is from 4-10, I'll be able to > extract the text from both documents. > But if I take the first or the second document and I'll split it again and > then I'll try to extract the text , it will fail. > Since IText can't extract text, I investigated the problem with couple API > and it seems that the core of the problem is when I'm spliting by Itext > files that the origin were also splited by IText. > > Please let me know if you have any idea how to solve this problem. > > Rosette > http://www.nabble.com/file/p25734860/test_split1.pdf test_split1.pdf http://www.nabble.com/file/p25734860/test_split1.pdf test_split1.pdf http://www.nabble.com/file/p25734860/Doc_0.PDF Doc_0.PDF http://www.nabble.com/file/p25734860/Doc_1.PDF Doc_1.PDF -- View this message in context: http://www.nabble.com/Splitting-by-Itext-tp25695119p25734860.html Sent from the iText - General mailing list archive at Nabble.com. ------------------------------------------------------------------------------ Come build with us! The BlackBerry® Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9-12, 2009. Register now! http://p.sf.net/sfu/devconf _______________________________________________ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.1t3xt.com/docs/book..php Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
test_split3.pdf
Description: Adobe PDF document
------------------------------------------------------------------------------ Come build with us! The BlackBerry® Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9-12, 2009. Register now! http://p.sf.net/sfu/devconf
_______________________________________________ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.1t3xt.com/docs/book.php Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/