> -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On > Behalf Of sjf > Sent: Wednesday, June 21, 2006 2:17 AM > To: [email protected] > Subject: [iText-questions] getPageContent Bug ? > > > > > I download the latest itext and itextsharp and find a > bug. If I burst > > > a PDF file into pages and merge them into one PDF file again using > > > pdfsam (http://sourceforge.net/projects/pdfsam), > getPageContent will > > > not return the correct content of the remerged PDF file, while > > > ExtractText from PDFbox(www.pdfbox.org <http://www.pdfbox.org> > <http://www.pdfbox.org>) can > > > extract all the text correctly from the same PDF file. > > > > It's not a bug. > > You are mixing two different concepts. > > 1. you DO get the correct content of the remerged PDF, > > but it's different from the content of the original PDF. > > In the merged PDF the content is added as a PDF Form XObject. > > 2. The text extracted with PDFBox is the text that is in the > > Form XObject. PDFBox parses the page content and discovers > > that the real content is in a different object. It gets > > that object to retrieve the text. > > Is there any examples about how to get the text in the Form XObject? > > A New Bug: > > When I use itextsharp to getPageContent, I got an Exception: > > System.IO.EndOfStreamException: Trying to read content after > the end of the stream > iTextSharp.text.pdf.RandomAccessFileOrArray.ReadFully(Byte[] > b, Int32 off, Int32 len) > iTextSharp.text.pdf.PdfReader.GetStreamBytesRaw(PRStream > stream,RandomAccessFileOrArray file) > iTextSharp.text.pdf.PdfReader.GetStreamBytes(PRStream > stream,RandomAccessFileOrArray file) > iTextSharp.text.pdf.PdfReader.GetPageContent(Int32 > pageNum,RandomAccessFileOrArray file) > > But there IS a picture(and nothing else) in the page and > GetImportedPage runs well. >
Post the PDF and a standalone example with the error. Paulo > Thanks, > sjf Aviso Legal: Esta mensagem é destinada exclusivamente ao destinatário. Pode conter informação confidencial ou legalmente protegida. A incorrecta transmissão desta mensagem não significa a perca de confidencialidade. Se esta mensagem for recebida por engano, por favor envie-a de volta para o remetente e apague-a do seu sistema de imediato. É proibido a qualquer pessoa que não o destinatário de usar, revelar ou distribuir qualquer parte desta mensagem. Disclaimer: This message is destined exclusively to the intended receiver. It may contain confidential or legally protected information. The incorrect transmission of this message does not mean the loss of its confidentiality. If this message is received by mistake, please send it back to the sender and delete it from your system immediately. It is forbidden to any person who is not the intended receiver to use, distribute or copy any part of this message.
_______________________________________________ iText-questions mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/itext-questions
