Re: [iText-questions] Help with parsing the PDF generated by Crystal reports-V9

Paulo Soares Wed, 29 Oct 2008 10:32:02 -0700

See DocumentFont.fillMetrics() for some ideas.

Paulo


> -----Original Message-----
> From: Kevin Day [mailto:[EMAIL PROTECTED] 
> Sent: Wednesday, October 29, 2008 5:07 PM
> To: itext-questions-request@
> Subject: Re: [iText-questions] Help with parsing the PDF 
> generated by Crystal reports-V9
> 
> Paulo-
> 
> Are there any examples or documentation that show how to use 
> iText to work with CMaps (esp when parsing PDF content)?
> 
> I have code that does the required spacial analysis, etc...  
> I wound up abandoning the project when I hit the CMap 
> situation and started using PDFBox for parsing PDF text 
> content - but that library does not handle cross reference 
> streams, so I'm faced with either adding cross reference 
> stream support to PDFBox, or re-visiting the text parsing in 
> iText.  Personally, I'd prefer to do the work to get the text 
> parsing working in iText...
> 
> Thanks,
> 
> - K
> 
> 
> ----------------------- Original Message -----------------------
>  
> From: Paulo Soares <[EMAIL PROTECTED]>
> To: Post all your questions about iText here 
> <itext-questions@lists.sourceforge.net>
> Cc: 
> Date: Wed, 29 Oct 2008 16:36:36 +0000
> Subject: Re: [iText-questions] Help with parsing the PDF 
> generated by Crystal    reports-V9
>  
> Text extraction in PDF is rocket science. Your simplistic 
> method may work or not depending on the encoding used. Your 
> workflow is flawed, you'd better look for other ways to do 
> it, ways that don't involve looking for text in the PDF.
> 
> Paulo 
> 
> > -----Original Message-----
> > From: Umashankar Palani [mailto:[EMAIL PROTECTED] 
> > Sent: Wednesday, October 29, 2008 3:46 PM
> > To: itext-questions@lists.sourceforge.net
> > Subject: [iText-questions] Help with parsing the PDF 
> > generated by Crystal reports-V9
> > 
> > Hi,
> > 
> > I am trying to parse the contents of the PDF with iTextSharp using :
> > 
> > PdfReader reader = new PdfReader("Test.pdf");
> > reader.GetPageContent(pageNumber);
> > 
> > byte[] pageContentByteArray;
> > 
> > I am using this byte array to search for a partcular text 
> > based on a Delimiter pattern by converting this to string by using -
> > 
> > string test = Encoding.ASCII.GetString(pageContentByteArray);
> > 
> > I am able to match the required text pattern inside the 
> > string generated using the above statement. The above logic 
> > works absolutely fine if we use a normal PDF input file.
> > 
> > My requirement is to read a PDF file which is created by 
> > CRYSTAL REPORTS (Version-9).
> > 
> > I have a byte array of the page with me. But I tried to 
> > convert to string using ASCII, UNICODE , UTF8, UnicodeBig..
> > 
> >             string test = 
> > Encoding.ASCII.GetString(invoicePageContentByteArray);
> >             string test = 
> > Encoding.Unicode.GetString(invoicePageContentByteArray);
> >             string test = 
> > Encoding.UTF8.GetString(invoicePageContentByteArray);
> >                         ..... also using UnicodeBig
> >  
> > The output is not in the readable format. I could not find 
> > any text in the page appearing in the output string. I guess 
> > the PDF generated out of crystal reports is using some other 
> > encoding format.  
> > 
> > (Note : We verified the template used by crystal reports to 
> > generate the PDF. The search delimiter pattern is defined as 
> > the Text object)
> > 
> > There should be some way of doing the above. Not sure what is 
> > that I am missing here. Can anyone please suggest ideas to 
> > resolve the above problem.
> > 
> > -- 
> > Regards,
> > Uma


Aviso Legal:
Esta mensagem é destinada exclusivamente ao destinatário. Pode conter 
informação confidencial ou legalmente protegida. A incorrecta transmissão desta 
mensagem não significa a perca de confidencialidade. Se esta mensagem for 
recebida por engano, por favor envie-a de volta para o remetente e apague-a do 
seu sistema de imediato. É proibido a qualquer pessoa que não o destinatário de 
usar, revelar ou distribuir qualquer parte desta mensagem. 

Disclaimer:
This message is destined exclusively to the intended receiver. It may contain 
confidential or legally protected information. The incorrect transmission of 
this message does not mean the loss of its confidentiality. If this message is 
received by mistake, please send it back to the sender and delete it from your 
system immediately. It is forbidden to any person who is not the intended 
receiver to use, distribute or copy any part of this message.

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/

_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.1t3xt.com/docs/book.php

Re: [iText-questions] Help with parsing the PDF generated by Crystal reports-V9

Reply via email to