Many thanks for your reply Cameron.

Slight error in my post - I meant to say non-1252.

Anyway, Paulo's answer solves the problem.

Thanks again.
  
Kind regards
William
William Bell
E: [email protected]

From: Cameron Conacher <[email protected]>
>To: "[email protected]" 
><[email protected]> 
>Sent: Tuesday, 10 January 2012, 2:56
>Subject: Re: [iText-questions] iText-questions Digest, Vol 68, Issue 13
>
>
>Hello,
>I am not sure if this helps or not, but CodePage 1252 is not Unicode.
>I believe that UTF-8 encoded Unicode data uses CodePage 1208.
>And, UTF-16 Little Endian is CodePage 1200. UTF-16 Big Endian is CodePage 1201.
> 
>Sorry I can't really help much more than that, but perhaps, your data is 
>either not Unicode, or not CodePage 1252, and when interpretting and 
>transforming it, it becomes scrambled?
> 
>
>
>From: "[email protected]" 
><[email protected]>
>To: [email protected] 
>Sent: Monday, January 9, 2012 5:10:22 PM
>Subject: iText-questions Digest, Vol 68, Issue 13
>
>Send iText-questions mailing list submissions to
>    [email protected]
>
>To subscribe or unsubscribe via the World Wide Web, visit
>    https://lists.sourceforge.net/lists/listinfo/itext-questions
>or, via email, send a message with subject or body 'help' to
>    [email protected]
>
>You can reach the person managing the list at
>    [email protected]
>
>When replying, please edit your Subject line so it is more specific
>than "Re: Contents of iText-questions digest..."
>
>
>Today's Topics:
>
>  1. Reading annotations containing Unicode characters (William Bell)
>
>
>----------------------------------------------------------------------
>
>Message: 1
>Date: Mon, 9 Jan 2012 22:10:09 -0000
>From: "William Bell" <[email protected]>
>Subject: [iText-questions] Reading annotations containing Unicode
>    characters
>To: <[email protected]>
>Message-ID: <[email protected]>
>Content-Type: text/plain; charset="us-ascii"
>
>Good evening,
>
>
>
>I am trying to extract the annotations in a pdf file.  This is straight
>forward:
>
>
>
>reader = new PdfReader(pdfFile.fullname);
>
>for (int n = 1; n <= reader.NumberOfPages; n++) {
>
>  PdfDictionary page = reader.GetPageN(n);  
>
>  PdfArray annotsArray = page.GetAsArray(PdfName.ANNOTS);
>
>  if (annotsArray != null) {
>
>    for (int k = 0; k < annotsArray.Size; k++) {
>
>      PdfDictionary annot =
>(PdfDictionary)PdfReader.GetPdfObject(annotsArray[k]);
>
>      PdfString content =
>(PdfString)PdfReader.GetPdfObject(annot.Get(PdfName.CONTENTS));
>
>      if (content != null) {
>
>        System.Windows.Forms.MessageBox.Show(content.ToString());
>
>    }
>
>  }
>
>  }
>
>}
>
>
>
>
>
>However, if the annotation contains Unicode (more specific 1252 code page)
>the annotation is not read correctly.
>
>
>
>I tried modifying the above code as follows:
>
>
>
>if (content != null) {
>
>byte[] byteArray =
>Encoding.Unicode.GetBytes(((PdfString)PdfReader.GetPdfObject(annot.Get(PdfNa
>me.CONTENTS))).ToString());
>
>string s = Encoding.Unicode.GetString(byteArray);
>
>System.Windows.Forms.MessageBox.Show(s);
>
>
>
>
>
>Unfortunately, this does not resolve the issue.
>
>
>
>I have attached a sample file with the troublesome annotation.
>
>
>
>I was wondering if someone could point me in the right direction.
>
>
>
>Thanks.
>
>
>
>William Bell
>
>-------------- next part --------------
>An HTML attachment was scrubbed...
>-------------- next part --------------
>A non-text attachment was scrubbed...
>Name: mypd1f.pdf
>Type: application/pdf
>Size: 30244 bytes
>Desc: not available
>
>------------------------------
>
>------------------------------------------------------------------------------
>Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
>infrastructure or vast IT resources to deliver seamless, secure access to
>virtual desktops. With this all-in-one solution, easily deploy virtual 
>desktops for less than the cost of PCs and save 60% on VDI infrastructure 
>costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
>
>------------------------------
>
>_______________________________________________
>iText-questions mailing list
>[email protected]
>https://lists.sourceforge.net/lists/listinfo/itext-questions
>
>iText(R) is a registered trademark of 1T3XT BVBA
>
>End of iText-questions Digest, Vol 68, Issue 13
>***********************************************
>
>
>
>------------------------------------------------------------------------------
>Write once. Port to many.
>Get the SDK and tools to simplify cross-platform app development. Create 
>new or port existing apps to sell to consumers worldwide. Explore the 
>Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
>http://p.sf.net/sfu/intel-appdev
>_______________________________________________
>iText-questions mailing list
>[email protected]
>https://lists.sourceforge.net/lists/listinfo/itext-questions
>
>iText(R) is a registered trademark of 1T3XT BVBA.
>Many questions posted to this list can (and will) be answered with a reference 
>to the iText book: http://www.itextpdf.com/book/
>Please check the keywords list before you ask for examples: 
>http://itextpdf.com/themes/keywords.php
>
>
------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create 
new or port existing apps to sell to consumers worldwide. Explore the 
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Reply via email to