Many thanks for your reply Cameron.
Slight error in my post - I meant to say non-1252.
Anyway, Paulo's answer solves the problem.
Thanks again.
Kind regards
William
William Bell
E: [email protected]
From: Cameron Conacher <[email protected]>
>To: "[email protected]"
><[email protected]>
>Sent: Tuesday, 10 January 2012, 2:56
>Subject: Re: [iText-questions] iText-questions Digest, Vol 68, Issue 13
>
>
>Hello,
>I am not sure if this helps or not, but CodePage 1252 is not Unicode.
>I believe that UTF-8 encoded Unicode data uses CodePage 1208.
>And, UTF-16 Little Endian is CodePage 1200. UTF-16 Big Endian is CodePage 1201.
>
>Sorry I can't really help much more than that, but perhaps, your data is
>either not Unicode, or not CodePage 1252, and when interpretting and
>transforming it, it becomes scrambled?
>
>
>
>From: "[email protected]"
><[email protected]>
>To: [email protected]
>Sent: Monday, January 9, 2012 5:10:22 PM
>Subject: iText-questions Digest, Vol 68, Issue 13
>
>Send iText-questions mailing list submissions to
> [email protected]
>
>To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.sourceforge.net/lists/listinfo/itext-questions
>or, via email, send a message with subject or body 'help' to
> [email protected]
>
>You can reach the person managing the list at
> [email protected]
>
>When replying, please edit your Subject line so it is more specific
>than "Re: Contents of iText-questions digest..."
>
>
>Today's Topics:
>
> 1. Reading annotations containing Unicode characters (William Bell)
>
>
>----------------------------------------------------------------------
>
>Message: 1
>Date: Mon, 9 Jan 2012 22:10:09 -0000
>From: "William Bell" <[email protected]>
>Subject: [iText-questions] Reading annotations containing Unicode
> characters
>To: <[email protected]>
>Message-ID: <[email protected]>
>Content-Type: text/plain; charset="us-ascii"
>
>Good evening,
>
>
>
>I am trying to extract the annotations in a pdf file. This is straight
>forward:
>
>
>
>reader = new PdfReader(pdfFile.fullname);
>
>for (int n = 1; n <= reader.NumberOfPages; n++) {
>
> PdfDictionary page = reader.GetPageN(n);
>
> PdfArray annotsArray = page.GetAsArray(PdfName.ANNOTS);
>
> if (annotsArray != null) {
>
> for (int k = 0; k < annotsArray.Size; k++) {
>
> PdfDictionary annot =
>(PdfDictionary)PdfReader.GetPdfObject(annotsArray[k]);
>
> PdfString content =
>(PdfString)PdfReader.GetPdfObject(annot.Get(PdfName.CONTENTS));
>
> if (content != null) {
>
> System.Windows.Forms.MessageBox.Show(content.ToString());
>
> }
>
> }
>
> }
>
>}
>
>
>
>
>
>However, if the annotation contains Unicode (more specific 1252 code page)
>the annotation is not read correctly.
>
>
>
>I tried modifying the above code as follows:
>
>
>
>if (content != null) {
>
>byte[] byteArray =
>Encoding.Unicode.GetBytes(((PdfString)PdfReader.GetPdfObject(annot.Get(PdfNa
>me.CONTENTS))).ToString());
>
>string s = Encoding.Unicode.GetString(byteArray);
>
>System.Windows.Forms.MessageBox.Show(s);
>
>
>
>
>
>Unfortunately, this does not resolve the issue.
>
>
>
>I have attached a sample file with the troublesome annotation.
>
>
>
>I was wondering if someone could point me in the right direction.
>
>
>
>Thanks.
>
>
>
>William Bell
>
>-------------- next part --------------
>An HTML attachment was scrubbed...
>-------------- next part --------------
>A non-text attachment was scrubbed...
>Name: mypd1f.pdf
>Type: application/pdf
>Size: 30244 bytes
>Desc: not available
>
>------------------------------
>
>------------------------------------------------------------------------------
>Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
>infrastructure or vast IT resources to deliver seamless, secure access to
>virtual desktops. With this all-in-one solution, easily deploy virtual
>desktops for less than the cost of PCs and save 60% on VDI infrastructure
>costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
>
>------------------------------
>
>_______________________________________________
>iText-questions mailing list
>[email protected]
>https://lists.sourceforge.net/lists/listinfo/itext-questions
>
>iText(R) is a registered trademark of 1T3XT BVBA
>
>End of iText-questions Digest, Vol 68, Issue 13
>***********************************************
>
>
>
>------------------------------------------------------------------------------
>Write once. Port to many.
>Get the SDK and tools to simplify cross-platform app development. Create
>new or port existing apps to sell to consumers worldwide. Explore the
>Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
>http://p.sf.net/sfu/intel-appdev
>_______________________________________________
>iText-questions mailing list
>[email protected]
>https://lists.sourceforge.net/lists/listinfo/itext-questions
>
>iText(R) is a registered trademark of 1T3XT BVBA.
>Many questions posted to this list can (and will) be answered with a reference
>to the iText book: http://www.itextpdf.com/book/
>Please check the keywords list before you ask for examples:
>http://itextpdf.com/themes/keywords.php
>
>------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions
iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php