We read in the Word document and store it in our internal format -
paragraph w/ properties and in each paragraph a sequence of character
formatting and text strings. When we scan the document if it's WordML or
DOCX then it is Unicode in the file. If it's RTF there is a codepage and
we use that to convert to Unicode (and throw away the codepage info).

As DOCX is the future, we have to handle the case where we start with
Unicode and are given no codepage.

Thanks - dave


-----Original Message-----
From: 1T3XT info [mailto:[email protected]] 
Sent: Monday, December 22, 2008 12:59 PM
To: Post all your questions about iText here
Subject: Re: [iText-questions] Code page or unicode

David Thielen wrote:
> We build our documents from a Word document. So the customer has
already
> selected the fonts in Word - and we must use those fonts. So I don't
> think that approach will work. And if it's Russian & Polish using
> Verdana, don't we then have to get 2 different Verdana fonts, one for
> each code page?

If you leave the path of using Unicode, you have to take codepages into 
account, yes. It's not an obvious question. Are you working with an 
intermediate format? Do you have the encodings in Word?
-- 
This answer is provided by 1T3XT BVBA
http://www.1t3xt.com/ - http://www.1t3xt.info

------------------------------------------------------------------------
------
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.1t3xt.com/docs/book.php

------------------------------------------------------------------------------
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.1t3xt.com/docs/book.php

Reply via email to