Re: [iText-questions] charset problems

Felipe Gaúcho Mon, 06 Aug 2007 04:39:34 -0700

from the old article:

Unicode was a brave effort to create a single character set that
included every reasonable writing system on the planet and some
make-believe ones like Klingon, too. Some people are under the
misconception that Unicode is simply a 16-bit code where each
character takes 16 bits and therefore there are 65,536 possible
characters. This is not, actually, correct. It is the single most
common myth about Unicode, so if you thought that, don't feel bad.


On 8/6/07, Felipe Gaúcho <[EMAIL PROTECTED]> wrote:
> because Unicode is wrong :(
>
> check it out: http://www.joelonsoftware.com/articles/Unicode.html
>
> section Unicode...
>
> Actually I am not worried about that, I am just having problems with
> char encoding and I was looking for a library solution.. I can work
> around that using a String as char related repository and then treat
> the situation there.. but it was much easier and sound if the code
> support it for me :)
>
> On 8/6/07, Paulo Soares <[EMAIL PROTECTED]> wrote:
> > I'm sorry but I've no idea what you are talking about. The OuputStream in
> > the PdfStamper context produces a PDF that, as we all know, is binary and
> > has nothing to do with Unicode (the PDF as a whole). Why do you need a
> > Writer if there's no Unicode here?
> >
> > Paulo
> >
> > ----- Original Message -----
> > From: "Felipe Gaúcho" <[EMAIL PROTECTED]>
> > To: "Post all your questions about iText here"
> > <[email protected]>
> > Sent: Monday, August 06, 2007 12:11 PM
> > Subject: Re: [iText-questions] charset problems
> >
> >
> > > inspecting the PdfStamper class, I am noticing all code relies in the
> > > OutputStream class, which does not consider characters encoding while
> > > flushing its contents:
> > >
> > >            while ((n = file.read(buf)) > 0)
> > >                this.os.write(buf, 0, n); <-- output stream is just
> > > byte printer :(
> > >
> > > if we can use Writers instead, we gain full control over what kind of
> > > encoding is being used to writes the characters:
> > >
> > >          writer = OutputStreamWriter(this.os, new Charset("UTF-8",
> > > new String("utf-8"));
> > >            while ((n = file.read(buf)) > 0)
> > >                writer.write(buf, 0, n); <-- now we can handle the
> > > correct encoding :)
> > >
> > > @see:
> > > http://java.sun.com/j2se/1.4.2/docs/api/java/io/OutputStreamWriter.html
> > >
> > > just a guess....
> > >
> > >
> > > On 8/6/07, Felipe Gaúcho <[EMAIL PROTECTED]> wrote:
> > >> I suppose it is Unicode.. my question is about a configuration to change
> > >> it...
> > >>
> > >> Since iText offer me a reader and a writer classes, I was looking for
> > >> some configuration somewhere :))
> > >>
> > >> On 8/6/07, Felipe Gaúcho <[EMAIL PROTECTED]> wrote:
> > >> > the PdfStamper uses what encoding ?
> > >> >
> > >> > On 8/6/07, Felipe Gaúcho <[EMAIL PROTECTED]> wrote:
> > >> > >                InputStream templateStream = ...
> > >> > >                PdfReader templateReader = new
> > >> > > PdfReader(templateStream);
> > >> > >
> > >> > > This PdfReader is using what encoding ?
> > >> > >
> > >> > > Can I configure it to use a charset or I must create a String before
> > >> > > to use the PdfReader ?
> > >> > >
> > >> > > The other option is to use the toString method of the
> > >> > > ByteArrayOutputStream in order to set the encoding....
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > > On 8/6/07, Paulo Soares <[EMAIL PROTECTED]> wrote:
> > >> > > >
> > >> > > > ----- Original Message -----
> > >> > > > From: "Felipe Gaúcho" <[EMAIL PROTECTED]>
> > >> > > > To: "Post all your questions about iText here"
> > >> > > > <[email protected]>
> > >> > > > Sent: Monday, August 06, 2007 11:35 AM
> > >> > > > Subject: Re: [iText-questions] charset problems
> > >> > > >
> > >> > > >
> > >> > > > > depending on the charset of the input I get wrong output.. how
> > >> > > > > can I
> > >> > > > > manage if it is UTF-8 or ISO-... ?? there is a configuration
> > >> > > > > somewhere
> > >> > > > > ?
> > >> > > >
> > >> > > > You are confusing Unicode with Unicode representation. A java
> > >> > > > String is
> > >> > > > Unicode. UTF-8 is an Unicode representation. There are constructors
> > >> > > > in the
> > >> > > > java String that can convert from an Unicode byte representation to
> > >> > > > the
> > >> > > > String itself. All this is beyond the iText scope but if you are
> > >> > > > more
> > >> > > > specific we may be able to help you (posting the code is ok but you
> > >> > > > must
> > >> > > > point to the problematic line).
> > >> > > >
> > >> > > > Paulo
> >
> >
> > -------------------------------------------------------------------------
> > This SF.net email is sponsored by: Splunk Inc.
> > Still grepping through log files to find problems?  Stop.
> > Now Search log events and configuration files using AJAX and a browser.
> > Download your FREE copy of Splunk now >>  http://get.splunk.com/
> > _______________________________________________
> > iText-questions mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/itext-questions
> > Buy the iText book: http://itext.ugent.be/itext-in-action/
> >
>
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions
Buy the iText book: http://itext.ugent.be/itext-in-action/

Re: [iText-questions] charset problems

Reply via email to