Ok! Let me see if I can explain myself - I am not an expert on this so please correct me if I am wrong!
An UTF-8 representation of one character consists of at combination of characters. Now JAVA is a Unicode language and this means that one character can represent "any" type of character in the world! Basically UTF-8 only makes sense when working on an "old" 7 bit asci system and you need to use characters not available in the given codepage. Both UTF-8 and UTF-16 uses a varying number of bytes to represent one character, where Unicode always uses 32 bit characters (maybe it is 24 bit). This was my understanding of the UTF standards and unicode - am I wrong here? /Jacob -----Original Message----- From: Michael Smith [mailto:[EMAIL PROTECTED] Sent: 30. januar 2004 01:44 To: Slide Users Mailing List Subject: Re: TXFileStore and local filesystem Oliver Zeigermann wrote: > Jacob Lund wrote: >> The correct solution might be to convert from UTF-8 to Unicode before >> storing the data and then change the database scheme to Unicode char >> in all >> fields containing strings. > > > Hmmmm. You might be confusing certain things here. On one side there is > Unicode having a number for each character. On the other side there is > the representation in bytes. Now, UTF-8 *is* Unicode, but on the other > side, i.e. the representation in bytes. Thus it does not make too much > sense to compare Unicode with UTF-8. Do you agree? A lot of microsoft's documentation confusingly uses "unicode" when it actually means "UTF-16" or "UCS-2" (I can never remember what the difference between those two is, and I don't know if it matters). I suspect rereading Jacob's mail mentally substituting "UTF-16" for "unicode" will make it clearer. Mike --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
