Re: [poi] Problem with encoding

Christian Gosch Wed, 09 Nov 2005 22:58:48 -0800

However, there are other Windows localizations over the world which don't
use Cp1252 -- to go to the extremes, look at asian versions supporting
Traditional Chinese, Japanese or the like. Even russian, korean, hebrew are
sold, and they all have completely different charsets -- and the interesting
thing about it is, that I can view / edit files created there on other
localizations! So there *must* be *some* way to (a) encode it and (b) tell,
what the actual encoding is.


Look at the implementation of JXL. It is not my work, but we used it a lot
before POI. In the first versions, there was no idea of supporting something
like non-US chars, but after some weeks of discussion the developer of JXL
got the message, and *did* implement support of different encodings. Since
this product is open source, it should be possible. Look for JExcepAPI
(http://www.andykhan.com/jexcelapi/).

Please, don't let all of us non-US developers in 'good old europe' and
wherever else not starvate by lack of custom encoding support...

Regards
Christian Gosch
inovex GmbH


On Wednesday, November 09, 2005 11:07 PM [GMT+1=CET],
[EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

> Excel wants cp1252 for most things...  It just does.  When I get home
> (I'm on the road) I'll look at the dev kit...it may be that by
> changing the codepage record we can handle things a bit nicer, but
> eeez kinda picky about that and regardless of what AIX may support,
> when
> you open the Excel sheet it will be on Windows generally (or a
> semi-emulation of it on Mac/Linux) and you'll have to write it in an
> encoding supported by Excel for Windows...
>
> -Andy
>
> Rainer Klute wrote:
>
>> Am Mittwoch, den 09.11.2005, 07:25 -0500 schrieb [EMAIL PROTECTED]:
>>
>>
>>> We should be universally handling the issues mentioned here:
>>> http://en.wikipedia.org/wiki/Windows-1252 by intercepting character
>>> differences and writing them out properly.  Thus HSSF should force
>>> 8859-1 encoding but should then kind of do a replace on the
>>>  characters. If someone wants to contribute I can point them in the
>>> right direction.
>>>
>>>
>>
>> Um, no. Enforcing ISO 8859-1 as character code would be of limited
>> use only. These reason is that like Windows Codepage 1252 it
>> represents only a limited set of characters. UTF-8 is the preferred
>> character encoding. However, POI should not forbid to create strings
>> in other character encodings, be it ISO 8859-1, cp1252 or whatever.
>>
>> By the way, HPSF does a nice job of supporting a lot of different
>> character encodings. At least there are no problems I am aware of. I
>> suggest you have a look at it.
>>
>> Best regards
>> Rainer Klute
>>
>>                           Rainer Klute IT-Consulting GmbH
>>  Dipl.-Inform.
>>  Rainer Klute             E-Mail:  [EMAIL PROTECTED]
>>  K??rner Grund 24          Telefon: +49 172 2324824
>> D-44143 Dortmund           Telefax: +49 231 5349423
>>
>> Public key fingerprint: E4E4386515EE0BED5C162FBB5343461584B5A42E

Gruesse,
-- 
Dipl.-Inform. Christian Gosch
Systems Development
inovex GmbH
Karlsruher Strasse 71
D-75179 Pforzheim
Tel.: +49 (0)72 31 - 31 91 - 85
Fax: +49 (0)72 31 - 31 91 - 91
mailto:[EMAIL PROTECTED]
http://www.inovex.de


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/

Re: [poi] Problem with encoding

Reply via email to