However, there are other Windows localizations over the world which don't use Cp1252 -- to go to the extremes, look at asian versions supporting Traditional Chinese, Japanese or the like. Even russian, korean, hebrew are sold, and they all have completely different charsets -- and the interesting thing about it is, that I can view / edit files created there on other localizations! So there *must* be *some* way to (a) encode it and (b) tell, what the actual encoding is.
Look at the implementation of JXL. It is not my work, but we used it a lot before POI. In the first versions, there was no idea of supporting something like non-US chars, but after some weeks of discussion the developer of JXL got the message, and *did* implement support of different encodings. Since this product is open source, it should be possible. Look for JExcepAPI (http://www.andykhan.com/jexcelapi/). Please, don't let all of us non-US developers in 'good old europe' and wherever else not starvate by lack of custom encoding support... Regards Christian Gosch inovex GmbH On Wednesday, November 09, 2005 11:07 PM [GMT+1=CET], [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > Excel wants cp1252 for most things... It just does. When I get home > (I'm on the road) I'll look at the dev kit...it may be that by > changing the codepage record we can handle things a bit nicer, but > eeez kinda picky about that and regardless of what AIX may support, > when > you open the Excel sheet it will be on Windows generally (or a > semi-emulation of it on Mac/Linux) and you'll have to write it in an > encoding supported by Excel for Windows... > > -Andy > > Rainer Klute wrote: > >> Am Mittwoch, den 09.11.2005, 07:25 -0500 schrieb [EMAIL PROTECTED]: >> >> >>> We should be universally handling the issues mentioned here: >>> http://en.wikipedia.org/wiki/Windows-1252 by intercepting character >>> differences and writing them out properly. Thus HSSF should force >>> 8859-1 encoding but should then kind of do a replace on the >>> characters. If someone wants to contribute I can point them in the >>> right direction. >>> >>> >> >> Um, no. Enforcing ISO 8859-1 as character code would be of limited >> use only. These reason is that like Windows Codepage 1252 it >> represents only a limited set of characters. UTF-8 is the preferred >> character encoding. However, POI should not forbid to create strings >> in other character encodings, be it ISO 8859-1, cp1252 or whatever. >> >> By the way, HPSF does a nice job of supporting a lot of different >> character encodings. At least there are no problems I am aware of. I >> suggest you have a look at it. >> >> Best regards >> Rainer Klute >> >> Rainer Klute IT-Consulting GmbH >> Dipl.-Inform. >> Rainer Klute E-Mail: [EMAIL PROTECTED] >> K??rner Grund 24 Telefon: +49 172 2324824 >> D-44143 Dortmund Telefax: +49 231 5349423 >> >> Public key fingerprint: E4E4386515EE0BED5C162FBB5343461584B5A42E Gruesse, -- Dipl.-Inform. Christian Gosch Systems Development inovex GmbH Karlsruher Strasse 71 D-75179 Pforzheim Tel.: +49 (0)72 31 - 31 91 - 85 Fax: +49 (0)72 31 - 31 91 - 91 mailto:[EMAIL PROTECTED] http://www.inovex.de --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/
