We should be universally handling the issues mentioned here:
http://en.wikipedia.org/wiki/Windows-1252 by intercepting character
differences and writing them out properly. Thus HSSF should force
8859-1 encoding but should then kind of do a replace on the characters.
If someone wants to contribute I can point them in the right direction.
-andy
Christian Gosch wrote:
Hi,
that would be of particular interest for me, too.
We have some international names in our application, although it runs in a
ISO-Latin-1 (ISO-8859-1) [db, appserver] / Cp1252 [client] environment with
deDE locale by default.
We have several areas of "visibility" like DB (VarChar fields), Java source
files, appserver console, JSP source / rendering / display, PDF and XLS
download.
Actually we use the last POI final (should be 2.5.1?), and I do not remember
any possibility of setting the encoding for String values in a sheet. Since
the XLS file format is kind of a "hybrid" one, mixed up from binary
structure / control data and textual content data, it is crucial to fill in
all textual "content" with the appropriate encoding -- and that one should
be subject to set up / choose.
Testing some examples I found that
- very most characters found in our data are displayed as they should, in
JSP and XLS (by POI).
- the czech "s with v on top" is displayed well in JSPs, but not in POI
generated XLS: There it shows up as "little rectangle".
I know that in ISO-8859-1 there are also problems with danish "o with slash"
also, but currently I have no test data. Also I would expect problems with
turkish letters like "i without dot" or "c with bottom accent", like in the
city name "Incirlik", when written correctly.
btw:
In JXL (JExcelAPI) it is posible to set up an encoding for a generated XLS
file, which by default is "the default encoding of the hosting VM", but it
took a while to make that happen.
Regards
Christian Gosch
inovex GmbH
On Tuesday, November 08, 2005 11:59 PM [GMT+1=CET],
Olivier Matt <[EMAIL PROTECTED]> wrote:
Hello,
I'm reading excel files and I get from a CELL_TYPE_STRING cell a
String.
That string has some problems with accents (I guess the file is
encoded using
some latin-characters encoding), they are not seen properly.
How can I avoid this behavior ? Can I specify somewhere the encoding
of the cells ?
Or is there a method for transforming misinterpreted strings to good
latin-strings ?
Thanks for help,
Olivier
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List: http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/
--
Andrew C. Oliver
SuperLink Software, Inc.
Java to Excel using POI
http://www.superlinksoftware.com/services/poi
Commercial support including features added/implemented, bugs fixed.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List: http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/