A little more information:
On Aug 29, 2011, at 10:22 , James Wilde wrote:
Can't think of a better subject line.
This is about the Ubuntu repository version of LibO running under Ubuntu
11.04 linux. The repository version is described as follows:
LibreOffice 3.3.3
OOO 330m 19 (Build:301)
LibreOffice 3.3.3.1 Ubuntu package 1.3.3.3ubuntu2
I have been testing Gnucash on linux, and in particular the ability to export
data to an html file and read that into other programs, specifically
LibreOffice Calc. The whole process went excellently with one small
exception: the translation of the Swedish characters, å, ä and ö (and of
course the capitals, Å, Ä and Ö which I have not tested).
The html file quite clearly includes the following at the beginning:
meta http-equiv=content-type content=text-html; charset=utf-8
However, I have had the explanation that the conversion below is the result of
LibO taking the input as ISO-8859-1 and converting it to UTF-8. If someone in
here can confirm this, I will write this up as a bug. As mentioned below in
the original mail, this does not occur on Windows, nor on the Mac. (I just
tested it on the Mac).
Instead of seeing each of the three letters, in each case I saw two
characters, Ã¥, for example instead of å. I checked the three characters in
a hex editor, and found the following:
å was represented by C3A5 in other applications, but as C383C2A5 in LibO
ä was represented by C3A4 in other applications, but as C383C2A4 in LibO
ö was represented by C3B6 in other applications, but as C383C2B6 in LibO
I checked the original data in Gnucash, the intermediary form in the html
file and the contents of content.xml in LibO. The first two had the short
form, LibO had the second form.
I have also checked the same information in Windows, although I have not yet
used a hex editor in Windows. Both Excel and LibO Calc imported the data
correctly, or rather, displayed it correctly.
Before I report this as a bug, I'd like to be sure it's not something I'm
missing in my configuration. All programs are set to use UTF-8, insofar as
one can set it in the program rather than in the operating system.
A late thought occurs to me. I use LibO Writer for writing Mandarin and have
set 'Enabled for Asian languages' under Languages. Can this have affected
the way LibO translates the non-English characters? Although LibO has
actually _added_ two bytes to the characters.
//James
--
For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted