On Mon, 2007-08-06 at 02:19 -0300, John Coppens wrote: > Hello people. > > Using version 1.6.3 of gnumeric, I tried to read an xls file and save it > as csv. There was a problem with the resulting csv file, in that iconv > didn't want to convert it into another coding. I'm not sure where the > problem lies. > > The original .xls had the following string in it: > > 0068 0069 0070 [201A] 0072 ... > > I've marked the 201A code, apparently a valid utf-16 code (according to > the xls specs).
Yes, this is U+201A which is a type of quotation mark. Someone might have used it (wrongly) instead of a comma, or it might be the usual style of quotation mark for some non-English text. If there's just one such character and you're sure it's a mistake (e.g. it should obviously be a comma) you can fix it in the spreadsheet and ignore the rest of my post. > Gnumeric (or ssconvert) saved this in the csv as: > > 68 69 70 E2 80 9A 72 ... > > Again 201A, and it seems to be the shortest utf-8 code that can represent > it. But iconv -f utf8 -t iso-8859-1 chokes on the sequence and aborts > with: > > illegal input sequence at position xxxx > > I know -c can make iconv skip the error, but that doesn't seem elegant. > Can anyone indicate where to look for a solution? This is not a Gnumeric problem The ISO 8859-1 character set does not include U+201A, so iconv is objecting because this transformation loses information. If you use the iconv -c switch this quotation mark will just vanish from the output. There is no way to do what you're asking, there simply isn't a way to write this U+201A character using ISO-8859-1, so either you need to choose a different encoding, or re-think the whole plan. Nick. _______________________________________________ gnumeric-list mailing list [email protected] http://mail.gnome.org/mailman/listinfo/gnumeric-list
