On Mon, 2007-08-06 at 02:19 -0300, John Coppens wrote:
> Hello people.
> 
> Using version 1.6.3 of gnumeric, I tried to read an xls file and save it
> as csv. There was a problem with the resulting csv file, in that iconv
> didn't want to convert it into another coding. I'm not sure where the
> problem lies.
> 
> The original .xls had the following string in it:
> 
> 0068 0069 0070 [201A] 0072 ...
> 
> I've marked the 201A code, apparently a valid utf-16 code (according to
> the xls specs).

Yes, this is U+201A which is a type of quotation mark. Someone might
have used it (wrongly) instead of a comma, or it might be the usual
style of quotation mark for some non-English text. If there's just one
such character and you're sure it's a mistake (e.g. it should obviously
be a comma) you can fix it in the spreadsheet and ignore the rest of my
post.

> Gnumeric (or ssconvert) saved this in the csv as:
> 
> 68 69 70 E2 80 9A 72 ...
> 
> Again 201A, and it seems to be  the shortest utf-8 code that can represent
> it. But iconv -f utf8 -t iso-8859-1 chokes on the sequence and aborts
> with:
> 
> illegal input sequence at position xxxx
> 
> I know -c can make iconv skip the error, but that doesn't seem elegant.
> Can anyone indicate where to look for a solution?

This is not a Gnumeric problem

The ISO 8859-1 character set does not include U+201A, so iconv is
objecting because this transformation loses information. If you use the
iconv -c switch this quotation mark will just vanish from the output.

There is no way to do what you're asking, there simply isn't a way to
write this U+201A character using ISO-8859-1, so either you need to
choose a different encoding, or re-think the whole plan.

Nick.

_______________________________________________
gnumeric-list mailing list
[email protected]
http://mail.gnome.org/mailman/listinfo/gnumeric-list

Reply via email to