Hi. I decided to write a small program in Go to convert utf8 to simple ASCII. This need arose by my copying a file created in Ubuntu 16.04 amd64, and used on a win10 computer.

I decided to first change ", ' and emdash characters. Using hexdump -C in Ubuntu, the runes in the file are:

open quote = 0xE2809C

close quote = 0xE2809D

apostrophe = 0xE28099

emdash = 0xE28094


However, when I write a simple program to display these runes from the file, using the routines in unicode/utf8, I get very different values. I do not understand this.

open quote = 0x201C

close quote = 0x201D

apostrophe = 0x2019

emdash = 0x2014.


Why are the runes returned by utf8.DecodeRuneInString different from what hexdump shows when inspecting the file directly?

--rob solomon

--
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to