On Sun, Jan 17, 2010 at 22:47, Riccardo Cohen wrote:

> Using numeric is more difficult because I have to make a program to
> write these values as I don't know them,

You most likely will have to know which characters you want to replace,
I don't see any way around that.  Anyway the more interesting change
is probably to replace a character at a time instead of a byte at a time.
How to specify each character literal in your source file is a separate

> But
> in the same time, source code is not used at run time, and the problem
> is at run time.

In general, the problem could be at compile time even if you don't see
the symptoms until runtime.

> I checked my locales on unix and there is no other locale installed than
> UTF-8. On mac it is more difficult to check.

Try "man locale" in your terminal window.

Good luck,


PS: If you get all the character set issues under control you probably don't
even need to replace characters in the URLs.  See for example Wikipedia
with URLs like 
