thanks, that worked.

2008/10/31 Ben Wiley Sittler <[EMAIL PROTECTED]>:
> if you need to fix a lot of these automatically from a shell script,
> you might consider something like this:
>
> python -c 'import sys, urllib; print urllib.unquote("
> ".join(sys.argv[1:])).decode("utf-8").encode("iso-8859-1")' \
>   '%C3%83%C2%A9' \
>   '%C3%A4%C2%B8%C2%93%C3%A8%C2%BE%C2%91'
>
> é 专辑
>
> it works like "echo", but decodes the %-escaping and one of the levels
> of utf-8 encoding.
>
> On Fri, Oct 31, 2008 at 1:31 PM, Andries E. Brouwer
> <[EMAIL PROTECTED]> wrote:
>> On Sat, Nov 01, 2008 at 01:51:42AM +0800, Ray Chuan wrote:
>>
>>> using an edonkey client, which has a function to convert file names to
>>> url-friendly strings (aka ed2k links), i was able to see that "é"
>>> showed up as %C3%83%C2%A9, while the more complex "专辑"
>>> (&#19987;&#36753;) would be %C3%A4%C2%B8%C2%93%C3%A8%C2%BE%C2%91.
>>
>> You converted twice to UTF-8, so have to go back once.
>>
>> (é is U+00e9 which is 11000011 10101001 in UTF-8, but if you read
>> the latter as Latin-1 and convert once more to UTF-8 you get
>> 11000011 10000011 11000010 10101001, that is, %C3%83%C2%A9 as you reported)
>>
>>
>> --
>> Linux-UTF8:   i18n of Linux on all levels
>> Archive:      http://mail.nl.linux.org/linux-utf8/
>>
>>
>



-- 
Cheers,
Ray Chuan

Reply via email to