Update of bug #50383 (project wget): Status: None => Confirmed
_______________________________________________________ Follow-up Comment #1: Two problems here: 1. the command-line URL is converted by 'remote_to_utf8()' ind url_parse(). This is wrong, locale_to_utf8() must be taken. On many locales, this wouldn't make a difference with tilde, but I just recognized it when tracing wget. 2. After dequeing (before download), wget converts the complete URL with remote_to_utf8(). This is wrong - only the part coming from remote should be converted (~foo came from local input). Suggested fix: The charset conversion to utf-8 should take place whenever input is taken (from command line or from remote). Internally, wget should work with utf-8 only. That is what Wget2 already does. I add my Python test script to reproduce this issue, if someone wants to work on it. Copy it to testenv/ and manually start it or add it to Makefile.am. (file #39814) _______________________________________________________ Additional Item Attachment: File name: Test-link-shiftjis.py Size:1 KB _______________________________________________________ Reply to this item at: <http://savannah.gnu.org/bugs/?50383> _______________________________________________ Message sent via/by Savannah http://savannah.gnu.org/