I am writing some URL parsing code in python (sssh! :-) and decided to see
how LWP solved my problem ... well, it doesn't.

What my web client is doing is logging in to a site using basic
authentication. The problem arises because the userid has a slash in it.

Example: userid = test/ing
         password = foobar

note that %2f is the encoded version of /.

hence, the URL would look like:
   http://test%2fing:[EMAIL PROTECTED]/page.html

this works fine in netscape 4.7/linux, but it chokes LWP.

use LWP::Simple;
print get("http://test%2fing:foobar\@www.whatever.com/bar.html");

putting the / in unencoded doesn't work either:
print get("http://test/ing:foobar\@www.whatever.com/bar.html");

(if you don't encode the slash, you have an ambiguous URL, and you could
interpret the host as being "test" in this case.)

I've looked at RFCs 2617 (HTTP Auth), 1738 (URLs) and 2616 (HTTP 1.1).

[1] RFC 2617 says userids can contain anything besides a colon (:). Passwords
can be *anything*.

[2] RFC 2616 and 1738 do not state that userid:password@ are allowed parts of
a http URL.

[3] RFC 1738 says to encode characters such as ":" and "@" when used in the
userid/password portions of URLs; however, they're mostly talking about ftp
and so on, since [2].

any thoughts?

Paul

__________________________________________________
Do You Yahoo!?
Send instant messages & get email alerts with Yahoo! Messenger.
http://im.yahoo.com/

Reply via email to