I also noticed that there's a Set-Cookie header in there. If you're not handling cookies that could potentially cause some trouble too, though I suspect this is not the problem here.
On Wed, Aug 13, 2008 at 11:17 AM, Jean-Paul Calderone <[EMAIL PROTECTED]> wrote: > On Wed, 13 Aug 2008 18:14:19 +0530, "O.R.Senthil Kumaran" > <[EMAIL PROTECTED]> wrote: >> >> I am trying to write a fix for this bug http://bugs.python.org/issue2464 >> - urllib2 can't handle http://www.wikispaces.com >> >> What actually happening here is: >> >> 1) urllib2 tries to open http://www.wikispaces.com >> 2) It gets 302 Redirected to >> >> https://session.wikispaces.com/session/auth?authToken=1bd8784307f89a495cc1aafb075c4983 >> 3) It again gets 302 Redirected to: >> 'http://www.wikispaces.com?responseToken=1bd8784307f89a495cc1aafb075c4983 >> >> After this, gets a 200 code, but when the page it retrived it 400 Bad >> Request! >> >> Firefox has NO problem in getting the actual page though. >> >> Here is the O/P of the session (I have made print header.items() at >> http_error_302 method in HTTPRedirectHandler): >> >>>>> obj1 = urllib2.urlopen("http://www.wikispaces.com") >> >> [('content-length', '0'), ('x-whom', 'w9-prod-http, p1'), ('set-cookie', >> 'slave=1; expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/, test=1; >> expires=Wed, >> 13-Aug-2008 13:03:51 GMT; path=/'), ('server', 'nginx/0.6.30'), >> ('connection', >> 'close'), ('location', >> >> 'https://session.wikispaces.com/session/auth?authToken=4b3eecb5c1ab301689e446cf03b3a585'), >> ('date', 'Wed, 13 Aug 2008 12:33:51 GMT'), ('p3p', 'CP: ALL DSP COR CURa >> ADMa >> DEVa CONo OUR IND ONL COM NAV INT CNT STA'), ('content-type', 'text/html; >> charset=utf-8')] >> [('content-length', '0'), ('x-whom', 'w8-prod-https, p1'), ('set-cookie', >> 'master=1; expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/, >> master=7de5d46e15fd23b1ddf782c565d4fb3a; expires=Thu, 14-Aug-2008 13:03:53 >> GMT; >> path=/; domain=session.wikispaces.com'), ('server', 'nginx/0.6.30'), >> ('connection', 'close'), ('location', >> >> 'http://www.wikispaces.com?responseToken=4b3eecb5c1ab301689e446cf03b3a585'), >> ('date', 'Wed, 13 Aug 2008 12:33:53 GMT'), ('p3p', 'CP: ALL DSP COR CURa >> ADMa >> DEVa CONo OUR IND ONL COM NAV INT CNT STA'), ('content-type', 'text/html; >> charset=utf-8')] >>>>> >>>>> print obj1.geturl() >> >> http://www.wikispaces.com?responseToken=4b3eecb5c1ab301689e446cf03b3a585 >>>>> >>>>> print obj1.code >> >> 200 >>>>> >>>>> print obj1.headers >> >>>>> print obj1.info() >> >>>>> print obj1.read() >> >> <html> >> <head><title>400 Bad Request</title></head> >> <body bgcolor="white"> >> <center><h1>400 Bad Request</h1></center> >> <hr><center>nginx/0.6.30</center> >> </body> >> </html> >> >> With all this happening with urllib2, firefox is able to handle this >> properly. >> Also I notice that I suffix the url with a dummy path say >> url = "http://www.wikispaces.com/dummy_url_path". The urlopen request will >> still to through 302-302-200. but with dummy_url_path appended in the >> redirections and then read() will succeed! >> >> Please share your opinion on where do you think, that urllib2 is going >> wrong >> here! I am not able to drill down to the fault point. >> This has NOT got to do with null characters in the redirection url as >> noted in >> the bug report. >> > > Some things: > > http://foo.com > > This is not a valid URL. The correct URL for the intended location here > is: > > http://foo.com/ > > This is the root of the problem, I suspect. Firefox notices this problem > and fixes it when deciding what requests to make. For example, while > urllib2 ultimately asks for this URL: > > ?responseToken=f02a955460b2cc180e9bf1faa8efd383 > > Firefox recognizes that this is silly and instead asks for: > > /?responseToken=5007a08643c2b4dd719a8848024b2c7a > > The tokens are different because these are values from actual requests. > Notice the important difference, though - Firefox's request begins with > a /. > > Likely, urllib2 should do a bit more validation of its input and make > sure it is only making requests which follow the protocol. > > Jean-Paul > _______________________________________________ > Web-SIG mailing list > Web-SIG@python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: > http://mail.python.org/mailman/options/web-sig/sidnei%40enfoldsystems.com > -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com