[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2014-02-03 Thread Mark Lawrence

Changes by Mark Lawrence breamore...@yahoo.co.uk:


--
nosy:  -BreamoreBoy

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2010-07-17 Thread Mark Lawrence

Mark Lawrence breamore...@yahoo.co.uk added the comment:

Just a prod in case it has gone under the radar.

--
nosy: +BreamoreBoy
versions: +Python 3.2 -Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2010-01-05 Thread Senthil Kumaran

Senthil Kumaran orsent...@gmail.com added the comment:

Antoine: I got your point. Yes, I was missing the purpose of the redirection 
itself and the patch was wrong.

If the 301 is to be cached, the cache map should be maintained at the higher 
level in order for the further requests to refer to.
I have created a redirect_map at OpenerDirector level, and this will be 
populated by the RedirectHandler and will be referred to at time of creation of 
Request. This is along the correct lines, I could verify it with simple scripts 
and check the fetches from cache map for repeated redirects.

Your comments please.

--
Added file: http://bugs.python.org/file15745/urllib2-301-redirection-proper.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2010-01-05 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

Well, first it would be better with some tests.

Second, what does it do for chained redirects? E.g. let's say that there's a 
chain of 301 redirects: A -- B -- C. Does it cache the whole A -- C mapping, 
or only A -- B? If the latter, will the chaining occur when looking up the 
redirected url from the cache (it doesn't seem to)?

Third, it seems to use a global OpenerDirector object. Are there situations 
where it should rather use a request-specific object?

Fourth, you shouldn't need to define a separate http_error_301 method. Just add 
the `cacheable` argument to `http_error_302`. Also, the `_cache_301_redirect` 
attribute seems useless.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2010-01-05 Thread Senthil Kumaran

Senthil Kumaran orsent...@gmail.com added the comment:

Thanks for the comments.
Shall come with the tests. 

Yes,it currently does not handle chained redirects via cache. I dont know RFC's 
stance on it. RFC does not say anything about 301 chained redirects and there 
are tricker issues of caching anything other than 301. Basically, 302 and other 
require the client to check and comply with the Cache-Control and Expires 
header. The feature request was reasonably for caching only 301 redirs and I 
also feel a good one to have. This is the reason for the separate 
http_error_301 method.

The global opener seems to be a straight forward way to for this
activity and not a harmful too.  I can't think of request-specific
object for this one.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2010-01-05 Thread John J Lee

John J Lee jj...@users.sourceforge.net added the comment:

To make sure I understood something Antoine said:  By per-request, I assume 
you mean the same kind of thing as the current use of .redirect_dict -- the 
multiple urllib2.Request instances that may result from a single request passed 
by the user to .open()/urlopen all sharing the same cache state.

In addition to what Antoine said:

 0. patch reports that your latest patch is malformed (see below).
 1. I'm afraid I think any 301 caching that's not per-request should be off by 
default.  Defaulting to on would be a significant change in behaviour, because 
urllib2.urlopen (and OpenerDirector.open) currently retains no state between 
calls (unless you add a handler that keeps state, such as HTTPCookieProcessor, 
but no such handlers are added by default).
 2. I imagine the code changes should be entirely (or almost entirely) confined 
to RedirectHandler and/or AbstractHTTPHandler.  Is there any justification for 
changing OpenerDirector?  Certainly no need to add any globals!
 3. http_error_30x is a documented interface, but it's not frequently used.  
The argument(s) used to control caching should be somewhere else (see questions 
below).
 4. Please do post the doc changes for review once the implementation is 
decided on.


Some questions to consider:

 a. How should POST requests be handled when there is a cached permanent 
redirect URI?
 b. How useful is per-request caching, in the sense defined above (as opposed 
to per-user agent caching -- i.e. per-handler in our case)?  Best answered with 
data from the web.
 c. Should URIs be normalised before being used as a cache key?
 d. Might the cache get big?

For all #a, #b, #c, #d: What do existing implementations (e.g. Firefox) do?


$ patch -p0  urllib2-301-redirection-proper.diff
patching file Lib/urllib2.py
Hunk #3 succeeded at 549 with fuzz 2 (offset 18 lines).
Hunk #4 succeeded at 562 (offset 18 lines).
patch:  malformed patch at line 55: @@ -604,8 +618,12 @@

$ patch --version
patch 2.5.9
Copyright (C) 1988 Larry Wall
Copyright (C) 2003 Free Software Foundation, Inc.

This program comes with NO WARRANTY, to the extent permitted by law.
You may redistribute copies of this program
under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING.

written by Larry Wall and Paul Eggert

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2010-01-05 Thread John J Lee

John J Lee jj...@users.sourceforge.net added the comment:

 Yes,it currently does not handle chained redirects via cache. I dont know 
 RFC's stance on it. RFC does not say anything about 301 chained redirects

I don't see anything in the RFC that prevents us caching chained 301 
redirections.  Caching the chained redirections is better, because it reduces 
the number of requests, which is the purpose of the cache.

 there are tricker issues of caching anything other than 301.

Antoine wasn't suggesting caching URLs from non-301 responses, just that you 
don't need to add a new method named http_redirect_301.  Just delete it and 
test if code == 301 in http_error_302.

 Basically, 302 and other require the client to check and comply with the 
 Cache-Control and Expires header.

I don't think references to caching in the descriptions of the 30* response 
codes are relevant, because urllib2 doesn't implement response caching.  I 
think the part that's relevant is this: The requested resource has been 
assigned a new permanent URI and any future references to this resource SHOULD 
use one of the returned URIs.

 The global opener seems to be a straight forward way to for this activity and 
 not a harmful too.  I can't think of request-specific object for this one.

No.  The global OpenerDirector instance is a convenience to allow having a 
global urlopen() function.  Having a handler pick a random opener object (the 
global one) would be insane.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2010-01-05 Thread Senthil Kumaran

Senthil Kumaran orsent...@gmail.com added the comment:

Attaching a non-malformed patch. I had incomplete tests in previous ones and I 
removed it by-hand before submitting for review. Something went wrong, I see.

Okay, I get the points you are making. Specifically a request specific object 
and then maintaining a map at AbstractHTTPHandler/HTTPRedirectHandler. Shall 
come up with such a method.

--
Added file: http://bugs.python.org/file15754/urllib2-301-patch.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2010-01-05 Thread Senthil Kumaran

Changes by Senthil Kumaran orsent...@gmail.com:


Removed file: 
http://bugs.python.org/file15745/urllib2-301-redirection-proper.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2010-01-05 Thread Senthil Kumaran

Changes by Senthil Kumaran orsent...@gmail.com:


Removed file: 
http://bugs.python.org/file15722/urllib2-301-redirection-CORRECTED.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2010-01-05 Thread Senthil Kumaran

Changes by Senthil Kumaran orsent...@gmail.com:


Removed file: http://bugs.python.org/file8117/liburllib2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2010-01-05 Thread Senthil Kumaran

Changes by Senthil Kumaran orsent...@gmail.com:


Removed file: http://bugs.python.org/file8116/test_urllib2-cache.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2010-01-05 Thread Senthil Kumaran

Changes by Senthil Kumaran orsent...@gmail.com:


Removed file: http://bugs.python.org/file8115/urllib2-301-cache.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2010-01-03 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 Here is the corrected patch for caching the 301 redirections.
 
 * It caches only the redirection not the response.
 * It retains cacheable=True kwarg for http_error_301 method. ( I feel, it 
 should be useful)
 * Have made the cached dict as private.

I'm still not sure what this patch is trying to do. It seems you are using the 
cached URL *after* getting the 301 response. But the whole point of caching 
redirections is to avoid emitting the initial request at all.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2010-01-02 Thread Senthil Kumaran

Senthil Kumaran orsent...@gmail.com added the comment:

Here is the corrected patch for caching the 301 redirections.

* It caches only the redirection not the response.
* It retains cacheable=True kwarg for http_error_301 method. ( I feel, it 
should be useful)
* Have made the cached dict as private.

I have updated the tests. The existing tests for 301 see no changes too.

If you have any review comments, please pitch in. (I shall add the docs and 
news entry before commit)

Thanks!

--
Added file: 
http://bugs.python.org/file15722/urllib2-301-redirection-CORRECTED.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2010-01-02 Thread Senthil Kumaran

Changes by Senthil Kumaran orsent...@gmail.com:


Removed file: http://bugs.python.org/file15677/urllib2-301-redirection.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2009-12-28 Thread Senthil Kumaran

Senthil Kumaran orsent...@gmail.com added the comment:

 Antoine Pitrou added the Comment:
  
   
 I have trouble understanding what the patch does. I would expect it to
 cache the original URL - redirected URL mapping, but it seems   
 cache the final HTTP response instead.

Oops. My mistake. I got carried away by my misunderstanding of the RFC
section on 301. The patch is wrong. I coded it to cache the response
from the redirection instead of just the redirected URL.
I shall write the correct one to cache just the redirection.   
   
   
   
 Aren't http_error_301 and friends for internal use?
   
  
Not really. They are exposed methods and I believe are being used by
clients. There have been bug reports related those redirection 
methods. Even the related Issue735515, explains pretty clearly about
the redirection required. (My bad again and my comment in that issue
is irrelevant.) In this issue, John says that there is no obvious need
to change the interface.  

RFC had a statement along the lines that 301 redirection is cached by
default, unless indicated otherwise. This is where I thought, an option
to turn-off the cacheable behavior might be needed.
   
Thanks for looking at this quickly.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2009-12-28 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

I haven't reviewed the patch, but I would like the caching behavior to
be settable.  I can easily imagine a use case where I would not want the
URLs cached: when using urllib in a test suite or test tool.  (I just
ran into this problem trying to use firefox as a test tool...at the time
I was doing it I wasn't aware that firefox cached the 301 responses).

--
nosy: +r.david.murray

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2009-12-27 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

I have trouble understanding what the patch does. I would expect it to
cache the original URL - redirected URL mapping, but it seems to
cache the final HTTP response instead.

Also, it's not obvious in which situations the default for `cacheable`
would be overriden. Aren't http_error_301 and friends for internal use?

--
nosy: +pitrou

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2009-12-26 Thread Senthil Kumaran

Senthil Kumaran orsent...@gmail.com added the comment:

I am attaching an updated patch for caching the 301 redirects.
As per RFC 2616:

10.3.2 301 Moved Permanently
   ...
   ...references returned by the server, where possible. This response
is  cacheable unless indicated otherwise.

So, I have included an additional argument to the 301 method called
cacheable=True. Which can be used to turn off the cache if required.

I would like to seek some comments on the patch, specifically on adding
this cacheable=True keyword argument.  If its fine, I would go ahead and
check it in.

--
Added file: http://bugs.python.org/file15677/urllib2-301-redirection.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2009-08-20 Thread Senthil

Changes by Senthil orsent...@gmail.com:


--
assignee:  - orsenthil

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2009-04-27 Thread Daniel Diniz

Changes by Daniel Diniz aja...@gmail.com:


--
nosy: +jjlee
stage:  - patch review
type:  - feature request
versions: +Python 2.7 -Python 2.6

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2009-04-27 Thread Skip Montanaro

Changes by Skip Montanaro s...@pobox.com:


--
nosy:  -skip.montanaro

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1755841
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

2008-01-05 Thread vila

Changes by vila:


--
nosy: +vila

_
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1755841
_
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com