Re: httplib incredibly slow :-(
Dieter Maurer wrote: Chris Withers ch...@simplistix.co.uk writes on Thu, 13 Aug 2009 08:20:37 +0100: ... I've already established that the file downloads in seconds with [something else], so I'd like to understand why python isn't doing the same and fix the problem... A profile might help to understand what the time is used for. As almost all operations are not done in Python itself (httplib is really a very tiny wrapper above a socket), a C level profile may be necessary to understand the behaviour. Actually, the problem *was* in Python: http://bugs.python.org/issue2576 Found and fixed :-) Chris -- Simplistix - Content Management, Batch Processing Python Consulting - http://www.simplistix.co.uk -- http://mail.python.org/mailman/listinfo/python-list
Re: httplib incredibly slow :-(
In article mailman.133.1250270175.2903.python-l...@python.org, Chris Withers ch...@simplistix.co.uk wrote: Aahz wrote: What do you need to know for a decent example? Simple download of a file from a url with some auth headers added would do me. Well, I've hacked up some sample code from my company's codebase: # !!! UNTESTED !!! c = pycurl.Curl() c.setopt(pycurl.URL, url) c.setopt(pycurl.USERPWD, %s:%s % (user, pwd)) c.setopt(pycurl.FOLLOWLOCATION, 1) c.setopt(pycurl.MAXREDIRS, 5) c.setopt(pycurl.CONNECTTIMEOUT, 30) f = StringIO() c.setopt(pycurl.WRITEDATA, f) c.perform() c.close() -- Aahz (a...@pythoncraft.com) * http://www.pythoncraft.com/ Given that C++ has pointers and typecasts, it's really hard to have a serious conversation about type safety with a C++ programmer and keep a straight face. It's kind of like having a guy who juggles chainsaws wearing body armor arguing with a guy who juggles rubber chickens wearing a T-shirt about who's in more danger. --Roy Smith -- http://mail.python.org/mailman/listinfo/python-list
Re: httplib incredibly slow :-(
i3dmaster wrote: Just wanted to check if you can try turning on the debug mode for httplib and see if you can read a bit more debug info on where the calls get hung. In your example, it would be conn.set_debuglevel(1) I had a look through the code this debug level controls and I don't see any information that this provides which would help here... Chris -- Simplistix - Content Management, Batch Processing Python Consulting - http://www.simplistix.co.uk -- http://mail.python.org/mailman/listinfo/python-list
Re: httplib incredibly slow :-(
Aahz wrote: Sorry, I mostly have been working on our Mac port, so I'm not sure what's needed to make this work on Windows. Did you try downloading the PyCurl binary? Maybe it statically links libcurl on Windows. Shame it's not available as a bdist_egg, that's what I'm really after... What do you need to know for a decent example? Simple download of a file from a url with some auth headers added would do me. Other than that, nice to haves would be how to build a http post with fields in a multi-part body, some of which might be files. cheers, Chris -- Simplistix - Content Management, Batch Processing Python Consulting - http://www.simplistix.co.uk -- http://mail.python.org/mailman/listinfo/python-list
Re: httplib incredibly slow :-(
Chris Withers ch...@simplistix.co.uk writes on Thu, 13 Aug 2009 08:20:37 +0100: ... I've already established that the file downloads in seconds with [something else], so I'd like to understand why python isn't doing the same and fix the problem... A profile might help to understand what the time is used for. As almost all operations are not done in Python itself (httplib is really a very tiny wrapper above a socket), a C level profile may be necessary to understand the behaviour. Dieter -- http://mail.python.org/mailman/listinfo/python-list
Re: httplib incredibly slow :-(
David Robinow wrote: On Wed, Aug 12, 2009 at 12:37 PM, Chris Withersch...@simplistix.co.uk wrote: David Stanek wrote: Also on the same box where you run this script can you test with curl or wget? It's a Windows box, so no :-( Why not? http://users.ugent.be/~bpuype/wget/ http://curl.haxx.se/download.html Fair point, but I don't see what this will achieve... I've already established that the file downloads in seconds with [something else], so I'd like to understand why python isn't doing the same and fix the problem... Chris -- Simplistix - Content Management, Batch Processing Python Consulting - http://www.simplistix.co.uk -- http://mail.python.org/mailman/listinfo/python-list
Re: httplib incredibly slow :-(
On Thu, Aug 13, 2009 at 3:20 AM, Chris Withersch...@simplistix.co.uk wrote: David Robinow wrote: On Wed, Aug 12, 2009 at 12:37 PM, Chris Withersch...@simplistix.co.uk wrote: David Stanek wrote: Also on the same box where you run this script can you test with curl or wget? It's a Windows box, so no :-( Why not? http://users.ugent.be/~bpuype/wget/ http://curl.haxx.se/download.html Fair point, but I don't see what this will achieve... I've already established that the file downloads in seconds with [something else], so I'd like to understand why python isn't doing the same and fix the problem... My post was simply to correct the implication that curl and wget can not be used on Windows. It's up to you whether you want to use one or the other. I'm not the OP, and this is not my area of expertise, but ... You've got two data points. You've jumped to the conclusion that there's something wrong with Python or your code. You're probably right. However, if you try wget, for example, and it's as slow as your code, you cqn look elsewhere. If, on the other hand, wget is as fast as IE, you'll have more proof that your code is the problem. Then, since wget is open source you can look at the source code and see what wget is doing right that you (or httplib) is doing wrong. -- http://mail.python.org/mailman/listinfo/python-list
Re: httplib incredibly slow :-(
In article mailman.4613.1250033136.8015.python-l...@python.org, Chris Withers ch...@simplistix.co.uk wrote: Aahz wrote: In article mailman.4598.1250022343.8015.python-l...@python.org, Chris Withers ch...@simplistix.co.uk wrote: Does anyone know of an alternative library for creating http requests and getting their responses that's faster but hopefully has a similar interface? PyCurl This seems to be a wrapper around libcurl. Does it work on Windows? Yes. If so, where can I find some decent examples? (the ones listed on the pycurl website are not what I'd call decent :-S) Sorry, I mostly have been working on our Mac port, so I'm not sure what's needed to make this work on Windows. Did you try downloading the PyCurl binary? Maybe it statically links libcurl on Windows. What do you need to know for a decent example? -- Aahz (a...@pythoncraft.com) * http://www.pythoncraft.com/ ...string iteration isn't about treating strings as sequences of strings, it's about treating strings as sequences of characters. The fact that characters are also strings is the reason we have problems, but characters are strings for other good reasons. --Aahz -- http://mail.python.org/mailman/listinfo/python-list
Re: httplib incredibly slow :-(
Answering myself... Chris Withers wrote: In article mailman.4598.1250022343.8015.python-l...@python.org, Chris Withers ch...@simplistix.co.uk wrote: Does anyone know of an alternative library for creating http requests and getting their responses that's faster but hopefully has a similar interface? PyCurl This seems to be a wrapper around libcurl. Does it work on Windows? Not by my definition of work: - there are no windows binaries for libcurl - getting https support on windows seems pretty hit'n'miss: http://stackoverflow.com/questions/197444/building-libcurl-with-ssl-support-on-windows I'm still reeling from what seems to be such a huge problem with httplib that seem to be largely ignored :-( Chris -- Simplistix - Content Management, Batch Processing Python Consulting - http://www.simplistix.co.uk -- http://mail.python.org/mailman/listinfo/python-list
Re: httplib incredibly slow :-(
Chris Withers ch...@simplistix.co.uk wrote: I'm still reeling from what seems to be such a huge problem with httplib that seem to be largely ignored :-( Chris There is an httplib2 (but I don't know anything further about it...): http://code.google.com/p/httplib2/ Calling wget or curl using a subprocess is probably as easy as it is ugly, I use the wget build from here: http://gnuwin32.sourceforge.net/packages/wget.htm max -- http://mail.python.org/mailman/listinfo/python-list
Re: httplib incredibly slow :-(
We use PyCURL on Windows. http://pycurl.sourceforge.net/ provides pre- built versions for Windows and it works out of the box. - Shailesh On Aug 12, 7:14 pm, Max Erickson maxerick...@gmail.com wrote: Chris Withers ch...@simplistix.co.uk wrote: I'm still reeling from what seems to be such a huge problem with httplib that seem to be largely ignored :-( Chris There is an httplib2 (but I don't know anything further about it...): http://code.google.com/p/httplib2/ Calling wget or curl using a subprocess is probably as easy as it is ugly, I use the wget build from here: http://gnuwin32.sourceforge.net/packages/wget.htm max -- http://mail.python.org/mailman/listinfo/python-list
Re: httplib incredibly slow :-(
On Tue, Aug 11, 2009 at 4:25 PM, Chris Withersch...@simplistix.co.uk wrote: Hi All, I'm using the following script to download a 150Mb file: from base64 import encodestring from httplib import HTTPConnection from datetime import datetime conn = HTTPSConnection('localhost') headers = {} auth = 'Basic '+encodestring('username:password').strip() headers['Authorization']=auth t = datetime.now() print t conn.request('GET','/somefile.zip',None,headers) print 'request:',datetime.now()-t response = conn.getresponse() print 'response:',datetime.now()-t data = response.read() print 'read:',datetime.now()-t The output shows it takes over 20 minutes to do this. However, this is on a local network, and downloading the same file in IE takes under 3 seconds! I saw this issue: http://bugs.python.org/issue2576 I tried changing the buffer size to 4096 in a subclass as the issue suggested, but I didn't see the reported speed improvement. I'm using Python 2.6.2. Does anyone know of an alternative library for creating http requests and getting their responses that's faster but hopefully has a similar interface? I tried to reproduce this, but I could not. Could you paste in the output of your script? Also on the same box where you run this script can you test with curl or wget? -- David blog: http://www.traceback.org twitter: http://twitter.com/dstanek -- http://mail.python.org/mailman/listinfo/python-list
Re: httplib incredibly slow :-(
Max Erickson wrote: There is an httplib2 (but I don't know anything further about it...): http://code.google.com/p/httplib2/ I had a look, it uses httplib, so will likely suffer from the same problems... Calling wget or curl using a subprocess is probably as easy as it is ugly, I use the wget build from here: http://gnuwin32.sourceforge.net/packages/wget.htm Yeah, no ;-) Chris -- Simplistix - Content Management, Batch Processing Python Consulting - http://www.simplistix.co.uk -- http://mail.python.org/mailman/listinfo/python-list
Re: httplib incredibly slow :-(
Yes it includes libcurl. I didn't have to install it separately. I still continue to use Python 2.4. So cannot say about Python 2.6. - Shailesh On Wed, Aug 12, 2009 at 10:23 PM, Chris Withers ch...@simplistix.co.ukwrote: shaileshkumar wrote: We use PyCURL on Windows. http://pycurl.sourceforge.net/ provides pre- built versions for Windows and it works out of the box. Does it include libcurl? Are these builds available for Python 2.6? Chris -- Simplistix - Content Management, Batch Processing Python Consulting - http://www.simplistix.co.uk -- http://mail.python.org/mailman/listinfo/python-list
Re: httplib incredibly slow :-(
shaileshkumar wrote: We use PyCURL on Windows. http://pycurl.sourceforge.net/ provides pre- built versions for Windows and it works out of the box. Does it include libcurl? Are these builds available for Python 2.6? Chris -- Simplistix - Content Management, Batch Processing Python Consulting - http://www.simplistix.co.uk -- http://mail.python.org/mailman/listinfo/python-list
Re: httplib incredibly slow :-(
David Stanek wrote: I tried to reproduce this, but I could not. Could you paste in the output of your script? Not sure how that'll help, but sure: 2009-08-11 21:27:59.153000 request: 0:00:00.109000 response: 0:00:00.109000 read: 0:24:31.266000 Also on the same box where you run this script can you test with curl or wget? It's a Windows box, so no :-( But it really does download in a few seconds with IE, and 20min+ using the script I included... Chris -- Simplistix - Content Management, Batch Processing Python Consulting - http://www.simplistix.co.uk -- http://mail.python.org/mailman/listinfo/python-list
Re: httplib incredibly slow :-(
On Aug 12, 9:37 am, Chris Withers ch...@simplistix.co.uk wrote: David Stanek wrote: I tried to reproduce this, but I could not. Could you paste in the output of your script? Not sure how that'll help, but sure: 2009-08-11 21:27:59.153000 request: 0:00:00.109000 response: 0:00:00.109000 read: 0:24:31.266000 Also on the same box where you run this script can you test with curl or wget? It's a Windows box, so no :-( But it really does download in a few seconds with IE, and 20min+ using the script I included... Chris -- Simplistix - Content Management, Batch Processing Python Consulting -http://www.simplistix.co.uk Just wanted to check if you can try turning on the debug mode for httplib and see if you can read a bit more debug info on where the calls get hung. In your example, it would be conn.set_debuglevel(1) -- http://mail.python.org/mailman/listinfo/python-list
Re: httplib incredibly slow :-(
On Wed, Aug 12, 2009 at 12:37 PM, Chris Withersch...@simplistix.co.uk wrote: David Stanek wrote: Also on the same box where you run this script can you test with curl or wget? It's a Windows box, so no :-( Why not? http://users.ugent.be/~bpuype/wget/ http://curl.haxx.se/download.html -- http://mail.python.org/mailman/listinfo/python-list
httplib incredibly slow :-(
Hi All, I'm using the following script to download a 150Mb file: from base64 import encodestring from httplib import HTTPConnection from datetime import datetime conn = HTTPSConnection('localhost') headers = {} auth = 'Basic '+encodestring('username:password').strip() headers['Authorization']=auth t = datetime.now() print t conn.request('GET','/somefile.zip',None,headers) print 'request:',datetime.now()-t response = conn.getresponse() print 'response:',datetime.now()-t data = response.read() print 'read:',datetime.now()-t The output shows it takes over 20 minutes to do this. However, this is on a local network, and downloading the same file in IE takes under 3 seconds! I saw this issue: http://bugs.python.org/issue2576 I tried changing the buffer size to 4096 in a subclass as the issue suggested, but I didn't see the reported speed improvement. I'm using Python 2.6.2. Does anyone know of an alternative library for creating http requests and getting their responses that's faster but hopefully has a similar interface? Chris -- Simplistix - Content Management, Batch Processing Python Consulting - http://www.simplistix.co.uk -- http://mail.python.org/mailman/listinfo/python-list
Re: httplib incredibly slow :-(
In article mailman.4598.1250022343.8015.python-l...@python.org, Chris Withers ch...@simplistix.co.uk wrote: Does anyone know of an alternative library for creating http requests and getting their responses that's faster but hopefully has a similar interface? PyCurl -- Aahz (a...@pythoncraft.com) * http://www.pythoncraft.com/ ...string iteration isn't about treating strings as sequences of strings, it's about treating strings as sequences of characters. The fact that characters are also strings is the reason we have problems, but characters are strings for other good reasons. --Aahz -- http://mail.python.org/mailman/listinfo/python-list
Re: httplib incredibly slow :-(
Aahz wrote: In article mailman.4598.1250022343.8015.python-l...@python.org, Chris Withers ch...@simplistix.co.uk wrote: Does anyone know of an alternative library for creating http requests and getting their responses that's faster but hopefully has a similar interface? PyCurl This seems to be a wrapper around libcurl. Does it work on Windows? If so, where can I find some decent examples? (the ones listed on the pycurl website are not what I'd call decent :-S) Chris -- Simplistix - Content Management, Batch Processing Python Consulting - http://www.simplistix.co.uk -- http://mail.python.org/mailman/listinfo/python-list