Re: httplib incredibly slow :-(

2009-09-07 Thread Chris Withers

Dieter Maurer wrote:

Chris Withers ch...@simplistix.co.uk writes on Thu, 13 Aug 2009 08:20:37 
+0100:

...
I've already established that the file downloads in seconds with
[something else], so I'd like to understand why python isn't doing the
same and fix the problem...


A profile might help to understand what the time is used for.

As almost all operations are not done in Python itself (httplib is really
a very tiny wrapper above a socket), a C level profile may be necessary
to understand the behaviour.


Actually, the problem *was* in Python:

http://bugs.python.org/issue2576

Found and fixed :-)

Chris

--
Simplistix - Content Management, Batch Processing  Python Consulting
   - http://www.simplistix.co.uk
--
http://mail.python.org/mailman/listinfo/python-list


Re: httplib incredibly slow :-(

2009-08-19 Thread Aahz
In article mailman.133.1250270175.2903.python-l...@python.org,
Chris Withers  ch...@simplistix.co.uk wrote:
Aahz wrote:

 What do you need to know for a decent example?

Simple download of a file from a url with some auth headers added would 
do me.

Well, I've hacked up some sample code from my company's codebase:

# !!! UNTESTED !!!
c = pycurl.Curl()
c.setopt(pycurl.URL, url)
c.setopt(pycurl.USERPWD, %s:%s % (user, pwd))
c.setopt(pycurl.FOLLOWLOCATION, 1)
c.setopt(pycurl.MAXREDIRS, 5)
c.setopt(pycurl.CONNECTTIMEOUT, 30)
f = StringIO()
c.setopt(pycurl.WRITEDATA, f)
c.perform()
c.close()
-- 
Aahz (a...@pythoncraft.com)   * http://www.pythoncraft.com/

Given that C++ has pointers and typecasts, it's really hard to have a
serious conversation about type safety with a C++ programmer and keep a
straight face.  It's kind of like having a guy who juggles chainsaws
wearing body armor arguing with a guy who juggles rubber chickens wearing
a T-shirt about who's in more danger.  --Roy Smith
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: httplib incredibly slow :-(

2009-08-17 Thread Chris Withers

i3dmaster wrote:

Just wanted to check if you can try turning on the debug mode for
httplib and see if you can read a bit more debug info on where the
calls get hung. In your example, it would be conn.set_debuglevel(1)


I had a look through the code this debug level controls and I don't see 
any information that this provides which would help here...


Chris

--
Simplistix - Content Management, Batch Processing  Python Consulting
   - http://www.simplistix.co.uk
--
http://mail.python.org/mailman/listinfo/python-list


Re: httplib incredibly slow :-(

2009-08-14 Thread Chris Withers

Aahz wrote:

Sorry, I mostly have been working on our Mac port, so I'm not sure what's
needed to make this work on Windows.  Did you try downloading the PyCurl
binary?  Maybe it statically links libcurl on Windows.


Shame it's not available as a bdist_egg, that's what I'm really after...


What do you need to know for a decent example?


Simple download of a file from a url with some auth headers added would 
do me.
Other than that, nice to haves would be how to build a http post with 
fields in a multi-part body, some of which might be files.


cheers,

Chris

--
Simplistix - Content Management, Batch Processing  Python Consulting
   - http://www.simplistix.co.uk
--
http://mail.python.org/mailman/listinfo/python-list


Re: httplib incredibly slow :-(

2009-08-14 Thread Dieter Maurer
Chris Withers ch...@simplistix.co.uk writes on Thu, 13 Aug 2009 08:20:37 
+0100:
 ...
 I've already established that the file downloads in seconds with
 [something else], so I'd like to understand why python isn't doing the
 same and fix the problem...

A profile might help to understand what the time is used for.

As almost all operations are not done in Python itself (httplib is really
a very tiny wrapper above a socket), a C level profile may be necessary
to understand the behaviour.

Dieter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: httplib incredibly slow :-(

2009-08-13 Thread Chris Withers

David Robinow wrote:

On Wed, Aug 12, 2009 at 12:37 PM, Chris Withersch...@simplistix.co.uk wrote:

David Stanek wrote:

Also on the same box where you run this script
can you test with curl or wget?

It's a Windows box, so no :-(


Why not?

http://users.ugent.be/~bpuype/wget/
http://curl.haxx.se/download.html


Fair point, but I don't see what this will achieve...

I've already established that the file downloads in seconds with 
[something else], so I'd like to understand why python isn't doing the 
same and fix the problem...


Chris

--
Simplistix - Content Management, Batch Processing  Python Consulting
   - http://www.simplistix.co.uk
--
http://mail.python.org/mailman/listinfo/python-list


Re: httplib incredibly slow :-(

2009-08-13 Thread David Robinow
On Thu, Aug 13, 2009 at 3:20 AM, Chris Withersch...@simplistix.co.uk wrote:
 David Robinow wrote:

 On Wed, Aug 12, 2009 at 12:37 PM, Chris Withersch...@simplistix.co.uk
 wrote:

 David Stanek wrote:

 Also on the same box where you run this script
 can you test with curl or wget?

 It's a Windows box, so no :-(

 Why not?

 http://users.ugent.be/~bpuype/wget/
 http://curl.haxx.se/download.html

 Fair point, but I don't see what this will achieve...

 I've already established that the file downloads in seconds with [something
 else], so I'd like to understand why python isn't doing the same and fix the
 problem...
 My post was simply to correct the implication that curl and wget can
not be used on Windows. It's up to you whether you want to use one or
the other.
 I'm not the OP, and this is not my area of expertise, but ...
  You've got two data points. You've jumped to the conclusion that
there's something wrong with Python or your code. You're probably
right. However, if you try wget, for example, and it's as slow as your
code, you cqn look elsewhere.  If, on the other hand, wget is as fast
as IE, you'll have more proof that your code is the problem.
 Then, since wget is open source you can look at the source code and
see what wget is doing right that you (or httplib) is doing wrong.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: httplib incredibly slow :-(

2009-08-13 Thread Aahz
In article mailman.4613.1250033136.8015.python-l...@python.org,
Chris Withers  ch...@simplistix.co.uk wrote:
Aahz wrote:
 In article mailman.4598.1250022343.8015.python-l...@python.org,
 Chris Withers  ch...@simplistix.co.uk wrote:

 Does anyone know of an alternative library for creating http requests 
 and getting their responses that's faster but hopefully has a similar 
 interface?
 
 PyCurl

This seems to be a wrapper around libcurl.
Does it work on Windows?

Yes.

If so, where can I find some decent examples?
(the ones listed on the pycurl website are not what I'd call decent :-S)

Sorry, I mostly have been working on our Mac port, so I'm not sure what's
needed to make this work on Windows.  Did you try downloading the PyCurl
binary?  Maybe it statically links libcurl on Windows.

What do you need to know for a decent example?
-- 
Aahz (a...@pythoncraft.com)   * http://www.pythoncraft.com/

...string iteration isn't about treating strings as sequences of strings, 
it's about treating strings as sequences of characters.  The fact that
characters are also strings is the reason we have problems, but characters 
are strings for other good reasons.  --Aahz
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: httplib incredibly slow :-(

2009-08-12 Thread Chris Withers

Answering myself...

Chris Withers wrote:

In article mailman.4598.1250022343.8015.python-l...@python.org,
Chris Withers  ch...@simplistix.co.uk wrote:
Does anyone know of an alternative library for creating http requests 
and getting their responses that's faster but hopefully has a similar 
interface?


PyCurl


This seems to be a wrapper around libcurl.

Does it work on Windows?


Not by my definition of work:

- there are no windows binaries for libcurl

- getting https support on windows seems pretty hit'n'miss:
http://stackoverflow.com/questions/197444/building-libcurl-with-ssl-support-on-windows

I'm still reeling from what seems to be such a huge problem with httplib 
that seem to be largely ignored :-(


Chris

--
Simplistix - Content Management, Batch Processing  Python Consulting
   - http://www.simplistix.co.uk
--
http://mail.python.org/mailman/listinfo/python-list


Re: httplib incredibly slow :-(

2009-08-12 Thread Max Erickson
Chris Withers ch...@simplistix.co.uk wrote:
 
 I'm still reeling from what seems to be such a huge problem with
 httplib that seem to be largely ignored :-(
 
 Chris
 

There is an httplib2 (but I don't know anything further about it...):

http://code.google.com/p/httplib2/

Calling wget or curl using a subprocess is probably as easy as it is 
ugly, I use the wget build from here:

http://gnuwin32.sourceforge.net/packages/wget.htm


max

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: httplib incredibly slow :-(

2009-08-12 Thread shaileshkumar
We use PyCURL on Windows. http://pycurl.sourceforge.net/ provides pre-
built versions for Windows and it works out of the box.

- Shailesh



On Aug 12, 7:14 pm, Max Erickson maxerick...@gmail.com wrote:
 Chris Withers ch...@simplistix.co.uk wrote:

  I'm still reeling from what seems to be such a huge problem with
  httplib that seem to be largely ignored :-(

  Chris

 There is an httplib2 (but I don't know anything further about it...):

 http://code.google.com/p/httplib2/

 Calling wget or curl using a subprocess is probably as easy as it is
 ugly, I use the wget build from here:

 http://gnuwin32.sourceforge.net/packages/wget.htm

 max

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: httplib incredibly slow :-(

2009-08-12 Thread David Stanek
On Tue, Aug 11, 2009 at 4:25 PM, Chris Withersch...@simplistix.co.uk wrote:
 Hi All,

 I'm using the following script to download a 150Mb file:

 from base64 import encodestring
 from httplib import HTTPConnection
 from datetime import datetime

 conn = HTTPSConnection('localhost')
 headers = {}
 auth = 'Basic '+encodestring('username:password').strip()
 headers['Authorization']=auth
 t = datetime.now()
 print t
 conn.request('GET','/somefile.zip',None,headers)
 print 'request:',datetime.now()-t
 response = conn.getresponse()
 print 'response:',datetime.now()-t
 data = response.read()
 print 'read:',datetime.now()-t

 The output shows it takes over 20 minutes to do this.
 However, this is on a local network, and downloading the same file in IE
 takes under 3 seconds!

 I saw this issue:

 http://bugs.python.org/issue2576

 I tried changing the buffer size to 4096 in a subclass as the issue
 suggested, but I didn't see the reported speed improvement.
 I'm using Python 2.6.2.

 Does anyone know of an alternative library for creating http requests and
 getting their responses that's faster but hopefully has a similar interface?


I tried to reproduce this, but I could not. Could you paste in the
output of your script? Also on the same box where you run this script
can you test with curl or wget?

-- 
David
blog: http://www.traceback.org
twitter: http://twitter.com/dstanek
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: httplib incredibly slow :-(

2009-08-12 Thread Chris Withers

Max Erickson wrote:

There is an httplib2 (but I don't know anything further about it...):

http://code.google.com/p/httplib2/


I had a look, it uses httplib, so will likely suffer from the same 
problems...


Calling wget or curl using a subprocess is probably as easy as it is 
ugly, I use the wget build from here:


http://gnuwin32.sourceforge.net/packages/wget.htm


Yeah, no ;-)

Chris

--
Simplistix - Content Management, Batch Processing  Python Consulting
   - http://www.simplistix.co.uk
--
http://mail.python.org/mailman/listinfo/python-list


Re: httplib incredibly slow :-(

2009-08-12 Thread Shailesh Kumar
Yes it includes libcurl. I didn't have to install it separately. I still
continue to use Python 2.4. So cannot say about Python 2.6.

- Shailesh

On Wed, Aug 12, 2009 at 10:23 PM, Chris Withers ch...@simplistix.co.ukwrote:

 shaileshkumar wrote:

 We use PyCURL on Windows. http://pycurl.sourceforge.net/ provides pre-
 built versions for Windows and it works out of the box.


 Does it include libcurl? Are these builds available for Python 2.6?

 Chris

 --
 Simplistix - Content Management, Batch Processing  Python Consulting
   - http://www.simplistix.co.uk

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: httplib incredibly slow :-(

2009-08-12 Thread Chris Withers

shaileshkumar wrote:

We use PyCURL on Windows. http://pycurl.sourceforge.net/ provides pre-
built versions for Windows and it works out of the box.


Does it include libcurl? Are these builds available for Python 2.6?

Chris

--
Simplistix - Content Management, Batch Processing  Python Consulting
   - http://www.simplistix.co.uk
--
http://mail.python.org/mailman/listinfo/python-list


Re: httplib incredibly slow :-(

2009-08-12 Thread Chris Withers

David Stanek wrote:

I tried to reproduce this, but I could not. Could you paste in the
output of your script? 


Not sure how that'll help, but sure:

2009-08-11 21:27:59.153000
request: 0:00:00.109000
response: 0:00:00.109000
read: 0:24:31.266000

 Also on the same box where you run this script

can you test with curl or wget?


It's a Windows box, so no :-(
But it really does download in a few seconds with IE, and 20min+ using 
the script I included...


Chris

--
Simplistix - Content Management, Batch Processing  Python Consulting
   - http://www.simplistix.co.uk
--
http://mail.python.org/mailman/listinfo/python-list


Re: httplib incredibly slow :-(

2009-08-12 Thread i3dmaster
On Aug 12, 9:37 am, Chris Withers ch...@simplistix.co.uk wrote:
 David Stanek wrote:
  I tried to reproduce this, but I could not. Could you paste in the
  output of your script?

 Not sure how that'll help, but sure:

 2009-08-11 21:27:59.153000
 request: 0:00:00.109000
 response: 0:00:00.109000
 read: 0:24:31.266000

   Also on the same box where you run this script

  can you test with curl or wget?

 It's a Windows box, so no :-(
 But it really does download in a few seconds with IE, and 20min+ using
 the script I included...

 Chris

 --
 Simplistix - Content Management, Batch Processing  Python Consulting
             -http://www.simplistix.co.uk

Just wanted to check if you can try turning on the debug mode for
httplib and see if you can read a bit more debug info on where the
calls get hung. In your example, it would be conn.set_debuglevel(1)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: httplib incredibly slow :-(

2009-08-12 Thread David Robinow
On Wed, Aug 12, 2009 at 12:37 PM, Chris Withersch...@simplistix.co.uk wrote:
 David Stanek wrote:
 Also on the same box where you run this script
 can you test with curl or wget?
 It's a Windows box, so no :-(

Why not?

http://users.ugent.be/~bpuype/wget/
http://curl.haxx.se/download.html
-- 
http://mail.python.org/mailman/listinfo/python-list


httplib incredibly slow :-(

2009-08-11 Thread Chris Withers

Hi All,

I'm using the following script to download a 150Mb file:

from base64 import encodestring
from httplib import HTTPConnection
from datetime import datetime

conn = HTTPSConnection('localhost')
headers = {}
auth = 'Basic '+encodestring('username:password').strip()
headers['Authorization']=auth
t = datetime.now()
print t
conn.request('GET','/somefile.zip',None,headers)
print 'request:',datetime.now()-t
response = conn.getresponse()
print 'response:',datetime.now()-t
data = response.read()
print 'read:',datetime.now()-t

The output shows it takes over 20 minutes to do this.
However, this is on a local network, and downloading the same file in IE 
takes under 3 seconds!


I saw this issue:

http://bugs.python.org/issue2576

I tried changing the buffer size to 4096 in a subclass as the issue 
suggested, but I didn't see the reported speed improvement.

I'm using Python 2.6.2.

Does anyone know of an alternative library for creating http requests 
and getting their responses that's faster but hopefully has a similar 
interface?


Chris

--
Simplistix - Content Management, Batch Processing  Python Consulting
   - http://www.simplistix.co.uk
--
http://mail.python.org/mailman/listinfo/python-list


Re: httplib incredibly slow :-(

2009-08-11 Thread Aahz
In article mailman.4598.1250022343.8015.python-l...@python.org,
Chris Withers  ch...@simplistix.co.uk wrote:

Does anyone know of an alternative library for creating http requests 
and getting their responses that's faster but hopefully has a similar 
interface?

PyCurl
-- 
Aahz (a...@pythoncraft.com)   * http://www.pythoncraft.com/

...string iteration isn't about treating strings as sequences of strings, 
it's about treating strings as sequences of characters.  The fact that
characters are also strings is the reason we have problems, but characters 
are strings for other good reasons.  --Aahz
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: httplib incredibly slow :-(

2009-08-11 Thread Chris Withers

Aahz wrote:

In article mailman.4598.1250022343.8015.python-l...@python.org,
Chris Withers  ch...@simplistix.co.uk wrote:
Does anyone know of an alternative library for creating http requests 
and getting their responses that's faster but hopefully has a similar 
interface?


PyCurl


This seems to be a wrapper around libcurl.

Does it work on Windows?
If so, where can I find some decent examples?
(the ones listed on the pycurl website are not what I'd call decent :-S)

Chris


--
Simplistix - Content Management, Batch Processing  Python Consulting
   - http://www.simplistix.co.uk
--
http://mail.python.org/mailman/listinfo/python-list