Re: [paramiko] SCPClient slower than scp
On Mon, Feb 1, 2010 at 2:52 AM, Roman Yakovenko roman.yakove...@gmail.com wrote: Hello. I am using SCPClient class from branch( http://bazaar.launchpad.net/~jbardin/paramiko/paramiko_scp/annotate/500?file_id=scp.py-20081117202350-5q0ozjv6zz9ww66y-1 ) with paramiko 1.7.6 and Python 2.6 on Ubuntu Karmic Koala. I am testing my code with file size 1 GB. The SCPClient upload rate starts with 10 MB/s and than drops to 5.2 MB/s. The average is 5.2 MB/s. I tried to change buffer size, but this didn't help The scp command upload rate starts with 20 MB/s and then drops to 10 MB/s. The average is 10 MB/s. To complete the statistics, paramiko built-in SFTPClient average rate is 2.2 MB. I use put method as is, with the default configuration. I am not sure where to start to solve the problem. Initially, I suspected that local file reading is a problem, but that functionality works pretty well. You would normally start by using a profiler to see where the performance bottleneck is, before you start speculating. You would have seen that most of the time is spent in paramiko.Transport manipulating data, and waiting for pyCrypto. SCPClient adds almost nothing to the overall time. Right now, I am using work around ( executing scp with subprocess ) but it is less than optimal solution. Any help is appreciated. Yes, the solution written entirely in c will be significantly faster. Since this is mostly python, cpu is the limiting factor. There may be some places where optimizations could be made in paramiko and pyCrypto, but I haven't looked into it myself. -jim ___ paramiko mailing list paramiko@lag.net http://www.lag.net/cgi-bin/mailman/listinfo/paramiko
Re: [paramiko] SCPClient slower than scp
On Mon, Feb 1, 2010 at 5:36 PM, james bardin jbar...@bu.edu wrote: On Mon, Feb 1, 2010 at 2:52 AM, Roman Yakovenko I am testing my code with file size 1 GB. The SCPClient upload rate starts with 10 MB/s and than drops to 5.2 MB/s. The average is 5.2 MB/s. I tried to change buffer size, but this didn't help The scp command upload rate starts with 20 MB/s and then drops to 10 MB/s. The average is 10 MB/s. To complete the statistics, paramiko built-in SFTPClient average rate is 2.2 MB. I use put method as is, with the default configuration. I am not sure where to start to solve the problem. Initially, I suspected that local file reading is a problem, but that functionality works pretty well. You would normally start by using a profiler to see where the performance bottleneck is, before you start speculating. You would have seen that most of the time is spent in paramiko.Transport manipulating data, and waiting for pyCrypto. SCPClient adds almost nothing to the overall time. Thanks for advice. I'll follow it. It was not a complete speculation. The CPU usage was pretty same for all solutions. Right now, I am using work around ( executing scp with subprocess ) but it is less than optimal solution. Any help is appreciated. Yes, the solution written entirely in c will be significantly faster. Since this is mostly python, cpu is the limiting factor. I have zero experience in ssh and encryption, but my expection was that at least in the case of transfering 10+ Gb files, the process will be bounded by network and not CPU. There may be some places where optimizations could be made in paramiko and pyCrypto, but I haven't looked into it myself. Thank you. -- Roman Yakovenko C++ Python language binding http://www.language-binding.net/ ___ paramiko mailing list paramiko@lag.net http://www.lag.net/cgi-bin/mailman/listinfo/paramiko
Re: [paramiko] SCPClient slower than scp
On Mon, Feb 1, 2010 at 11:49 AM, Roman Yakovenko roman.yakove...@gmail.com wrote: You would normally start by using a profiler to see where the performance bottleneck is, before you start speculating. You would have seen that most of the time is spent in paramiko.Transport manipulating data, and waiting for pyCrypto. SCPClient adds almost nothing to the overall time. Thanks for advice. I'll follow it. It was not a complete speculation. The CPU usage was pretty same for all solutions. There are some other limiting factors in both paramiko and openssh(http://www.psc.edu/networking/projects/hpn-ssh/), but I have always hit the cpu wall with paramiko long before anything else is relevant. If you're maxing out 1 processor core for each, you're just seeing the difference in the efficiency of the c code vs the python+c code (pyCrypto does the heavy lifting in c). Yes, the solution written entirely in c will be significantly faster. Since this is mostly python, cpu is the limiting factor. I have zero experience in ssh and encryption, but my expection was that at least in the case of transfering 10+ Gb files, the process will be bounded by network and not CPU. The size of the file has nothing to do with it once the connection and negotiation time become irrelevant. It's an encrypted stream of data, so you're limited by how fast you can process it, not by how long it is. ___ paramiko mailing list paramiko@lag.net http://www.lag.net/cgi-bin/mailman/listinfo/paramiko
Re: [paramiko] SCPClient slower than scp
On Mon, Feb 1, 2010 at 12:16 PM, james bardin jbar...@bu.edu wrote: Yes, the solution written entirely in c will be significantly faster. Since this is mostly python, cpu is the limiting factor. I have zero experience in ssh and encryption, but my expection was that at least in the case of transfering 10+ Gb files, the process will be bounded by network and not CPU. Your email got me thinking, so I did a few tests: The biggest boost in performance was had by using the latest pycrypto(2.1.0). You'll get a deprecation warning from paramiko that you can ignore for now (bug already submitted in github). There was a change to the HMAC code that made a huge difference in paramiko's performance. I tried using a limited bandwidth connection, and paramiko was on par with openssh when cpu wasn't a concern. When bandwidth wasn't an issue (using loopback), paramiko was about 85% of the speed of openssh on my machine. Each newer version of python2.X was slightly faster as well. ___ paramiko mailing list paramiko@lag.net http://www.lag.net/cgi-bin/mailman/listinfo/paramiko
Re: [paramiko] SCPClient slower than scp
On Tue, Feb 2, 2010 at 12:26 AM, james bardin jbar...@bu.edu wrote: On Mon, Feb 1, 2010 at 12:16 PM, james bardin jbar...@bu.edu wrote: Yes, the solution written entirely in c will be significantly faster. Since this is mostly python, cpu is the limiting factor. I have zero experience in ssh and encryption, but my expection was that at least in the case of transfering 10+ Gb files, the process will be bounded by network and not CPU. Your email got me thinking, so I did a few tests: The biggest boost in performance was had by using the latest pycrypto(2.1.0). You'll get a deprecation warning from paramiko that you can ignore for now (bug already submitted in github). There was a change to the HMAC code that made a huge difference in paramiko's performance. I tried using a limited bandwidth connection, and paramiko was on par with openssh when cpu wasn't a concern. As expected, since sending data take much more time then encryption. When bandwidth wasn't an issue (using loopback), paramiko was about 85% of the speed of openssh on my machine. Those are really good news. I will try to upgrade the code. I am using the real IP to test my code ( I found the following code on the internet ) import socket import struct import fcntl def get_ip_address(fname='eth0'): SIOCGIFADDR = 0x8915 s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) io_result = fcntl.ioctl( s.fileno(), SIOCGIFADDR, struct.pack('256s', fname[:15] ) ) return socket.inet_ntoa( io_result[20:24] ) from one side all requests goes via router, from the other side I have local access to the both ends. In case of file transfer, the md5sum is executed on both files and compared. Each newer version of python2.X was slightly faster as well. I am using Python 2.4 ( production sys admins are so conservative :-) ) and 2.6 in development, but as you noted there is no a big difference between them. Thank you for help. -- Roman Yakovenko C++ Python language binding http://www.language-binding.net/ ___ paramiko mailing list paramiko@lag.net http://www.lag.net/cgi-bin/mailman/listinfo/paramiko