[issue17552] Add a new socket.sendfile() method
Giampaolo Rodola' added the comment: Charles, Antoine, any final thought about this given the reasons I stated above? If you're still -1 about adding 'send_blocksize' argument I guess I can get rid of it and perhaps reintroduce it later if we see there's demand for it. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17552 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17552] Add a new socket.sendfile() method
Charles-François Natali added the comment: I agree it is not necessary for sendfile() (you were right). Good we agree :-) Do not introducing it for send(), though, poses some questions. For instance, should we deprecate or ignore 'blocksize' argument in ftplib as well? Honestly, we should deprecate the whole ftplib module :-) More seriously, it's really low-level, I don't understand the point of this whole callback-based API: FTP.storbinary(command, file[, blocksize, callback, rest])Why not simply a: FTP.store(source, target, binary=True) If you have time and it interest you, trying to improve this module would be great :-) Generally speaking, when using send() there are circumstances where you might want to adjust the number of bytes to read from the file, for instance: - 1: set a small blocksize (say 512 bytes) on systems where you have a limited amount of memory - 2: set a big blocksize (say 256000) in order to speed up the transfer / use less CPU cycles; on very fast networks (e.g. LANs) this may result in a considerable speedup (I know 'cause I conducted these kind of tests in pyftpdlib: https://code.google.com/p/pyftpdlib/issues/detail?id=94). I agree, but both points are addressed by sendfile(): internally, the kernel will use whatever block size it likes, depending on the source file type, page size, target NIC ring buffer size. So to reply to your above question, I wouldn't feel too bad about simply ignoring the blocksize argument in e.g. shutil.copyfile(). For ftplib, it's a little harder since we're supposed to support an optional callback, but calling a callback for every block will drive performance down... So I'd really like it if you could push the version without the blocksize, and then we'll see (and hopefully fix the non-sensical libraries that depend on it). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17552 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17552] Add a new socket.sendfile() method
Giampaolo Rodola' added the comment: I agree, but both points are addressed by sendfile() I'm talking about send(), not sendfile(). Please remember that send() will be used as the default on Windows or when non-regular files are passed to the function. My argument is about introducing an argument to use specifically with send(), not sendfile(). In summary: sendfile(self, file, offset=0, count=None, send_blocksize=16384): [...] If os.sendfile() is not available (e.g. Windows) or file is not a regular file socket.send() will be used instead. [...] *send_blocksize* is the maximum number of bytes to transmit at one time in case socket.send() is used. Honestly, we should deprecate the whole ftplib module :-) More seriously, it's really low-level, I don't understand the point of this whole callback-based API: FTP.storbinary(command, file[, blocksize, callback, rest]) Why not simply a: FTP.store(source, target, binary=True) ftplib module API may be a bit rusty but there's a reason why it was designed like that. 'callback' and 'blocksize' arguments can be used to implement progress bars, in-place transformations of the source file's data and bandwidth throttling (by having your callback 'sleep' if more than N bytes were sent in the last second). 'rest' argument is necessary for resuming uploads. I'm not saying ftplib API cannot be improved: maybe we can provide two higher level put and get methods but please let's discuss that into another thread. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17552 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17552] Add a new socket.sendfile() method
Charles-François Natali added the comment: I agree, but both points are addressed by sendfile() I'm talking about send(), not sendfile(). Please remember that send() will be used as the default on Windows or when non-regular files are passed to the function. My argument is about introducing an argument to use specifically with send(), not sendfile(). Which makes even less sense if it's not needed for sendfile() :-) I really don't see why we're worrying so much about allocating 8K or 16K buffers, it's really ridiculous. For example, buffered file I/O uses a default block size of 8K. We're not targeting embedded systems. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17552 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17552] Add a new socket.sendfile() method
Giampaolo Rodola' added the comment: OK then, I'll trust your judgement. I'll use 8K as the default and will commit the patch soon. -- assignee: - giampaolo.rodola ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17552 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17552] Add a new socket.sendfile() method
Roundup Robot added the comment: New changeset 001895c39fea by Giampaolo Rodola' in branch 'default': fix issue #17552: add socket.sendfile() method allowing to send a file over a socket by using high-performance os.sendfile() on UNIX. Patch by Giampaolo Rodola'· http://hg.python.org/cpython/rev/001895c39fea -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17552 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17552] Add a new socket.sendfile() method
Changes by Giampaolo Rodola' g.rod...@gmail.com: -- keywords: +3.2regression -needs review, patch resolution: - fixed status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17552 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17552] Add a new socket.sendfile() method
Giampaolo Rodola' added the comment: Looking back at this I think a send_blocksize argument is necessary after all. shutil.copyfileobj() has it, so is ftplib.FTP.storbinary() and httplib (issue 13559) which will both be using socket.sendfile() once it gets included. Updated patch is in attachment. If there are no other objections I'd be for committing this next week or something as I'm pretty confident with the current implementation. -- Added file: http://bugs.python.org/file35507/socket-sendfile7.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17552 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17552] Add a new socket.sendfile() method
Charles-François Natali added the comment: Looking back at this I think a send_blocksize argument is necessary after all. shutil.copyfileobj() has it, so is ftplib.FTP.storbinary() and httplib (issue 13559) which will both be using socket.sendfile() once it gets included. Those APIs are really poor, please don't cripple sendfile() to mirror them. Once again, a send_blocksize argument doesn't make sense, you won't find it anywhere else. All that's needed is start offset and a size/length argument. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17552 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17552] Add a new socket.sendfile() method
Giampaolo Rodola' added the comment: I agree it is not necessary for sendfile() (you were right). Do not introducing it for send(), though, poses some questions. For instance, should we deprecate or ignore 'blocksize' argument in ftplib as well? Generally speaking, when using send() there are circumstances where you might want to adjust the number of bytes to read from the file, for instance: - 1: set a small blocksize (say 512 bytes) on systems where you have a limited amount of memory - 2: set a big blocksize (say 256000) in order to speed up the transfer / use less CPU cycles; on very fast networks (e.g. LANs) this may result in a considerable speedup (I know 'cause I conducted these kind of tests in pyftpdlib: https://code.google.com/p/pyftpdlib/issues/detail?id=94). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17552 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17552] Add a new socket.sendfile() method
Giampaolo Rodola' added the comment: ...speaking of which, now that I look back at those benchmarks it looks like 65536 bytes is the best compromise (in my latest patch I used 16348). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17552 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17552] Add a new socket.sendfile() method
Changes by STINNER Victor victor.stin...@gmail.com: -- title: socket.sendfile() - Add a new socket.sendfile() method ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17552 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com