On 7/25/10 8:10 PM, Kyle Markley wrote: > Brian, > > Yeah, I think we're approximately saturating the network during the large > file transfers. But for the small files, both network and CPU load are > very low (under 10%).
Yup. I suspect that your large files are running into python's performance limits: the best way to speed those up will be to move our transport to something with less overhead (signed HTTP is our current idea, ticket #510), then to start looking at what pieces can be rewritten in a faster language. The obvious parts are already in C or C++. > The large number of recvfrom calls on the tahoe backup of small files > looks odd because it's a huge mismatch against the sendto calls, but > nothing else stands out. Yeah, that's odd, it makes me wonder if the receivers are pulling small chunks of data off the wire, even when the sender is writing large chunks. Twisted's TCP transport code should be pulling up to 64KiB at a time out of socket.recv(). Maybe we could find out the size of some of those recvfrom() calls? It depends a lot upon how large of a chunk the kernel is passing up, of course. > I wonder whether temporary file creation for the small file transfers > might be part of the problem. I know that temporary files are created on > occasion; could someone explain precisely when? There aren't very many. The most relevant one here is that "large" HTTP request bodies (the threshold is 100kB) are held on disk (instead of in RAM) between the HTTP PUT/POST request and the actual encrypt+encode+push upload. For files smaller than a few hundred MB, I'd expect the kernel's filesystem caches to hold this data, and it probably wouldn't even touch the platters. (the python stdlib tempfile.TemporaryFile API is used, which I think opens the file in /tmp/ with a random name and unlinks it right away). The other two places are: * when you do a download with a Range: header that doesn't cover the whole file, the 1.7.1 code creates a temporary file to hold the incoming data (and then does seek()+read() to return just the desired segment). * if you're using a Helper, then the helper holds encrypted upload data on disk until the ciphertext transfer is complete and it can start the real encode+push. But other than that, > Ping to the storage node through the wired interface is about 0.66ms. > Ping to the storage node through the wireless interface is about 1.25ms. Yeah, it might be nice to find out how many roundtrips are involved (perhaps by analyzing the tcpdump of a single upload, counting the trips manually, and then multiplying), to see how many seconds this latency represents.. it might be considerable. cheers, -Brian _______________________________________________ tahoe-dev mailing list [email protected] http://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev
