Hi again, Thanks for all the feedback so far, it was really helpful for debugging and narrowing things down further. Currently I'm a little stuck with the testing again, so time to share some of the results I'm currently getting, I think :). I'll be a little more verbose with things I could get fixed and tried in between than just sharing the point I'm currently stuck at. Maybe some people might find the steps in between useful later, too.
The first issue is (as already stated in the irc channel) that actually scp is not working / is not supposed to work. scp never gave me any error messages and as I was always having the same destination for the downloaded file, I was always assuming that the download via scp would have worked - but in fact it did nothing at all. However the sftp command worked fine - except having the same latency issues. I then further discovered that actually while doing the "tahoe put", which saturates my hole upload capacity of about 800-900kbit/s, introduced quite a lot of latency; generally on average 1s, highly variable. So it looked very similar to the bufferbloat phaenomena [1]. To outrule, that the problem is the interaction of "tahoe get" and "tahoe put", I've switched to creating the traffic via netcat6 instead of "tahoe get" (nc6 192.168.145.3 12345 < /dev/urandom &). And I still had the same aweful 20s and higher latency issues for tahoe-lafs itself. I've then added some tc/qdisc stuff via wondershaper on my Linux router, limting the upload rate to 800kbit/s, which helped partially: The big upload of the VM on a dedicated PC running tahoe did not have an impact on the latency of other machines in my network anymore (e.g. the internet latency was normal again for my laptop). However the latency within the tinc VPN (that's where tahoe is currently running in; actually it was running within the tinc VPN layer within the BATMAN mesh layer, but I removed that mesh layer for testing for now) was still at the highly variable 1s latency. To further verify that the tahoe-lafs issue and network latency issue was somewhere on my local network, I set up wondershaper to shape to an upload rate of 400kbit/s (and verified via vnstat, that really just about 400kbit/s get out of my internet ppp connection). And yes, the 1s latency was still present within the VM and the tahoe-lafs latency was still 20s+. So I think the issue should be somewhere at my place, at home, as I'm not saturating my ISP's routers or anything at or to the storage node on the other side anymore. I've first tried also using wondershaper on the VPN interface to always limit the upload rate to something slightly lower than the one set on the router's ppp interface to move any bufferering issues as close to the sender as possible. But I had to discover, that for one thing wondershaper does not work for IPv6, seemingly due to a lack of the Linux kernel's tc dealing with anything other than IPv4 [2] (I was using netcat6 via IPv6 in the beginning). And for the final use-case the BATMAN packets are neither IPv6 nor IPv4. And for another thing a fixed upload capacity via wondershaper would probably not work when I'm simultaneously uploading something from my laptop, with for instance 200kbit/s. So I ended up playing with the txqueuelen on the VPN interface tap0. Which of course is not really ideal either... For 1280 Byte traffic, a txqueuelen of 2 works great (it limits the throughput to about ~800kbit/s), but for 700 Byte packets (as they are coming from the mesh network layer due to internal fragmentation) that only saturates the link to about 200kbit/s, for 700 Byte packets a txqueuelen of 4 would be ideal. Anyway, for the tests I was now further going for a txqueuelen of 2 and will later for the real setup go for 4 until I find a better solution. So great, within the VPN I can now successfully upload with netcat6 at 800-900kbit/s and an ICMP ping over the VPN has a nice latency of only 70ms. Time to get back to tahoe-lafs: --- /usr/bin/time -f "%e" sh -c "for i in \`seq 1 10\`; do ~/allmydata-tahoe-1.8.2/bin/tahoe get root:music/8bitpeoples/8BP102.gif /tmp/8BP102.gif 2> /dev/null; done" -> 10.39! --- Cool, that finally works nicelly and has an acceptable delay of 1s per transfer on average. (see [3] -> 03:21:29 - 03:21:38 for details) So just trying to verify that this also solved the issue for the parallel 'tahoe put' and 'tahoe get', I'm stopping netcat6 and start the 'tahoe put' of a 1GB file. I'm again waiting until the uploading starts and check that the ICMP ping is fine. Then starting the 'tahoe get' again: --- /usr/bin/time -f "%e" sh -c "for i in \`seq 1 10\`; do ~/allmydata-tahoe-1.8.2/bin/tahoe get root:music/8bitpeoples/8BP102.gif /tmp/8BP102.gif 2> /dev/null; done" -> 220.42!? --- Pfeh, and now it seems to have something to do with 'tahoe put' and 'tahoe get' again, athough I had thought that I had outruled it in the beginning... the issue is there again, 22s average (see [4] -> 03:34:03 - 03:37:25 for details). Finally, two more tests with a txqueuelen 500 on tap0 instead of 2, just to show the impact of that again: With 'tahoe put': --- /usr/bin/time -f "%e" sh -c "for i in \`seq 1 10\`; do ~/allmydata-tahoe-1.8.2/bin/tahoe get root:music/8bitpeoples/8BP102.gif /tmp/8BP102.gif 2> /dev/null; done" -> 178.23 --- (not quite sure why it is lower now, maybe some variance, or maybe the txqueuelen was so low that with multiple streams it had a bad impact on the performance again, 18s average; see [5] -> 03:50:57 - 03:53:38 for details) With netcat6: --- /usr/bin/time -f "%e" sh -c "for i in \`seq 1 10\`; do ~/allmydata-tahoe-1.8.2/bin/tahoe get root:music/8bitpeoples/8BP102.gif /tmp/8BP102.gif 2> /dev/null; done" -> 51.28!? --- 5s average, 'tahoe put' definitely seems to have something to do with the issue, as with using netcat6 it's not as bad as with 'tahoe put' again. (see [6] -> 03:59:34 - 04:00:19 for details) Can anyone make any sense from these test results? Cheers, Linus PS: Just for the record, I'm using tinc with the IFFOneQueue option. [1]: http://gettys.wordpress.com/2010/12/03/introducing-the-criminal-mastermind-bufferbloat/ http://www.bufferbloat.net/ [2]: http://lartc.org/howto/lartc.adv-filter.ipv6.html [3]: http://x-realis.dyndns.org/tahoe/logs/round1 [4]: http://x-realis.dyndns.org/tahoe/logs/round2 [5]: http://x-realis.dyndns.org/tahoe/logs/round3 [6]: http://x-realis.dyndns.org/tahoe/logs/round4 _______________________________________________ tahoe-dev mailing list [email protected] http://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev
