Brian, > A side-effect (and arguably a benefit) of low utilization is that > uploads don't mess up your other traffic very badly, which was > convenient for a consumer application that ran for days at a time. Every > couple of seconds, you get a gap in which other applications can get > their data through. Having that sort of "breathing room" let us defer > the development of more intelligent bandwidth-management schemes.
True... but on the other hand, an initial backup takes a long time, and I'd like for it to be able to finish overnight (because my network can support that) instead of needing to let it run for several days. It would be nice to expose more of the tuning parameters as configuration options. Some of my machines have oodles of spare memory that I would be happy for tahoe to use. The round trip time and bandwith of a LAN are very unlike the asynchronous DSL environment assumed by the code. And some users have unusually good Internet connectivity; I actually have symmetric 25Mbps. In short, the defaults are all wrong for me. :) > servers. This thrashed the disk and used a lot of RAM. So Tahoe streams > the file out: it encrypts+encodes segment[0] (typically 128KiB), uploads > the blocks, waits for those transfers to complete, then forgets about > seg[0] and starts the process on seg[1]. If the time it takes to push > 128KiB over the network is not significantly larger than the round-trip > time, your upload pipe will be underutilized. This was exciting to read. The encrypt+encode/transfer ping-pong guarantees that we will either be using CPU, or network, but not both simultaneously, leading to low utilization of both. I'm very handy with threading and googled up some information on the Python threading model... and then I learned about the GIL, which guarantees very low returns to multithreading. (And this sort of circumstance is best solved by multithreading, not multiprocessing.) I was excited about writing a bit of code that would use my threading skills while getting me to learn a new language and contribute to a great project, then had my dreams crushed by learning that the dominant Python interpreter is thread-hostile... so why bother? :( > If there are only a few slower servers, we could probably afford to > store those shares on disk, but there'd be a funny policy question of > how far to let the fast servers race ahead versus how much storage we're > allowed to use. For my part, I'd be happiest allocating a chunk of memory (not disk) for tahoe to use for general-purpose buffering. I hate it when my disk is slow -- virus scanners are evil :) -- but my computers tend to have more memory than they really need. -- Kyle Markley _______________________________________________ tahoe-dev mailing list [email protected] http://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev
