Tahoe-LAFS Weekly Dev Chat, 07-Aug-2013
In attendance: Brian, Daira, Zooko, Mark, Nathan, Oleksandr(?) We started by investigating the severe (2x) slowdown of the unit test suite on the 1819-cloud-merge branch (ticket #1870). We're pretty sure it involves the new leasedb operations. Brian specifically suspects the DB writes, since he's heard that SQLite tries to be honest about durability and does an fsync() after every db.commit(). Daira and Zooko have done some analysis to count DB ops and tempfile generation. Brian noticed that the specific test in question (test_cli.Cp.test_copy_using_filecap) is mistakenly copying the entire base directory, *including the servers and their shares*, into the virtual filesystem, creating 174 files instead of four (and making it remarkable that it ever terminates, since it's copying shares that are created by the process of copying the other shares). Fixing that reduces the non-leasedb runtime from 24s to 1s. The buggy test snuck past code review last november. The larger performance problem still remains. Daira will add some instrumentation to measure time spent in db.commit() specifically, to see if the slowdown can be attributed to it. If so, we need to look at the DB writes and see if there's any clean way to consolidate them (reducing the number of writes to one-per-new-share). If that's insufficient, we may need to tell SQLite to stop fsyncing (e.g. reduce durability) during unit tests. Daira is hopeful we won't have to do that. We also talked about moving the Buildbot config into a git repo where it would be more accessible (Zooko has started on this), adding a buildbot tool to warn us when a single test's runtime suddenly increases (to spot things like the buggy test above), and reviewing other patches. Please join us in the Dev Chat next week! cheers, -Brian _______________________________________________ tahoe-dev mailing list [email protected] https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev
