Folks: I'm finally getting caught up on the Elk Point design. (David-Sarah: where were you when you first thought of it? Because the tradition is to name a design after the geographic location where it was conceived.)
I have several things to contribute to the discussion. This note is about "how to convert roadblocks into speedbumps". This is a possible feature that could be added to NewCap design to reduce the number of "roadblock prevention bits" in caps. This feature may not turn out to be needed or to be worth its cost in complexity and efficiency, but I want to document it as a tool in our design toolbox. As you know, a "roadblock attack" is a kind of denial-of-service where the attacker knows or guesses the storage index that you are going to use to upload your file, and it persuades a storage server to put something else into that storage index before you upload your share. If the part of the storage index which the storage server can verify (labelled 't' in the Elk Point diagrams [1]) is short enough, then an attacker can use brute force to generate a share which matches t, and the server won't be able to tell that it isn't a legitimate share for that entire storage index (t, S). One can imagine dealing with a roadblock attack by choosing a different key and re-uploading (or if there is a destroy cap 'D' then choosing a different 'D' so that you don't have to re-encrypt with a different key), but even if the uploader could win that race against the roadblocker, a repairer would not have that option since the repairer has to match the already-extant immutable file read cap. Here is an extension to the upload/download protocol that converts roadblocks to speedbumps: the storage server holds all shares that have been uploaded under a given storage index (and share num), and it stores them in the order that they arrived. Downloaders can retrieve and check the K1 and V values of each of the candidates in order that they were uploaded, and they can tell which one matches their readcap. Therefore if an attacker goes to the work to set up a roadblock, it only inconveniences the downloader (and the storage server) a tiny amount -- it is only a shallow speedbump. Here are the details. The protocol to upload shares is that the client invokes the server's "allocate_buckets()" method [2], passing a storage index and a set of sharenums. To download shares, the client invoke's the server's "get_buckets()" method [3] passing a storage index. The storage server returns a dict mapping from sharenum to BucketReader objects. The extension to the protocol is that the storage server returns instead a dict mapping from sharenum to a list of BucketReader objects, which are sorted in the order that they were uploaded. This doesn't sound like too much added complication, but perhaps that's just because the Foolscap library is so Pythonic and expressive. If you think about how the storage server lays out its share data on disk and how many network round-trips and bytes are required for downloading, the cost of this improvement looks a bit worse: The storage server in Tahoe-LAFS v1.5 stores shares in its local filesystem in the following layout: storage/shares/$START/$STORAGEINDEX/$SHARENUM Where "$START" denotes the first 10 bits worth of $STORAGEINDEX (that's 2 base-32 chars). You can see this in storage/server.py [4]. The easiest way to extend this to handle multiple share candidates would be something like storage/shares/$START/$STORAGEINDEX/$SHARENUM_$CANDIDATENUM Then the server could see all the candidates (just by inspecting the $STORAGEINDEX directory -- not having to read the files themselves) and return an ordered list of remote references to BucketReacers. Now if the server's reply to get_buckets() contained only the share num and the remote references to the BucketReaders, then the downloader would have to wait for at least one more network round trip to fetch the K1 and V values for each BucketReader before it could decide which candidate share matched its read-cap. One the other hand, if the server's initial reply contained the K1 and V values for each candidate, then the server would have to do at least one extra disk seek per candidate to read those values out of each share. We know from experience that a disk seek in the storage server is typically one of the slowest operations in the whole grid. So we could add a file to the layout which contains just the K1 and V values, in order, for all of the candidates, thus making it possible for the storage server to return all the K1 and V values without doing a lot of seeks: storage/shares/$START/$STORAGEINDEX/$SHARENUM_K1s_and_Vs storage/shares/$START/$STORAGEINDEX/$SHARENUM_$CANDIDATENUM Okay, that's it. Would this be a win? Well, I think that depends on the exact sizes of the caps. I'm working on a table cataloguing Everything That Can Go Wrong, including brute force attacks on various elements of the crypto structure and failure of the various crypto primitives that we use. The Everything That Can Go Wrong table includes the numbers for how much a brute force attack would cost, which will let me understand exactly what size of crypto values I would be comfortable with. Once I have that table, then I'll reconsider whether in Elk Point town we should build a Roadblock-To-Speedbump Converter, or if we should just set such a high tax on roadblocks that nobody can afford to build one (by including enough t bits). Regards, Zooko [1] http://jacaranda.org/tahoe/mutable-addonly-elkpoint-2.svg [2] http://allmydata.org/trac/tahoe/browser/src/allmydata/interfaces.py?rev=4045#L94 [3] http://allmydata.org/trac/tahoe/browser/src/allmydata/interfaces.py?rev=4045#L160 [4] http://allmydata.org/trac/tahoe/browser/src/allmydata/storage/server.py?rev=3871 _______________________________________________ tahoe-dev mailing list [email protected] http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev
