Dear David-Sarah and Brian: Hello, I am slowly catching up on the burst of crypto cap creativity that you two posted over the last few days.
On Monday,2009-09-07, at 1:48 , Brian Warner wrote: > * we can't determine the storage-index until after we've encoded the > entire file (which generally means after we've uploaded it). So we > need a new uploader protocol that lets us upload to an as-yet- > unnamed > slot, and then provide the slot's storage-index at the very end of > the process. This is more work, but it isn't a huge deal. Remember that I really, really want this anyway, because this is necessary to have "one-pass" == "on-line" upload. Imagine that you are a tiny embedded machine with little RAM and little or no disk. Your client opens an HTTP connection to you and starts uploading the plaintext of a huge file, expecting you to store it on a Tahoe-LAFS grid. You need to (a) pick a random encryption key, (b) perform encryption, erasure-coding, and computation of the verification data, (c) send the resulting encrypted shares and verification data to storage servers. You have to do all of this in an "on-line" way, i.e. you can't store a lot of intermediate data somewhere while waiting to see the end of the plaintext. Then, (d) return the resulting read-cap to the client as quickly as possible after the client finishes sending you the plaintext. This is ticket #320. > * we wouldn't be able to directly use our permuted-list Tahoe2 > peer-selection protocol, since we won't know the storage-index (and > thus the permuted list) until after we've uploaded all the > shares. I > think we'd have to go with the "server-selection-index" idea: a > much > shorter string (since it only needs to provide load-balancing, not > collision resistance), either randomly generated or derived from a > salted CHK hash (and thus computable before encoding/upload), > used to > permute the peerlist. This string must be included in the readcap, > increasing it's length, but we could probably get away with > maybe 20 > bits or so. Argh! You are right! Another few bits needed in the readcap! Boo hoo. :-( > So, while I like the one-cryptovalue trick, I'm unsatisfied with both > the lack of server-side validation and offline readcap-to-verifycap > attenuation, and the separate SSI value makes me slightly nervous. Re: server-side validation, what do you think of my proposal in [1]? It lets the server fully validate the verify-cap, and readers carry around just enough of the verify cap to give themselves a massive advantage (a million to one) over DoS'ers. Re: offline diminishing readcap-to-verifycap, I liked your and David- Sarah's comments about storing the verifycap with the readcap sometimes. In general, each kind of cap could have a base part -- the minimal information which is necessary and sufficient to be a cap (assuming full access to servers) -- plus it could have an "extended" part -- pieces that you can always get from the servers if you have the base part, but you can save round-trips if you have the extended part. For read-caps, the minimal part could be the crypto value, the server-selection-index (boo hoo) and a 20-bit prefix of the verifycap. The extended part could be the full verify-cap and the k_enc. Or maybe the extended part could be the full public key and the read key! Then it would be up to the user of the cap to decide whether to use the smallest possible cap or to use the extended cap in order to save round-trips when dereferencing or diminishing it. Re: separate SSI (server-selection-index) value, what makes you nervous about it? Personally, I like the idea of separating the data (crypto) layer from the network (server-selection) layer. Some grids might have a server-selection policy that you always query the servers in increasing order of network round trip time, regardless of which cap you are looking for. Those grids wouldn't need a server- selection-index at all. Others might accompany each of their caps with a description of which servers each share was last seen on. That would be in a sense a very large, optional SSI. (Hm, and it would act a bit like a slow, persistent BitTorrent tracker. :-)) Is the fact that people might eventually use such crazy server- selection policies (that we haven't yet vetted) one of the things that makes you nervous about separating out the SSI? :-) Regards, Zooko [1] http://allmydata.org/pipermail/tahoe-dev/2009-September/002829.html tickets mentioned in this letter: http://allmydata.org/trac/tahoe/ticket/320 # add streaming (on-line) upload to HTTP interface _______________________________________________ tahoe-dev mailing list [email protected] http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev
