On Sun, Aug 09, 2009 at 02:25:34PM -0600, Zooko Wilcox-O'Hearn wrote: > On Thursday,2009-08-06, at 19:58 , Sam Mason wrote: > > My only initial concern is the apparent lack of timeouts when > > creating/uploading things. > > Nope, this is a known issue. It happens a lot on Test Grid, where > there are nodes which are offering storage service but which > disconnect abruptly without saying goodbye or which take ages > (minutes) to respond to your requests. I encounter it frequently > because my blog is stored on Test Grid. It doesn't happen very often > grids with higher-quality storage servers. Here are some probably- > relevant tickets: #193, #253, #287, #436, #521, #573.
The fixes to those look as though they'd be scattered across the code somewhat. Just to dip into the code (so to speak) if I were just to fix my immediate problem what would be a good fix? The others are mainly about download or the initial selection of servers so seem to be a different, though related, problem. There seem to be a couple of way of fixing this, the easiest is to tell the user the file has been uploaded when some number of shares (somewhere between N and K) have been successfully sent to other servers. With the remaining shares would continue to be sent in the background and normal repair mechanisms coming to the rescue if the failing servers never made it back into the network. A better fix would seem to be to send the failing shares off to other servers, but if I interpreted the protocol correctly each server knows which other servers contain shares and so you'd need some way of telling them that things have moved. Comments? -- Sam http://samason.me.uk/ _______________________________________________ tahoe-dev mailing list [email protected] http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev
