On Tuesday 08 April 2008 12:36, Matthew Toseland wrote: > On Tuesday 08 April 2008 00:36, Ian Clarke wrote: > > http://video.google.com/videoplay?docid=-2372664863607209585 > > > > He mentions Freenet's use of this technique about 10-15 minutes in, > > they also use erasure codes, so it seems they are using a few > > techniques that we also use (unclear about whether we were the direct > > source of this inspiration). > > They use 500% redundancy in their RS codes. Right now we use 150% (including > the original 100%). Maybe we should increase this? A slight increase to say > 200% or 250% may give significantly better performance, despite the increased > overhead...
In fact, I think I can justify a figure of 200% (the original plus 100%, so 128 -> 255 blocks to fit within the 8 bit fast encoding limit). On average, in the long term, a block will be stored on 3 nodes. Obviously a lot of popular data will be stored on more nodes than 3, but in terms of the datastore, this is the approximate figure. On an average node with a 1GB datastore, the 512MB cache has a lifetime of less than a day; stuff lasts a lot longer in the store, and on average data is stored on 3 nodes (by design). We then multiply that by two from splitfile redundancy, to get a total redundancy of 6. Wuala works well with a factor of 5 redundancy... but that's entirely due to FEC. They simulated ordinary redundancy and needed a factor of 24 to be reliable, but a factor of 5 for FEC. So maybe what we need is less network level redundancy and more FEC level redundancy? So we're talking about the data itself. IMHO we can't reduce the network level redundancy much below the current store-in-3-nodes, because we do use freenet for things other than splitfiles - frost posts, the top level block, ... The top level block is a special case, it will usually be fetchable because anyone trying to fetch the splitfile will fetch it even if they give up afterwards, and even if they just followed a link in fproxy and got a size warning and changed their mind... Wuala's simulations assume 25% uptime, and they don't allow nodes to have extra storage unless they have at least 17% uptime. Can we implement something similar? We would have to not take low uptime nodes into account when determining whether we are a sink for a key, the problem with this is that we'd have to reliably tell whether nodes are low uptime... On opennet, there is enough connection churn that we're unlikely to have had a node for the many days necessary to measure this. We could reduce the connection churn but this would come at the cost of reduced connectivity - when a node disconnects, we give it a few minutes to reconnect, and then we move on. A full blown reputation system as Wuala uses would be a lot of work and a lot of debugging... > > Also, they discourage low uptime nodes by not giving them any extra storage. > I'm not sure exactly what we can do about this, but it's a problem we need to > deal with. > > We should also think about randomising locations less frequently. It can take > a while to recover, and the current code randomizes roughly every 13 to 22 > hours. It may be useful to increase this significantly? Unfortunately this > parameter is very dependant on the network size and so on, it's not really > something we can get a good value for from simulations... I suggest we > increase it by say a factor of 4, and if we get major location distribution > issues, we can reduce it again. This may be important. > > > > Ian. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: <https://emu.freenetproject.org/pipermail/tech/attachments/20080408/7d7569f8/attachment.pgp>