After a long conversation with p0s, I am fairly sure that our decision at last year's summit to use non-convergent encryption for splitfiles (i.e. a different set of blocks each time) in order to largely solve our security problems will make filesharing on Freenet much less convenient.
Other points: - The problem with persistence of data is frequently the top block. This can be solved by duplicating it. This is easy for SSKs, but for CHKs would require quite long URIs. This may be acceptable given the likely data persistence gains. - We must deal with the short segments problem. - Node references should be tolerant of losing newlines and gaining new ones. I had hoped that we could release a 0.8.1 with solutions to the data persistence problems and non-convergent encryption, but it looks like we will need to wait for 0.9 with tunnels and bloom filter sharing. :( [16:22:39] <p0s> toad_: hi. yesterday i told you about the bad user experience of one of my friends ("the node is slow in general") and that he had updated to trunk then... i visited him today and he said that it isn't sluggish anymore and the pages which did not work with 1208 work now with trunk [16:22:55] <p0s> toad_: so its maybe time to consider releasing 1209 [16:23:13] <toad_> yeah that's my thinking [16:23:16] <toad_> but we need more testing [16:23:27] <p0s> any specific testing i can help with? [16:24:14] <toad_> dunno [16:26:02] <p0s> well i'll just update to trunk and see if it announces etc [16:26:19] <toad_> yeah, bootstrap tests are always a good idea before releasing [16:26:23] <p0s> but it still feels like downloads stall at 0% [16:26:25] * toad_ needs to wade through the 270 commits on cvs@ [16:26:41] <toad_> downloads frequently stall at 0%, the reason is that the top block can't be found [16:26:44] <toad_> afaik [16:26:47] <p0s> one has started this morning at few blocks, below 30 or so... after not starting for several days... now its at 38% ! [16:28:08] <p0s> i just wonder why it does not find the top block for almost a week at 3 downloads at once [16:28:49] <toad_> because the non-top blocks have tons of redundancy, whereas with the top block it's a matter of whether the 3 or 4 nodes that have it in their stores are online at the time, not backed off, etc [16:29:13] <toad_> the solution is to duplicate the top block, this is hard for CHKs but very easy for SSKs [16:30:58] <toad_> the other thing to deal with is segment size [16:31:16] <p0s> unfortunately SSKs are not the large files which usually get screwed by missing top blocks... [16:31:17] <toad_> put those together we should improve data reliability quite a bit [16:31:22] <toad_> p0s: well, they should be ! [16:31:36] <p0s> all my testing is with CHK [16:31:54] <toad_> they will be if we make it clear that inserting as CHK is deprecated and in any case will yield a different key each time because of non-convergent encryption (a security feature) [16:31:56] <p0s> SSKs work well IMHO [16:32:06] <p0s> CHK is deprecated? wtf? [16:32:20] <toad_> it isn't yet, but imho it will be [16:32:22] <p0s> the point of CHK is to allow anyone to re-insert if the original inserter is gone [16:32:31] <toad_> in favour of SSK,3@ [16:32:34] <p0s> non-convergent encryption? why? [16:32:36] <toad_> p0s: well we are abolishing that [16:32:50] <toad_> because predictable keys are *REALLY* bad for security [16:33:20] <p0s> for opennet attacks or also if you insert/fetch as darknet only? [16:33:39] <toad_> either way, predictable keys make mobile attacker tracing of an inserter much easier [16:33:59] <toad_> on opennet getting connections to target nodes is of course much easier than on darknet [16:34:06] <p0s> but it will screw data-availability even more because people cannot re-insert! [16:34:50] <p0s> for SSKs it does not matter because freetalk messages and freesites will probably be 99% of the SSKs and those will not be re-inserted anyway [16:34:53] <toad_> they can reinsert they just have to announce the new keys! [16:34:56] <p0s> but for filesharing this SUCKS [16:35:04] <p0s> it will screw the avaiability all over [16:35:19] <p0s> do you think that anyone cares to announce new keys? i really doubt it [16:35:21] <toad_> no, like you said, the main problem is you can never fetch the top block [16:35:35] <p0s> yes and that can be solved by re-inserting [16:35:50] <p0s> but no chance that people will transfer the new keys over and over again if they re-insert that would be a major annoyance [16:36:07] <p0s> full PITA [16:36:11] <toad_> no it can't be solved by reinserting [16:36:26] <p0s> why? [16:36:32] <toad_> the current situation is after 2 weeks stuff is no longer in cache, it's only in store [16:36:40] <toad_> so it takes weeks and months to fetch the top block [16:36:45] <toad_> but once we have the top block, it's not so bad [16:36:52] <toad_> apart from the over-short last segment [16:37:01] <p0s> but the top block will also be re-inserted, won't it? [16:37:14] <toad_> yes but only if the file is reinserted [16:37:24] <p0s> i am talking about manual re-inserts here [16:37:28] <p0s> manual re-inserts should work [16:37:33] <toad_> you really want a filesharing system where you have to reinsert everything every 2 weeks? [16:37:37] <p0s> i am against making manual re-inserts producing a new URI [16:37:47] <toad_> what's your view about huge URIs? [16:38:07] <toad_> it is possible to duplicate the top block for CHKs, but it means adding at least one more routing key per extra key [16:38:34] <p0s> improving availability of files to be longer than 2 weeks does not require non-convergent encryption, does it? [16:39:09] <p0s> all i'm currently trying to say is that forcing people to publish a new CHK uri when they re-insert something manually SUCKS [16:39:19] <toad_> improving availability requires duplicating the top block [16:39:31] <toad_> duplicating the top block can be done with SSKs very easily [16:39:40] <toad_> but with CHKs it requires much longer URIs [16:39:43] <toad_> is that a problem? [16:39:59] <p0s> why not just insert the top block twice? once at the start of the insert and once at the end. it will have a high probability of ending up on different nodes, won't it? [16:40:04] <p0s> how much longer? [16:40:10] <toad_> CHK@<routing key>,<decrypt key>,<extra> -> CHK@<routing key 1>,<routing key 2>,<routing key 3>,<decrypt key>,<extra> [16:40:19] <CIA-41> saces * r27215 /trunk/freenet/src/freenet/client/async/BaseManifestPutter.java: try to fix map creation, should be rigth now. [16:40:29] <toad_> i.e. at least twice as long [16:40:34] <p0s> sucks. [16:40:45] <p0s> okay so why not just insert the top block twice? [16:40:54] <p0s> i.e. run the insert of it twice [16:40:56] <toad_> if the insert doesn't take long there's no point [16:41:09] <toad_> if the insert takes long, well, it might help a little, but imho not in the long run [16:41:41] <p0s> well now that we have our shiny new object database we could just queue the second insert of the top block to be run 1 day later [16:42:04] <toad_> even so, i really don't think it will help that much [16:42:05] <p0s> or N days [16:42:21] <toad_> if it's the same block it will be sent to the same part of the network, and very possibly to the same nodes [16:42:21] <p0s> well then test it :) [16:42:44] <toad_> anyway, WE ARE GOING TO KILL CONVERGENT ENCRYPTION [16:42:50] <p0s> the BAAAAD user expierience of disallowing manual re-inserts is TOTALLY worth the effor of testing whether what i just suggested helps [16:42:56] <toad_> because there is no way to secure freenet without doing so [16:43:09] <p0s> hmm [16:43:21] <p0s> thats the death of file sharing [16:43:26] <toad_> apart from tunnels, which are a big project and will have a *significant* performance impact [16:43:46] <toad_> like 7 hops relaying encrypted data before the insert even starts [16:43:46] <p0s> its not like it has been proven that file sharings systems DO NOT WORK if they don't use MULTIPLE peers for the same file [16:43:54] <p0s> napster, winmx, gnutella, all dead [16:43:59] <toad_> p0s: freenet does use multiple peers for the same file already [16:44:19] <p0s> multiple peers also means allowing different persons to manually re-insert without changing the URI [16:44:30] <p0s> it also means that SAME files will have the SAME key... [16:44:55] <p0s> because having different keys and storage for the same files is a waste of bandwidth and disk space [16:45:28] <p0s> really thats where p2p apps started to become interesting when they started to recognize which files had the same hash ... [16:45:41] <p0s> and now i hear that this will be abolished :(( [16:45:47] <toad_> but how much is convergent encryption actually used in practice? [16:46:08] <p0s> i've seen "please reinsert" on fms often [16:46:47] <p0s> re-inserting is one thing and often two or three clicks... publishing a new URI is not [16:47:14] <p0s> publishing a new uri involves updating freesites, fms posts, freetalk posts, blah [16:47:21] <p0s> annoying as hell [16:47:35] <toad_> well okay, the alternative is what? inserting the top block many times over a period after insertion (which is bad for security in itself); getting clients to reinsert the top block when they download (okay this is a good one); reinserting with tunnels which make the insert take at least 50% longer than it does now and also have a knock-on effect on requests; CHK URIs that are twice as long as current ones [16:47:49] <p0s> => getting clients to reinsert the top block when they download (okay this is a good one); [16:47:49] <toad_> p0s: did the reinserts actually happen? [16:47:56] <p0s> yes [16:48:21] <p0s> re-insert the top block upon download sounds great [16:48:28] <p0s> easy to implement and fair [16:48:34] <toad_> sure [16:48:42] <toad_> but will it be enough? [16:48:43] <p0s> fair because a file's top block only gets spread more if people actually download it [16:48:46] <p0s> do it, try it [16:48:53] <toad_> imho inserting multiple copies of the top block will SOLVE the problem [16:49:02] <toad_> but it requires either SSKs or very long CHKs [16:49:28] <p0s> well look... its a few days of work to test whether reinsert-upon download helps [16:49:39] <toad_> a few days of work that i don't have until post-beta [16:49:41] <p0s> but NOT doing that will ANNOY THOUSANDS OF USERS [16:49:49] <p0s> it is *worth* trying it [16:50:03] <toad_> no, it's something that would take weeks to test properly [16:50:10] <toad_> but yes we could try it [16:50:17] <p0s> we really shouldn't chose the ANNOY decision until we have tried everything else [16:50:28] <toad_> but we do need a solution to the convergent encryption problem [16:50:30] <p0s> for example we still don't have an automatic data availability test i guess? [16:50:39] <toad_> at last year's meeting we agreed to ditch convergent encryption [16:50:45] <toad_> p0s: that is impossible [16:50:54] <toad_> p0s: well ... it depends what you mean [16:51:04] <toad_> "is this data available, without downloading it" is impossible [16:51:07] <p0s> insert(); sleep(1 week) ; fetch(); [16:51:21] <toad_> because it makes it very easy to find the source and attack it without further propagation [16:51:31] <toad_> however, from the inserter's point of view, yes it's possible [16:51:41] <toad_> all you need is a request that goes for lots of random hops before starting [16:51:50] <toad_> so it starts at a random point on the network [16:51:52] <p0s> i'm talking about us inserting random files and seeing how long they are fetchable [16:52:05] <p0s> pretty simple yet useable test [16:52:06] <toad_> yes, we do that now, but there is no delay between insert and fetch [16:52:20] <p0s> well then why not add one? :) [16:52:23] <toad_> we will implement a longer term test, but not yet, because we need to get a release out with current features [16:53:08] <p0s> anyway, apart from all arguments about security, i think it is crucial to understand that site maintainers will be pissed off very hard if they have to update URIs on every re-insert [16:53:09] <toad_> p0s: in the medium term, we do need to deal with the problem of convergent encryption vs attacks [16:53:19] <p0s> if someone hosts 10 000 files do you really expect him to always update the URIs? [16:53:36] <toad_> well, the filesharing system could help with that [16:53:45] <p0s> the filesharing system does not exist. [16:53:45] <toad_> distributed searching system infinity0 is building [16:53:54] <toad_> sites with 10000 files exist? [16:54:06] <p0s> i can insert project gutenberg right now =p [16:54:18] <toad_> do index sites with 10000 files exist now? [16:54:24] <toad_> *before* you insert it? :) [16:54:36] <p0s> there are index sites with ~ 1000 files i guess [16:54:50] <p0s> i would be really pissed if it was my job to always update the URIs [16:55:02] <sdiz> toad_: wanna edition 17 index have ~4000 files, and it was broken. [16:55:05] <toad_> p0s: there are convoluted solutions involving reinserting to a random key, announcing the key, and getting the people asking the data to reinsert it in non-convergent form [16:55:12] <sdiz> the metadata size limit [16:55:23] <toad_> sdiz: i don't follow [16:55:47] <sdiz> hmm, the bug is fixed [16:56:05] <sdiz> but thousands of files is not the uncommon [16:56:17] <p0s> toad_: i can tell you what will happen if non convergent encryption is required [16:56:20] <toad_> p0s: this busts the people asking for the data rather than the inserter, and it means we have to try both sets of keys, we still have to announce the new key, and the first requesters need to do the reinsert [16:56:24] <p0s> toad_: people will insert to SSK so the URI does not change [16:56:36] <p0s> toad_: then they will publish the insert URI of the SSK so anybody can re-insert the file [16:56:44] <toad_> sdiz: I mean thousands of *files*, not thousands of pages on a freesite [16:56:48] <toad_> links to files in an index [16:56:51] <p0s> which voids the security of non-convergent encryption i guess [16:57:03] <toad_> p0s: no, it just means the reinsert will break [16:57:34] <toad_> p0s: because the first layer will be a multi-level metadata splitfile, and that will be encrypted differently to last time, so it will not converge with the SSK [16:58:16] <p0s> so manual re-inserts will ALWAYS produce a different URI and different encrypted blocks? [16:58:19] <p0s> sucks. [16:58:26] <toad_> that is the current plan yes [16:58:31] <p0s> sucks sucks sucks sucks and i won't use it [16:58:43] <p0s> :( [16:58:51] <toad_> the alternative is as I mentioned to tunnel the inserts [16:59:05] <p0s> tunneling ~= onion routing? [16:59:31] <toad_> on average an insert probably goes to 20 nodes, so adding approx 7 at the beginning should be a 30% slowdown, I guess this is acceptable... [16:59:53] <toad_> but really, tunneling with randomised encryption is *way* more secure than tunneling without randomised encryption [17:00:09] <p0s> it does not help if nobody uses freenet anymore [17:00:23] <p0s> to me this sounds the same like the "darknet only" thing which freenet had before opennet [17:00:26] <p0s> too idealistic [17:00:46] <toad_> well when i brought it up before, nobody seemed to have a big problem with it [17:01:14] <p0s> did you bring it up in a developer or in a user discussion? [17:01:29] <toad_> on the mailing lists, and at the mini-summit last year [17:01:29] <p0s> you should ask the users whether they will use it [17:01:34] <p0s> not the developers whether it is perfect [17:01:48] <p0s> mailing lists have too high traffic for users [17:01:52] <toad_> would the users accept a 30% performance loss? [17:01:59] <toad_> on inserts? [17:02:15] <p0s> everyone would do that [17:02:20] <p0s> because you do not WAIT for inserts [17:02:26] <p0s> if you download something you are WAITING for it to finish [17:02:32] <toad_> okay... [17:02:33] <p0s> inserting is just "click and forget till its finished" [17:02:39] <toad_> and capacity would not be a problem? [17:02:47] <toad_> what about long CHKs? [17:02:58] <p0s> insert delays have already doubled for me with the new 7zip/etc stuff because it tries to compress multiple times [17:03:00] <p0s> i do not give a shit [17:03:11] <p0s> capacity? [17:03:26] <toad_> well it means you could insert 30% less in a month [17:03:39] <toad_> this won't be a problem generally? [17:03:46] <p0s> then i'd just get more bandwidth [17:03:56] <toad_> heh yeah if life was so easy... [17:04:11] <toad_> it's the sort of figure that is acceptable for security anyway [17:04:21] <toad_> anyway [17:05:12] <toad_> well, as far as the attacks we are discussing go, what would the consequences be? the attacker can predict the keys, so he can trace each of the tunnel endpoints relatively easily ... hence using lots of tunnel endpoints would be bad, but it is needed for good performance ... [17:05:13] <p0s> the really important point is that it does not matter how slow inserts are because people have patience with them because IF they want to insert something then they ARE WILLING to SPENT some of their time without GETTING ANYTHING for it [17:05:19] <p0s> they have already decided to be patient! [17:05:34] <toad_> tunnels are not only extra hops, they have limited capacity [17:05:56] <toad_> the longer they are the less info the attacker gains from tracing them, but the greater the chance he is on the route and therefore gets a predecessor sample... [17:06:14] <toad_> I dunno, it *might* be possible to achieve acceptable insert performance with tunnels [17:06:19] <p0s> tunnels = onion routing? [17:06:23] <toad_> but it's definitely 0.9 stuff, not 0.8.1 stuff [17:06:46] <toad_> something like that, we send out a bunch of random routed anchors and when they rendezvous create a tunnel through the shortest route [17:06:53] <toad_> using secret sharing [17:06:57] <p0s> well that is something we need ANYWAY [17:07:06] <toad_> it's not classical onion routing, which requires you know the network [17:07:12] <toad_> which isn't practical for freenet [17:07:13] <p0s> if it also gives the benefit of allowing us to keep convergent encryption thats GREAT :) [17:07:14] <p0s> win win! [17:07:34] <p0s> two problems solved at once!" [17:07:49] <toad_> well the other benefit is that it allows us to do bloom filter sharing [17:07:58] <p0s> wow so another one [17:08:06] <p0s> so there is really no reason to disallow convergent encryption =P [17:08:25] <toad_> well, tunneling will cost significantly in performance; bloom filter sharing will gain some of that back for requests [17:08:45] <p0s> anyway i might quit maintaining some of my freesites if i have to replace the old uris with new ones after maintaining. [17:08:56] <p0s> i'm one maintainer maybe you should ask more of them via freemail or whatever [17:09:06] <toad_> we really don't know whether requests will be faster or slower afterwards ... but i'm fairly sure they will be more reliable [17:09:32] <p0s> after maintaining = after reinserting [17:09:58] <p0s> right now i just have to click a few upload buttons [17:10:00] <p0s> thats acceptable [17:10:10] <p0s> reinsert & forget [17:10:13] <toad_> anyway, your point is simply that convergent encryption is vital, that randomised splitfile encryption is not a usable quick-fix for our (currently severe) security problems with inserts of predictable data [17:10:29] <p0s> yes. [17:10:57] <toad_> so we'll have to wobble on with our current crappy security until we have full blown tunneling and filter sharing [17:11:13] <toad_> the other worry with filter sharing is it may not work well on opennet [17:11:17] <toad_> maybe even not at all [17:11:18] <toad_> but we'll see [17:11:20] <p0s> the ability to reinsert something without having a new uri is GREAT... its usually one of the questions i receiv from newbies: "what happens if the file falls off" the network?"..... right now i can just answer "then you re-insert it and people can download it again" [17:11:32] <toad_> if all requests are tunneled and most nodes don't share bloom filters then requests will be way slower [17:11:53] <toad_> otoh you can turn it off for requests and just do it for inserts [17:11:55] <p0s> most nodes will share bloom filters if it is the default seting [17:12:00] <toad_> requesters are expendable :) [17:12:05] <p0s> our users trust freenet more than we do [17:12:19] <toad_> p0s: yeah, it's a question of whether the level of churn will be so high that it isn't economic on bandwidth to keep updating the filters [17:12:31] <p0s> they are not paranoid about ultra clever attackers who spent weeks for writing tracing software [17:12:48] <p0s> FOAF is also enabled by most i guess [17:12:52] <toad_> p0s: I could definitely do it in that timeframe, and I charge ?12 an hour [17:13:10] <p0s> which timeframe? [17:13:19] <toad_> p0s: so for well under a grand I could build you a good tracing tool so you can go hunt paedophiles [17:13:31] <p0s> ah [17:13:38] <p0s> also if they run darknet only? [17:13:39] <toad_> current security is *bad* [17:13:53] <toad_> it's a lot harder if they run pure darknet because it's a lot harder to get connections [17:14:05] <p0s> harder? impossible. [17:14:16] <toad_> nah, it involves social engineering [17:14:30] <toad_> but compared to opennet that's vastly more difficult [17:14:46] <p0s> well we teach users to connect to people whom they know in reallife [17:15:06] <toad_> imho "in reallife" is excessive and will limit growth overly [17:15:10] <p0s> so social engineering is not possible because you cannot pretend to be 10 different reallife persons .) [17:15:12] <toad_> anyway, what's your view on long CHKs? [17:15:32] <p0s> and we tell people that if they use opennet they are screwed anyway. [17:15:41] <p0s> so darknet is secure enough right now, great :) [17:15:52] <toad_> depends what you need [17:15:52] <p0s> IMHO 3 routing keys is very very long [17:16:01] <toad_> yes, is this a problem? [17:16:03] <p0s> yet it wouldn't prevent them to be copy&pasteable [17:16:12] <p0s> most people copy & paste anyway [17:16:16] <toad_> well, noderefs are generally regarded as not copy & pasteable [17:16:22] <p0s> if you need a short uri you can insert a KSK link to the CHK [17:16:34] <p0s> thats because noderefs contain linebreaks and ICQ screws them [17:16:53] <toad_> not because of their length? then we need to expend some effort to fix that [17:17:01] <p0s> for me it was always the linebreaks [17:17:05] <p0s> icq just eats them [17:17:21] <toad_> so you don't think long CHKs are a big problem in general? [17:17:38] <p0s> yes they are DEFINITELY better than forcing people to re-publish uris after re-insert [17:17:39] <p0s> don [17:17:41] <p0s> don [17:17:45] <p0s> damn^^ [17:17:49] <toad_> hmmm? [17:17:54] <p0s> don't forget that re-publishing also involves security risks [17:18:04] <toad_> not compared to the alternative [17:18:15] <toad_> one sample against how many tens of thousands? [17:18:23] <p0s> okay well yea [17:18:38] <p0s> still i think that forcing people to re-publish uris should be the LAST resort [17:18:43] <p0s> really the very last one [17:18:56] <toad_> do you mind if I post this conversation? [17:19:04] <p0s> i urge you to post it [17:19:06] <p0s> :) [17:19:08] <toad_> is it a problem for you to be known to maintain sites linking to keys? [17:19:36] <p0s> no there are many sites with absolutely legal content and i only maintain those [17:19:41] <toad_> ok [17:19:54] <p0s> http://www.gutenberg.org [17:20:07] <p0s> that should be on freenet as a whole :) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 835 bytes Desc: This is a digitally signed message part. URL: <https://emu.freenetproject.org/pipermail/devl/attachments/20090422/78e2a938/attachment.pgp>