[freenet-dev] 0.7 routing explanation?
Is there an overview of the 0.7 routing architecture online somewhere? I'm curious as to how it compares to previous routing schemes Freenet has used. Thanks! Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
[freenet-dev] Re: [Tech] Bandwidth usage too low? Insert slowness...
Does Freenet report its bandwidth usage? I show 10-20KB/s usage, with a bandwidth setting of 100K. It would be interesting to know the bandwidth settings of the nodes I'm connected to... Evan On 4/6/06, Matthew Toseland [EMAIL PROTECTED] wrote: Is your 0.7 node consistently using less bandwidth than it should? Much less? If this is the case then it may show why inserts average 1kB/sec at the moment... Does anyone have any other ideas why inserts are so slow? -- Matthew J Toseland - [EMAIL PROTECTED] Freenet Project Official Codemonkey - http://freenetproject.org/ ICTHUS - Nothing is impossible. Our Boss says so. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFENW3hHzsuOmVUoi0RAujfAJ46TmYq7Qjte4cHO6dOM9UxkwFncACgk25b 6WcPJtVdj88JSOpwqOfzNRk= =+xdL -END PGP SIGNATURE- ___ Tech mailing list Tech@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/tech ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] 3 or more
On 4/18/06, Volodya [EMAIL PROTECTED] wrote: Is it possible to ask for any explanation for the 3 link limit that you have imposed upon 0,7? I really do not understand how i am suppose to switch to that network now, my original plan was to contact all my friends who i trust and tell them that if they ever wanted to achieve anonymity, my node is open to them, then i'd connect to them and we'd start our little network, slowly the network would continue to grow, as they would invite their friends, etc, and everybody would be happy. With the limit i can only contact people who already know at least 2 other people willing to try freenet out, or i have to ask my friends to compromise their anonymity to each other (many of my friends don't know each other), or i will have to run 3 nodes on my computer to start with (my current plan) which will allow for us to go around the limit, until such time when people will start finding other links themselves. Free your mind and seek the truth. - Volodya With 3 links you are either a leaf node (1 link) or on a chain (2 links). In either case, no real routing is occuring. I would suggest that you consider a) running with too few links, which will work ok on small networks (a guess ;) ) and b) finding more friends to ask. Are there people you've met online and trust enough? Here I would put trust as you believe they're not EvilCorp (tm) or Big Brother, you believe they are at least somewhat paranoid about their identity, and you believe they won't directly attack your privacy. Personally I contacted a few friends and then got on IRC to get more connections. I think there is still a large security improvement over 0.5 (mostly from no harvesting; an attacker would have to do real work to find people). If I was worried about my annonymity in a more than passing way I'd do what I suggested above. Good luck! Evan Daniel PS send me a noderef and I'll connect with you! You can trust me, I promise ;) ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Freenet Name server
On 6/18/06, Colin Davis [EMAIL PROTECTED] wrote: Juiceman wrote: What if 2 or more lists have the same user-friendly name but have them pointing to different keys? How would this be handled? Keep in mind, unless it's from your list, you don't access Key.. You access Username\Key, where username is the name chosen on the USK page. That way, you can have both E1ven\Coolness and Juiceman\Coolness, and both work. As for two versions of a single list.. Ie, Getting Bob's list from Bob, versus Bob's list from Sally, I suppose you'd want to add a version number. Aum? What if I have 2 lists, one from someone I call Alice and one from someone I call Bob. Then Alice adds a friend she calls Bob (who's not the same as the one I call Bob). What now? Or alternately both Alice and Bob add different Charlies... Evan ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Freenet Name server
On 6/18/06, Colin Davis [EMAIL PROTECTED] wrote: Hrmm.. For one, keep in mind that these names are for the User's beneifit, not the node's... In reality, Charles's nodelist is a unique USK, at [EMAIL PROTECTED] So that means that the User might get confused, but the node knows the difference between them. Because the node knows the difference between them (the node knows teh difference between the charles that Alice added, and the charles Bob added, since the File includes the original USK), you can prioritize- You can tell the name resolver to draw from the lists in in order of their place in the file.. So if you have Alice [EMAIL PROTECTED] Charles [EMAIL PROTECTED] Charles [EMAIL PROTECTED] Bob [EMAIL PROTECTED] Your name resolver will trust the top one more than the second, and so on. Doing it that way automatically passes the order on to the people subscribed to your list.. They inherit your trust relationship, and your priority by default. I think in practice, link pages would end up being uniquely named. Do you have a better suggestion, without giving each node a user-facing generated number? Nope. Propogating prioritizations was the best I could come up with on short notice. Or expose directory structures, like Alice/Charlie/somesite and Bob/Charlie/somesite, but that seems worse. Prioritization does mean that different people will have Charlie/somesite resolving to different locations, though. Which means I can't necessarily assume I can give it to someone on a business card... In the extreme case, this enables a redirection attack if people high up in the list decide they don't like someone. And most users would never notice it in all likelihood, so it's hard to police. I don't see a good answer to these problems, unfortunately. Keep thinking, though, it's well worth pursuing. (Actually I do have one idea on it, I'll give it some thought and post if I still like it). Evan ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Semi-opennet?
On 6/27/06, Lars Juel Nielsen [EMAIL PROTECTED] wrote: On 6/27/06, Ruud Javi [EMAIL PROTECTED] wrote: Do we want semi-opennet support? This would be a way to connect, with mutual advance consent, to peers of our direct peers? (There would be measures taken to ensure that we don't connect to peers of their direct peers). Well, I am not sure but I am not a fan of it. Semi-opennet to me sounds like the worst of two worlds. The idea of darknet is that you need tot trust your neighbors, but you are pretty safe to everyone else. I think a semi-opennet would give a less safe network, because people are connecting to people they have not added them selves. My guess is that people would turn it on because it would make Freenet faster, while it would also make it less safe for them imho. Further, you would still need to add some connections, so this would not bring in the big user group that is looking for an opennet-version of Freenet .7 at all. If you have some special reasons/ arguments for this semi-opennet, please post. If you want we could discuss about if there should be an opennet in Freenet .7 , and how it should look like. I have some other ideas to get people to freenet .7 that wants an opennet. Unfortunately I am already seeing a few weak points, so other idea's might be better :) greetings, Ruud It sounded really nice but I think you're right, this would be a bad idea. Well, in my mind it's not really a question of whether it's a good idea or a bad one in the absolute sense, but of whether it's better or worse than people exchanging noderefs with random people on IRC, or some other hackish way of putting together an opennet. It's a lot easier to find 1-2 friends and connect to them and a couple neighbors than it is to find 5-6 independent links. Also, I believe this would provide better topology than the current mess of people connecting to random nodes, even if it only meant that each person connected to a couple random nodes and a couple of their immediate neighbors, no? Anyway, the goal is to improve things, not to hold out for perfection, and this seems likely to improve on the status quo. I would also advocate highly visible security warnings about it. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Darknet and opennet: semi-separate networks?
On 8/15/06, Matthew Toseland [EMAIL PROTECTED] wrote: Because in many cases the network we provide it with is not a single small world network (which is what it is designed for), but two loosely connected small world networks of different parameters. It seems likely to me that interest in content will closely match connectedness of the networks -- content created on the chinese network will be of interest on the western network to a degree approximately proportional to the interconnectedness of those networks. So bottlenecks in the topology are present only in places where they aren't a problem. Obviously I have no proof of this, but it seems at least as intuitive to me as the assumption that there will be a pair of loosely connected networks in such a way as to create a bottleneck. I think it is inappropriate to spend time or effort worrying about this problem until we have both a method to simulate the network in question and a set of load balancing / routing algorithms that work on a single network that we can test on a split network. The only counter argument to this that I can see is if there is obvious reason to believe that decisions made without worrying about this possibility will be actively problematic later in the development process, and that seems unlikely in the extreme to me. And lastly, why shouldn't the split network be small-world? By small world I assume you mean the triangle property holds, ie if a and b are connected, and b and c are too, then there is a significantly increased probability of a and c being connected. Is there some reason to believe that this property fails as soon as national / cultural borders get in the way? I can see there being bottlenecks, but I don't see how that precludes the small-world nature of the network. Evan ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Darknet and opennet: semi-separate networks?
On 8/17/06, Ian Clarke [EMAIL PROTECTED] wrote: On 17 Aug 2006, at 09:58, Matthew Toseland wrote: On Thu, Aug 17, 2006 at 09:37:02AM -0700, Ian Clarke wrote: I don't believe that the darknet and opennet will be weakly connected as you suggest, but neither of us can no for sure until we see it. We can know for near certain that darknets operating in hostile environments will be weakly connected to the opennet, and probably to other darknets too, for the simple reason that they CANNOT use opennet. No, but they can be connected to peers outside the hostile environment that can be promiscuous. Can they? If the outside peer is promiscuous, then it can be harvested (with some greater amount of effort than for 0.5, right?). So can't a hostile gov't harvest external promiscuous nodes and block all traffic to / from them? Then you'd need a user behind the firewall to connect to a darknet-only node outside the firewall, which would then connect to promiscuous nodes via darknet connections. That might be a problem... And it's definitely a way in which having an open-net hurts the darknet (though I do agree that we have a defacto open-net right now). Evan ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Insertion Verification
On 8/25/06, Michael Rogers [EMAIL PROTECTED] wrote: * Can reputations be negative as well as positive? * If reputations can be negative, how do you prevent nodes from generating new identities to escape bad reputations (whitewashing)? * If reputations can only be positive, do new nodes start with a zero reputation or a slightly positive reputation? * If new nodes start with a zero reputation, why should anyone trust them? * If new nodes start with a slightly positive reputation, how do you prevent nodes from generating new identities to return to a slightly positive reputation (whitewashing again)? Starting with a neutral reputation and the ability to have both positive and negative reputations is exactly equivalent to starting with a slightly positive reputation and the ability to have only positive reputations. I would argue that since the only absolute reference point is a new node, it ought to have reputation 0 by definition. And there's no reason not to allow negative reps, but we should assume that any node that has a negative rep will ditch its old identity, and therefore negative reps are useful only if they can be tied to something hard to change, like one of our darknet links. Evan ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] SHA-1 broken at the Crypto 2006
On 9/5/06, Michael Rogers [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Matthew Toseland wrote: We will be using STS, at least initially. Which means checking a signature. Cool, IANAC but I think we should be OK. As long as we're signing the data, not its hash; in normal use, one signs the hash of the data for compute cost reasons (and IIRC there are security reasons too, but I don't have Applied Cryptography in front of me right now). That is secure as long as there is second preimage resistance, but the hash function *is* security critical. Evan ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Alpha, Darknet routing, et al.
On Feb 1, 2008 12:00 PM, Robert Hailey [EMAIL PROTECTED] wrote: On Jan 31, 2008, at 6:48 PM, Evan Daniel wrote: On Jan 30, 2008 5:49 PM, Matthew Toseland [EMAIL PROTECTED] wrote: You also need an escape-route mechanism - a way to find an entrance into another network once regular routing has exhausted the local network. Doesn't this allow an attacker to selectively DOS the bottleneck points by sending out requests for non-existant data? Evan Daniel If we allow the requestor to specify which network they are trying to get to, then maybe (but the node still can rejectoverload like any other). I think it would work better to the negative; specify which networks *not* to route to, this would not only help on a reject of a network-gateway node, but it also lets nodes w/o a good routing table to use the same mechanism. Even if the requestor can't specify a target network, I think it works. If the model is that the request is first routed within the network, and if that fails it tries to find an escape route -- then that escape route is a bottleneck (by definition). The nodes using rejectoverload is insufficient, I think -- they'll reject the attacker's requests and real requests with similar probability, and so performance for real requests will degrade substantially. Now the attacker only needs resources comparable to the bottlenecks; they don't even have to know where those bottlenecks are in order to seriously degrade the network topology. I'm not familiar enough with the details of the proposed ULPRs and how USKs and Frost and the like check for new updates / messages, but it seems possible that simple legitimate checks for new content would have a similar effect. Of course, failure tables would help a lot with that case, but they wouldn't help against a malicious attacker. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Alpha, Darknet routing, et al.
On Feb 1, 2008 12:57 PM, Matthew Toseland [EMAIL PROTECTED] wrote: Even if the requestor can't specify a target network, I think it works. If the model is that the request is first routed within the network, and if that fails it tries to find an escape route -- then that escape route is a bottleneck (by definition). The nodes using rejectoverload is insufficient, I think -- they'll reject the attacker's requests and real requests with similar probability, and so performance for real requests will degrade substantially. Now the attacker only needs resources comparable to the bottlenecks; they don't even have to know where those bottlenecks are in order to seriously degrade the network topology. I'm not familiar enough with the details of the proposed ULPRs and how USKs and Frost and the like check for new updates / messages, but it seems possible that simple legitimate checks for new content would have a similar effect. Of course, failure tables would help a lot with that case, but they wouldn't help against a malicious attacker. Could ULPRs help to resolve it? Would it be possible to estimate the demand for a key (in a way which doesn't favour single nodes that constantly rerequest, and is biased by links so that an attacker could only attack proportionately to the number of connections he has), in order to decide which requests to let through? I think ULPRs will do a good job of preventing legitimate traffic from creating such an effect. A malicious attacker, however, would have no reason to repeat keys, so any technique that simply tries to make re-requests more efficient would have no effect. Biasing on popularity is probably a good thing, and if it can be done in a relatively attack-proof manner, might be the solution. Do we have any understanding of how well network clusters will correlate with content clusters? That is, if there are effectively two networks, especially if they result from cultural and language barriers, to what extent will the two sides be uninterested in communicating with each other? I think having a ballpark answer to that question will go a long way in determining how big a problem this really is, and also what sort of solutions might be appropriate. Of course, it sounds hard to answer :) Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Packet size proposal
On Sat, Mar 8, 2008 at 7:47 AM, Michael Rogers [EMAIL PROTECTED] wrote: Matthew Toseland wrote: RFC 2861 is rather late in the day for TCP. If we had set out to copy TCP we would likely not have seen it. So what's your point - that because TCP has bugs, anything that's not TCP won't have bugs? We're the only people looking for bugs in Freenet. Lots of people are looking at TCP. Not possible. Well, maybe with transport plugins we'd have it as well as UDP, but our primary transport will be based on UDP for the foreseeable future, because of NATs. TCP NAT traversal is nearly as reliable as UDP now, and likely to get better as new NATs implement the BEHAVE standards. Plus we have UPnP (yes, it's ugly and unreliable, but it's better than nothing) and NAT-PMP. Again, a lot of other people are working on this. At least for the near term future, and probably longer, we need an answer other than TCP because of ugliness like Comcast's Sandvine hardware. Forged TCP reset packets are non-trivial to deal with, but the equivalent problem doesn't even exist for UDP. Also, most consumer-level NATs are probably old devices that won't be upgraded any time soon. Remember, we want to handle an average user's NAT well, even if they can't / won't change the settings when Freenet asks them to. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Packet size proposal
On Sat, Mar 8, 2008 at 9:30 AM, Michael Rogers [EMAIL PROTECTED] wrote: Evan Daniel wrote: At least for the near term future, and probably longer, we need an answer other than TCP because of ugliness like Comcast's Sandvine hardware. Forged TCP reset packets are non-trivial to deal with, but the equivalent problem doesn't even exist for UDP. True, UDP is more robust than TCP against this particular attack, but that just means the next logical step in the P2P vs ISP arms race is for all the P2P apps to move to UDP, and then the ISPs will just start throttling UDP instead of forging RSTs. Ultimately if your ISP doesn't want to carry your traffic, they won't carry it. Sure, the arms race will continue. Hence near-term future. For the near-term future, we want to be on the winning side of it, rather than assuming we can switch to the way that *currently doesn't work*. You're also ignoring the reason they're forging resets rather than throttling -- they don't need to modify the main routing hardware to inject packets, they do need to modify it to drop or delay them. Thus it's cheaper to forge TCP resets than to throttle UDP or TCP. They certainly *can* throttle things properly, but the point remains that they aren't, and likely will continue doing exactly what they're doing for the near term future. I don't know what the state of legacy NATs is. I had been of the impression UDP worked better, but I could easily be mistaken. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Post 0.7 idea: off-grid darknet!
On Sat, May 10, 2008 at 12:33 PM, Ian Clarke [EMAIL PROTECTED] wrote: Ian is of the view that this should be a separate application based on similar principles to Freenet. I'm not. We agree that there are some significant issues to deal with. I am of the view that these networks are mutually complementary and therefore should talk to each other I think the use-cases are too different for these to be part of the same application. IMHO, there's another interesting use-case. If I have a friend or two I see daily at work or similar, and we swap 8GB memory cards, that represents more bw than my cable modem uplink! (And the cost of a memory card is lower than 1 month's subscription, provided it gets swapped most days.) There's an interesting hybrid option here -- for large queued downloads, requests go over the network link, but responses go over sneakernet. I think flood routing inserts opportunistically is a good idea -- there's no point in sending out a memory card less than full, and routed requests / inserts may well not be enough to fill it. One interesting case is Cuba -- there's an operational sneakernet there already: http://www.nytimes.com/2008/03/06/world/americas/06cuba.html?ex=1362546000en=eff6155b2c2d280dei=5124partner=permalinkexprod=permalink Currently it's basically manually flood routed, but I imagine there would be significant demand for proper freenet routing to distribute entertainment; everyone wants to see the latest news media, but perhaps not the entertainment stuff depending how much there is. There may also be significant numbers of local wifi hops available that aren't boardly connected (pure speculation on my part), so switching back and forth between regular Freenet links and sneakernet links could be useful. Also, in small communities where there's strong motivation and short geographical distances, you may well find the motivation sufficient to produce latencies of a couple hours, not a day or so, at least in some cases. I have visions of Neo from The Matrix, sitting in a darkened apartment and acting as clandestine data broker... Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Post 0.7 idea: off-grid darknet!
On Mon, May 12, 2008 at 6:48 PM, Michael Rogers [EMAIL PROTECTED] wrote: Evan Daniel wrote: I think flood routing inserts opportunistically is a good idea -- there's no point in sending out a memory card less than full, and routed requests / inserts may well not be enough to fill it. My knee-jerk reaction was flooding doesn't scale, but it's actually worked alright for Usenet - with a couple of tweaks. First, break down the traffic into channels and allow each node to decide which channels to carry. Second, flood the message IDs rather than the messages, and only request the messages you haven't seen. What I mean is: you're preparing an 8GB memory card to send to a neighbor. You've been able to find 3GB worth of data to fulfill some of his outstanding requests. Rather than leave the other 5GB empty, you should fill it with inserts you've seen recently, even if they might not be normally worth directing to him (because of wrong location, etc). Unlike other sorts of bandwidth, it can't be retasked for transmission to a different node. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Post 0.7 idea: off-grid darknet!
On Mon, May 12, 2008 at 6:56 PM, Ian Clarke [EMAIL PROTECTED] wrote: On Mon, May 12, 2008 at 9:52 AM, Matthew Toseland 2. Most or all Freenet apps assume a few seconds latency on requests (Frost, Fproxy, etc), yet the latency with the sneakernet would be measured in days. Freenet's existing apps would be useless here. Not true IMHO. A lot of existing Freenet apps deal with long term requests, which would work very nicely with sneakernet. Such as? FMS is pretty slow even with multi-second requests, do you really think it would be useful with multi-day requests? I can't think of a single Freenet app that would be useful over a transport with multi-day latencies, it would be insane. I'm pretty sure FMS is slow because it has a list of a few hundred identities to poll for messages, and it only polls 10-20 at a time. On a sneakernet you'd send all the poll requests at once. There's no reason the delay on receiving a message couldn't be roughly the one-way latency of the path. Downloading any sort of large media file can take days on Freenet *right now*. People still do it. What do I care whether the 4 day download delay is routing delay or bandwidth limit? The major change needed would be a way to request not the specific SSK block, but the SSK, whatever CHK it happens to redirect to, and any CHK blocks needed to decode the result -- plus a way to prevent that being a DoS attack (tit-for-tat?). Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
[freenet-dev] plugins/build.xml
Fix to plugins/build.xml to correctly specify paths, assuming normal svn checkout of both plugins/ and freenet/ Evan Daniel Index: build.xml === --- build.xml (revision 19885) +++ build.xml (working copy) @@ -2,8 +2,8 @@ !-- ant build file for Freenet -- project name=Freenet default=dist basedir=. - property name=freenet-cvs-snapshot.location location=/home/nextgens/src/freenet/src/freenet/lib/freenet-cvs-snapshot.jar/ - property name=freenet-ext.location location=/home/nextgens/src/freenet/src/freenet/lib/freenet-ext.jar/ + property name=freenet-cvs-snapshot.location location=../freenet/lib/freenet-cvs-snapshot.jar/ + property name=freenet-ext.location location=../freenet/lib/freenet-ext.jar/ property name=source-version value=1.4/ property name=build location=build// property name=dist location=dist// ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] [freenet-cvs] r19912 - trunk/freenet/src/freenet/crypt/ciphers
On Fri, May 16, 2008 at 8:05 AM, Matthew Toseland [EMAIL PROTECTED] wrote: On Friday 16 May 2008 00:52, Daniel Cheng wrote: On Fri, May 16, 2008 at 1:13 AM, Matthew Toseland [EMAIL PROTECTED] wrote: On Thursday 15 May 2008 17:01, Daniel Cheng wrote: On Thu, May 15, 2008 at 10:30 PM, Matthew Toseland [EMAIL PROTECTED] wrote: On Tuesday 13 May 2008 17:10, [EMAIL PROTECTED] wrote: Author: j16sdiz Date: 2008-05-13 16:10:32 + (Tue, 13 May 2008) New Revision: 19912 Modified: trunk/freenet/src/freenet/crypt/ciphers/Rijndael.java Log: No Monte Carlo test for Rijndael Huh? The test output the monte carlo test result, it is supposed to be compared with ecb_e_m.txt in the FIPS standard. Our implementation is the original Rijndael (not the one in FIPS standard), the output does not match ecb_e_m.txt. Is that bad? Presumably changes during the standardisation process were to improve security? Just like what NIST did to other cipher, this remain a mystery -- no one outside NIST know why. This can be good or bad, depends on the conspiracy level. FYI, NIST once fixed a DES vulnerability before anybody else suspect there was a weakness. The standard AES is not compatible to our Rijndael implementation I guess it's not worth breaking the backward compatibility in 0.7.1. It might be if it's more secure...? Unless I'm mistaken, the difference between Rijndael and AES relates to things like specified block sizes and not the core crypto: http://en.wikipedia.org/wiki/Rijndael#Description_of_the_cipher Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Moving to java 1.5
On Sat, May 17, 2008 at 10:08 PM, Juiceman [EMAIL PROTECTED] wrote: On Sat, May 17, 2008 at 4:58 PM, Matthew Toseland [EMAIL PROTECTED] wrote: GCC 4.3 shipped in March, including the new ECJ frontend. It has full support for all the new 1.5 language features. IMHO this means that there is no longer any reason to stick to java 1.4. Are the opensource jvm's up to 1.5? If so, I say go for it. :) IIRC, there are no JVM changes, only compiler changes. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Freenet uninstallation survey results so-far
On Mon, Sep 1, 2008 at 4:43 PM, Matthew Toseland [EMAIL PROTECTED] wrote: Searching: Lots of users want searching, and are disappointed when they don't get it. Integrating XMLLibrarian into freesites and into the homepage would help; making XMLSpider easier to use might help too; in any case, we need somebody to regularly insert an index. At present the spider doesn't work. I'm happy to run it for testing, but probably won't run it for a public index (I'd want to run it anonymously). See https://bugs.freenetproject.org/view.php?id=2350 for details. Short version: after some time, the spider stalls with a queue full or queries that neither complete nor fail. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Freenet uninstallation survey results so-far
On Tue, Sep 2, 2008 at 2:43 AM, Florent Daignière [EMAIL PROTECTED] wrote: * Evan Daniel [EMAIL PROTECTED] [2008-09-01 22:35:07]: On Mon, Sep 1, 2008 at 4:43 PM, Matthew Toseland [EMAIL PROTECTED] wrote: Searching: Lots of users want searching, and are disappointed when they don't get it. Integrating XMLLibrarian into freesites and into the homepage would help; making XMLSpider easier to use might help too; in any case, we need somebody to regularly insert an index. At present the spider doesn't work. I'm happy to run it for testing, but probably won't run it for a public index (I'd want to run it anonymously). See https://bugs.freenetproject.org/view.php?id=2350 for details. Short version: after some time, the spider stalls with a queue full or queries that neither complete nor fail. I have changed how callbacks are handled in previous stable; can you still reproduce it with builds post 1158? The behavior is unchanged with build 1160. The quickest way to verify is to recompile the spider with maxParallelRequests = 1. After a little while it should get stuck with a request in the queue that never completes or finishes. (It took a few minutes to stall here, with 800 completed requests and 50 failed.) Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
[freenet-dev] Wiki and documentation
I think there are a number of problems that could be solved by better, up to date documentation. Many of these could better be solved by major fixes to the UI, but those have yet to appear. I think the simplest and easiest way to get better documentation is to have a more active wiki. There's a decent amount of good material on the wiki, but I don't think it sees much use. I propose that the wiki be given a more prominent place in the documentation: - Currently, the link to the wiki from the freenetproject.org main page is buried as the last link under the documentation subheading. Move it to the second link (after What is Freenet? of the main menu. - Add a link on fproxy (with a redirect warning that the wiki is not anonymous). - Make an effort to point users to the relevant wiki pages if their question / problem has a known answer. Raising user awareness of the wiki is important. Thoughts? Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Freenet progress update
On Thu, Feb 26, 2009 at 4:19 PM, Matthew Toseland t...@amphibian.dyndns.org wrote: - Possibly increase the number of nodes faster nodes can connect to. [...] TOP FIVE USERVOICE SUGGESTIONS: 1. Release the 20 nodes barrier. This is marked as under review, it may happen in 0.9. It requires some alchemy/tweaking. :| Alternately (or in addition), you could decrease the node limit on slow nodes. I've been running with a lower bandwidth limit lately (15KiB/s out), and I see slightly improved payload fraction with a 15 node limit than 20 -- about 71-72%, vs about 68% with 20KiB/s. Bandwidth limiting still hits its target effectively (currently showing 14.3KiB/s average on 18h uptime). Subjectively, I can't see a difference running 20 vs 15 connections. As I understand it, the problem is that per-connection speed is limited by the slowest connection (approximately). If slow nodes had fewer connections, those connections would be faster, just as if the faster node had more connections. So from a bandwidth usage standpoint, the two approaches should be similar. I do see two advantages to not increasing the connection limit, though. With a small network of only a few thousand nodes, the diameter of the network is very small. Eventually, when Freenet has a large network, routing needs to work over a larger diameter. If you increase the connection limit now, you'll learn less about how Freenet scales in practice in the near future. Since reducing the connection limit on low bw nodes seems to increase the payload fraction, that means their bw is being used more efficiently. My recollection is that reducing the connection limit didn't change payload fraction at higher bw limits. Efficiency improvements are nice even if they're small and only on some of the network. I think it would be inappropriate to reduce the connection limit without further testing. Has anyone else with a low bw limit tried this? Does it cause any problems? If it doesn't cause any problems, I would suggest making the change be a small one initially. Rather than a flat 20 connections, something like 1 connection per 2KiB/s of outbound bandwidth, with a minimum of 15 and a max of 20. I'll perform some testing with 15 connections, 30KiB/s limit and report back on that. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Freenet progress update
On Mon, Mar 2, 2009 at 3:11 PM, Florent Daignière nextg...@freenetproject.org wrote: * Evan Daniel eva...@gmail.com [2009-02-27 10:58:19]: I think it would be inappropriate to reduce the connection limit without further testing. [...] Tweaking that code based on one's experience is just plain silly. Then it seems we're in agreement. Tweaking an emergent system based on hunches is silly. Gathering data and tweaking based on that data isn't. Individual anecdotes like my node's performance prove nothing, but can suggest routes for further investigation. Right now, all I think we know is that the current system works, and that there is reason to believe improvement is possible (ie unused available bandwidth). Do you disagree with that assessment? Is there a reason not to investigate this? I'm not wedded to any particular solution or testing method, and I can think of plenty of flaws in mine. If you have an improved proposal, by all means say so. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Freenet progress update
On Tue, Mar 10, 2009 at 7:16 PM, Matthew Toseland t...@amphibian.dyndns.org wrote: On Monday 02 March 2009 20:55:59 Florent Daignière wrote: * Evan Daniel eva...@gmail.com [2009-03-02 15:41:59]: On Mon, Mar 2, 2009 at 3:11 PM, Florent Daignière nextg...@freenetproject.org wrote: * Evan Daniel eva...@gmail.com [2009-02-27 10:58:19]: I think it would be inappropriate to reduce the connection limit without further testing. [...] Tweaking that code based on one's experience is just plain silly. Then it seems we're in agreement. Tweaking an emergent system based on hunches is silly. Gathering data and tweaking based on that data isn't. Individual anecdotes like my node's performance prove nothing, but can suggest routes for further investigation. Right now, all I think we know is that the current system works, and that there is reason to believe improvement is possible (ie unused available bandwidth). Do you disagree with that assessment? Is there a reason not to investigate this? I'm not wedded to any particular solution or testing method, and I can think of plenty of flaws in mine. If you have an improved proposal, by all means say so. Yes, they are *good* reasons why we should keep the number of peers constant accross nodes. - makes traffic analysis harder (CBR is good; there is even an argument saying we should pad and send garbage if we have to) How is this related to the number of peers being constant across very fast and very slow nodes? On a node with a very low transfer limit, we will have different padding behaviour than on a node with a very high transfer limit: a fast node has more opportunities for padding because we have a fixed period of time for coalescing. Agreed. This actually sounds like an argument in favor of variable connection count (at least relative to the current system). If each connection is CBR, then how many there are doesn't tell an observer anything. If all nodes have the same number of connections, then either there is bw going unused or some connections are higher bw than others -- which means there is significant routing based on capacity rather than location. If a fast node simply has more connections at the same (constant) bit rate instead of a mix of fast and slow links, this is mitigated. - we don't want to go back to the route and missroute according to load approach Agreed. - dynamic systems are often easier to abuse than static ones It has been discussed numerous times already; As far as I am concerned, nothing has changed... We have to accept that we will always have to deal with slow nodes and those are going to determine the speed of the whole network. The only parameter we should change is the height of the entry fence: how much is the minimal configuration needed to access freenet. Obviously that's some form of elitism... but that's much better than the alternative (creating a dynamic, fragile system which will work well only for *some* people). I don't see why it would be more fragile... however we would have to deploy it and try to see if we can tell whether it's an improvement, which may be tricky... NextGen$ ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Current design of Freenet, routing/storage heuristics and arbitrary constants was Re: Article contd.
2009/3/4 Matthew Toseland t...@amphibian.dyndns.org: STORAGE: There are 6 stores: a store and a cache for each of SSK, CHK and public key (for SSKs; we can avoid sending this over the network in many cases). Every block fetched is stored in the relevant cache. When an insert completes, it will be stored only if there is no node with a location closer to the key's location form than this node. But this calculation ignores peers with a lower reported uptime than 40% (this is simply an estimate computed by the node and sent to us on connection), and ignores backed off nodes unless all nodes are backed off. It does not take into account recently failed status etc. Arguably we should send uptime messages more often than on connection: this disadvantages high uptime nodes that connected when they were first installed. In practice the store fills up *much* slower than the cache. There is a technique that would make the store fill more quickly than it currently does without any drawbacks (aside from a small amount of development time ;) ). Right now, there are two hash tables with one slot per location. One hash table is the store, one the cache. (Obviously I'm only considering a single one of CHK/SSK/pubkey.) This is equivalent to a single hash table with two slots per location, with rules that decide which slot to use rather than which hash table. Currently, the rule is that inserts go in the store and fetches in the cache. The change is this: when storing a key in the cache, if the cache slot is already occupied but the store slot is empty, put it in the store instead (and vice versa). Even without the bloom filter, this doesn't add any disk reads -- by treating it as one hash table with two slots per location, you put those two slots adjacent on disk and simply make one larger read to retrieve both keys. This is a technique I first saw in hash tables for chess programs. Evaluation results are cached, with two distinct slots per location. One slot stores the most recent evaluation result, the other stores the most expensive to recompute. There is a noticeable performance improvement in the cache if you are willing to store a result in the wrong slot when only one of the two is full already. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Current design of Freenet, routing/storage heuristics and arbitrary constants was Re: Article contd.
On Wed, Mar 11, 2009 at 1:22 PM, Oskar Sandberg os...@sandbergs.org wrote: On Wed, Mar 11, 2009 at 5:03 PM, Evan Daniel eva...@gmail.com wrote: There is a technique that would make the store fill more quickly than it currently does without any drawbacks (aside from a small amount of development time ;) ). Right now, there are two hash tables with one slot per location. One hash table is the store, one the cache. (Obviously I'm only considering a single one of CHK/SSK/pubkey.) This is equivalent to a single hash table with two slots per location, with rules that decide which slot to use rather than which hash table. Currently, the rule is that inserts go in the store and fetches in the cache. The change is this: when storing a key in the cache, if the cache slot is already occupied but the store slot is empty, put it in the store instead (and vice versa). Even without the bloom filter, this doesn't add any disk reads -- by treating it as one hash table with two slots per location, you put those two slots adjacent on disk and simply make one larger read to retrieve both keys. This is a technique I first saw in hash tables for chess programs. Evaluation results are cached, with two distinct slots per location. One slot stores the most recent evaluation result, the other stores the most expensive to recompute. There is a noticeable performance improvement in the cache if you are willing to store a result in the wrong slot when only one of the two is full already. Isn't this just a more complicated way of saying: put anything which you cache into the store if the store isn't full yet? Basically. And an observation that there doesn't have to be a performance penalty in doing so. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Current uservoice top 5
On Mon, Apr 20, 2009 at 10:34 AM, Matthew Toseland t...@amphibian.dyndns.org wrote: So duplicating the top block is fairly important. Another weakness is that the last segment in a splitfile may have much less redundancy than the rest; this can be fixed by making the last 2 segments the same size. Is there any reason to make the first n-2 segments full size and the last 2 balanced and potentially only (slightly over) half size, rather than make all n segments the same size? Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Easy top block duplication: Content Multiplication Keys
On Thu, Apr 23, 2009 at 4:01 PM, Robert Hailey rob...@freenetproject.org wrote: On Apr 23, 2009, at 12:34 PM, Florent Daigniere wrote: Robert Hailey wrote: Perhaps there is an easier solution? How about extending the chk logic into an alternate-chk-key (ACK?); that simply adds 0.25 to the expected location (for routing and storage). So when you insert the top block, put it in as a chk and an ack (no extra uri's neccesary). When you go to fetch it, if the chk does not work, try the ack variant of the same key. At the moment each node on the path can verify that the data sent by previous hop corresponds to what it ought to; How would that work with your proposed solution? NextGen$ Sorta like this... package freenet.keys; public class ASKKey extends NodeCHK { public double toNormalizedDouble() { return (super.toNormalizedDouble()+0.25)%1.0; } } The only difference is where any node would look for it. This would not be exposed to the client. My idea is that any chk could be converted to an alternate-location-finding-key just by type (which surely would mean a different fetch-command, e.g. fetchCHK/fetchACK...). There would be no difference in handling, the only difference would be how the target-routing-location is identified from the key (the same as CHK plus a constant [mod 1.0]). After all, the mapping from the key to the small-world location is open to interpretation... -- Robert Hailey I suggested the obvious extension of this on IRC. Instead of simple searching at location + 0.25, you search at location + n/N, where n is which copy of the block you're looking for, and N is the number of copies inserted. Toad didn't like this because it makes top blocks identifiable to everyone on the routing path, and involves network-level changes. The other approaches can be implemented at a higher level as a translation before handing a normal CHK request to the network. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Can we implement Bloom filter sharing quickly???
On Fri, May 1, 2009 at 3:37 PM, Robert Hailey rob...@freenetproject.org wrote: On May 1, 2009, at 9:35 AM, Matthew Toseland wrote: IMPLEMENTING IT: Main tasks: - Converting our datastore indexes into a compact, slightly more lossy, 1-bit filter. (Easy) - Creating a snapshot once an hour, and keeping for some period (1 week?). (Easy) - Creating an efficient binary diff format for hour-to-hour diffs, and keeping them. (Moderate) This might actually be quite hard as an efficient bloom filter will scatter (even a few) updates all over the filter. Actually, it's not hard at all. The naive version is to observe that a tiny fraction of the bits in the filter changed, and just record the location of each bit that changed. At 0.24 writes per second, you'll get on average 0.24*3600*2 keys changed per hour (each write represents 1 add and 1 delete), or 0.24*3600*2*16 counters changed per hour. Unless I'm mistaken, changing a counter in the counting bloom filter has ~ 50% probability of changing the bit in the compressed non-counting version. So that means 0.24*3600*2*16*0.5=13824 bits changed per hour. The address of a bit can be represented in 32 bits trivially (for bloom filters 512MiB in size), so the 1-hour diff should consume 13824*32/8=55296 bytes. That represents 15.36 bytes/s of traffic for each peer, or 307.2B/s across 20 peers. That encoding isn't terribly efficient. More efficient is to sort the addresses and compute the deltas. (So if bits 19, 7, and 34 changed, I send the numbers 7, 12, 15.) Those deltas should follow a geometric distribution with mean (number of bits changed) / (size of filter). It's easy to build an arithmetic coding for that data that will achieve near-perfect compression (see http://en.wikipedia.org/wiki/Arithmetic_coding for example). My BOTE estimate using toad's 84MiB filter it would compress at 14.5 bits per address (instead of the 30 or 32 you'd get with no compression; gzip or lzw should be somewhere in between). Evan ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Can we implement Bloom filter sharing quickly???
On Fri, May 1, 2009 at 4:32 PM, Robert Hailey rob...@freenetproject.org wrote: On May 1, 2009, at 3:02 PM, Matthew Toseland wrote: [20:37:16] evanbd You standardize on a size for the bloom filters; say 1MiB. Then, if your store has 100GB of data, and needs 18MiB of bloom filters, you partition the keyspace into 18 chunks of roughly equal population. Each segment of the keyspace then gets its own bloom filter. [20:38:24] evanbd Then, if my node has 100MiB of memory to spend on my peers' bloom filters, and 20 peers, I just ask each peer for up to 5 bloom filters. [20:40:11] toad_ evanbd: it's good to have a partial filter for each node yes [20:40:24] toad_ evanbd: however, you end up checking more filters which increases false positives [20:40:52] evanbd No, the fp rate stays the same. [20:41:18] evanbd Suppose your node has 18 filters, each with a 1E-5 fp rate [20:41:36] evanbd When I get a request, I compare it to your node's filter set. [20:41:57] evanbd But only *one* of those filters gets checked, since each one covers a different portion of the keyspace If requested to send 5 partial filters, which do you send? I presume those closest to the originator's present location. Looks like you actually may end up checking less filters overall (if the blooms from large peers are out-of-range of the request). Yes, that's a question worth considering. There are both performance and security issues involved, I think. Note that the partition could be a set of contiguous regions (allowing performance optimization around which piece of the keyspace you send info about), but it could just as easily be determined by a hash function instead. You still check the same number of filters overall -- one per peer. The difference is that for some peers you may have a partial filter set, and therefore sometimes check their filters, instead of deciding you don't have the memory for that peer's filter and never checking it. Nice idea, Evan! Thanks! Evan ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Can we implement Bloom filter sharing quickly???
On Fri, May 1, 2009 at 7:01 PM, Matthew Toseland t...@amphibian.dyndns.org wrote: On Friday 01 May 2009 22:43:50 Robert Hailey wrote: On May 1, 2009, at 3:46 PM, Evan Daniel wrote: Yes, that's a question worth considering. There are both performance and security issues involved, I think. Note that the partition could be a set of contiguous regions (allowing performance optimization around which piece of the keyspace you send info about), but it could just as easily be determined by a hash function instead. You still check the same number of filters overall -- one per peer. The difference is that for some peers you may have a partial filter set, and therefore sometimes check their filters, instead of deciding you don't have the memory for that peer's filter and never checking it. Maybe if we partition it we can also get a free datastore histogram on the stats page. No, we cannot divide by actual keyspace, the keys must be hashed first, or the middle bloom filter will be far too big. Well, you could partition by actual keyspace as long as the partitions are (approximately) equal in population rather than fraction of the keyspace they cover. Doing that is only mildly nontrivial, and gives you a histogram with variable-width bars, but still an accurate histogram. Each bar would cover the same area; tall and skinny near the node's location, wide and short away from it. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Can we implement Bloom filter sharing quickly???
On Fri, May 1, 2009 at 7:59 PM, Matthew Toseland t...@amphibian.dyndns.org wrote: On Saturday 02 May 2009 00:53:27 Evan Daniel wrote: On Fri, May 1, 2009 at 7:01 PM, Matthew Toseland t...@amphibian.dyndns.org wrote: On Friday 01 May 2009 22:43:50 Robert Hailey wrote: On May 1, 2009, at 3:46 PM, Evan Daniel wrote: Yes, that's a question worth considering. There are both performance and security issues involved, I think. Note that the partition could be a set of contiguous regions (allowing performance optimization around which piece of the keyspace you send info about), but it could just as easily be determined by a hash function instead. You still check the same number of filters overall -- one per peer. The difference is that for some peers you may have a partial filter set, and therefore sometimes check their filters, instead of deciding you don't have the memory for that peer's filter and never checking it. Maybe if we partition it we can also get a free datastore histogram on the stats page. No, we cannot divide by actual keyspace, the keys must be hashed first, or the middle bloom filter will be far too big. Well, you could partition by actual keyspace as long as the partitions are (approximately) equal in population rather than fraction of the keyspace they cover. Doing that is only mildly nontrivial, and gives you a histogram with variable-width bars, but still an accurate histogram. Each bar would cover the same area; tall and skinny near the node's location, wide and short away from it. And if the distribution was ever to change ... it's not mildly nontrivial IMHO. Anyway, the first version should simply use the existing filters, to make things easy. You have to recompute the bloom filter occasionally regardless, otherwise the counters eventually all saturate. If the distribution changes enough to be problematic, you rebalance the filters. If you're willing to be slightly inefficient, you can avoid recomputing all the filters. When one filter starts getting full, you split its portion of the keyspace in half and create a pair to take its place. On average your filters will only be 3/4 full or so, but you reduce the computational load (though probably not the disk load? hmm...) Or you could just partition the keyspace by hashing and trust that equal size partitions will have equal populations. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Current uservoice top 5
On Mon, May 4, 2009 at 11:33 AM, Matthew Toseland t...@amphibian.dyndns.org wrote: 1. Release the 20 nodes barrier (206 votes) As I have mentioned IMHO this is a straightforward plea for more performance. I'll reiterate a point I've made before. While this represents a simple plea for performance, I don't think it's an irrational one -- that is, I think the overall network performance is hampered by having all nodes have the same number of connections. Because all connections use similar amounts of bandwidth, the network speed is limited by the slower nodes. This is true regardless of the absolute number of connections; raising the maximum for fast nodes should have a very similar effect to lowering it for slow nodes. What matters is that slow nodes have fewer connections than fast nodes. For example, the max allowed connections (and default setting) could be 1 connection per 2KiB/s output bandwidth, but never more than 20 or less than 15. Those numbers are based on some (very limited) testing I've done -- if I reduce the allowed bw, that is the approximate number of connections required to make full use of it. Reducing the number of connections for slow nodes has some additional benefits. First, my limited testing shows a slight increase in payload % at low bw limits as a result of reducing the connection count (there is some per-connection network overhead). Second, bloom filter sharing represents a per-connection overhead (mostly in the initial transfer -- updates are low bw, as discussed). If (when?) implemented, it will represent a smaller total overhead with fewer connections than with more. Presumably, the greatest impact is on slower nodes. On the other hand, too few connections may make various attacks easier. I have no idea how strong an effect this is. However, a node that has too many connections (ie insufficient bw to use them all fully) may show burstier behavior and thus be more susceptible to traffic analysis. In addition, fewer connections means a larger network diameter on average, which may have an impact on routing. Lower degree also means that the node has fewer neighbor bloom filters to check, which means that a request is compared against fewer stores during its traversal of the network. I'm intentionally suggesting a small change -- it's less likely to cause major problems. By keeping the ratio between slow nodes (15 connections) and fast nodes (20 connections) modest, the potential for reliance on ubernodes is kept minimal. (Similarly, if you want to raise the 20 connections limit instead of lower it, I think it should only be increased slightly.) And finally: I have done some testing on this proposed change. At first glance, it looks like it doesn't hurt and may help. However, I have not done enough testing to be able to say anything with confidence. I'm not suggesting to implement this change immediately; rather, I'm saying that *any* change like this should see some real-world testing before implementation, and that reducing the defaults for slow nodes is as worthy of consideration and testing as raising it for fast nodes. Also: do we have any idea what the distribution of available node bandwidth looks like? Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Current uservoice top 5
On Mon, May 4, 2009 at 6:15 PM, Matthew Toseland t...@amphibian.dyndns.org wrote: On Monday 04 May 2009 17:29:51 Evan Daniel wrote: On Mon, May 4, 2009 at 11:33 AM, Matthew Toseland t...@amphibian.dyndns.org wrote: 1. Release the 20 nodes barrier (206 votes) As I have mentioned IMHO this is a straightforward plea for more performance. I'll reiterate a point I've made before. While this represents a simple plea for performance, I don't think it's an irrational one -- that is, I think the overall network performance is hampered by having all nodes have the same number of connections. Because all connections use similar amounts of bandwidth, the network speed is limited by the slower nodes. This is true regardless of the absolute number of connections; raising the maximum for fast nodes should have a very similar effect to lowering it for slow nodes. What matters is that slow nodes have fewer connections than fast nodes. For example, the max allowed connections (and default setting) could be 1 connection per 2KiB/s output bandwidth, but never more than 20 or less than 15. What would the point be? Don't we need a significant range for it to make much difference? If the network is in fact limited by the per-connection speed of the slower nodes, and they are in fact a minority of the network, increasing the per-connection bandwidth of the slower nodes by 33% should result in a throughput increase for most of the rest of the network of a similar magnitude. A performance improvement of 10-30% should be easily measurable, and (at the high end of that) noticeable enough to be appreciated by most users. Really, though, the idea would be to use it as a network-wide test. Small tests by a few users are helpful, but not nearly as informative as a network-wide test. Assuming the change produced measurable improvement, it would make sense to explore further changes. For example, changing the range to 15-30, or increasing the per-connection bandwidth requirement, or making the per-connection requirement nonlinear, or some other option. However, security concerns (especially ubernodes) are bigger with more dramatic changes. Those numbers are based on some (very limited) testing I've done -- if I reduce the allowed bw, that is the approximate number of connections required to make full use of it. Reducing the number of connections for slow nodes has some additional benefits. First, my limited testing shows a slight increase in payload % at low bw limits as a result of reducing the connection count (there is some per-connection network overhead). True. To be specific, my anecdotal evidence is that it improves the payload fraction by roughly 3-8%. Second, bloom filter sharing represents a per-connection overhead (mostly in the initial transfer -- updates are low bw, as discussed). If (when?) implemented, it will represent a smaller total overhead with fewer connections than with more. Presumably, the greatest impact is on slower nodes. Really it's determined by churn, isn't it? Or by any heuristic artificial limits we impose... My assumption is that connection duration is well modeled by a per-connection half-life, that is largely independent of the number of connections. The bandwidth used on such filters is proportional to the total churn, so fewer connections means less churn in absolute sense but the same connection half-life. (That is, bloom filter bandwidth usage is proportional to # of connections * per-connection churn rate.) I don't have any evidence for that assumption, though. On the other hand, too few connections may make various attacks easier. I have no idea how strong an effect this is. However, a node that has too many connections (ie insufficient bw to use them all fully) may show burstier behavior and thus be more susceptible to traffic analysis. Yes, definitely true with our current padding algorithms. In addition, fewer connections means a larger network diameter on average, which may have an impact on routing. Lower degree also means that the node has fewer neighbor bloom filters to check, which means that a request is compared against fewer stores during its traversal of the network. True. Do you know how big a problem this would cause? My assumption is that it would be a fairly small effect even on the nodes with fewer connections, and that they would be in the minority. I'm intentionally suggesting a small change -- it's less likely to cause major problems. By keeping the ratio between slow nodes (15 connections) and fast nodes (20 connections) modest, the potential for reliance on ubernodes is kept minimal. (Similarly, if you want to raise the 20 connections limit instead of lower it, I think it should only be increased slightly.) Why? I don't see the point unless the upper bound is significantly higher than the lower bound: any improvement won't be measurable. As above, I would hope
Re: [freenet-dev] Question about an important design decision of the WoT plugin
I don't have any specific ideas for how to choose whether to ignore identities, but I think you're making the problem much harder than it needs to be. The problem is that you need to prevent spam, but at the same time prevent malicious non-spammers from censoring identities who aren't spammers. Fortunately, there is a well documented algorithm for doing this: the Advogato trust metric. The WoT documentation claims it is based upon the Advogato trust metric. (Brief discussion: http://www.advogato.org/trust-metric.html Full paper: http://www.levien.com/thesis/compact.pdf ) I think this is wonderful, as I think there is much to recommend the Advogato metric (and I pushed for it early on in the WoT discussions). However, my understanding of the paper and what is actually implemented is that the WoT code does not actually implement it. Before I go into detail, I should point out that I haven't read the WoT code and am not fully up to date on the documentation and discussions; if I'm way off base here, I apologize. The Advogato metric is designed from the ground up to have strong spam-resistance properties. In fact, it has a mathematical proof of how strong they are: the amount of spam that gets through is limited by the number of confused nodes, that is nodes who are not spammers (or simple shills of spammers), but who have mistakenly marked spammers as trustworthy. The existence of this proof is, to me, so compelling an argument in favor of using the metric that I believe any changes to the algorithm that do not come with an updated version of the proof should be looked upon with extreme suspicion. I'll leave the precise descriptions of the two algorithms to those who are actually writing the code for now. (Though I have read the Advogato paper and feel I understand it fairly well -- it's rather dense, though, and I'd be happy to try to offer a clearer or more detailed explanation of the paper if that would be helpful.) However, one of the properties of the Advogato metric (which the WoT algorithm, AIUI, does not have) is worth discussing, as I think it is particularly relevant to issues around censorship that are frequently discussed wrt WoT and Freenet. Specifically, Advogato does not use negative trust ratings, whereas both WoT and FMS do. The concept of negative trust ratings has absolutely nothing to do with the arbitrary numbers one person assigns to another in their published trust list. Those can be on any scale you like, whether it's 0-100, 1-5, or -100 to +100. A system can have or not have negative trust properties on any of those scales. Instead, negative trust is a property based on how the trust rating computed for an identity behaves as other identities *change* their trust ratings. Let's suppose that Alice trusts Bob, and is trying to compute a trust rating for Carol (whom she does not have a direct rating for). Alice has trust ratings for people not named, some of whom have ratings for Carol published. If the trust computation is such that there exists a rating Bob can assign to Carol such that Alice's rating of Carol is worse than if Bob had not rated her at all, then the system exhibits negative trust behaviors. This is, broadly, equivalent to the ability to censor a poster of FMS or WoT by marking them untrusted. There has been much debate over the question of censoring posters never, only for spamming, for spamming plus certain objectionable speech, what should be objectionable, whether you should censor someone who publishes a trust list that censors non-spammers, etc. In my opinion, all of that discussion is very silly to be having in the first place, since the answer is so well documented: simply don't use a trust metric with negative trust behaviors! The problem of introductions, etc is not magically solved by the Advogato algorithm. However, I don't think it is made any harder by it. The dual benefits of provable spam resistance and lack of censorship are, in my opinion, rather compelling. Evan Daniel On Wed, May 6, 2009 at 5:00 PM, xor x...@gmx.li wrote: Hello, I am currently refactoring the WoT plugin to allow per-context trust values. Lets first explain how WoT currently works so you can understand what I mean: - There is a set of Identities. An identity has a SSK URI, a nick name, a set of contexts (and a set of properties). An own identity is an identity of the user of the plugin, he owns the SSK insert URI so he can insert the identity. - Each identity can offer a SetString of contexts. A context is a client application, currently there are: Introduction (the given identity publishes captchas to allow other identities to get known by the web of trust by solving a captcha - if you solve one, you get on the publisher's trust list) and Freetalk, which is the messaging system based on WoT (comparable to FMS) which I am implementing. - Identities currently can give each users a trust value from -100 to +100. Each trust relationship
Re: [freenet-dev] Question about an important design decision of the WoT plugin
On Thu, May 7, 2009 at 4:00 AM, xor x...@gmx.li wrote: On Thursday 07 May 2009 00:02:11 Evan Daniel wrote: The WoT documentation claims it is based upon the Advogato trust metric. (Brief discussion: http://www.advogato.org/trust-metric.html Full paper: http://www.levien.com/thesis/compact.pdf ) I think this is wonderful, as I think there is much to recommend the Advogato metric (and I pushed for it early on in the WoT discussions). However, my understanding of the paper and what is actually implemented is that the WoT code does not actually implement it. I must admit that I do not know whether its claim that it implements Advogato is right or not. I have refactored the code but I have not modified the trust calculation logic and have not checked whether it is Advogato or not. Someone should probably do that. I don't have any specific ideas for how to choose whether to ignore identities, but I think you're making the problem much harder than it needs to be. Why exactly? Your post is nice but I do not see how it answers my question. The general problem my post is about: New identities are obtained by taking them from trust lists of known identities. An attacker therefore could put 100 identities in his trust list to fill up your database and slow down WoT. Therefore, an decision has to be made when to NOT import new identities from someone's trust list. In the current implementation, it is when he has a negative score. As I've pointed out, in the future there will be MULTIPLE webs of trust, for different contexts - Freetalk, Filesharing, Identity-Introduction (you can get a trust value from someone in that context when you solve a captcha he has published), so the question now is: Which context(s) shall be used to decide when to NOT import new identity's from someones trust list anymore? I have not examined the WoT code. However, the Advogato metric has two attributes that I don't think the current WoT method has: no negative trust behavior (if there is a trust rating Bob can assign to Carol such that Alice will trust Carol less than if Bob had not assigned a rating, that's a negative trust behavior), and a mathematical proof as to the upper limit on the quantity of spammer nodes that get trusted. The Advogato metric is *specifically* designed to handle the case of the attacker creating millions of accounts. In that case, his success is bounded (linear with modest constant) by the number of confused nodes -- that is, legitimate nodes that have (incorrectly) marked his accounts as legitimate. If you look at the flow computation, it follows that for nodes for which the computed trust value is zero, you don't have to bother downloading their trust lists, so the number of such lists you download is similarly well controlled. As for contexts, why should the same identity be treated differently in different contexts? If the person is (believed to be) a spammer in one context, is there any reason to trust them in some other context? I suppose I don't really understand the purpose of having different contexts if your goal is only to filter out spammers. Wasn't part of the point of the modular approach of WoT that different applications could share trust lists, thus preventing users from having to mark trust values for the same identities several times? Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Question about an important design decision of the WoT plugin
On Thu, May 7, 2009 at 2:02 PM, Thomas Sachau m...@tommyserver.de wrote: Evan Daniel schrieb: I don't have any specific ideas for how to choose whether to ignore identities, but I think you're making the problem much harder than it needs to be. The problem is that you need to prevent spam, but at the same time prevent malicious non-spammers from censoring identities who aren't spammers. Fortunately, there is a well documented algorithm for doing this: the Advogato trust metric. The WoT documentation claims it is based upon the Advogato trust metric. (Brief discussion: http://www.advogato.org/trust-metric.html Full paper: http://www.levien.com/thesis/compact.pdf ) I think this is wonderful, as I think there is much to recommend the Advogato metric (and I pushed for it early on in the WoT discussions). However, my understanding of the paper and what is actually implemented is that the WoT code does not actually implement it. Before I go into detail, I should point out that I haven't read the WoT code and am not fully up to date on the documentation and discussions; if I'm way off base here, I apologize. I think, you are: The advogato idea may be nice (i did not read it myself), if you have exactly 1 trustlist for everything. But xor wants to implement 1 trustlist for every app as people may act differently e.g. on firesharing than on forums or while publishing freesites. You basicly dont want to censor someone just because he tries to disturb filesharing while he may be tries to bring in good arguments at forum discussions about it. And i dont think that advogato will help here, right? There are two questions here. The first question is given a set of identities and their trust lists, how do you compute the trust for an identity the user has not rated? The second question is, how do you determine what trust lists to use in which contexts? The two questions are basically orthogonal. I'm not certain about the contexts issue; Toad raised some good points, and while I don't fully agree with him, it's more complicated than I first thought. I may have more to say on that subject later. Within a context, however, the computation algorithm matters. The Advogato idea is very nice, and imho much better than the current WoT or FMS answers. You should really read their simple explanation page. It's really not that complicated; the only reasons I'm not fully explaining it here is that it's hard to do without diagrams, and they already do a good job of it. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Question about an important design decision of the WoT plugin
On Thu, May 7, 2009 at 4:43 PM, xor x...@gmx.li wrote: On Thursday 07 May 2009 11:23:51 Evan Daniel wrote: On Thu, May 7, 2009 at 4:00 AM, xor x...@gmx.li wrote: On Thursday 07 May 2009 00:02:11 Evan Daniel wrote: The WoT documentation claims it is based upon the Advogato trust metric. (Brief discussion: http://www.advogato.org/trust-metric.html Full paper: http://www.levien.com/thesis/compact.pdf ) I think this is wonderful, as I think there is much to recommend the Advogato metric (and I pushed for it early on in the WoT discussions). However, my understanding of the paper and what is actually implemented is that the WoT code does not actually implement it. I must admit that I do not know whether its claim that it implements Advogato is right or not. I have refactored the code but I have not modified the trust calculation logic and have not checked whether it is Advogato or not. Someone should probably do that. I don't have any specific ideas for how to choose whether to ignore identities, but I think you're making the problem much harder than it needs to be. Why exactly? Your post is nice but I do not see how it answers my question. The general problem my post is about: New identities are obtained by taking them from trust lists of known identities. An attacker therefore could put 100 identities in his trust list to fill up your database and slow down WoT. Therefore, an decision has to be made when to NOT import new identities from someone's trust list. In the current implementation, it is when he has a negative score. As I've pointed out, in the future there will be MULTIPLE webs of trust, for different contexts - Freetalk, Filesharing, Identity-Introduction (you can get a trust value from someone in that context when you solve a captcha he has published), so the question now is: Which context(s) shall be used to decide when to NOT import new identity's from someones trust list anymore? I have not examined the WoT code. However, the Advogato metric has two attributes that I don't think the current WoT method has: no negative trust behavior (if there is a trust rating Bob can assign to Carol such that Alice will trust Carol less than if Bob had not assigned a rating, that's a negative trust behavior), and a mathematical proof as to the upper limit on the quantity of spammer nodes that get trusted. The Advogato metric is *specifically* designed to handle the case of the attacker creating millions of accounts. In that case, his success is bounded (linear with modest constant) by the number of confused nodes -- that is, legitimate nodes that have (incorrectly) marked his accounts as legitimate. If you look at the flow computation, it follows that for nodes for which the computed trust value is zero, you don't have to bother downloading their trust lists, so the number of such lists you download is similarly well controlled. Well I'm no mathematician, I cannot comment on that. I think toads argument sounds reasonable though: That there must be a way to distrust someone if the original person who trusted him disappears. I do not plan to change the trust logic on my own, I consider myself more as a programmer who can implement things than a designer of algorithms etc. All the more reason to use Advogato (or some other metric with useful provable properties) :) The current WoT is entirely black magic alchemy. Maybe it works, maybe it doesn't, but us non-mathematicians have trouble saying anything conclusive. Alchemy is to be avoided; if you have the ability to show why it works, it ceases to be alchemy. Advogato is certainly not perfect, but its limits are well defined (spammer identities are linearly bounded by the trust granted to confused legitimate identities) and imho acceptable -- if you've forced the spammer to do manual work linearly proportional (with a sane constant) to the amount of spam he wants to send, you've won. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Question about an important design decision of the WoT plugin
On Thu, May 7, 2009 at 6:33 PM, Matthew Toseland t...@amphibian.dyndns.org wrote: On Thursday 07 May 2009 21:32:42 Evan Daniel wrote: On Thu, May 7, 2009 at 2:02 PM, Thomas Sachau m...@tommyserver.de wrote: Evan Daniel schrieb: I don't have any specific ideas for how to choose whether to ignore identities, but I think you're making the problem much harder than it needs to be. The problem is that you need to prevent spam, but at the same time prevent malicious non-spammers from censoring identities who aren't spammers. Fortunately, there is a well documented algorithm for doing this: the Advogato trust metric. The WoT documentation claims it is based upon the Advogato trust metric. (Brief discussion: http://www.advogato.org/trust-metric.html Full paper: http://www.levien.com/thesis/compact.pdf ) I think this is wonderful, as I think there is much to recommend the Advogato metric (and I pushed for it early on in the WoT discussions). However, my understanding of the paper and what is actually implemented is that the WoT code does not actually implement it. Before I go into detail, I should point out that I haven't read the WoT code and am not fully up to date on the documentation and discussions; if I'm way off base here, I apologize. I think, you are: The advogato idea may be nice (i did not read it myself), if you have exactly 1 trustlist for everything. But xor wants to implement 1 trustlist for every app as people may act differently e.g. on firesharing than on forums or while publishing freesites. You basicly dont want to censor someone just because he tries to disturb filesharing while he may be tries to bring in good arguments at forum discussions about it. And i dont think that advogato will help here, right? There are two questions here. The first question is given a set of identities and their trust lists, how do you compute the trust for an identity the user has not rated? The second question is, how do you determine what trust lists to use in which contexts? The two questions are basically orthogonal. I'm not certain about the contexts issue; Toad raised some good points, and while I don't fully agree with him, it's more complicated than I first thought. I may have more to say on that subject later. Within a context, however, the computation algorithm matters. The Advogato idea is very nice, and imho much better than the current WoT or FMS answers. You should really read their simple explanation page. It's really not that complicated; the only reasons I'm not fully explaining it here is that it's hard to do without diagrams, and they already do a good job of it. It's nice, but it doesn't work. Because the only realistic way for positive trust to be assigned is on the basis of posted messages, in a purely casual way, and without the sort of permanent, universal commitment that any pure-positive-trust scheme requires: If he spams on any board, if I ever gave him trust and haven't changed that, then *I AM GUILTY* and *I LOSE TRUST* as the only way to block the spam. How is that different than the current situation? Either the fact that he spams and you trust him means you lose trust because you're allowing the spam through, or somehow the spam gets stopped despite your trust -- which implies either that a lot of people have to update their trust lists before anything happens, and therefore the spam takes forever to stop, or it doesn't take that many people to censor an objectionable but non-spamming poster. I agree, this is a bad thing. I'm just not seeing that the WoT system is *that* much better. It may be somewhat better, but the improvement comes at a cost of trading spam resistance vs censorship ability, which I think is fundamentally unavoidable. There's another reason I don't see this as a problem: I'm working from the assumption that if you can force a spammer to perform manual effort on par with the amount of spam he can send, then the problem *has been solved*. The reason email spam and Frost spam is a problem is not that there are lots of spammers; there aren't. It's that the spammers can send colossal amounts of spam. The solution, imho, is mundane: if the occasional trusted identity starts a spam campaign, I mark them as a spammer. This is optionally published, but can be ignored by others to maintain the positive trust aspects of the behavior. Locally, it functions as a slightly stronger killfile: their messages get ignored, and their identity's trust capacity is forced to zero. In the context of the routing and data store algorithms, Freenet has a strong prejudice against alchemy and in favor of algorithms with properties that are both useful and provable from reasonable assumptions, even though they are not provably perfect. Like routing, the generalized trust problem is non-trivial. Advogato has such properties; the current WoT and FMS algorithms do not: they are alchemical
Re: [freenet-dev] Recent progress on Interdex
On Tue, May 12, 2009 at 4:26 PM, Ximin Luo xl...@cam.ac.uk wrote: (one way of storing it which would allow token-deflate would be having each indexnode as a CHK, then you'd only have to INS an updated node and all its parents up to the root, but i chose not to do this as CHKs have a higher limit for being turned into a splitfile. was this the right decision?) My impression is that most of the time to download a key is the routing time to find it, not the time to transfer the data once found. So a 32KiB CHK is only somewhat slower to download than a 1KiB SSK. (Though I haven't seen hard numbers on this in ages, so I could be completely wrong.) My instinct is that the high latency for a single-key lookup that is the norm for Freenet means that if using CHKs instead results in an appreciably shallower tree, that will yield a performance improvement. The other effect to consider is how likely the additional data fetched is to be useful to some later request. Answering that is probably trickier, since it requires reasonable assumptions about index size and usage. It would be nice if there was a way to get some splitfile-type redundancy in these indexes; otherwise uncommonly searched terms won't be retrievable. However, there's obviously a tradeoff with common search term latency. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Question about an important design decision of the WoT plugin
On Wed, May 13, 2009 at 9:03 AM, Matthew Toseland t...@amphibian.dyndns.org wrote: On Friday 08 May 2009 02:12:21 Evan Daniel wrote: On Thu, May 7, 2009 at 6:33 PM, Matthew Toseland t...@amphibian.dyndns.org wrote: On Thursday 07 May 2009 21:32:42 Evan Daniel wrote: On Thu, May 7, 2009 at 2:02 PM, Thomas Sachau m...@tommyserver.de wrote: Evan Daniel schrieb: I don't have any specific ideas for how to choose whether to ignore identities, but I think you're making the problem much harder than it needs to be. The problem is that you need to prevent spam, but at the same time prevent malicious non-spammers from censoring identities who aren't spammers. Fortunately, there is a well documented algorithm for doing this: the Advogato trust metric. The WoT documentation claims it is based upon the Advogato trust metric. (Brief discussion: http://www.advogato.org/trust-metric.html Full paper: http://www.levien.com/thesis/compact.pdf ) I think this is wonderful, as I think there is much to recommend the Advogato metric (and I pushed for it early on in the WoT discussions). However, my understanding of the paper and what is actually implemented is that the WoT code does not actually implement it. Before I go into detail, I should point out that I haven't read the WoT code and am not fully up to date on the documentation and discussions; if I'm way off base here, I apologize. I think, you are: The advogato idea may be nice (i did not read it myself), if you have exactly 1 trustlist for everything. But xor wants to implement 1 trustlist for every app as people may act differently e.g. on firesharing than on forums or while publishing freesites. You basicly dont want to censor someone just because he tries to disturb filesharing while he may be tries to bring in good arguments at forum discussions about it. And i dont think that advogato will help here, right? There are two questions here. The first question is given a set of identities and their trust lists, how do you compute the trust for an identity the user has not rated? The second question is, how do you determine what trust lists to use in which contexts? The two questions are basically orthogonal. I'm not certain about the contexts issue; Toad raised some good points, and while I don't fully agree with him, it's more complicated than I first thought. I may have more to say on that subject later. Within a context, however, the computation algorithm matters. The Advogato idea is very nice, and imho much better than the current WoT or FMS answers. You should really read their simple explanation page. It's really not that complicated; the only reasons I'm not fully explaining it here is that it's hard to do without diagrams, and they already do a good job of it. It's nice, but it doesn't work. Because the only realistic way for positive trust to be assigned is on the basis of posted messages, in a purely casual way, and without the sort of permanent, universal commitment that any pure-positive-trust scheme requires: If he spams on any board, if I ever gave him trust and haven't changed that, then *I AM GUILTY* and *I LOSE TRUST* as the only way to block the spam. How is that different than the current situation? Either the fact that he spams and you trust him means you lose trust because you're allowing the spam through, or somehow the spam gets stopped despite your trust -- which implies either that a lot of people have to update their trust lists before anything happens, and therefore the spam takes forever to stop, or it doesn't take that many people to censor an objectionable but non-spamming poster. I agree, this is a bad thing. I'm just not seeing that the WoT system is *that* much better. It may be somewhat better, but the improvement comes at a cost of trading spam resistance vs censorship ability, which I think is fundamentally unavoidable. So how do you solve the contexts problem? The only plausible way to add trust is to do it on the basis of valid messages posted to the forum that the user reads. If he posts nonsense to other forums, or even introduces identities that spam other forums, the user adding trust probably does not know about this, so it is problematic to hold him responsible for that. In a positive trust only system this is unsolvable afaics? Perhaps some form of feedback/ultimatum system? Users who are affected by spam from an identity can send proof that the identity is a spammer to the users they trust who trust that identity. If the proof is valid, those who trust the identity can downgrade him within a reasonable period; if they don't do this they get downgraded themselves? I don't have an easy solution for the contexts issue. As I see it, there are several related but distinct issues: -- Given a set of trust ratings, what is the algorithm
Re: [freenet-dev] Question about an important design decision of the WoT plugin
On Wed, May 13, 2009 at 12:58 PM, Matthew Toseland t...@amphibian.dyndns.org wrote: On Wednesday 13 May 2009 15:47:24 Evan Daniel wrote: On Wed, May 13, 2009 at 9:03 AM, Matthew Toseland t...@amphibian.dyndns.org wrote: On Friday 08 May 2009 02:12:21 Evan Daniel wrote: On Thu, May 7, 2009 at 6:33 PM, Matthew Toseland t...@amphibian.dyndns.org wrote: On Thursday 07 May 2009 21:32:42 Evan Daniel wrote: On Thu, May 7, 2009 at 2:02 PM, Thomas Sachau m...@tommyserver.de wrote: Evan Daniel schrieb: I don't have any specific ideas for how to choose whether to ignore identities, but I think you're making the problem much harder than it needs to be. The problem is that you need to prevent spam, but at the same time prevent malicious non-spammers from censoring identities who aren't spammers. Fortunately, there is a well documented algorithm for doing this: the Advogato trust metric. The WoT documentation claims it is based upon the Advogato trust metric. (Brief discussion: http://www.advogato.org/trust-metric.html Full paper: http://www.levien.com/thesis/compact.pdf ) I think this is wonderful, as I think there is much to recommend the Advogato metric (and I pushed for it early on in the WoT discussions). However, my understanding of the paper and what is actually implemented is that the WoT code does not actually implement it. Before I go into detail, I should point out that I haven't read the WoT code and am not fully up to date on the documentation and discussions; if I'm way off base here, I apologize. I think, you are: The advogato idea may be nice (i did not read it myself), if you have exactly 1 trustlist for everything. But xor wants to implement 1 trustlist for every app as people may act differently e.g. on firesharing than on forums or while publishing freesites. You basicly dont want to censor someone just because he tries to disturb filesharing while he may be tries to bring in good arguments at forum discussions about it. And i dont think that advogato will help here, right? There are two questions here. The first question is given a set of identities and their trust lists, how do you compute the trust for an identity the user has not rated? The second question is, how do you determine what trust lists to use in which contexts? The two questions are basically orthogonal. I'm not certain about the contexts issue; Toad raised some good points, and while I don't fully agree with him, it's more complicated than I first thought. I may have more to say on that subject later. Within a context, however, the computation algorithm matters. The Advogato idea is very nice, and imho much better than the current WoT or FMS answers. You should really read their simple explanation page. It's really not that complicated; the only reasons I'm not fully explaining it here is that it's hard to do without diagrams, and they already do a good job of it. It's nice, but it doesn't work. Because the only realistic way for positive trust to be assigned is on the basis of posted messages, in a purely casual way, and without the sort of permanent, universal commitment that any pure-positive-trust scheme requires: If he spams on any board, if I ever gave him trust and haven't changed that, then *I AM GUILTY* and *I LOSE TRUST* as the only way to block the spam. How is that different than the current situation? Either the fact that he spams and you trust him means you lose trust because you're allowing the spam through, or somehow the spam gets stopped despite your trust -- which implies either that a lot of people have to update their trust lists before anything happens, and therefore the spam takes forever to stop, or it doesn't take that many people to censor an objectionable but non-spamming poster. I agree, this is a bad thing. I'm just not seeing that the WoT system is *that* much better. It may be somewhat better, but the improvement comes at a cost of trading spam resistance vs censorship ability, which I think is fundamentally unavoidable. So how do you solve the contexts problem? The only plausible way to add trust is to do it on the basis of valid messages posted to the forum that the user reads. If he posts nonsense to other forums, or even introduces identities that spam other forums, the user adding trust probably does not know about this, so it is problematic to hold him responsible for that. In a positive trust only system this is unsolvable afaics? Perhaps some form of feedback/ultimatum system? Users who are affected by spam from an identity can send proof that the identity is a spammer to the users they trust who trust that identity. If the proof is valid, those who trust the identity can downgrade him within
Re: [freenet-dev] a social problem with Wot (was: Hashcash introduction, was: Question about WoT )
On Wed, May 13, 2009 at 4:28 PM, xor x...@gmx.li wrote: On Wednesday 13 May 2009 10:01:31 Luke771 wrote: Thomas Sachau wrote: Luke771 schrieb: I can't comment on the technical part because I wouldnt know what im talking about. However, I do like the 'social' part (being able to see an identity even if the censors mark it down it right away as it's created) The censors? There is no central authority to censor people. Censors can only censor the web-of-trust for those people that trust them and which want to see a censored net. You cant and should not prevent them from this, if they want it. This have been discussed a lot. the fact that censoship isnt done by a central authority but by a mob rule is irrelevant. Censorship in this contest is blocking users based on the content of their messages The whole point is basically this: A tool created to block flood attacks is being used to discriminate against a group of users. Now, it is true that they can't really censor anything because users can decide what trust lists to use, but it is also true that this abuse of the wot does creates problems. They are social problems and not technical ones, but still 'freenet problems'. If we see the experience with FMS as a test for the Web of Trust, the result of that test is in my opinion something in between a miserable failure and a catastrophe. The WoT never got to prove itself against a real flood attack, we have no idea what would happen if someone decided to attack FMS, not even if the WoT would stop the attempted attack at all, leave alone finding out how fast and/or how well it would do it. In other words, for what we know, the WoT may very well be completely ineffective against a DoS attack. All we know about it is that the WoT can be used to discriminate against people, we know that it WILL be used in that way, and we know that because of a proven fact: it's being used to discriminate against people right now, on FMS That's all we know. We know that some people will abuse WoT, but we dont really know if it would be effective at stopping DoS attacks. Yes, it should work, but we don't 'know'. The WoT has never been tested t actually do the job it's designed to do, yet the Freenet 'decision makers' are acting as if the WoT had proven its validity beyond any reasonable doubt, and at the same time they decide to ignore the only one proven fact that we have. This whole situation is ridiculous, I don't know if it's more funny or sad... it's grotesque. It reminds me of our beloved politicians, always knowing what's the right thing to do, except that it never works as expected. No, it is not ridiculous, you are just having a point of view which is not abstract enough: If there is a shared medium (= Freenet, Freetalk, etc.) which is writable by EVERYONE, it is absolutely IMPOSSIBLE to *automatically* (as in by writing an intelligent software) distinguish spam from useful uploads, because EVERYONE can be evil. EITHER you manually view every single piece of information which is uploaded and decide yourself whether you consider it as spam or not OR you adopt the ratings of other people so each person only has to rate a small subset of the uploaded data. There are no other options. And what the web of trust does is exactly the second option: it load balances the content rating equally between all users. While your statement is trivially true (assuming we ignore some fairly potent techniques like bayesian classifiers that rely on neither additional work by the user or reliance on the opinions of others...), it misses the real point: the fact that WoT spreads the work around does not mean it does so efficiently or effectively, or that the choices it makes wrt various design tradeoffs are actually the choices that we, as its users, would make if we considered those choices carefully. A web of trust is a complex system, the entire purpose of which is to create useful emergent behaviors. Too much focus on the micro-level behavior of the parts of such a system, instead of the emergent properties of the system as a whole, means that you won't get the emergent properties you wanted. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] a social problem with Wot (was: Hashcash introduction, was: Question about WoT )
On Thu, May 14, 2009 at 4:22 AM, xor x...@gmx.li wrote: On Wednesday 13 May 2009 22:48:53 Evan Daniel wrote: On Wed, May 13, 2009 at 4:28 PM, xor x...@gmx.li wrote: On Wednesday 13 May 2009 10:01:31 Luke771 wrote: Thomas Sachau wrote: Luke771 schrieb: I can't comment on the technical part because I wouldnt know what im talking about. However, I do like the 'social' part (being able to see an identity even if the censors mark it down it right away as it's created) The censors? There is no central authority to censor people. Censors can only censor the web-of-trust for those people that trust them and which want to see a censored net. You cant and should not prevent them from this, if they want it. This have been discussed a lot. the fact that censoship isnt done by a central authority but by a mob rule is irrelevant. Censorship in this contest is blocking users based on the content of their messages The whole point is basically this: A tool created to block flood attacks is being used to discriminate against a group of users. Now, it is true that they can't really censor anything because users can decide what trust lists to use, but it is also true that this abuse of the wot does creates problems. They are social problems and not technical ones, but still 'freenet problems'. If we see the experience with FMS as a test for the Web of Trust, the result of that test is in my opinion something in between a miserable failure and a catastrophe. The WoT never got to prove itself against a real flood attack, we have no idea what would happen if someone decided to attack FMS, not even if the WoT would stop the attempted attack at all, leave alone finding out how fast and/or how well it would do it. In other words, for what we know, the WoT may very well be completely ineffective against a DoS attack. All we know about it is that the WoT can be used to discriminate against people, we know that it WILL be used in that way, and we know that because of a proven fact: it's being used to discriminate against people right now, on FMS That's all we know. We know that some people will abuse WoT, but we dont really know if it would be effective at stopping DoS attacks. Yes, it should work, but we don't 'know'. The WoT has never been tested t actually do the job it's designed to do, yet the Freenet 'decision makers' are acting as if the WoT had proven its validity beyond any reasonable doubt, and at the same time they decide to ignore the only one proven fact that we have. This whole situation is ridiculous, I don't know if it's more funny or sad... it's grotesque. It reminds me of our beloved politicians, always knowing what's the right thing to do, except that it never works as expected. No, it is not ridiculous, you are just having a point of view which is not abstract enough: If there is a shared medium (= Freenet, Freetalk, etc.) which is writable by EVERYONE, it is absolutely IMPOSSIBLE to *automatically* (as in by writing an intelligent software) distinguish spam from useful uploads, because EVERYONE can be evil. EITHER you manually view every single piece of information which is uploaded and decide yourself whether you consider it as spam or not OR you adopt the ratings of other people so each person only has to rate a small subset of the uploaded data. There are no other options. And what the web of trust does is exactly the second option: it load balances the content rating equally between all users. While your statement is trivially true (assuming we ignore some fairly potent techniques like bayesian classifiers that rely on neither additional work by the user or reliance on the opinions of others...), Bayesian filters DO need input: You need to give them old spam and non-spam messages so that they can decide about new input. But they cannot help Freetalk because they cannot prevent identity spam, i.e. the creation of very large amounts of identities. They do not require input from *other people*. it misses the real point: the fact that WoT spreads the work around does not mean it does so efficiently or effectively, or that the choices it makes wrt various design tradeoffs are actually the choices that we, as its users, would make if we considered those choices carefully. A web of trust is a complex system, the entire purpose of which is to create useful emergent behaviors. Too much focus on the micro-level behavior of the parts of such a system, instead of the emergent properties of the system as a whole, means that you won't get the emergent properties you wanted. Yes, the current web of trust implementation might not be perfect. But it is one of the only solutions to the spam problem, if not the only. So the question is not whether to use a WoT but rather how to program the WoT to fit our purposes. Well anyway
Re: [freenet-dev] Question about an important design decision of the WoT plugin
, but manageable. Second, I think that if the amount of spam they can send is limited to that level, they (generally) won't bother in the first place, and so in practice you will only rarely see even that level of spam. Constraining trust list changes is definitely required. I would start with a system that says that if Alice is calculating her trust web, and Bob has recently removed Sam from his trust list, then when Alice is propagating trust through Bob's node, she starts by requiring one unit of flow go to Sam before anyone else on the list, but that that unit of flow has no effect on Alice's computation of Sam's trustworthiness. Or, equivalently, Bob's connection to the supersink is sized as (1 + number of recently un-trusted identities) rather than the normal constant 1. That allows Bob to manage his trust list in a prompt fashion, but if he removes people from it then he is prevented from adding new people to replace them too rapidly. The definition of recent could be tweaked as well, possibly something like only 1 identity gets removed from the recent list per time period, rather than a fixed window during which any removed id counts as recently removed. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Question about an important design decision of the WoT plugin
On Thu, May 14, 2009 at 6:14 PM, Matthew Toseland t...@amphibian.dyndns.org wrote: On Thursday 14 May 2009 17:33:29 Evan Daniel wrote: On Thu, May 14, 2009 at 11:32 AM, Matthew Toseland t...@amphibian.dyndns.org wrote: IMHO these are not solutions to the contexts problem -- it merely shifts the balance between allowing spam and allowing censorship. In one case, the attacker can build trust in one context and use it to spam a different context. In the other case, he can build trust in one context and use it to censor in another. Right now, the only good answer I see to contexts is to make them fully independent. Perhaps I missed it, but I don't recall a discussion of how any other option would work in any detail -- the alternative under consideration appears to be to treat everything as one unified context. I'm not necessarily against that, but the logical conclusion is that you're responsible for paying attention to everything someone you've trusted does in all contexts in which you trust them -- which, for a unified context, means everywhere. Having to bootstrap on each forum would be _bad_. Totally impractical. What about ultimatums? these above refers to WoT with negative trust, right? Ultimatums: I mark somebody as a spammer, I demand that my peers mark him as a spammer, they evaluate the situation, if they don't mark the spammer as spammer then I mark them as spammer. Right. So all the forums go in a single context. I don't see how you can usefully define two different contexts such that trust is common to them but responsibility is not. I think the right solution (at least for now) is one context per application. So you have to boostrap into the forums app, and into the filesharing app, and into the mail app, but not per-forum. Otherwise I have to be able to evaluate possible spam in an application I may not have installed. Ultimatums sound like a reasonable approach. Though if Alice sends Bob an ultimatum about Bob's trust for Sam, and Bob does not act, I'm inclined to think that Alice's client should continue downloading Bob's messages, but cease publishing a trust rating for Bob. After all, Bob might just be lazy, in which case his trust list is worthless but his messages aren't. Agreed, I have no problem with not reducing message trust in this case. Also, I don't see how this attack is specific to the Advogato metric. It works equally well in WoT / FMS. The only thing stopping it there is users manually examining each other's trust lists to look for such things. If you assume equally vigilant users with Advogato the attack is irrelevant. It is solvable with positive trust, because the spammer will gain trust from posting messages, and lose it by spamming. The second party will likely be the stronger in most cases, hence we get a zero or worse outcome. Which second party? The group of users affected by the spam. The first party is the group of users who are not affected by the spam but appreciate the spammer's messages to a forum and therefore give him trust. Ah. You meant solvable with *negative* trust then? Yes, sorry. There's a potential problem here (in the negative trust version): if you post good stuff in a popular forum, and spam in a smaller one, the fact that the influence of any one person is bounded means that you might keep your overall trust rating positive. XKCD describes the problem well: http://xkcd.com/325/ I continue to think that the contexts problem is nontrivial, though different systems will have different tradeoffs. Fundamentally, I think that if trust and responsibility apply to different regions, there are potential problems. OK. I think you really mean Pure positive only works *perfectly* if every user... Hmm, maybe. We don't need a perfect system that stops all spam and nothing else. Any system will have some failings. Minimizing those failings should be a design goal, but knowing where we expect those failings to be, and placing them where we want them, is also an important goal. Or, looked at another way: We have ample evidence that people will abuse the new identity creation process to post spam. That is a problem worth expending significant effort to solve. Do we have evidence that spammers will actually exert per-identity manual effort in order to send problematic amounts of spam? I don't see why it would be per-identity. Per fake identity that will be sending spam. If they can spend manual effort to create a trusted id, and then create unlimited fake ids bootstrapped from that one to spam with, that's a problem. If the amount of effort they have to spend is linear with the number of ids sending spam, that's not a problem, regardless of whether the effort is spent on the many spamming ids or the single bootstrap id. Because there is a limit on churn, and one spamming identity
Re: [freenet-dev] Why WoTs won't work....
It's not all that interesting. It has been discussed to death many times. The Advogato algorithm (or something like it) solves this problem (not perfectly, but far, far better than the current FMS / WoT alchemy), as I have explained in great detail. Evan Daniel On Sat, May 9, 2009 at 12:57 PM, gu...@gmx.org wrote: Interesting discussion from Frost, especially the last post at the bottom. Its about WoTs in general and why they won't work. - hahaha...@yle3zhs5lkiwe3fdjyqlcf5+rka - 2009.04.05 - 02:28:11GMT - I had to forward this one here. --- jezreel℺X~GLTTHo9aaYtIpGT6OOyBMMFwl3b8LwFu6TUw9Q82E sent via FMS on 2009-04-05 at 01:31:54GMT --- Probably not an amazing subject, but FMS is so dead lately so what the hell. Falafel, why do some of the folks posting to Frost hate you so much? In particular Luke771 and VolodyA. You generally seem like a nice identity so I'm just curious. -- jezr...@pbxwgrdegrigwuteoz3tc6cfla2xu3trmi2tgr2enfrvi4bxkzlectshfvyw2wcxkrffslccijku4z2om5gtoojrpzdfcolkovtdawjuifigkrcqjbevsvrnla2xotcdpfvw6lkmkzzsyqkrifbucqkf.freemail ;)) - VolodyA! V a...@r0pa7z7ja1haf2xttt7aklre+yw - 2009.04.11 - 12:11:26GMT - I should point out i do not hate falafel, i do not know him. I do hate what he is doing to Freenet, however. If he was to stop supporting intruduction of censorship on Freenet i would not say anything bad about him. In fact i tend to agree with much of what he has to say on other subjects. -- May all the sentient beings benefit from our conversation. - denmin...@dlkkaikia79j4ovpbgfk4znh25y - 2009.04.14 - 16:13:07GMT - Falafel is doing nothing. If a single guy can make the protocol not work, then the protocol is shitty from the start and we need to write a new one. - Anonymous - 2009.04.14 - 20:46:59GMT - The only shit around here is spewing from the mouths of those who don't understand how it works. No one can stop you from seeing what you want to see. Anyone who tells you otherwise is spreading misinformation. - luke...@ch4jcmdc27eeqm9cw_wvju+coim - 2009.05.04 - 01:01:46GMT - No one can really censor FMS alright, BUT there IS a problem with those 'censored trust lists' anyway. The existance of censored trust lists forces users to actively maintain their own trust lists, the WoT wont work 'on its own' as it would if everyone used it the way it's supposed to. Let me try to explain: if everyone used wot to block flood attacks and nothing else, new users wouldnt need to try and find out which trust lists are 'good', they wouldnt need to work on thir trust lists for hours every day, try to spot censors or guys who wont block pedos, they could simply use FMS and occasianlly set a high trust for someone they actually trust, or lower the trust for someone they caught spamming But the current situation makes FMS a pain in the ass. Users have to work on your trust lists regularly, and new users risk (and probably do) to have some of the content blocked by some censor because the guy posted one message on a board that the censor found 'immoral'. It may take time until the new user figures out which trust lists to use, and there's a very real risk that he would think that it isnt worth the hassle and give up on FMS completely. I did that, others did that, and more will. THAT is the real problem with the Little Brothers, not their non-existent ability to censor content. they cant censor anything and they know it. But they can and do make FMS a pain in the ass to use. Another problem is that, assuming that the fms community will survive (which i dontthink it will), it my end up split into a number of closed sub-communities that refuse to talk to each other. But this is only a guess, so far. We'll have to see how it turns out. In the meantime, making FMS into a PITa has been done already, that is why FMS is as good as dead, and that's why I think that invesiting develpers' time and effort into WoT and Freetalk is a huge waste: FMS failed because of human stupidity and arrogance, and so will Freetalk/WoT, and I really cant understand why the devs cant see the obvious (or refuse to admit it) BTW, I dont hate Falafel. Hate costs energy. A lot of it. -- FAFS - the Freenet Applications FreeSite u...@ugb~uuscsidmi-ze8laze~o3buib3s50i25riwdh99m,9T20t3xoG-dQfMO94LGOl9AxRTkaz~TykFY-voqaTQI,AQACAAE/FAFS/47/ -Don't think out of the box: destroy it!- ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Why WoTs won't work....
: stop trusting spammers or we'll stop trusting you. This would have to be answered in a reasonable time, hence is a problem for those not constantly at their nodes. evanbd has argued that the latter two measures are unnecessary, and that the limited number of spam identities that any one identity can introduce will make the problem manageable. An attacker who just introduces via a CAPTCHA will presumably only get short-lived trust, and if he only posts spam he won't get any positive trust. An attacker who contributes to boards to gain trust to create spamming sub-identities with has to do manual work to gain and maintain reputation among some sub-community. A newbie will not see old captcha-based spammers, only new ones, and those spam identities that the attacker's main, positive identity links to. He will have to manually block each such identity, because somebody is bound to have positive trust for the spammer parent identity. Well... I argue that they *may* be unnecessary. Specifically, I think we can defer implementation until there are problems that warrant it. In terms of UI, if evanbd is right, all we need is a button to mark the poster of a message as a spammer (and get rid of all messages from them), and a small amount of automatic trust when answering a message (part of the UI so it can be disabled). Only those users who know how, and care enough, would actually change the trust for the spammer-parent, and in any case doing so would only affect them and contribute nothing to the community. But if he is wrong, or if an attacker is sufficiently determined, we will also need some way to detect spam-parents, and send them ultimatums. I'm not certain that's the right way to grant manual trust. (Or perhaps we need more than one level of it.) You don't want a spammer to be able to get manual trust by posting a message to the test board consisting only of Hi, can anyone see this? -- they can do that automatically. I think there should be a pair of buttons, mark spammer and mark non-spammer. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Why WoTs won't work....
. The result is that his one main identity can get a large quantity of spam through, even though it can only mark a limited number of child identities trusted and each of them can only send a limited amount of spam. Also, what do you mean by review of identities added from others? Surely you don't mean that I should have to manually review every poster? Isn't the whole point of using a wot in the first place that I can get good trust estimates of people I've never seen before? It probably also requires: - Some indication of which trusted identities trust a spammer when you mark an identity as a spammer. In FMS, you can simply watch the list of trusts of the spammer identity to get this information. - Sending an ultimatum to the trusted identity that trusts more than one spammer: stop trusting spammers or we'll stop trusting you. This would have to be answered in a reasonable time, hence is a problem for those not constantly at their nodes. You may note him about if, if you want (either public, or if implemented via private message), but basicly, why this warning? Does it help him in any way, if we trust him or does it harm him, if we dont any more trust him? At least in FMS it does not change his visibility, but may change the trustlist trust that others get for him and so may or may not include his trusts. Having a well-connected graph is useful, regardless of the algorithm. If the reason the person trusted a spammer was that they made an honest mistake (or got scammed by a bait-and-switch, or...) then you may want to continue using their trust list but inform them of the problem. If they don't want to fix the problem, you probably don't want to continue using their trust list. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Why WoTs won't work....
On Fri, May 22, 2009 at 12:39 PM, Thomas Sachau m...@tommyserver.de wrote: Evan Daniel schrieb: On Fri, May 22, 2009 at 10:48 AM, Thomas Sachau m...@tommyserver.de wrote: Matthew Toseland schrieb: On Friday 22 May 2009 08:17:55 bbac...@googlemail.com wrote: Is'nt his point that the users just won't maintain the trust lists? I thought that is the problem that he meant how can Advogato help us here? Advogato with only positive trust introduces a different tradeoff, which is still a major PITA to maintain, but maybe less of one: - Spammers only disappear when YOU mark them as spammers, or ALL the people you trust do. Right now they disappear when the majority, from the point of view of your position on the WoT, mark them as spammers (more or less). So this is a disadvantage of avogato against current FMS implementation. With the current FMS implementation, only a majority of trusted identities need to mark him down, with avogato, either all original trusters need to mark him down or you need to do it yourself (either mark him down or everyone, who trusts him, so FMS 1:0 avogato As I've said repeatedly, I believe there is a fundamental tradeoff between spam resistance and censorship resistance, in the limiting case. (It's obviously possible to have an algorithm that does poorly at both.) Advogato *might* let more spam through than FMS. There is no proof provided for how much spam FMS lets through; with Advogato it is limited in a provable manner. Alchemy is a bad thing. FMS definitely makes censorship by the mob easier. By my count, that's a win for Advogato on both. I dont think you can divide between spam resistance and censorship resistance for a simple reason: Who defines what sort of action or text is spam? Many people may mostly aggree about some sort of action or content to be spam, but others could claim the reduced visibility censorship. And i dont see any alchemy with the current trust system of FMS, if something is alchemy and not clear, please point it out, but the exact point please. And FMS does not make censorship by a mob easier. Simply because you should select the people you trust yourself. Like you should select your friends and darknet peers yourself. If you let others do it for you, dont argue about what follows (like a censored view on FMS). Yes, the spam and censorship problems are closely related. That's why I say there is something of a tradeoff between them. The problem with FMS should be obvious: if some small group actively tries to censor things I consider non-spam, then it requires a significant amount of effort by me to stop that. I have to look at trust lists that mostly contain valid markings, and belong to real people posting real messages, and somehow determine that some of the entries on them are invalid, and then decide not to trust their trust list. Furthermore, I have to do this without actually examining each entry on their trust list -- I'm trying to look at *less* spam here, not more. The result is a balkanization of trust lists based on differing policies. Any mistakes I make will go unnoticed, since I won't see the erroneously rejected messages. In FMS, a group with permissive policies (spam filtering only) and a group that filtered content they found objectionable can't make effective use of each other's trust lists. However, the former group would like to trust the not-spammer ratings made by the latter group, and the latter group would like to trust the spammer ratings made by the former. AIUI, the balkanization of FMS trust lists largely prevents this. Advogato would allow the permissive group to make use of the less permissive group's ratings, without allowing them to act as censors. IMHO, the Advogato case is better for two reasons: first, favoring those who only want to stop spam over those who want to filter objectionable content is more consistent with the philosophy behind Freenet. Second, spam filters of any sort should be biased towards type II errors, since they're less problematic and easier to correct. Essentially, I think that FMS goes overboard in its attempts to reduce spam. It is my firm belief that limiting the amount of spam that can be sent to a modest linear function of the amount of *manual* effort a spammer exerts is sufficient. Spam is a problem in both Frost and email because spammers can simply run bots. The cost of FMS, both in worry over mob censorship and work required to maintain trust lists, is very high. I think that the total effort spent by the community would be reduced by the use of an algorithm that took more effort to stop spammers, and less effort to enable normal communications. We need to be aware of what we optimize for, and make sure it's really what we want. I've explained why FMS is alchemy before, but it's an important point, so I don't mind repeating it. FMS has some goals, and it performs an algorithm. There is no proof
Re: [freenet-dev] Why WoTs won't work....
On Fri, May 22, 2009 at 4:16 PM, Matthew Toseland t...@amphibian.dyndns.org wrote: On Friday 22 May 2009 15:39:06 Evan Daniel wrote: On Fri, May 22, 2009 at 8:17 AM, Matthew Toseland t...@amphibian.dyndns.org wrote: On Friday 22 May 2009 08:17:55 bbac...@googlemail.com wrote: Is'nt his point that the users just won't maintain the trust lists? I thought that is the problem that he meant how can Advogato help us here? Advogato with only positive trust introduces a different tradeoff, which is still a major PITA to maintain, but maybe less of one: - Spammers only disappear when YOU mark them as spammers, or ALL the people you trust do. Right now they disappear when the majority, from the point of view of your position on the WoT, mark them as spammers (more or less). When they *fail to mark them as trusted*. It's an important distinction, as it means that in order for the spammer to do anything they first have to *manually* build trust. If an identity suddenly starts spamming, only people that originally marked it as trusted have to change their trust lists in order to stop them. - If you mark a spammer as positive because he posts useful content on one board, and you don't read the boards he spams you are likely to get marked as a spammer yourself. Depends how militant people are. I suspect in practice people won't do this unless you trust a lot of spammers... in which case they have a point. (This is also a case for distinguishing message trust from trust list trust; while Advogato doesn't do this, the security proof extends to cover it without trouble.) You can take an in-between step: if Alice marks both Bob and Carol as trusted, and Bob marks Carol a spammer, Alice's software notices and alerts Alice, and offers to show Alice recent messages from Carol from other boards. (Algorithmically, publishing Sam is a spammer is no different from not publishing anything about Sam, but it makes some nice things possible from a UI standpoint.) This may well get most of the benefit of ultimatums with lower complexity. Right, this is something I keep forgetting to mention. When marking a user as a spammer, the UI should ask the user about people who trust that spammer and other spammers. However, it does encourage militancy, doesn't it? It certainly doesn't solve the problem the way that ultimatums do... I don't know how much militancy the software should encourage. I'm inclined to think it should start low, and then change it if that doesn't work. - If a spammer doesn't spam himself, but gains trust through posting useful content on various boards and then spends this trust by trusting spam identities, it will be necessary to give him zero message list trust. Again this has serious issues with collateral damage, depending on how trigger-happy people are and how much of a problem it is for newbies to see spam. Technologically, this requires: - Changing WoT to only support positive trust. This is more or less a one line change. If all you want is positive trust only, yes. If you want the security proof, it requires using the network flow algorithm as specified in the paper, which is a bit more complex. IMHO, fussing with the algorithm in ways that don't let you apply the security proof is just trading one set of alchemy for another -- it might help, but I don't think it would be wise. I was under the impression that WoT already used Advogato, apart from supporting negative trust values and therefore negative trust. The documentation mentions Advogato, and there are some diagrams that relate to it, but none of the detailed description of the algorithm is at all related. Advogato is based on network flow computation. WoT as described on the freesite is not -- an identity with 40 trust points is permitted to give up to 40 points *each* to any number of child identities, with the actual number given determined by a trust rating. In contrast, Advogato has multiple levels of trust, and each identity either trusts or does not trust each other identity at a given level. The number of trust points an identity gets is based on capacity and the optimal flow path. It does not speak to how trustworthy that identity is; at a given trust level, the algorithm either accepts or does not accept a given identity. Multiple trust levels (eg, a level for captcha solving and a level for manual trust) implies running the algorithm multiple times on different (though related) graphs; when running at a given level, connections at that level and all higher levels are used. This implies running Ford-Fulkerson or similar; it's more complicated than the current system, though not drastically so. http://en.wikipedia.org/wiki/Ford-Fulkerson_algorithm - Making sure that my local ratings always override those given by others, so I can mark an identity as spam and never see it again. Dunno if this is currently implemented
Re: [freenet-dev] Why WoTs won't work....
). We can try to be polite about this using ultimatums, since it's likely that they didn't deliberately choose to trust the spam-parent knowing he is a spam-parent - but if they don't respond in some period by removing him from their trust list, we will have to reduce our trust in them. This will cause collateral damage and may be abused for censorship which might be even more dangerous than the current problems on FMS. However, if there is a LOT of spam, or if we want the network to be fairly spam-free for newbies, the first two options are insufficient. :| I'm not certain you're correct about this. The first two methods are, imho, sufficient to limit spam to levels that are annoying, but where the network is still usable. Even if they download a bunch of messages, a new user only has to click the spam button once per spamming identity, and those are limited in a well defined manner (linear with modest coefficient with the number of dummy identities the spammer is willing to maintain). My suspicion is that if all they can aspire to be is a nuisance, the spammers won't be nearly as interested. There is much more appeal to being able to DoS a board or the whole network than being able to mildly annoy the users. So if we limit the amount of damage they can do to a sane level, the actual amount of damage done will be noticeably less than that limit. There is another possible optimization we could do (I've just thought of it, and I'm not entirely certain that it works or that I like it). Suppose that Alice trusts Bob trusts Carol (legitimate but confused) trusts Sam (a spammer), and Alice is busy computing her trust list. Bob has (correctly) marked Sam as a spammer. In the basic implementation, Alice will accept Sam. Bob may think that Carol is normally correct (and not malicious), and be unwilling to zero out his trust list trust for her. However, since this is a flow computation, we can place an added restriction: when Alice calculates trust, flow passing through Bob may not arrive at Sam even if there are intermediate nodes. If Alice can find an alternate route for flow to go from Alice to Carol or Sam, she will accept Sam. This modification is in some ways a negative trust feature, since Bob's marking of Sam as a spammer is different from silence. However, it doesn't let Bob censor anyone he couldn't censor by removing Carol from his trust list. Under no circumstances will Alice using Bob's trust list result in fewer people being accepted than not using Bob's trust list. It does mean that Bob, as a member of the evil cabal of default trust list members for newbies, can (with the unanimous help of the cabal) censor identities in a more subtle fashion than simply not trusting anyone. The caveats: this is a big enough change that it needs a close re-examination of the security proof (I'm pretty sure it's still valid, but I'm not certain). If it sounds like an interesting idea, I can do that. Also, I don't think it's compatible with Ford-Fulkerson or the other simple flow capacity algorithms. The changes required might be non-trivial, possibly to the point of changing the running time. Again, I could look at this in detail if it's interesting enough to warrant it. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Why WoTs won't work....
result in fewer people being accepted than not using Bob's trust list. It does mean that Bob, as a member of the evil cabal of default trust list members for newbies, can (with the unanimous help of the cabal) censor identities in a more subtle fashion than simply not trusting anyone. The caveats: this is a big enough change that it needs a close re-examination of the security proof (I'm pretty sure it's still valid, but I'm not certain). If it sounds like an interesting idea, I can do that. Also, I don't think it's compatible with Ford-Fulkerson or the other simple flow capacity algorithms. The changes required might be non-trivial, possibly to the point of changing the running time. Again, I could look at this in detail if it's interesting enough to warrant it. Worth investigating IMHO. OK, I'll examine it further. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Why WoTs won't work....
On Sat, May 23, 2009 at 10:06 AM, Matthew Toseland t...@amphibian.dyndns.org wrote: On Saturday 23 May 2009 10:43:09 Arne Babenhauserheide wrote: On Friday, 22. May 2009 23:10:42 Mike Bush wrote: I have been watching this debate an I was wondering whether it could help to have 2 sets of trust values for each identity in a trust list, this could mean you could mark an identity as spamming or that I don't want to see these posts again as i find them objectionable. This is what Credence did in the end for spam detection on Gnutella, so it might fit the human psyche :) People got the option to say that's bad quality or misleading, I don't like it or that's spam. For messages that could be * that ID posts spam * that ID posts crap The first can easily be reviewed, the second is subjective. That would give a soft group censorship option, but give the useful spam detection to everyone. Best wishes, Arne PS: Yes, I mostly just tried to clarify Mikes post for me. I hope the mail's useful to you nontheless. People will game the system, no? If they think paedophiles are scum who should not be allowed to speak, and they realise that clicking This is spam is more effective than This is crap, they will click the former, no? I would assume that's the normal case. OTOH, there isn't much harm in implementing it, and if some people use it, that would help somewhat... Perhaps implement, but not required for initial release? Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Policy on removing people from mailing list archives?
On Mon, May 25, 2009 at 8:01 PM, Matthew Toseland t...@amphibian.dyndns.org wrote: On Tuesday 26 May 2009 00:56:22 Matthew Toseland wrote: On one prior occasion (this year), we have authorised a mailing list archive site to remove messages posted by somebody. I have now had another mail asking for us to remove somebody's name from two archives which we don't run - which generally requires him asking them and getting authorisation from us - and from our own archives. If this is to be a regular occurrence, we need to formulate some policy, and IMHO the best way to do this is to discuss it here. Does anyone have an opinion on this? I doubt very much that we have any legal obligation to remove somebody's posts, especially as at least one of the other archive sites will only remove messages with our say so, but I guess we could get legal advice on it... Any opinions on the principle? IMHO rewriting history to make yourself look good to employers is dubious, but at the same time we clearly don't want to pick fights and unnecessarily annoy people. Suggested solution: Authorise removal from the external sites, and obscure the name on our archives. ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl I concur. IMHO other sites should operate as they choose... if they're willing to remove people, then I think we should authorize it. I think it is important to retain all messages, but for archives the name is less important than the content. I would recommend obscuring it as [removed name #n] or similar, so that it's obvious whether it's the same removed name as some other message. Given Freenet's pro-anonymity stance, I think if someone has a desire to be made more anonymous, especially as regards potentially illegal software usage, that we should support them. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Question about an important design decision of the WoT plugin
On Tue, May 26, 2009 at 4:02 PM, xor x...@gmx.li wrote: On Thursday 07 May 2009 11:23:51 Evan Daniel wrote: Why exactly? Your post is nice but I do not see how it answers my question. The general problem my post is about: New identities are obtained by taking them from trust lists of known identities. An attacker therefore could put 100 identities in his trust list to fill up your database and slow down WoT. Therefore, an decision has to be made when to NOT import new identities from someone's trust list. In the current implementation, it is when he has a negative score. [...] I have not examined the WoT code. However, the Advogato metric has two attributes that I don't think the current WoT method has: no negative trust behavior (if there is a trust rating Bob can assign to Carol such that Alice will trust Carol less than if Bob had not assigned a rating, that's a negative trust behavior), and a mathematical proof as to the upper limit on the quantity of spammer nodes that get trusted. The Advogato metric is *specifically* designed to handle the case of the attacker creating millions of accounts. In that case, his success is bounded (linear with modest constant) by the number of confused nodes -- that is, legitimate nodes that have (incorrectly) marked his accounts as legitimate. If you look at the flow computation, it follows that for nodes for which the computed trust value is zero, you don't have to bother downloading their trust lists, so the number of such lists you download is similarly well controlled. I have read your messages again and all your new messages and you are so convinced about advogato that I'd like to ask you more questions about how it would work, I don't want you to feel like everyone is ignoring you :) (- I am more of a programmer right now than a designer of algorithms, I concentrate on spending most available time on *implementing* WoT/FT because nobody else is doing it and it needs to get done... so I have not talked much in this discussion) Well... to be fair, I'm not actually completely certain it will work. I do, however, think that it has a lot of potential. I don't know any way to get the answer short of running the experiment, and I'm very optimistic about the results. I firmly expect them to be good, but not perfect. Your questions are certainly welcome :) Consider the following case, using advogato and not the current FMS/WoT alchemy: 1. Identity X is an occasional and trustworthy poster. X has received many positive trust values from hundreds of identities because it has posted hundreds of messages over the months, so it has a high score and capacity to give trust values, and all newbies will know about the identity and it's high score because it is well-integrated into the trust graph. Careful: Advogato doesn't assign trust scores in the same sense that FMS and WoT do. Because X is trusted by many identities, many identities can reach it, and therefore accept it. That is a purely binary consideration -- it does not matter directly that it is reachable by many paths. Because many identities link to X, X is only a short distance away from many identities. When A calculates his trust graph, X is likely to be nearby. However, even if X is poorly connected, this will be true for some identities; the connectivity changes how likely it is. Capacity of a node is determined (in the base algorithm; there are tweaks worth considering) only by distance, nothing else. Whether that capacity actually limits anything or not depends on a variety of factors. If there aren't enough downstream nodes, then it isn't needed. If the upstream nodes spend their capacity elsewhere, there might not be enough available to fill it -- here is the other place that X being well connected matters. 2. Now a spammer gets a single identity Y onto the trust list of X by solving a captcha, his score is very low because he has only solved a captcha but the score is there. Therefore, any newbie will see Y because X is well-integrated into the WoT Correct. 3. X is gone for quite some time due to inactivity, during that time Y creates 500 spam identities on his trust list and starts to spam all boards. X will not remove Y from his trust list because he is *away* for weeks. Several points. First, one of the optimizations worth considering is tightly limiting the capacity of any identity that only has captcha level trust. This means that newbies have to solve captchas from identities that have received manual trust, which is easy enough to determine. It also means that though our spammer lists 500 fake ids, other people will only accept a very small number of them -- possibly as low as zero, if the captcha trust only nodes are limited to capacity 1. So most of those ids are worthless, and spam is contained. This is one of the weaknesses of the simplest implementation (no limits on captcha-only ids, that is use
Re: [freenet-dev] Question about an important design decision of the WoT plugin
2009/5/26 xor x...@gmx.li: On Tuesday 26 May 2009 22:02:37 xor wrote: On Thursday 07 May 2009 11:23:51 Evan Daniel wrote: Why exactly? Your post is nice but I do not see how it answers my question. The general problem my post is about: New identities are obtained by taking them from trust lists of known identities. An attacker therefore could put 100 identities in his trust list to fill up your database and slow down WoT. Therefore, an decision has to be made when to NOT import new identities from someone's trust list. In the current implementation, it is when he has a negative score. [...] I have not examined the WoT code. However, the Advogato metric has two attributes that I don't think the current WoT method has: no negative trust behavior (if there is a trust rating Bob can assign to Carol such that Alice will trust Carol less than if Bob had not assigned a rating, that's a negative trust behavior), and a mathematical proof as to the upper limit on the quantity of spammer nodes that get trusted. The Advogato metric is *specifically* designed to handle the case of the attacker creating millions of accounts. In that case, his success is bounded (linear with modest constant) by the number of confused nodes -- that is, legitimate nodes that have (incorrectly) marked his accounts as legitimate. If you look at the flow computation, it follows that for nodes for which the computed trust value is zero, you don't have to bother downloading their trust lists, so the number of such lists you download is similarly well controlled. I have read your messages again and all your new messages and you are so convinced about advogato that I'd like to ask you more questions about how it would work, I don't want you to feel like everyone is ignoring you :) (- I am more of a programmer right now than a designer of algorithms, I concentrate on spending most available time on *implementing* WoT/FT because nobody else is doing it and it needs to get done... so I have not talked much in this discussion) Consider the following case, using advogato and not the current FMS/WoT alchemy: 1. Identity X is an occasional and trustworthy poster. X has received many positive trust values from hundreds of identities because it has posted hundreds of messages over the months, so it has a high score and capacity to give trust values, and all newbies will know about the identity and it's high score because it is well-integrated into the trust graph. 2. Now a spammer gets a single identity Y onto the trust list of X by solving a captcha, his score is very low because he has only solved a captcha but the score is there. Therefore, any newbie will see Y because X is well-integrated into the WoT 3. X is gone for quite some time due to inactivity, during that time Y creates 500 spam identities on his trust list and starts to spam all boards. X will not remove Y from his trust list because he is *away* for weeks. Also consider the case that instead of 500 new identities he just posts 500 messages with his single identity Y. How do we get rid of Y? First, you rate limit messages. I'm having trouble coming up with a case where I ever want my node downloading that many messages from one identity. Second, after I read a few, I'll mark some as spam and the rest will go away. From a practical standpoint, I don't really care about the difference between 5 messages, 500, or 50 -- I'll read one, or a few, and then mark Y as a spammer. I'll never see the rest. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Why WoTs won't work....
On Tue, May 26, 2009 at 4:45 PM, xor x...@gmx.li wrote: On Friday 22 May 2009 16:39:06 Evan Daniel wrote: On Fri, May 22, 2009 at 8:17 AM, Matthew Toseland t...@amphibian.dyndns.org wrote: On Friday 22 May 2009 08:17:55 bbac...@googlemail.com wrote: Is'nt his point that the users just won't maintain the trust lists? I thought that is the problem that he meant how can Advogato help us here? Advogato with only positive trust introduces a different tradeoff, which is still a major PITA to maintain, but maybe less of one: - Spammers only disappear when YOU mark them as spammers, or ALL the people you trust do. Right now they disappear when the majority, from the point of view of your position on the WoT, mark them as spammers (more or less). When they *fail to mark them as trusted*. It's an important distinction, as it means that in order for the spammer to do anything they first have to *manually* build trust. If an identity suddenly starts spamming, only people that originally marked it as trusted have to change their trust lists in order to stop them. - If you mark a spammer as positive because he posts useful content on one board, and you don't read the boards he spams you are likely to get marked as a spammer yourself. Depends how militant people are. I suspect in practice people won't do this unless you trust a lot of spammers... in which case they have a point. (This is also a case for distinguishing message trust from trust list trust; while Advogato doesn't do this, the security proof extends to cover it without trouble.) You can take an in-between step: if Alice marks both Bob and Carol as trusted, and Bob marks Carol a spammer, Alice's software notices and alerts Alice, and offers to show Alice recent messages from Carol from other boards. (Algorithmically, publishing Sam is a spammer is no different from not publishing anything about Sam, but it makes some nice things possible from a UI standpoint.) This may well get most of the benefit of ultimatums with lower complexity. - If a spammer doesn't spam himself, but gains trust through posting useful content on various boards and then spends this trust by trusting spam identities, it will be necessary to give him zero message list trust. Again this has serious issues with collateral damage, depending on how trigger-happy people are and how much of a problem it is for newbies to see spam. Technologically, this requires: - Changing WoT to only support positive trust. This is more or less a one line change. If all you want is positive trust only, yes. If you want the security proof, it requires using the network flow algorithm as specified in the paper, which is a bit more complex. IMHO, fussing with the algorithm in ways that don't let you apply the security proof is just trading one set of alchemy for another -- it might help, but I don't think it would be wise. - Making sure that my local ratings always override those given by others, so I can mark an identity as spam and never see it again. Dunno if this is currently implemented. - Making CAPTCHA announcement provide some form of short-lived trust, so if the newly introduced identity doesn't get some trust it goes away. This may also be implemented. My proposal: there are two levels of trust (implementation starts exactly as per Advogato levels). The lower level is CAPTCHA trust; the higher is manually set only. (This extends to multiple manual levels without loss of generality.) First, the algorithm is run normally on the manual trust level. Then, the algorithm is re-run on the CAPTCHA trust level, with modification: identities that received no manual trust have severely limited capacity (perhaps as low as 1), and the general set of capacity vs distance from root is changed to not go as deep. The first part means that the spammer can't chain identities *at all* before getting the top one manually trusted. The second means that identities that only solved a CAPTCHA will only be seen by a small number of people -- ie they can't spam everyone. The exact numbers for flow vs depth would need some tuning for both trust levels, obviously. You want enough people to see new identities that they will receive manual trust. It is absolutely INACCEPTABLE for a discussion system to only display messages of newbies to some people due to the nature of discussion: - The *value* of a single post from a new identity which has posted a single message can be ANYTHING... it can be absolute crap... but it can also be a highly valuable secret document which reveals stuff which is interesting for millions of people. In other words: The fact that someone is a newbie does not say ANYTHING about the worth of his posts. In more other words: NO individual has the right to increase the worth of his posts - as in the amount of people reading them - by speaking very much on Freetalk
Re: [freenet-dev] Question about an important design decision of the WoT plugin
On Tue, May 26, 2009 at 5:38 PM, xor x...@gmx.li wrote: On Tuesday 26 May 2009 23:19:53 Evan Daniel wrote: 2009/5/26 xor x...@gmx.li: On Tuesday 26 May 2009 22:02:37 xor wrote: On Thursday 07 May 2009 11:23:51 Evan Daniel wrote: Why exactly? Your post is nice but I do not see how it answers my question. The general problem my post is about: New identities are obtained by taking them from trust lists of known identities. An attacker therefore could put 100 identities in his trust list to fill up your database and slow down WoT. Therefore, an decision has to be made when to NOT import new identities from someone's trust list. In the current implementation, it is when he has a negative score. [...] I have not examined the WoT code. However, the Advogato metric has two attributes that I don't think the current WoT method has: no negative trust behavior (if there is a trust rating Bob can assign to Carol such that Alice will trust Carol less than if Bob had not assigned a rating, that's a negative trust behavior), and a mathematical proof as to the upper limit on the quantity of spammer nodes that get trusted. The Advogato metric is *specifically* designed to handle the case of the attacker creating millions of accounts. In that case, his success is bounded (linear with modest constant) by the number of confused nodes -- that is, legitimate nodes that have (incorrectly) marked his accounts as legitimate. If you look at the flow computation, it follows that for nodes for which the computed trust value is zero, you don't have to bother downloading their trust lists, so the number of such lists you download is similarly well controlled. I have read your messages again and all your new messages and you are so convinced about advogato that I'd like to ask you more questions about how it would work, I don't want you to feel like everyone is ignoring you :) (- I am more of a programmer right now than a designer of algorithms, I concentrate on spending most available time on *implementing* WoT/FT because nobody else is doing it and it needs to get done... so I have not talked much in this discussion) Consider the following case, using advogato and not the current FMS/WoT alchemy: 1. Identity X is an occasional and trustworthy poster. X has received many positive trust values from hundreds of identities because it has posted hundreds of messages over the months, so it has a high score and capacity to give trust values, and all newbies will know about the identity and it's high score because it is well-integrated into the trust graph. 2. Now a spammer gets a single identity Y onto the trust list of X by solving a captcha, his score is very low because he has only solved a captcha but the score is there. Therefore, any newbie will see Y because X is well-integrated into the WoT 3. X is gone for quite some time due to inactivity, during that time Y creates 500 spam identities on his trust list and starts to spam all boards. X will not remove Y from his trust list because he is *away* for weeks. Also consider the case that instead of 500 new identities he just posts 500 messages with his single identity Y. How do we get rid of Y? First, you rate limit messages. I'm having trouble coming up with a case where I ever want my node downloading that many messages from one identity. And how to find a practical rate limit? Consider SVN/GIT/etc. log-bots: They post a single message for each commit to the repository. FMS, WoT, and Advogato all mark identities, not messages. Why is this scenario relevant to the question at hand -- that is, which algorithm to run on the trust graph? If you want to discuss intelligent rate limiting, and how to make that usable and useful to users, that is basically a UI problem. I have ideas and suggestions, and would be happy to discuss them. However, that would be completely unrelated to the current subject, so I suggest starting a new thread. Second, after I read a few, I'll mark some as spam and the rest will go away. From a practical standpoint, I don't really care about the difference between 5 messages, 500, or 50 -- I'll read one, or a few, and then mark Y as a spammer. I'll never see the rest. Can a messaging system survive which will appear as full of spam to every newbie? Isn't it the core goal of the WoT to prevent *newbies* from seeing spam, to let the community design a set of ratings which prevents EVERYONE from having to manually mark spam/non spam, letting only a subset of the community doing the work and others can benefit from it? I think thats what any algorithm needs to be able to do: Provide a nice first usage experience. First usage = empty trust list. So this also applies to people who are to lazy to mark everything as spam which is spam. Which probably applies to 50
Re: [freenet-dev] Question about an important design decision of the WoT plugin
On Wed, May 27, 2009 at 1:18 PM, Thomas Sachau m...@tommyserver.de wrote: Evan Daniel schrieb: That is fundamentally a hard problem. - Advogato is not perfect. I am certain there will be some amount of spam getting through; hopefully it will be a small amount. - With Advogato, the amount of spam possible is well defined. With FMS and WoT it is not. Neither of them have an upper bound on the amount of spam. How do you define spam? Please clarify the question. Do you mean me, personally? The Freenet community as a whole? Or in the context of the proof? One limit per identity is the amount of messages which are accepted per day. And if you trust some active indentities, which think the same as you, you will get nearly no spam at all because they already marked it as spam. FMS/WoT depends on the trust relationship between people and them telling each other about third partys. - Being too good at solving the spam problem means we are too good at mob censorship. Both are problems. In practice, the goal should be to strike an appropriate balance between the two, not simply to eliminate spam. Since you cannot say what is spam and what not, this is relative. In FMS, you can choose to trust those that think the same as you and you will get their spam markings. Can you get the same with avogato? I have only *very* rarely had any difficulty determining whether a message was spam or not. Why would this be any different? Of course Advogato gives you the same ability, that is the entire point. The precise algorithm is different, but the problem it tries to solve is the same. The one difference is that Advogato is not about determining that person X is a spammer, it's about determining that person X *isn't* a spammer. From a user's standpoint, the two questions are precisely identical, but at an algorithm level they're not. - I believe that Advogato is capable of limiting spam to levels where the system is usable, even in the case of reasonably determined spammers. If the most they can aspire to is being a nuisance, I don't think the spammers will be as interested. If spamming takes work and doesn't do all that much, they'll give up. The actual amount of spam seen in practice should be well below the worst possible case -- if and only if the worst case isn't catastrophic. How much noice will it allow? The alice bot spam in frost was also just annoying, but i do think that many new users where annoyed and left frost and freenet. So a default system should not only make it usable, but also relative spamfree from the view of the majority. It will accept a number of spam identities at most equal to the sum of the excess capacity of the set of confused identities. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Why WoTs won't work....
-based trust has tighter limitations than manual trust, he has to solve a captcha for each fake identity. This proposal is not a required part of Advogato, it is my own suggestion. It could be applied to WoT as well, I believe. If you assume that people will not maintain trust lists, then it doesn't matter what algorithm you run on the trust graph. There won't be one. FMS, WoT, and Advogato all fail completely under that assumption. Fundamentally, it's a question of whether you believe CAPTCHAs work. I don't. If you start with an assumption that CAPTCHAs are a minor hindrance at most, then if you require that everyone sees messages sent by identities that have only solved CAPTCHAs and not gained manual trust, then you've made it a design criteria to permit unlimited amounts of spam. (That's bad.) If you believe CAPTCHAs work, then things are a bit easier... but I think the balance of the evidence is against that belief. Captchas may not be the ultimative solution. But they are one way to let people in while prooving to be humans. And you will need this limit (human proove), so you will always need some sort of captcha or a real friends trust network. Captchas do not prove someone is human. They prove that someone solved a problem. If your captchas are good, that means they are more likely to be human. I work from an assumption that captchas are marginally effective at best. If you think I am mistaken in that, please explain why. From that assumption, I conclude that we need a system that is reasonably effective against a spammer who can solve significant numbers of captchas, but still is capable of making use of the information that solving a captcha does provide. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Question about an important design decision of the WoT plugin
On Wed, May 27, 2009 at 2:44 PM, Thomas Sachau m...@tommyserver.de wrote: Evan Daniel schrieb: On Wed, May 27, 2009 at 1:18 PM, Thomas Sachau m...@tommyserver.de wrote: Evan Daniel schrieb: That is fundamentally a hard problem. - Advogato is not perfect. I am certain there will be some amount of spam getting through; hopefully it will be a small amount. - With Advogato, the amount of spam possible is well defined. With FMS and WoT it is not. Neither of them have an upper bound on the amount of spam. How do you define spam? Please clarify the question. Do you mean me, personally? The Freenet community as a whole? Or in the context of the proof? The question should point out the problem about spam. One may say that only messages with random letters are spam. Others may add many messages, which are all the same or similar. Others may add messages with different languages than their own. Others may add logbots. Another one may want to add everyone who argues for avogato or FMS. Since there is no objective spam definition, you can neither say that the amount of spam is well defined nor that there is no upper bound on the amount of spam. Have you read the proof? - Being too good at solving the spam problem means we are too good at mob censorship. Both are problems. In practice, the goal should be to strike an appropriate balance between the two, not simply to eliminate spam. Since you cannot say what is spam and what not, this is relative. In FMS, you can choose to trust those that think the same as you and you will get their spam markings. Can you get the same with avogato? I have only *very* rarely had any difficulty determining whether a message was spam or not. Why would this be any different? You yourself had no problems. But are you sure others share your view on it? How is this remotely relevant to choice of algorithm? - I believe that Advogato is capable of limiting spam to levels where the system is usable, even in the case of reasonably determined spammers. If the most they can aspire to is being a nuisance, I don't think the spammers will be as interested. If spamming takes work and doesn't do all that much, they'll give up. The actual amount of spam seen in practice should be well below the worst possible case -- if and only if the worst case isn't catastrophic. How much noice will it allow? The alice bot spam in frost was also just annoying, but i do think that many new users where annoyed and left frost and freenet. So a default system should not only make it usable, but also relative spamfree from the view of the majority. It will accept a number of spam identities at most equal to the sum of the excess capacity of the set of confused identities. The question is this: Will it prevent enough, so almost all spam or will the amount of spam force new (and old) users to leave like it happened and happens with frost and the alice bot? That is one question, but not the only one. Another one is, is the provable upper bound better or worse than the provable upper bound for a specific alternative proposal? As to the former, I cannot say with certainty until we try it out, and even then we only have indications, not proof. As to the latter, I have yet to hear anyone propose an alternative to Advogato that has an upper bound, let alone one that's better. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Why WoTs won't work....
On Wed, May 27, 2009 at 3:11 PM, Thomas Sachau m...@tommyserver.de wrote: Evan Daniel schrieb: On Wed, May 27, 2009 at 1:29 PM, Thomas Sachau m...@tommyserver.de wrote: A small number could still be rather large. Having thousands see it ought to suffice. For the current network, I see no reason not to have the (default) limits such that basically everyone sees it. If your small number is that big, you should add that because for me, small is not around thousends. Additionally, if you allow them to reach thousends (will a freenet based message system ever reach more people?), is there any value in restricting this anyway? Currently, the total number of people using Freenet is small. Hopefully that will not always be the case. Designing a new system that assumes it will always be the case seems like a rather bad idea to me. In this context, I would say small means sublinear growth with the size of the entire network. Having the new-identity spam reach thousands of recipients is far better than having it reach tens of thousands or millions. Why not let the WoT solve the problem? In practise, not all of those will pull the spam at the same time. So some will get it first, see it is spam and mark it as such. Later ones will then see the spammer mark and not even fetch the message. On the other hand, if it is no spam, it will get fetched. If WoT can solve it, fine. If it can't, that's fine too. Neither case has any bearing on Advogato's abilities, merely the standard of comparison. If the post is really that valuable, some people will mark the poster as trusted. Then everyone will see it. Why should they? People are lazy, so most, if not all will just read it, maybe answer it, but who thinks about rating someone because of a single post? People are and will always be lazy. If the post is only somewhat valuable, it might take a few posts. If it's a provocative photo that escaped from an oppressive regime, I suspect it wouldn't. A few? I do sometimes check some FMS trustlists. And those i did check did not set some trust value for many people. Additionally remember that FMS is used by people who are willing to do something. So i would expect much less from the default WoT inside freenet. With your suggestion, someone will have to wait, until someone uncensors him. Imho, noone should be censored by default, so it should be exactly the other way round. See below on captchas. Granting trust automatically on replies is an idea that has been discussed before. It has a great deal of merit. I'm in favor of it. I just don't think that should be the highest level of trust. It may be an additional option, but this would only make those well-trusted, which do write many posts, while others with less posts get less trust. Would be another place, where a spammer could do something to make his attacks more powerfull. It is my firm belief that if the system makes the spammer perform manual work per identity they wish to spam with, the problem is solved. Do you have evidence or sound reasoning to the contrary? All systems I know of -- such as email and Frost -- have spam problems because the spammer can automate all the steps. You may think that everyone should be equal; I don't. If newbies are posting stuff that isn't spam (be it one message or many), I'm willing to believe someone my web can reach will mark them trusted. You obviously aren't; that's fine too. Fortunately, there is no requirement we use the same capacity limiting functions -- that should be configurable for each user. If you want to make the default function fairly permissive, that's fine. I think you'd be making the wrong choice, but personally I wouldn't care that much because I'd just change it away from the default if new-identity spam was a problem. So you want the default to be more censoring. And you trust people to not be lazy. I oppose both. First, if you really want to implement such censorship, make the default open, with thousends of trusted users, it wont be a difference anyway. Second, why should people mark new identities as trusted? I use FMS and i dont change the trust of every identity i see there. And i do somehow manage a trustlist there. If someone is lazy (and the majority is), they will do nothing. If one of your design requirements is that new identities can post and be seen by everyone, you have made the spam problem unsolvable BY DEFINITION. That is bad. Wrong. The initial barrier is the proove to solve a problem. Which should be done with a problem hard for computers and easy for humans. But this just prevents automated computerbased identity creation. Please cite evidence that such a problem exists in a form that is user-friendly enough we can use it. Unless I am greatly mistaken, Freenet's goal as a project is not to solve the captcha problem when no one else has. Taking on oppressive governments
Re: [freenet-dev] Freenet doesn't work with java 1.6.0.14??? was Fwd: Re: [freenet-support] freenet
On Thu, Jun 4, 2009 at 3:35 PM, Matthew Toseland t...@amphibian.dyndns.org wrote: -- Forwarded message -- From: Philip Bych pot...@googlemail.com To: Matthew Toseland t...@amphibian.dyndns.org Date: Thu, 4 Jun 2009 19:46:46 +0100 Subject: Re: [freenet-support] freenet thanks but i have sorted the problem it was the java update i updated java runtime to the latest 1.6.0.14 and it seems that freenet does not work with this so i have reinstalled java 1.6.0.13 and uninstalled then reinstalled freenet all works fine as long as i do not update java runtime. cheers 2009/6/4 Matthew Toseland t...@amphibian.dyndns.org On Tuesday 02 June 2009 20:54:09 goat wrote: updated java to 1.6.0.14 and updated freenet now nothing works keep getting no start up script have disabled norton and use a different browser and all other security still no joy what other options are there Hi. To help us solve this problem, please: - Find the directory Freenet is installed in, find a file called wrapper.log, and send me it. - Open a terminal (run cmd.exe), cd to where Freenet is installed, type start.exe (or start.cmd if you have an old installation). What happens? Send any output. ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl Works for me (TM). On Debian, using Sun Java (package sun-java6-jdk, etc). I just installed the version out of unstable. $ java -version java version 1.6.0_14 Java(TM) SE Runtime Environment (build 1.6.0_14-b08) Java HotSpot(TM) Server VM (build 14.0-b16, mixed mode) Freenet seems to be functioning normally. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
[freenet-dev] Should the spider ignore common words?
On my (incomplete) spider index, the index file for the word the (it indexes no other words) is 17MB. This seems rather large. It might make sense to have the spider not even bother creating an index on a handful of very common words (the, be, to, of, and, a, in, I, etc). Of course, this presents the occasional difficulty: http://bash.org/?514353 I think I'm in favor of not indexing common words even so. Also, on a related note, the index splitting policy should be a bit more sophisticated: in an attempt to fit within the max index size as configured, it split all the way down to index_8fc42.xml. As a result, the file index_8fc4b.xml sits all by itself at 3KiB. It contains the two words vergessene and txjmnsm. I suspect it would have reliability issues should anyone actually want to search either of those. It would make more sense to have all of index_8fc4 in one file, since it would be only trivially larger. (I have a patch that I thought did that, but it has a bug; I'll test once my indexwriter is finished writing, since I don't want to interrupt it by reloading the plugin.) Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Should the spider ignore common words?
On Wed, Jun 10, 2009 at 1:54 AM, Daniel Chengj16sdiz+free...@gmail.com wrote: On Wed, Jun 10, 2009 at 12:02 PM, Evan Danieleva...@gmail.com wrote: On my (incomplete) spider index, the index file for the word the (it indexes no other words) is 17MB. This seems rather large. It might make sense to have the spider not even bother creating an index on a handful of very common words (the, be, to, of, and, a, in, I, etc). Of course, this presents the occasional difficulty: http://bash.org/?514353 I think I'm in favor of not indexing common words even so. Yes, it should ignore common words. This is called stopword in search engine termology. Also, on a related note, the index splitting policy should be a bit more sophisticated: in an attempt to fit within the max index size as configured, it split all the way down to index_8fc42.xml. As a result, the file index_8fc4b.xml sits all by itself at 3KiB. It contains the two words vergessene and txjmnsm. I suspect it would have reliability issues should anyone actually want to search either of those. It would make more sense to have all of index_8fc4 in one file, since it would be only trivially larger. (I have a patch that I thought did that, but it has a bug; I'll test once my indexwriter is finished writing, since I don't want to interrupt it by reloading the plugin.) trivially larger ... ugh... how trivial is trivial? the xmllibrarian can handle index_8fc42.xml on its own but all other 8fc4 on index_8fc4.xml. however, as i have stated in irc, that make index generation even slower. 8fc42 is 17382 KiB. All other 8fc4 are 79 KiB combined. Also, it would make index generation faster. The spider first does all the work of creating 8fc4, then discards it to recreate the sub-indexes. The vast majority of this work is in 8fc42, which gets created twice. Not splitting the index would nearly halve the time to create the 8fc4 set of indexes. Of course, a more efficient algorithm for creating the indexes in the first place would both make it far faster and make the two take approximately the same time. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Should the spider ignore common words?
On Wed, Jun 10, 2009 at 2:56 AM, Daniel Chengj16sdiz+free...@gmail.com wrote: On Wed, Jun 10, 2009 at 2:06 PM, Evan Danieleva...@gmail.com wrote: On Wed, Jun 10, 2009 at 1:54 AM, Daniel Chengj16sdiz+free...@gmail.com wrote: On Wed, Jun 10, 2009 at 12:02 PM, Evan Danieleva...@gmail.com wrote: On my (incomplete) spider index, the index file for the word the (it indexes no other words) is 17MB. This seems rather large. It might make sense to have the spider not even bother creating an index on a handful of very common words (the, be, to, of, and, a, in, I, etc). Of course, this presents the occasional difficulty: http://bash.org/?514353 I think I'm in favor of not indexing common words even so. Yes, it should ignore common words. This is called stopword in search engine termology. Also, on a related note, the index splitting policy should be a bit more sophisticated: in an attempt to fit within the max index size as configured, it split all the way down to index_8fc42.xml. As a result, the file index_8fc4b.xml sits all by itself at 3KiB. It contains the two words vergessene and txjmnsm. I suspect it would have reliability issues should anyone actually want to search either of those. It would make more sense to have all of index_8fc4 in one file, since it would be only trivially larger. (I have a patch that I thought did that, but it has a bug; I'll test once my indexwriter is finished writing, since I don't want to interrupt it by reloading the plugin.) trivially larger ... ugh... how trivial is trivial? the xmllibrarian can handle index_8fc42.xml on its own but all other 8fc4 on index_8fc4.xml. however, as i have stated in irc, that make index generation even slower. 8fc42 is 17382 KiB. All other 8fc4 are 79 KiB combined. Also, it would make index generation faster. The spider first does all the work of creating 8fc4, then discards it to recreate the sub-indexes. The vast majority of this work is in 8fc42, which gets created twice. Not splitting the index would nearly halve the time to It don't get created twice, it shortcut early. see the estimateSize variable in IndexWriter. Unless I'm mistaken, the slow part of the index creation is the term.getPages() call. That call is where all the disk io hides, no? The shortcut doesn't occur until after that call returns. As discussed above, the accounts for about 99.5% of the whole index, and therefore (I'm assuming) 99.5% of the disk io. And that 99.5% happens twice. The shortcut only functions properly when the largest term accounts for a modest fraction of the total work, which is exactly what isn't happening here. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Should the spider ignore common words?
On Wed, Jun 10, 2009 at 3:49 AM, Daniel Chengj16sdiz+free...@gmail.com wrote: On Wed, Jun 10, 2009 at 3:18 PM, Evan Danieleva...@gmail.com wrote: On Wed, Jun 10, 2009 at 2:56 AM, Daniel Chengj16sdiz+free...@gmail.com wrote: On Wed, Jun 10, 2009 at 2:06 PM, Evan Danieleva...@gmail.com wrote: On Wed, Jun 10, 2009 at 1:54 AM, Daniel Chengj16sdiz+free...@gmail.com wrote: On Wed, Jun 10, 2009 at 12:02 PM, Evan Danieleva...@gmail.com wrote: On my (incomplete) spider index, the index file for the word the (it indexes no other words) is 17MB. This seems rather large. It might make sense to have the spider not even bother creating an index on a handful of very common words (the, be, to, of, and, a, in, I, etc). Of course, this presents the occasional difficulty: http://bash.org/?514353 I think I'm in favor of not indexing common words even so. Yes, it should ignore common words. This is called stopword in search engine termology. Also, on a related note, the index splitting policy should be a bit more sophisticated: in an attempt to fit within the max index size as configured, it split all the way down to index_8fc42.xml. As a result, the file index_8fc4b.xml sits all by itself at 3KiB. It contains the two words vergessene and txjmnsm. I suspect it would have reliability issues should anyone actually want to search either of those. It would make more sense to have all of index_8fc4 in one file, since it would be only trivially larger. (I have a patch that I thought did that, but it has a bug; I'll test once my indexwriter is finished writing, since I don't want to interrupt it by reloading the plugin.) trivially larger ... ugh... how trivial is trivial? the xmllibrarian can handle index_8fc42.xml on its own but all other 8fc4 on index_8fc4.xml. however, as i have stated in irc, that make index generation even slower. 8fc42 is 17382 KiB. All other 8fc4 are 79 KiB combined. Also, it would make index generation faster. The spider first does all the work of creating 8fc4, then discards it to recreate the sub-indexes. The vast majority of this work is in 8fc42, which gets created twice. Not splitting the index would nearly halve the time to It don't get created twice, it shortcut early. see the estimateSize variable in IndexWriter. Unless I'm mistaken, the slow part of the index creation is the term.getPages() call. That call is where all the disk io hides, no? no :) getPages() return a IPersistentSet (ScalableSet) which is lazy evaluated. Internally, it is a linkedset when small, btree when large. the .size() method is always cached. In this case, I don't think it helps. 13 bytes is a gross underestimate of the size adding a page adds to the file. estimateSize isn't checked again until all the pages have been added. Furthermore, that leaves the timing unexplained. It takes as long to generate b70 as all the rest of b7* combined. This is fairly consistent across the whole set of files (obviously some variation is present). 2009-06-10 02:59 index_b6e.xml 2009-06-10 03:00 index_b6f.xml 2009-06-10 03:16 index_b70.xml 2009-06-10 03:17 index_b71.xml 2009-06-10 03:18 index_b72.xml 2009-06-10 03:19 index_b73.xml 2009-06-10 03:20 index_b74.xml 2009-06-10 03:21 index_b75.xml 2009-06-10 03:21 index_b76.xml 2009-06-10 03:22 index_b77.xml 2009-06-10 03:24 index_b78.xml 2009-06-10 03:24 index_b79.xml 2009-06-10 03:25 index_b7a.xml 2009-06-10 03:27 index_b7b.xml 2009-06-10 03:28 index_b7c.xml 2009-06-10 03:28 index_b7d.xml 2009-06-10 03:29 index_b7e.xml 2009-06-10 03:30 index_b7f.xml 2009-06-10 03:45 index_b80.xml 2009-06-10 03:47 index_b81.xml Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Should the spider ignore common words?
On Wed, Jun 10, 2009 at 6:49 AM, Mike Bushmpb...@gmail.com wrote: XMLLibrarian doesn't currently support searching for phrases or rating relevance of results based on proximity so I don't think common words could be of any use in searches now. Also, I'm not sure but I think the current index doesn't include words under 4 letters at all. If you read my previous mails, you'll see that the the spider is in fact indexing the word the. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
[freenet-dev] Reducing Bloom filter memory usage
Currently, the Bloom filters use 46 bits per key (23 buckets, 2-bit counter per bucket) of RAM, using 16 hash functions. This gives a false positive rate of 1.5E-5. Oddly enough, a simple hash table has lower memory usage. Create an array in memory of n-bit values, one value per slot in the salted-hash store. The value stored in the array is an n-bit hash of the key in the corresponding store location. On a given lookup, the odds of a false positive are 2^-n. Because the store does quadratic probing with 4 slots, there are 4 lookups per key request. The false positive rate is then 2^-(n-2). For n=18, we get a false positive rate of 1.5E-5. n=16 would align on byte boundaries and save a tiny bit of memory at a cost of a false positive rate of 6.1E-5. The reason this works better than the Bloom filter is because a given key can only go in a limited set of locations in the salted hash store. The Bloom filter works equally well whether or not that restriction is present. With this method, as the number of possible locations grows, so does the false positive rate. For low associativity, the simple hash table wins. This also makes updates and removal trivial. However, we probably can't share it with our neighbors without giving away our salt value. For that, we probably want to continue planning to use Bloom filters. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Variable opennet connections: moving forward
On Sat, Jun 13, 2009 at 1:08 PM, Matthew Toselandt...@amphibian.dyndns.org wrote: Now that 0.7.5 has shipped, we can start making disruptive changes again in a few days. The number one item on freenet.uservoice.com has been for some time to allow more opennet peers for fast nodes. We have discussed this in the past, and the conclusions which I agree with and some others do: - This is feasible. - It will not seriously break routing. - Reducing the number of connections on slow nodes may actually be a gain in security, by increasing opportunities for coalescing. It will improve payload percentages, improve average transfer rates, let slow nodes accept more requests from each connection, and should improve overall performance. - The network should be less impacted by the speed of the slower nodes. - But we have tested using fewer connections on slow nodes in the past and had anecdotal evidence that it is slower. We need to evaluate it more rigorously somehow. - Increasing the number of peers allowed for fast opennet nodes, within reason, should not have a severe security impact. It should improve routing (by a smaller network diameter). It will of course allow fast nodes to contribute more to the network. We do need to be careful to avoid overreliance on ubernodes (hence an upper limit of maybe 50 peers). - Routing security: FOAF routing allows you to capture most of the traffic from a node already, the only thing stopping this is the 30%-to-one-peer limit. - Coalescing security: Increasing the number of peers without increasing the bandwidth usage does increase vulnerability to traffic analysis by doing less coalescing. On the other hand, this is not a problem if the bandwidth usage scales with the number of nodes. How can we move forward? We need some reliable test results on whether a 10KB/sec node is better off with 10 peers or with 20 peers. I think it's a fair assumption for faster nodes. Suggestions? I haven't tested at numbers that low. At 15KiB/s, the stats page suggests your slightly better off with 12-15 peers than 20. I saw no subjective difference in browsing speed either way. I'm happy to do some testing here, if you tell me what data you want me to collect. More testers would obviously be good. We also need to set some arbitrary parameters. There is an argument for linearity, to avoid penalising nodes with different bandwidth levels, but nodes with more peers and the same amount of bandwidth per peer are likely to be favoured by opennet anyway... Non-linearity, in the sense of having a lower threshold and an upper threshold and linearly add peers between them but not necessarily consistently with the lower threshold, would mean fewer nodes with lots of peers, and might achieve better results? E.g. 10 peers at 10KB/sec ... 20 peers at 20KB/sec (1 per KB/sec) 20 peers at 20KB/sec ... 50 peers at 80KB/sec (1 per 3KB/sec) I wouldn't go as low as 10 peers, simply because I haven't tested it. Other than that, those seem perfectly sensible to me. We should also watch for excessive cpu usage. If there's lots of bw available, we'd want to have just enough connections to not quite limit on available cpu power. Of course, I don't really know how many connections / how much bw it is before that becomes a concern. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] About the website
On Sat, Jun 13, 2009 at 2:18 PM, Matthew Toselandt...@amphibian.dyndns.org wrote: Probably worth moving forward on this? Submenus are important, we have a lot of content and it's not well organised. And arguably the theme is better, and arguably even if it isn't better it's a change, and is no worse... On Monday 01 June 2009 01:24:24 Clément wrote: Hello all, about three weeks ago I had a HCI webproject to do. The subject was : improve an existant website (well, I'm not 100% sure that it was the subject, but that what we've done) I convinced the three other people who worked with me to work on the freenet website. It was a small project though (3 hours with a teacher in the room, + 3 hours max of personal time), so we didn't go far. But maybe some of what we've done could be usefull for the project. Here is the copy/paste of what we've done : I like the submenus. I think that is fairly universal. New layout is fine. I am not convinced about the way the site has been split up however. More comment below... -- Objective of the new website: - To improve the existing navigation controls of freenetproject.org - To improve it's structural presentation of information on home page Aim of the web-site: - to present the software product and provide support - documentation and tools to users and developers to allow them to use and contribute to the software. Problems: The problems of current website http://freenetproject.org : - irrelevant information for homepage: mainly financial status, we don't know what freenet is - too many items in left navigation menu and not really well structured - documentation section where subsections do not have direct hyperlinks - its confusing Solutions we proposed: - simpler horizontal navigation bar with restructured tree - new menu tree proposition: Home -- what is freenet a bit modified page This page looks good IMHO. About freenet: what is freenet philosophy contributors Downloads: freenet tools Tools are unofficial and unsupported. Maybe download should be under home? Contribute: papers -- research and stuff developer I'm not convinced papers belong under Contribute. Donations donate sponsors Ok. But shouldn't both be under Contribute? Support feedback help --documentation and stuff faq --move out from help section mailing lists suggestions What about the wiki? Shouldn't it be on the same level as the uservoice tracker? How about: Home - Home - What is Freenet? - Download Freenet About: - Philosophy - Papers - People Contribute: - Developer page - Donate - Sponsors Help: - Docs - FAQ - Mailing lists - Suggestions - Wiki Too many links? Depends on the theme I guess... Actually, I'm not convinced we want to keep the documentation pages: - Install only applies to the java installer, needs some typo fixes and a new final screenshot, and some guidance on the post-install wizard. - Connect: needs updating but is basically acceptable. - Content: dunno, isn't this more About? but i'm not sure we want to move it there... - Understand: maybe keep - Freemail: probably keep, is sort of official - Frost: dunno, we don't ship it, and we don't review it, but at the moment we recommend it ... - jSite: keep - Thaw: see Frost - FAQ: should be at a higher level - Wiki: should be at a higher level ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl IMHO the wiki should be made more prominent, with a top level link. Is there any reason the following shouldn't be wiki pages? Current docs FAQ What is Freenet? Papers Philosophy People I'm happy to volunteer to work on the wiki, but only if it is going to be made prominent enough that new users are likely to see it. Buried under a submenu as it presently is, I feel that effort spent improving it would be wasted because no one who needs the info would ever see it. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Variable opennet connections: moving forward
On Sat, Jun 13, 2009 at 2:54 PM, Matthew Toselandt...@amphibian.dyndns.org wrote: On Saturday 13 June 2009 19:05:36 Evan Daniel wrote: On Sat, Jun 13, 2009 at 1:08 PM, Matthew Toselandt...@amphibian.dyndns.org wrote: Now that 0.7.5 has shipped, we can start making disruptive changes again in a few days. The number one item on freenet.uservoice.com has been for some time to allow more opennet peers for fast nodes. We have discussed this in the past, and the conclusions which I agree with and some others do: - This is feasible. - It will not seriously break routing. - Reducing the number of connections on slow nodes may actually be a gain in security, by increasing opportunities for coalescing. It will improve payload percentages, improve average transfer rates, let slow nodes accept more requests from each connection, and should improve overall performance. - The network should be less impacted by the speed of the slower nodes. - But we have tested using fewer connections on slow nodes in the past and had anecdotal evidence that it is slower. We need to evaluate it more rigorously somehow. - Increasing the number of peers allowed for fast opennet nodes, within reason, should not have a severe security impact. It should improve routing (by a smaller network diameter). It will of course allow fast nodes to contribute more to the network. We do need to be careful to avoid overreliance on ubernodes (hence an upper limit of maybe 50 peers). - Routing security: FOAF routing allows you to capture most of the traffic from a node already, the only thing stopping this is the 30%-to-one-peer limit. - Coalescing security: Increasing the number of peers without increasing the bandwidth usage does increase vulnerability to traffic analysis by doing less coalescing. On the other hand, this is not a problem if the bandwidth usage scales with the number of nodes. How can we move forward? We need some reliable test results on whether a 10KB/sec node is better off with 10 peers or with 20 peers. I think it's a fair assumption for faster nodes. Suggestions? I haven't tested at numbers that low. At 15KiB/s, the stats page suggests your slightly better off with 12-15 peers than 20. I saw no subjective difference in browsing speed either way. Which stats are you comparing? Output bandwidth (average), payload %, and nodeAveragePingTime. I'd be happy to track others as well. I'm happy to do some testing here, if you tell me what data you want me to collect. More testers would obviously be good. That would be a good start. It would be useful to compare: - 12KB/sec with 10, 12, 20 peers. - 8KB/sec with 8, 10, 20 peers. - 20KB/sec with 10, 15, 20 peers. 10 peers on each setting (proposed minimum), 20 peers (current setting), and 1 peer per KiB/s... What's the rationale behind 20KiB/s with 15 peers? The huge variable is what sort of load I put on the node. Nothing? A few queued downloads? Run the spider? Some test files inserted for the purpose by someone else? Other ideas? We also need to set some arbitrary parameters. There is an argument for linearity, to avoid penalising nodes with different bandwidth levels, but nodes with more peers and the same amount of bandwidth per peer are likely to be favoured by opennet anyway... Non-linearity, in the sense of having a lower threshold and an upper threshold and linearly add peers between them but not necessarily consistently with the lower threshold, would mean fewer nodes with lots of peers, and might achieve better results? E.g. 10 peers at 10KB/sec ... 20 peers at 20KB/sec (1 per KB/sec) 20 peers at 20KB/sec ... 50 peers at 80KB/sec (1 per 3KB/sec) I wouldn't go as low as 10 peers, simply because I haven't tested it. Well, maybe the lower bound should be different. Testing should help. It might very well be that there is a minimum number of opennet connections below which it just doesn't work well. I suspect that is the case. I have no idea where that limit is, though. I suspect having the 30% limit become relevant just due to normal routing policy would be bad. Also, your math above is off: 20 KiB/s to 80 KiB/s is a 60 KiB/s jump; adding 30 peers is 1 peer per 2 KiB/s. Other than that, those seem perfectly sensible to me. We should also watch for excessive cpu usage. If there's lots of bw available, we'd want to have just enough connections to not quite limit on available cpu power. Of course, I don't really know how many connections / how much bw it is before that becomes a concern. Maybe... just routing requests isn't necessarily a big part of our overall CPU usage, the client layer stuff tends to be pretty heavy ... IMHO if people have CPU problems they can just reduce their bandwidth limits. To some degree ping time will keep it in check, but that's a crude measure in that it can't do much until the situation
Re: [freenet-dev] About the website
On Sat, Jun 13, 2009 at 3:15 PM, Matthew Toselandt...@amphibian.dyndns.org wrote: On Saturday 13 June 2009 20:01:18 Evan Daniel wrote: On Sat, Jun 13, 2009 at 2:18 PM, Matthew Toselandt...@amphibian.dyndns.org wrote: Probably worth moving forward on this? Submenus are important, we have a lot of content and it's not well organised. And arguably the theme is better, and arguably even if it isn't better it's a change, and is no worse... On Monday 01 June 2009 01:24:24 Clément wrote: Hello all, about three weeks ago I had a HCI webproject to do. The subject was : improve an existant website (well, I'm not 100% sure that it was the subject, but that what we've done) I convinced the three other people who worked with me to work on the freenet website. It was a small project though (3 hours with a teacher in the room, + 3 hours max of personal time), so we didn't go far. But maybe some of what we've done could be usefull for the project. Here is the copy/paste of what we've done : I like the submenus. I think that is fairly universal. New layout is fine. I am not convinced about the way the site has been split up however. More comment below... -- Objective of the new website: - To improve the existing navigation controls of freenetproject.org - To improve it's structural presentation of information on home page Aim of the web-site: - to present the software product and provide support - documentation and tools to users and developers to allow them to use and contribute to the software. Problems: The problems of current website http://freenetproject.org : - irrelevant information for homepage: mainly financial status, we don't know what freenet is - too many items in left navigation menu and not really well structured - documentation section where subsections do not have direct hyperlinks - its confusing Solutions we proposed: - simpler horizontal navigation bar with restructured tree - new menu tree proposition: Home -- what is freenet a bit modified page This page looks good IMHO. About freenet: what is freenet philosophy contributors Downloads: freenet tools Tools are unofficial and unsupported. Maybe download should be under home? Contribute: papers -- research and stuff developer I'm not convinced papers belong under Contribute. Donations donate sponsors Ok. But shouldn't both be under Contribute? Support feedback help --documentation and stuff faq --move out from help section mailing lists suggestions What about the wiki? Shouldn't it be on the same level as the uservoice tracker? How about: Home - Home - What is Freenet? - Download Freenet About: - Philosophy - Papers - People Contribute: - Developer page - Donate - Sponsors Help: - Docs - FAQ - Mailing lists - Suggestions - Wiki Too many links? Depends on the theme I guess... Actually, I'm not convinced we want to keep the documentation pages: - Install only applies to the java installer, needs some typo fixes and a new final screenshot, and some guidance on the post-install wizard. - Connect: needs updating but is basically acceptable. - Content: dunno, isn't this more About? but i'm not sure we want to move it there... - Understand: maybe keep - Freemail: probably keep, is sort of official - Frost: dunno, we don't ship it, and we don't review it, but at the moment we recommend it ... - jSite: keep - Thaw: see Frost - FAQ: should be at a higher level - Wiki: should be at a higher level ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl IMHO the wiki should be made more prominent, with a top level link. Is there any reason the following shouldn't be wiki pages? Current docs FAQ What is Freenet? Papers Philosophy People They'd have to be locked, or they'd get vandalised during a release. And vandalism here could be very nasty. Also, there might be performance issues, although if we have a third party hosting our wiki it might not be a problem. Locked might be overkill. Allowing edits by non-new accounts would probably work. OTOH, docs and FAQ would definitely make sense as wiki pages ... they would still need menu items on the main site... If we use mediawiki, can we get notifications by email when a page is changed? On wikipedia i think you have to login to see such notices? I don't know. I don't see
Re: [freenet-dev] About the website
On Mon, Jun 15, 2009 at 7:14 PM, Matthew Toselandt...@amphibian.dyndns.org wrote: I have done the first phase of deploying this, after discussions with Ian. We use the new background and the new logo, but we waste a lot of space on the top line with the banner, and we don't use the horizontal menu yet as we need to implement the sub-menus. Also I have rewritten the What is Freenet? page with some input from Ian. Looking at the new version, it feels like it's targetted to an academic who is interested in the theory of anonymous networks. IMHO, it should be targeted at a potential new Freenet user. What they want to know is what they can do with it. The first sentence is a great introduction; it says that Freenet does something to let them communicate anonymously and without censorship. At that point, I think the obvious question for a potential user isn't How does it manage that? but What sorts of communication? In the current version, a new user has to get to the fourth paragraph before they get any hint about what they can do with it, rather than how it works. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] About the website
On Tue, Jun 16, 2009 at 1:52 PM, Matthew Toselandt...@amphibian.dyndns.org wrote: On Tuesday 16 June 2009 03:18:47 Evan Daniel wrote: On Mon, Jun 15, 2009 at 7:14 PM, Matthew Toselandt...@amphibian.dyndns.org wrote: I have done the first phase of deploying this, after discussions with Ian. We use the new background and the new logo, but we waste a lot of space on the top line with the banner, and we don't use the horizontal menu yet as we need to implement the sub-menus. Also I have rewritten the What is Freenet? page with some input from Ian. Looking at the new version, it feels like it's targetted to an academic who is interested in the theory of anonymous networks. IMHO, it should be targeted at a potential new Freenet user. What they want to know is what they can do with it. The first sentence is a great introduction; it says that Freenet does something to let them communicate anonymously and without censorship. At that point, I think the obvious question for a potential user isn't How does it manage that? but What sorts of communication? In the current version, a new user has to get to the fourth paragraph before they get any hint about what they can do with it, rather than how it works. Okay. The homepage now says: ' Freenet is free software which lets you anonymously share files, browse and publish web sites, and chat on forums, without fear of censorship. Users are anonymous, and Freenet is entirely decentralised. Without anonymity there can never be true freedom of speech, and without decentralisation the network would be vulnerable to attack. Learn more!' The What is Freenet? page now says: ' Freenet is free software which lets you anonymously share files, browse and publish web sites (freesites), and chat on forums, without fear of censorship. Users are anonymous, and Freenet is entirely decentralised. Without anonymity there can never be true freedom of speech, and without decentralisation the network would be vulnerable to attack. Communications by Freenet nodes are encrypted and are routed through other nodes to make it extremely difficult to determine who is requesting the information and what its content is. Users contribute to the network by giving bandwidth and a portion of their hard drive (called the data store) for storing files. Files are automatically kept or deleted depending on how popular they are, with the least popular being discarded to make way for newer or more popular content. Files are encrypted, so generally the user cannot easily discover what is in his datastore, and hopefully can't be held accountable for it. Chat forums, websites, and search functionality, are all built on top of this distributed data store. Freenet has been downloaded by over 2 million users since the project started, and used for the distribution of censored information all over the world including countries such as China and the Middle East. Ideas and concepts pioneered in Freenet have had a significant impact in the academic world. Our 2000 paper Freenet: A Distributed Anonymous Information Storage and Retrieval System was the most cited computer science paper of 2000 according to Citeseer, and Freenet has also inspired papers in the worlds of law and philosophy. Ian Clarke, Freenet's creator and project coordinator, was selected as one of the top 100 innovators of 2003 by MIT's Technology Review magazine. An important recent development, which very few other networks have, is the darknet: By only connecting to people they trust, users can greatly reduce their vulnerability, and yet still connect to a global network through their friends' friends' friends and so on. This enables people to use Freenet even in places where Freenet may be illegal, makes it very difficult for governments to block it, and does not rely on tunneling to the free world. Sounds good? Try it!' I think that's better. I might change browse and publish web sites (freesites) to something like browse and publish freesites (freesites are like web sites, but hosted on Freenet). The current version sounds somewhat like you could browse the normal web with Freenet. Given that some new users are familiar with TOR, they may be expecting something like it; best not to feed those assumptions. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] usability testing
On Tue, Jun 16, 2009 at 2:42 PM, Matthew Toselandt...@amphibian.dyndns.org wrote: e) On the Download page: No idea what a node reference is. (Could be rephrased or explained better) That's why it's in quotes, and the Add a friend page does explain it. Do you have any suggestion as to how to improve the wording? Leave it the same, but make node reference a link to an explanation (perhaps even a wiki page...) instead of in quotes? Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Good screenshots needed
On Tue, Jun 16, 2009 at 7:26 PM, Matthew Toselandt...@amphibian.dyndns.org wrote: We need some (3?) screenshots. These must be legal, look reasonably good both in full and when thumbnailed to a reasonable size so we can put them on the homepage. Alternatively, please explain why it would be bad to replace the news on the homepage with a few screenshots. I would vote for a front page that had the title and (possibly) the first paragraph of the most recent news item, with a read more link. It should be small enough to not take up too much of the above-the-fold section. Screenshots below that would be good. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Variable opennet connections: moving forward
On Thu, Jun 18, 2009 at 8:00 PM, Matthew Toselandt...@amphibian.dyndns.org wrote: Are you doing more testing? On Saturday 13 June 2009 19:05:36 Evan Daniel wrote: On Sat, Jun 13, 2009 at 1:08 PM, Matthew Toselandt...@amphibian.dyndns.org wrote: Now that 0.7.5 has shipped, we can start making disruptive changes again in a few days. The number one item on freenet.uservoice.com has been for some time to allow more opennet peers for fast nodes. We have discussed this in the past, and the conclusions which I agree with and some others do: - This is feasible. - It will not seriously break routing. - Reducing the number of connections on slow nodes may actually be a gain in security, by increasing opportunities for coalescing. It will improve payload percentages, improve average transfer rates, let slow nodes accept more requests from each connection, and should improve overall performance. - The network should be less impacted by the speed of the slower nodes. - But we have tested using fewer connections on slow nodes in the past and had anecdotal evidence that it is slower. We need to evaluate it more rigorously somehow. - Increasing the number of peers allowed for fast opennet nodes, within reason, should not have a severe security impact. It should improve routing (by a smaller network diameter). It will of course allow fast nodes to contribute more to the network. We do need to be careful to avoid overreliance on ubernodes (hence an upper limit of maybe 50 peers). - Routing security: FOAF routing allows you to capture most of the traffic from a node already, the only thing stopping this is the 30%-to-one-peer limit. - Coalescing security: Increasing the number of peers without increasing the bandwidth usage does increase vulnerability to traffic analysis by doing less coalescing. On the other hand, this is not a problem if the bandwidth usage scales with the number of nodes. How can we move forward? We need some reliable test results on whether a 10KB/sec node is better off with 10 peers or with 20 peers. I think it's a fair assumption for faster nodes. Suggestions? I haven't tested at numbers that low. At 15KiB/s, the stats page suggests your slightly better off with 12-15 peers than 20. I saw no subjective difference in browsing speed either way. I'm happy to do some testing here, if you tell me what data you want me to collect. More testers would obviously be good. We also need to set some arbitrary parameters. There is an argument for linearity, to avoid penalising nodes with different bandwidth levels, but nodes with more peers and the same amount of bandwidth per peer are likely to be favoured by opennet anyway... Non-linearity, in the sense of having a lower threshold and an upper threshold and linearly add peers between them but not necessarily consistently with the lower threshold, would mean fewer nodes with lots of peers, and might achieve better results? E.g. 10 peers at 10KB/sec ... 20 peers at 20KB/sec (1 per KB/sec) 20 peers at 20KB/sec ... 50 peers at 80KB/sec (1 per 3KB/sec) I wouldn't go as low as 10 peers, simply because I haven't tested it. Other than that, those seem perfectly sensible to me. We should also watch for excessive cpu usage. If there's lots of bw available, we'd want to have just enough connections to not quite limit on available cpu power. Of course, I don't really know how many connections / how much bw it is before that becomes a concern. Evan Daniel I'd been running the Spider, and trying to get a complete run out of it in order to provide a full set of bug reports. Unfortunately, after spidering over 100k keys (representing over a week of runtime), the .dbs file became unrecoverably corrupted, and it won't write index files. I had started rerunning it; I've since paused that and started taking data on connections. I've got a little data so far at 12KiB/s limit, 10 and 12 peers. Basically, I don't see a difference between 10 and 12 peers. Both produce reasonable performance numbers. My node has 2 darknet peers, remainder opennet. I'm not using the node much during these tests; it has a few MiB of downloads queued that aren't making progress (old files that have probably dropped off). Evan Daniel 12 peers, 12 KiB/s limit # bwlimitDelayTime: 91ms # nodeAveragePingTime: 408ms # darknetSizeEstimateSession: 0 nodes # opennetSizeEstimateSession: 63 nodes # nodeUptime: 1h37m # Connected: 10 # Backed off: 2 # Input Rate: 2.54 KiB/s (of 60.0 KiB/s) # Output Rate: 12.9 KiB/s (of 12.0 KiB/s) # Total Input: 31.3 MiB (5.5 KiB/s average) # Total Output: 47.5 MiB (8.34 KiB/s average) # Payload Output: 32.6 MiB (5.73 KiB/sec)(68%) 1469Output bandwidth liability 18 SUB_MAX_PING_TIME Success rates Group P(success) Count All requests3.329% 10,633 CHKs9.654% 3,377 SSKs0.386% 7,256 Local requests 2.022
[freenet-dev] FCP questions
I'm trying to test out SSK inserts with FCP, and running into problems. First, the Disconnect command does not work: Disconnect EndMessage ProtocolError ExtraDescription=Unknown message name Disconnect Fatal=false CodeDescription=Don't know what to do with message Code=7 EndMessage Second, DDA isn't working: TestDDARequest Directory=/data/freenet/ WantReadDirectory=true WantWriteDirectory=false EndMessage TestDDAReply ReadFilename=/data/freenet/DDACheck-2801570529831049194.tmp Directory=/data/freenet EndMessage TestDDAResponse Directory=/data/freenet/ ReadContent=[stuff read from file indicated above] EndMessage TestDDAComplete ReadDirectoryAllowed=false Directory=/data/freenet EndMessage Third, if I didn't want to use DDA to insert my data, how would I do that? The documentation on ClientPut doesn't say how to actually send data in the direct mode. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] FCP questions
On Fri, Jun 19, 2009 at 3:50 AM, bo-lebo...@web.de wrote: Am Freitag, 19. Juni 2009 03:08:13 schrieb Evan Daniel: I'm trying to test out SSK inserts with FCP, and running into problems. First, the Disconnect command does not work: Disconnect EndMessage Where did you get this info? I have never seen this. Just close the socket. The node will detect it ;) From http://wiki.freenetproject.org/FCP2p0Disconnect Is there more accurate / up to date FCP documentation somewhere? I'd been testing with telnet, and I'd rather type Disconnect than fuss with escape sequences, but it's not exactly a huge problem. ProtocolError ExtraDescription=Unknown message name Disconnect Fatal=false CodeDescription=Don't know what to do with message Code=7 EndMessage Second, DDA isn't working: TestDDARequest Directory=/data/freenet/ WantReadDirectory=true WantWriteDirectory=false EndMessage TestDDAReply ReadFilename=/data/freenet/DDACheck-2801570529831049194.tmp Directory=/data/freenet EndMessage TestDDAResponse Directory=/data/freenet/ ReadContent=[stuff read from file indicated above] EndMessage TestDDAComplete ReadDirectoryAllowed=false Directory=/data/freenet EndMessage Third, if I didn't want to use DDA to insert my data, how would I do that? The documentation on ClientPut doesn't say how to actually send data in the direct mode. sample command: ClientPut Persistence=connection Verbosity=-1 //be my wife err be max verbose. UploadFrom=direct PriorityClass=1 DataLength=42 Identifier=jHyperocha1245397350876 TargetFilename= // missing this field means auto, a empty name means off Global=false uri=...@blah,blub,AQECAAE/singlechunktest Data 42 bytes of data Thanks, that works! Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Can we implement Bloom filter sharing quickly???
On Fri, Jun 19, 2009 at 1:36 PM, Matthew Toselandt...@amphibian.dyndns.org wrote: On Friday 19 June 2009 17:37:00 Robert Hailey wrote: On Jun 18, 2009, at 9:07 PM, Matthew Toseland wrote: CHUNK SIZE PROBLEM: === Current plans call to split the keyspace up for each datastore, and assign keys to a manageable sized bloom filter for a section of the (hashed) keyspace. These can then be transferred separately, are useful immediately after being transferred, can be juggled in memory, etc. However, we cannot guarantee that the populaion of such a chunk will be small enough for the filter to be effective. Solving this requires either moving the boundaries dynamically (possibly continually), or making the chunks significantly larger than they would need to be in an ideal world. Right... but I believe we prescribed that the split would be based on a hashed value of the key and not by logical keyspace location to avoid disproportionate chunks. That is to say... ideally a node is going to get a disporportionate amount of cache/store data about it's network location. STORE_SIZE/MAX_BLOOM_CHUNK - N_CHUNKS H(key, N_CHUNKS) = n (0 n N) CHUNK[n].add(key) Wouldn't the problem be reduced to finding a well-scattering hash function then? Yes and no. How much variation will we have even if we divide by hashed keyspace? Hence how much bigger than the ideal splitting size do we need each chunk to be to maintain approximately the right false positives ratio? If the hash spreads things well, number of keys in a single bloom filter should be normally distributed with mean ((total keys) / (number of filters)) and standard deviation sqrt((total keys) * (1 / (number of filters)) * (1 - 1 / (number of filters))). (It's a binomial distribution; any given key has a 1/ number of filters chance of landing in a specific filter.) For a 100GiB total store, we have 3E6 keys between store and cache (for CHKs, same for SSKs). That implies 17MiB of bloom filters. If we size the filters at 1MiB each, with 17 filters, we have 176k keys per filter on average. From the preceding formula, standard deviation is 408 keys. Size variation is only a serious concern if the hash function is not distributing the keys at random. To be safe, we could slightly underfill the filters. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Can we implement Bloom filter sharing quickly???
On Fri, Jun 19, 2009 at 9:15 PM, Juicemanjuicema...@gmail.com wrote: On Fri, Jun 19, 2009 at 1:50 PM, Evan Danieleva...@gmail.com wrote: On Fri, Jun 19, 2009 at 1:36 PM, Matthew Toselandt...@amphibian.dyndns.org wrote: On Friday 19 June 2009 17:37:00 Robert Hailey wrote: On Jun 18, 2009, at 9:07 PM, Matthew Toseland wrote: CHUNK SIZE PROBLEM: === Current plans call to split the keyspace up for each datastore, and assign keys to a manageable sized bloom filter for a section of the (hashed) keyspace. These can then be transferred separately, are useful immediately after being transferred, can be juggled in memory, etc. However, we cannot guarantee that the populaion of such a chunk will be small enough for the filter to be effective. Solving this requires either moving the boundaries dynamically (possibly continually), or making the chunks significantly larger than they would need to be in an ideal world. Right... but I believe we prescribed that the split would be based on a hashed value of the key and not by logical keyspace location to avoid disproportionate chunks. That is to say... ideally a node is going to get a disporportionate amount of cache/store data about it's network location. STORE_SIZE/MAX_BLOOM_CHUNK - N_CHUNKS H(key, N_CHUNKS) = n (0 n N) CHUNK[n].add(key) Wouldn't the problem be reduced to finding a well-scattering hash function then? Yes and no. How much variation will we have even if we divide by hashed keyspace? Hence how much bigger than the ideal splitting size do we need each chunk to be to maintain approximately the right false positives ratio? If the hash spreads things well, number of keys in a single bloom filter should be normally distributed with mean ((total keys) / (number of filters)) and standard deviation sqrt((total keys) * (1 / (number of filters)) * (1 - 1 / (number of filters))). (It's a binomial distribution; any given key has a 1/ number of filters chance of landing in a specific filter.) For a 100GiB total store, we have 3E6 keys between store and cache (for CHKs, same for SSKs). That implies 17MiB of bloom filters. If we size the filters at 1MiB each, with 17 filters, we have 176k keys per filter on average. From the preceding formula, standard deviation is 408 keys. Size variation is only a serious concern if the hash function is not distributing the keys at random. To be safe, we could slightly underfill the filters. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl Perhaps this is what you are discussing, but if (generally speaking) requests are supposed to route to the neighboring node with the closest location to hopefully get closer each time, would it make sense to limit the bloom filter we share to a section of the keyspace we are supposed to specialize in? For example, say my node's location is 0.5, I have neighbors expecting me to have data close to 0.5. Do they care if I have a key for the 0.9 area of the keyspace? They should be asking checking the bloom filter of their top n choices of neighbors for that keyspace for a direct hit. If not, chuck it towards toward the closest one and move on. The next node in the chain will do the same hopefully closer each time... Might we instead send the bloom filter of x portion of our datastore, x being either a fixed number of keys near our specialization or a percent of our datastore in each direction. Even if we just send the closest 25%-50% it should provide an improvement in speed at a cost/benefit/risk ratio that would be easier than full bloom filter sharing... This would reduce the amount of bloom filter data needing to be shared while limiting exposing that we have received an insert. Several thoughts in no particular order... I see two basic ways to do it. Split fairly with a hash function, or split on actual keyspace. If splitting on actual keyspace, the obvious way to do it is to add keys to a filter until it gets overly full, then split it in half and rescan the datastore. That results in filters that are (on average) 3/4 full; the hash approach results in filters that are almost completely full. The memory savings aren't huge, but the disk io difference might be. Splitting on normal keyspace means we can tell our neighbors about selected portions of the keyspace. That may improve normal routing. Telling them about the wrong portion of the keyspace might make old keys in the wrong spot that would otherwise seem to have fallen out of the network more available. Which is better is not clear to me, though I suspect sending info about the keyspace near the node is. (Though should we send info about the keyspace near us, or near the node we're sending to? They're usually but not always similar.) Clearly, the ideal case is that we send all of the bloom filters
Re: [freenet-dev] New website design
On Thu, Jul 16, 2009 at 7:02 AM, Luke771luke771.li...@gmail.com wrote: Colin Davis wrote: * If the user is on windows (Detectable using http://www.quirksmode.org/js/detect.html), we should link the Download button directly to http://freenet.googlecode.com/files/FreenetInstaller-1222.exe No. A lot of users have a Windows desktop and a *nix server, and would run Freenet on the unix server. I say make one download page for everyone.. or possibly make different download pages according to the detected OS, but all of them including all the version, changing the order (detected OS up) Users who want a download that doesn't match OS detection tend to know what they're doing. I suggest a large button that links directly to the autodetected download, and says which OS it is for. Then, below that, a link to the full download page. For example: http://www.mozilla.com/en-US/firefox/firefox.html Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Installer file name
On Wed, Jul 22, 2009 at 10:41 AM, bren...@artvote.com wrote: Message: 4 Date: Tue, 21 Jul 2009 12:57:28 -0500 From: Ian Clarke i...@locut.us Subject: Re: [freenet-dev] Installer file name To: Discussion of development issues devl@freenetproject.org Message-ID: 823242bd0907211057s4e91e920m35ba1df107081...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 Brendan, The 4 digit number would, in most products, be an internal build number. The problem with Freenet is that we do extremely frequent releases, and to distinguish between them requires a rather large number :-) We do also do infrequent major releases, like 0.7.5 - however in practice there are many many intermediate releases. I'm certainly wide open to ideas about an alternative nomenclature. Ian. -- Just curious, whom exactly does the 4 digit number benefit? Do users care about this number? And if so why? (Sorry if these are dumb questions. Just trying to wrap my head around the issue:) -Brendan It's the build number. Basically a different style of version number. There are frequent updates that increment the build number, and occasional updates that change the version number (eg 0.7.5 - 0.8.0). Version numbers can be thought of as just a name for a specific build. Users care because Freenet does mandatory updates relatively frequently -- nodes won't talk to any node older than the most recent mandatory update. So if you install an outdated build, it won't be able to connect properly (the update over mandatory code should let it update itself and then start working, but it's less than ideal). Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
[freenet-dev] Experiences writing a plugin
I've been slowly working on writing a plugin for Freenet over the past couple weeks. (Details aren't particularly relevant to what follows; for the curious, it's a microblogging / chat application aiming at keeping latencies as low as possible. RFC available at u...@cf9ctasza8w2jafeqmiln49tfrpdz2q5m68m1m5r9w0,NQiPGX7tNcaXVRXljGJnFlKhnf0eozNQsb~NwmBAJ4k,AQACAAE/Fritter-site/1/ ) Having not written much actual Freenet code before, I'm learning a lot about how Freenet works in the process -- which is harder than it has any reason to be. Why? NOTHING IS DOCUMENTED. For example, after retrieving some text from Freenet, my plugin would like to display it on a web page, including filtering it so that it doesn't break anything, even in the case of malicious input. The method HTMLEncoder.encode() sounds like it ought to do that. Let's take a look at the Javadoc: encode public static java.lang.String encode(java.lang.String s) That's it. Nothing about what the method is supposed to do. So, after hunting through several source files (when reading the Javadoc should have sufficed), I have a fairly good sense of what the method does. I'm pretty sure it doesn't do precisely what I'm looking for (though the HTML specs are complicated enough, and I don't know them well enough, that I'm not certain either way). Neither the Javadoc nor inline comments reference any standard that it is trying to conform to. If there was a contract specified, I could fairly easily determine whether that contract matched what I needed -- and therefore whether I should be writing my own function or submitting a patch (or whether I'm misreading the relevant specs, for that matter). If this were an isolated incident, it wouldn't matter much. It isn't. It is the norm for Freenet. For a platform whose primary impediment to wider adoption (IMO, of course) is a lack of things to do with it, rather than a lack of underlying functionality, this is a problem. I haven't tracked it, but I wouldn't be surprised if I've spent nearly as much time trying to figure out how the plugin API works (or even which classes it consists of) as I have actually writing code. In case I haven't made my point yet, here are a few questions I've had. Can anyone point me to documentation that answers them (Javadoc or wiki)? I've spent some time looking, and I haven't found it. Most (but not all) I've answered for myself by reading lots of Freenet code -- a vastly slower process. Some of them I believe represent bugs. How do I make a request keep retrying forever? How do I determine whether a ClientGetter represents an active request? Are there any circumstances under which ClientGetCallback.onSuccess() will be called more than once, and do I need to handle them? Why doesn't ClientGetCallback.onFetchable() get called (more than a trivial time) before onSuccess()? How do I guarantee that an insert won't generate a redirect? What, if anything, should I be doing with this ObjectContainer container that gets passed around everywhere? Under what circumstances will I see a FetchException or InsertException thrown when using the HighLevelSimpleClient? Why does the Hello World plugin break if I turn it into a FredPluginHTTP? None of these seem like overly complex or unexpected questions for someone trying to write a plugin for Freenet. They should all be answerable by reading documentation. On a closely related note, here are a few related issues -- imho they need fixes in the code, but proper documentation of the actual behavior would have left me far less confused when debugging: FreenetURI.setDocName(String) doesn't. Creating a ClientSSK from a FreenetURI and then attempting to insert a file to it throws an exception about wrong extra bytes even though the extra bytes are unchanged from the FreenetURI the node generated for me. At this point, I think I have a much better understanding of why Freenet has so little software that makes use of it, despite the fact that Freenet itself seems to work fairly well. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Experiences writing a plugin
On Thu, Jul 23, 2009 at 2:09 AM, David ‘Bombe’ Rodenbo...@pterodactylus.net wrote: On Thursday 23 July 2009 01:12:15 Evan Daniel wrote: The method HTMLEncoder.encode() sounds like it ought to do that. Let's take a look at the Javadoc: encode public static java.lang.String encode(java.lang.String s) I guess I’m the only having to take the blame for that. HTMLNode and consorts have been committed by me. And though in a local version all of the HTML* classes do have javadoc comments those seem to have been created after I committed them to Freenet. Since the time I wrote those classes I have gone over to javadoc commenting everything I write as I write it. As you already have noticed documentation is not a strong side of developers so that even if I had committed those files with documentation that documentation would now be largely out of date so that it might even be wrong. That would be equally helpful as having no documentation at all. :) I’m aware that the HTMLEncoder class was merely an example that you used to demonstrate what’s wrong the code base in general. Unfortunately there’s not really a way to force developers to write (good!) documentation for the stuff that they do. I have configured my Eclipse to perform javadoc validation on everything, including checking for malformed comments and the like. And I’ve grown opposed to files that show any warning in Eclipse so that at least the code I’m committing in the future will have meaningful javadoc comments. I’m urging everyone to do the same but—as I already said—you can’t enforce it. Well, if you're sufficiently motivated, you can enforce it -- simply don't apply patches that don't include sufficient documentation. That's harsh enough that I'm not sure it's worth it, though. And yes, the HTMLEncoder was merely the most recent example I'd been frustrated with; it's no better or worse than most of the rest of the code I've looked at. If you'd commit your comments on HTMLEncoder, I'd appreciate it. Specifically, I'm trying to figure out whether it's supposed to take well-formed, non-malicious strings and format them so they display properly in HTML, or whether it should also be filtering malicious strings so that they don't screw up the page. Not being an HTML expert, I'm uncertain whether it does the latter. It worries me that it doesn't filter the ascii control characters. The best citation I can find on that says that aside from tab, CR, and LF, they shouldn't appear in HTML documents: http://www.w3.org/MarkUp/html3/specialchars.html However, that's for HTML 3. HTML 4 specifies everything in reference to character encodings, and I'm having trouble answering the question for it. Perhaps someone who knows HTML better than I do could chime in? My personal habit in documenting involves more comments than most of Freenet has. It tends towards big blocks of text rather than Javadoc since I rarely run the javadoc engine against my own code -- my projects tend to be small enough I usually have all the relevant files open in gvim anyway. I think credit for that goes to my Assembly professor, who wouldn't even bother grading things that didn't have sufficient comments. Of course, I'm not particularly good about keeping the comments up to date, which is sometimes better and occasionally worse than no comments. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Experiences writing a plugin
On Sat, Jul 25, 2009 at 12:27 PM, Zero3ze...@zerosplayground.dk wrote: Evan Daniel skrev: Having not written much actual Freenet code before, I'm learning a lot about how Freenet works in the process -- which is harder than it has any reason to be. Why? NOTHING IS DOCUMENTED. [snip] If this were an isolated incident, it wouldn't matter much. It isn't. It is the norm for Freenet. For a platform whose primary impediment to wider adoption (IMO, of course) is a lack of things to do with it, rather than a lack of underlying functionality, this is a problem. I haven't tracked it, but I wouldn't be surprised if I've spent nearly as much time trying to figure out how the plugin API works (or even which classes it consists of) as I have actually writing code. [snip] At this point, I think I have a much better understanding of why Freenet has so little software that makes use of it, despite the fact that Freenet itself seems to work fairly well. I completely agree. I've been pulling my hair over similar issues before as well. The closest thing to documentation was for me the Wiki and/or simply askin toad about what I needed to know. That has been my strategy as well. I tend to think it's a bad use of toad's time to answer questions that could be answered by documentation, and a bad use of my time to wait until he's available to get answers. I guess the fact that the Freenet core is ever-changing has a lot to do with it. If the documentation were merely out of date, I would agree. However, it's not out of date, it's nonexistant. Also, the main APIs have been stable enough for long enough that I don't think this is an excuse any longer, especially for parts like plugins and FCP that are expected to be used by outside programs (as opposed to FNP, etc). Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Experiences writing a plugin
On Sat, Jul 25, 2009 at 3:53 PM, Matthew Toselandt...@amphibian.dyndns.org wrote: The basic answer is that plugins should be written to a defined, fixed, sandboxed, external API. But there is no such API. We will need to implement one eventually. But this takes considerable time, which could be spent on things which give a more direct benefit to end users in terms of usability, performance or security. My general complaint is not that the plugin api isn't well defined and perfect, though that would be nice. It's that what does exist isn't documented. My current take on coding for Freenet is that the API isn't perfect, but that I could work with it if I knew what it did. However, figuring out what it does requires reading significant quantities of Freenet source code as often as not. On Thursday 23 July 2009 00:12:15 Evan Daniel wrote: I've been slowly working on writing a plugin for Freenet over the past couple weeks. (Details aren't particularly relevant to what follows; for the curious, it's a microblogging / chat application aiming at keeping latencies as low as possible. RFC available at u...@cf9ctasza8w2jafeqmiln49tfrpdz2q5m68m1m5r9w0,NQiPGX7tNcaXVRXljGJnFlKhnf0eozNQsb~NwmBAJ4k,AQACAAE/Fritter-site/1/ ) Having not written much actual Freenet code before, I'm learning a lot about how Freenet works in the process -- which is harder than it has any reason to be. Why? NOTHING IS DOCUMENTED. For example, after retrieving some text from Freenet, my plugin would like to display it on a web page, including filtering it so that it doesn't break anything, even in the case of malicious input. The method HTMLEncoder.encode() sounds like it ought to do that. Let's take a look at the Javadoc: encode public static java.lang.String encode(java.lang.String s) That's it. Nothing about what the method is supposed to do. So, after hunting through several source files (when reading the Javadoc should have sufficed), I have a fairly good sense of what the method does. I'm pretty sure it doesn't do precisely what I'm looking for (though the HTML specs are complicated enough, and I don't know them well enough, that I'm not certain either way). Neither the Javadoc nor inline comments reference any standard that it is trying to conform to. If there was a contract specified, I could fairly easily determine whether that contract matched what I needed -- and therefore whether I should be writing my own function or submitting a patch (or whether I'm misreading the relevant specs, for that matter). HTMLEncoder is supposed to encode text so that it can be safely inserted into HTML, as its name suggests. It is used widely and hopefully it is sufficient. See https://bugs.freenetproject.org/view.php?id=3335 If this were an isolated incident, it wouldn't matter much. It isn't. It is the norm for Freenet. For a platform whose primary impediment to wider adoption (IMO, of course) is a lack of things to do with it, rather than a lack of underlying functionality, this is a problem. I haven't tracked it, but I wouldn't be surprised if I've spent nearly as much time trying to figure out how the plugin API works (or even which classes it consists of) as I have actually writing code. That is an interesting theory. I guess since we have a load of money from Google it would be possible to spend some time on the plugin API??? I think it would be *very* wise to spend time on the Javadoc for what currently exists, and making assorted small improvements. (For example: InsertContext should be cloneable, like FetchContext.) As far as writing a better API, or sandboxing, I'm not as certain. The current one is probably workable, and there are lots of important things that need doing. In case I haven't made my point yet, here are a few questions I've had. Can anyone point me to documentation that answers them (Javadoc or wiki)? I've spent some time looking, and I haven't found it. Most (but not all) I've answered for myself by reading lots of Freenet code -- a vastly slower process. Some of them I believe represent bugs. How do I make a request keep retrying forever? Set maxRetries to -1 Yeah, I figured that out eventually -- by asking in IRC, I think. But it isn't documented. It would take all of one sentence in the Javadoc to make it easy for me or any other plugin author to find, and that's a lot easier than answering the question every time someone has it. How do I determine whether a ClientGetter represents an active request? Dunno. https://bugs.freenetproject.org/view.php?id=3336 Are there any circumstances under which ClientGetCallback.onSuccess() will be called more than once, and do I need to handle them? Shouldn't be, if there are it's a bug. It happens on every fetch, afaict. https://bugs.freenetproject.org/view.php?id=3331 Why doesn't ClientGetCallback.onFetchable() get called (more than a trivial time) before onSuccess
Re: [freenet-dev] NedaNet evaluation of Iran circumvention technologies on Freenet
On Mon, Jul 27, 2009 at 9:18 AM, Matthew Toselandt...@amphibian.dyndns.org wrote: - Easy insert of freesites. We need a good freesite insertion wizard, a plugin as part of the base install. Maybe another attempt at an easy blogging wizard too. [...] - Better, integrated, attack resistant, easy to use, fast, chat. Negative trust issues may or may not be important, depending on the deployment, likely not a problem in the short term anyway. Embedded chat forums would probably work well... - Microblogging. (I'll reply to the rest in detail later.) For these three things, I think the biggest thing you could do would be documentation of what currently exists of the plugins API. Obviously I only speak for myself, and of possible such applications mine is the least developed, but that's what I'd find helpful. I've considered taking what I've learned and turning it into a piece of example code that's very simple, but demonstrates a bit more than the HelloWorld plugin. For example, it would demonstrate basic plugin loading, and a little of how to use the HLSC. If you think that would be useful, I can probably put such a thing together today (depends how much real life interferes). The second thing would be to address https://bugs.freenetproject.org/view.php?id=3338 Right now I see latencies in testing of typically 15-40 seconds, with 30 seconds as a typical number. I suspect it could be much better. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
[freenet-dev] RSKs (was Re: NedaNet evaluation of Iran circumvention technologies on Freenet)
in a correctly-formatted manner. - The ability to verify a binary blob and read the data it contains. This would require also knowing the key it represents (AIUI, the blob normally contains the routing key but not the decryption key, just like the datastore). If Alice intends to revoke Bob's site, she probably wants to first check that the blob she is about to insert will do that (inserting the revoke for the wrong site would be bad). When she receives the revoke blob from Bob, she probably also wants to check that the revoke message says what Bob said it did. This verification step needs to not actually insert the blob. - And, obviously, the ability to actually insert a blob that the user has. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Variable opennet connections: moving forward
On Thu, Jun 18, 2009 at 8:50 PM, Evan Danieleva...@gmail.com wrote: On Thu, Jun 18, 2009 at 8:00 PM, Matthew Toselandt...@amphibian.dyndns.org wrote: Are you doing more testing? On Saturday 13 June 2009 19:05:36 Evan Daniel wrote: On Sat, Jun 13, 2009 at 1:08 PM, Matthew Toselandt...@amphibian.dyndns.org wrote: Now that 0.7.5 has shipped, we can start making disruptive changes again in a few days. The number one item on freenet.uservoice.com has been for some time to allow more opennet peers for fast nodes. We have discussed this in the past, and the conclusions which I agree with and some others do: - This is feasible. - It will not seriously break routing. - Reducing the number of connections on slow nodes may actually be a gain in security, by increasing opportunities for coalescing. It will improve payload percentages, improve average transfer rates, let slow nodes accept more requests from each connection, and should improve overall performance. - The network should be less impacted by the speed of the slower nodes. - But we have tested using fewer connections on slow nodes in the past and had anecdotal evidence that it is slower. We need to evaluate it more rigorously somehow. - Increasing the number of peers allowed for fast opennet nodes, within reason, should not have a severe security impact. It should improve routing (by a smaller network diameter). It will of course allow fast nodes to contribute more to the network. We do need to be careful to avoid overreliance on ubernodes (hence an upper limit of maybe 50 peers). - Routing security: FOAF routing allows you to capture most of the traffic from a node already, the only thing stopping this is the 30%-to-one-peer limit. - Coalescing security: Increasing the number of peers without increasing the bandwidth usage does increase vulnerability to traffic analysis by doing less coalescing. On the other hand, this is not a problem if the bandwidth usage scales with the number of nodes. How can we move forward? We need some reliable test results on whether a 10KB/sec node is better off with 10 peers or with 20 peers. I think it's a fair assumption for faster nodes. Suggestions? I haven't tested at numbers that low. At 15KiB/s, the stats page suggests your slightly better off with 12-15 peers than 20. I saw no subjective difference in browsing speed either way. I'm happy to do some testing here, if you tell me what data you want me to collect. More testers would obviously be good. We also need to set some arbitrary parameters. There is an argument for linearity, to avoid penalising nodes with different bandwidth levels, but nodes with more peers and the same amount of bandwidth per peer are likely to be favoured by opennet anyway... Non-linearity, in the sense of having a lower threshold and an upper threshold and linearly add peers between them but not necessarily consistently with the lower threshold, would mean fewer nodes with lots of peers, and might achieve better results? E.g. 10 peers at 10KB/sec ... 20 peers at 20KB/sec (1 per KB/sec) 20 peers at 20KB/sec ... 50 peers at 80KB/sec (1 per 3KB/sec) I wouldn't go as low as 10 peers, simply because I haven't tested it. Other than that, those seem perfectly sensible to me. We should also watch for excessive cpu usage. If there's lots of bw available, we'd want to have just enough connections to not quite limit on available cpu power. Of course, I don't really know how many connections / how much bw it is before that becomes a concern. Evan Daniel I'd been running the Spider, and trying to get a complete run out of it in order to provide a full set of bug reports. Unfortunately, after spidering over 100k keys (representing over a week of runtime), the .dbs file became unrecoverably corrupted, and it won't write index files. I had started rerunning it; I've since paused that and started taking data on connections. I've got a little data so far at 12KiB/s limit, 10 and 12 peers. Basically, I don't see a difference between 10 and 12 peers. Both produce reasonable performance numbers. My node has 2 darknet peers, remainder opennet. I'm not using the node much during these tests; it has a few MiB of downloads queued that aren't making progress (old files that have probably dropped off). Evan Daniel 12 peers, 12 KiB/s limit # bwlimitDelayTime: 91ms # nodeAveragePingTime: 408ms # darknetSizeEstimateSession: 0 nodes # opennetSizeEstimateSession: 63 nodes # nodeUptime: 1h37m # Connected: 10 # Backed off: 2 # Input Rate: 2.54 KiB/s (of 60.0 KiB/s) # Output Rate: 12.9 KiB/s (of 12.0 KiB/s) # Total Input: 31.3 MiB (5.5 KiB/s average) # Total Output: 47.5 MiB (8.34 KiB/s average) # Payload Output: 32.6 MiB (5.73 KiB/sec)(68%) 1469 Output bandwidth liability 18 SUB_MAX_PING_TIME Success rates Group P(success
Re: [freenet-dev] status
On Tue, Jul 28, 2009 at 3:26 PM, Florent Daignièrenextg...@freenetproject.org wrote: * Matthew Toseland t...@amphibian.dyndns.org [2009-07-28 20:03:42]: So we need to deal with the bug tracker. But it will take several days work (= approximately the cost to both FPI and Ian of emu for 1.5 months). Is this really a high priority? IMHO losing our existing bugs database will cost significant work in the medium to long term, hence the need to migrate data... Last time people talked about it, only you and xor where objecting to trashing the bug database altogether and switching to another bug tracker (possibly hosted by someone else). As someone who has submitted a number of both feature requests and bugs to the database, I would be right annoyed if they were simply dropped without any plan to keep track of and address them. That said, I have no particular attachment to the current software, and no objection to changing to something else if would improve things. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Freenet blog plugin?
On Thu, Jul 30, 2009 at 1:08 PM, Matthew Toselandt...@amphibian.dyndns.org wrote: On Thursday 30 July 2009 14:52:49 Zero3 wrote: Matthew Toseland skrev: It has been pointed out that a minimal blog engine can be written in approx 22KB of php - around 800 lines of code at most. I suspect that given a template I could probably put together a blog plugin in a few days. This would integrate with Freetalk for comments and for announcing the site initially. It should make it easier to contribute content to Freenet, eliminating the need to get Thingamablog working, etc. Thoughts? IMHO it would be a better idea to have someone less experienced with Freenet development work on and maintain such (with your mentoring). A project like this seems like a great opportunity to get a new developer into working with Freenet. Given that such person is available, of course. The problem is we had at least one try in the past and fail... and there may be significant benefits to having such functionality sooner rather than later. My personal opinion is that your time would be better invested in documenting the plugins interface. I think it would be far easier to find another developer willing and able to write such a plugin if the interface was actually documented. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Serious near-practical AES attack, consequences for Freenet
On Fri, Jul 31, 2009 at 2:07 PM, Matthew Toselandt...@amphibian.dyndns.org wrote: http://www.schneier.com/blog/archives/2009/07/another_new_aes.html Practical related-key/related-subkey attacks on AES with a 256-bit key with 9, 10 and 11 rounds. The official standard uses 14 rounds, so there is precious little safety margin - attacks always get better. We use AES/256 (technically we use Rijndael with 256 bit key and 256 bit block size mostly, which isn't strictly AES, although we use 128 bit block size, which is, for store encryption). Such attacks rely on related-key weaknesses in the protocol (as in WEP, where the IV was too small). In theory we shouldn't have any, although I am not entirely sure how to determine this. We shouldn't have known ciphertext, because we have an unforgeable authenticator on all packets, but I'm not sure exactly what the definition of a related-key weakness is. Nonetheless, it would seem prudent to increase the number of rounds as Schneier outlines (28 rounds for a 256-bit key). We have the infrastructure to do this without too much trouble, with key subtypes and negotiation types. Moving to AES/128 would be considerably more work. I think it would be worth trying to get someone who is a qualified cryptographer to look in detail at how Freenet uses cryptography. Freenet does a *lot* of crypto, mixed together in ways that aren't necessarily common. It's also a very interesting project from a cryptographic standpoint; it seems possible that someone could be talked into doing it on a volunteer basis. Even if it wasn't volunteer, it might be worth seeing how much a proper review would cost. Cryptographic review seems appropriate for a program which relies so strongly on the strength of its cryptography. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] please please please can we get a better screenshot?
On Mon, Aug 3, 2009 at 2:04 PM, Matthew Toselandt...@amphibian.dyndns.org wrote: On Monday 03 August 2009 16:17:12 Ian Clarke wrote: Right now the most visible thing on our website is a screenshot that really doesn't look good. Can someone create a screenshot that looks more appealing? This could be a screenshot of JSite, or Frost, anything that shows Freenet doing its thing but actually looks appealing? IMHO a screenshot of the default theme would make more sense than a screenshot of the minimalblue theme. We do not bundle either jSite or Frost (or Thingamablog, or FMS). Thingamablog is the best tool for creating a blog, jSite is the best tool for uploading existing content, both have been reviewed by core devs but Thingamablog is a bit large... But maybe we should bundle them anyway? Either that or replace them in the near future with web-based plugins doing the same job. How would a new user find out about such software? It doesn't look obvious from the front page of the site to me. Frost and FMS have links from the discussion tab on the node page, but jSite and Thingamablog don't. There's some info on the documentation page of the freenetproject.org, but that's not where I would think to look for Freenet-related applications. Is Thingamablog maintained? Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] F2F web proxy???
On Thu, Aug 6, 2009 at 8:09 AM, Matthew Toselandt...@amphibian.dyndns.org wrote: I propose that as a darknet value-add, and as an additional tool for those in hostile regimes who have friends on the outside, we implement a web-proxy-over-your-darknet-peers option. Your Friends would announce whether they are willing to proxy for you, and you could choose which friends to use, or allow it to use all of them (assuming people on the inside don't offer). You could then configure your browser to use Freenet as a proxy. This would not provide any anonymity but it would get you past network obstacles and/or out of Bad Place and into Happy Place. It's not a long term solution, but: - We have expended considerable effort on making darknet viable: IP detection, ARKs etc. - It could take advantage of future transport plugins, but even before that, FNP 0.7 is quite hard to block. - Many people are in this situation. - It is easy to implement. HTTP is complex but cache-less proxies can be very simple. - It could be combined with longer term measures (growing the internal darknet), and just work for as long as it works. Most likely it would be throttled rather than blocked outright to start with, hopefully allowing for a smooth-ish migration of users to more robust mechanisms... - We could allow recursive proxying to some depth - maybe friend of a friend. This would provide a further incentive to grow the internal darknet, which is what we want. - The classic problem with proxies is that they are rare so hundreds of people connect to them, and the government finds out and blocks them. This does not apply here. I like it. Darknet features are a very good thing. This probably also needs some care wrt bandwidth management (related to 3334 -- similar considerations probably apply). However, as I mentioned on IRC, there are several things I think should be higher priority. Of course, I'm not the one implementing any of this, but here's my opinion anyway ;) In no particular order: - Documentation! Both the plugins api and making sure that the FCP docs on the wiki are current and correct. - Bloom filter sharing. (Probably? I have no idea what the relative work required is for these two.) - Freetalk and a blogging app of some sort (though these are probably mostly for someone other than toad?). - A few specific bugs: 3295 (percent encoding is horribly, embarrassingly broken -- in at least 5 different ways), 2931 (split blocks evenly between splitfile segments -- should help dramatically with availability), fixing top block healing on splitfiles (discussed in 3358). - Low-latency inserts flag as per 3338. (I know, most people probably don't care all that much, but I'd really like to see whether Freenet can hit near-real-time latencies for the messaging app I'm working on.) Also, it's worth considering other ways to make darknet connections more useful (in addition to this, whether before or after I don't have a strong opinion on). Enabling direct transfer of large files would be good (at a bare minimum, this shouldn't fail silently like it does right now). Improving messaging would be good; I should be able to see recently sent / received messages (including timestamps), queue a message to be sent when a peer comes online, and tell whether a message I've sent arrived successfully. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] How to start a new translation?
On Thu, Aug 6, 2009 at 1:20 PM, Alex Pyattaevalex.pyatt...@gmail.com wrote: Hi there! I'd like to make a russian translation for freenet program (it seems it will be very popular here some time soon). I just want to know how to organize my work so that it is not lost in vain. Thanks. Glad to hear it! It should be fairly easy. First, go to configuration - core settings, and set your language to 'unlisted'. Then, go to configuration - translation and start translating. When you're done (some or all of the strings; you don't have to do it all at once), you can download a translation file from the translations page and email it to this list. I'm not certain how you specify the language name / country code in this process (I haven't actually done a translation myself). Just be sure to say what language it is in your email. Evan Daniel ___ Devl mailing list Devl@freenetproject.org http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl