[freenet-dev] 0.7 routing explanation?

2005-10-26 Thread Evan Daniel
Is there an overview of the 0.7 routing architecture online somewhere?
 I'm curious as to how it compares to previous routing schemes Freenet
has used.

Thanks!

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Re: [Tech] Bandwidth usage too low? Insert slowness...

2006-04-06 Thread Evan Daniel
Does Freenet report its bandwidth usage?

I show 10-20KB/s usage, with a bandwidth setting of 100K.

It would be interesting to know the bandwidth settings of the nodes
I'm connected to...

Evan

On 4/6/06, Matthew Toseland [EMAIL PROTECTED] wrote:
 Is your 0.7 node consistently using less bandwidth than it should? Much
 less? If this is the case then it may show why inserts average 1kB/sec
 at the moment...

 Does anyone have any other ideas why inserts are so slow?
 --
 Matthew J Toseland - [EMAIL PROTECTED]
 Freenet Project Official Codemonkey - http://freenetproject.org/
 ICTHUS - Nothing is impossible. Our Boss says so.


 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.1 (GNU/Linux)

 iD8DBQFENW3hHzsuOmVUoi0RAujfAJ46TmYq7Qjte4cHO6dOM9UxkwFncACgk25b
 6WcPJtVdj88JSOpwqOfzNRk=
 =+xdL
 -END PGP SIGNATURE-


 ___
 Tech mailing list
 Tech@freenetproject.org
 http://emu.freenetproject.org/cgi-bin/mailman/listinfo/tech


___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] 3 or more

2006-04-18 Thread Evan Daniel
On 4/18/06, Volodya [EMAIL PROTECTED] wrote:
 Is it possible to ask for any explanation for the 3 link limit that you have 
 imposed upon
 0,7? I really do not understand how i am suppose to switch to that network 
 now, my
 original plan was to contact all my friends who i trust and tell them that if 
 they ever
 wanted to achieve anonymity, my node is open to them, then i'd connect to 
 them and we'd
 start our little network, slowly the network would continue to grow, as they 
 would invite
 their friends, etc, and everybody would be happy.

 With the limit i can only contact people who already know at least 2 other 
 people willing
 to try freenet out, or i have to ask my friends to compromise their anonymity 
 to each
 other (many of my friends don't know each other), or i will have to run 3 
 nodes on my
 computer to start with (my current plan) which will allow for us to go around 
 the limit,
 until such time when people will start finding other links themselves.

 Free your mind and seek the truth.
   - Volodya

With  3 links you are either a leaf node (1 link) or on a chain (2
links).  In either case, no real routing is occuring.

I would suggest that you consider a) running with too few links, which
will work ok on small networks (a guess ;) ) and b) finding more
friends to ask.  Are there people you've met online and trust enough? 
Here I would put trust as you believe they're not EvilCorp (tm) or
Big Brother, you believe they are at least somewhat paranoid about
their identity, and you believe they won't directly attack your
privacy.

Personally I contacted a few friends and then got on IRC to get more
connections.  I think there is still a large security improvement over
0.5 (mostly from no harvesting; an attacker would have to do real work
to find people).  If I was worried about my annonymity in a more than
passing way I'd do what I suggested above.

Good luck!

Evan Daniel

PS send me a noderef and I'll connect with you!  You can trust me, I promise ;)
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Freenet Name server

2006-06-18 Thread Evan Daniel

On 6/18/06, Colin Davis [EMAIL PROTECTED] wrote:

Juiceman wrote:
 What if 2 or more lists have the same user-friendly name but have them
 pointing to different keys?  How would this be handled?

Keep in mind, unless it's from your list, you don't access Key.. You
access Username\Key, where username is the name chosen on the USK page.

That way, you can have both E1ven\Coolness and Juiceman\Coolness, and
both work.

As for two versions of a single list.. Ie, Getting Bob's list from Bob,
versus Bob's list from Sally, I suppose you'd want to add a version
number. Aum?


What if I have 2 lists, one from someone I call Alice and one from
someone I call Bob.  Then Alice adds a friend she calls Bob (who's not
the same as the one I call Bob).  What now?

Or alternately both Alice and Bob add different Charlies...

Evan
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Freenet Name server

2006-06-18 Thread Evan Daniel

On 6/18/06, Colin Davis [EMAIL PROTECTED] wrote:

Hrmm..

For one, keep in mind that these names are for the User's beneifit, not
the node's... In reality, Charles's nodelist is a unique USK, at
[EMAIL PROTECTED]

So that means that the User might get confused, but the node knows the
difference between them.

Because the node knows the difference between them (the node knows teh
difference between the charles that Alice added, and the charles Bob
added, since the File includes the original USK), you can prioritize-
You can tell the name resolver to draw from the lists in in order of
their place in the file.. So if you have


Alice [EMAIL PROTECTED]
Charles [EMAIL PROTECTED]
Charles [EMAIL PROTECTED]
Bob [EMAIL PROTECTED]


Your name resolver will trust the top one more than the second, and so on.

Doing it that way automatically passes the order on to the people
subscribed to your list.. They inherit your trust relationship, and your
priority by default.


I think in practice, link pages would end up being uniquely named.
Do you have a better suggestion, without giving each node a user-facing
generated number?


Nope.  Propogating prioritizations was the best I could come up with
on short notice.  Or expose directory structures, like
Alice/Charlie/somesite and Bob/Charlie/somesite, but that seems worse.
Prioritization does mean that different people will have
Charlie/somesite resolving to different locations, though.  Which
means I can't necessarily assume I can give it to someone on a
business card...

In the extreme case, this enables a redirection attack if people high
up in the list decide they don't like someone.  And most users would
never notice it in all likelihood, so it's hard to police.

I don't see a good answer to these problems, unfortunately.  Keep
thinking, though, it's well worth pursuing.

(Actually I do have one idea on it, I'll give it some thought and post
if I still like it).

Evan
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Semi-opennet?

2006-06-27 Thread Evan Daniel

On 6/27/06, Lars Juel Nielsen [EMAIL PROTECTED] wrote:

On 6/27/06, Ruud Javi [EMAIL PROTECTED] wrote:
 Do we want semi-opennet support? This would be a way to connect, with
 mutual advance consent, to peers of our direct peers? (There would be
 measures taken to ensure that we don't connect to peers of their direct
 peers).

 Well, I am not sure but I am not a fan of it.

 Semi-opennet to me sounds like the worst of two worlds. The idea of darknet
 is that you need tot trust your neighbors, but you are pretty safe to
 everyone else. I think a semi-opennet would give a less safe network,
 because people are connecting to people they have not added them selves. My
 guess is that people would turn it on because it would make Freenet faster,
 while it would also make it less safe for them imho.

 Further, you would still need to add some connections, so this would not
 bring in the big user group that is looking for an opennet-version of
 Freenet .7 at all.

 If you have some special reasons/ arguments for this semi-opennet, please
 post. If you want we could discuss about if there should be an opennet in
 Freenet .7 , and how it should look like. I have some other ideas to get
 people to freenet .7 that wants an opennet. Unfortunately I am already
 seeing a few weak points, so other idea's might be better :)

 greetings,
 Ruud


It sounded really nice but I think you're right, this would be a bad idea.


Well, in my mind it's not really a question of whether it's a good
idea or a bad one in the absolute sense, but of whether it's better or
worse than people exchanging noderefs with random people on IRC, or
some other hackish way of putting together an opennet.

It's a lot easier to find 1-2 friends and connect to them and a couple
neighbors than it is to find 5-6 independent links.

Also, I believe this would provide better topology than the current
mess of people connecting to random nodes, even if it only meant that
each person connected to a couple random nodes and a couple of their
immediate neighbors, no?

Anyway, the goal is to improve things, not to hold out for perfection,
and this seems likely to improve on the status quo.

I would also advocate highly visible security warnings about it.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Darknet and opennet: semi-separate networks?

2006-08-15 Thread Evan Daniel

On 8/15/06, Matthew Toseland [EMAIL PROTECTED] wrote:


Because in many cases the network we provide it with is not a single
small world network (which is what it is designed for), but two loosely
connected small world networks of different parameters.


It seems likely to me that interest in content will closely match
connectedness of the networks -- content created on the chinese
network will be of interest on the western network to a degree
approximately proportional to the interconnectedness of those
networks.  So bottlenecks in the topology are present only in places
where they aren't a problem.

Obviously I have no proof of this, but it seems at least as intuitive
to me as the assumption that there will be a pair of loosely connected
networks in such a way as to create a bottleneck.

I think it is inappropriate to spend time or effort worrying about
this problem until we have both a method to simulate the network in
question and a set of load balancing / routing algorithms that work on
a single network that we can test on a split network.  The only
counter argument to this that I can see is if there is obvious reason
to believe that decisions made without worrying about this possibility
will be actively problematic later in the development process, and
that seems unlikely in the extreme to me.

And lastly, why shouldn't the split network be small-world?  By
small world I assume you mean the triangle property holds, ie if a and
b are connected, and b and c are too, then there is a significantly
increased probability of a and c being connected.  Is there some
reason to believe that this property fails as soon as national /
cultural borders get in the way?  I can see there being bottlenecks,
but I don't see how that precludes the small-world nature of the
network.

Evan
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Darknet and opennet: semi-separate networks?

2006-08-17 Thread Evan Daniel

On 8/17/06, Ian Clarke [EMAIL PROTECTED] wrote:






On 17 Aug 2006, at 09:58, Matthew Toseland wrote:

On Thu, Aug 17, 2006 at 09:37:02AM -0700, Ian Clarke wrote:

I don't believe that the darknet and opennet will be weakly connected
as you suggest, but neither of us can no for sure until we see it.


We can know for near certain that darknets operating in hostile
environments will be weakly connected to the opennet, and probably to
other darknets too, for the simple reason that they CANNOT use opennet.


No, but they can be connected to peers outside the hostile environment that can 
be promiscuous.



Can they?  If the outside peer is promiscuous, then it can be
harvested (with some greater amount of effort than for 0.5, right?).
So can't a hostile gov't harvest external promiscuous nodes and block
all traffic to / from them?  Then you'd need a user behind the
firewall to connect to a darknet-only node outside the firewall, which
would then connect to promiscuous nodes via darknet connections.

That might be a problem...  And it's definitely a way in which having
an open-net hurts the darknet (though I do agree that we have a
defacto open-net right now).

Evan
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Insertion Verification

2006-08-25 Thread Evan Daniel

On 8/25/06, Michael Rogers [EMAIL PROTECTED] wrote:


* Can reputations be negative as well as positive?

* If reputations can be negative, how do you prevent nodes from
generating new identities to escape bad reputations (whitewashing)?

* If reputations can only be positive, do new nodes start with a zero
reputation or a slightly positive reputation?

* If new nodes start with a zero reputation, why should anyone trust them?

* If new nodes start with a slightly positive reputation, how do you
prevent nodes from generating new identities to return to a slightly
positive reputation (whitewashing again)?


Starting with a neutral reputation and the ability to have both
positive and negative reputations is exactly equivalent to starting
with a slightly positive reputation and the ability to have only
positive reputations.

I would argue that since the only absolute reference point is a new
node, it ought to have reputation 0 by definition.  And there's no
reason not to allow negative reps, but we should assume that any node
that has a negative rep will ditch its old identity, and therefore
negative reps are useful only if they can be tied to something hard to
change, like one of our darknet links.

Evan
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] SHA-1 broken at the Crypto 2006

2006-09-05 Thread Evan Daniel

On 9/5/06, Michael Rogers [EMAIL PROTECTED] wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Matthew Toseland wrote:
 We will be using STS, at least initially. Which means checking a
 signature.

Cool, IANAC but I think we should be OK.


As long as we're signing the data, not its hash; in normal use, one
signs the hash of the data for compute cost reasons (and IIRC there
are security reasons too, but I don't have Applied Cryptography in
front of me right now).  That is secure as long as there is second
preimage resistance, but the hash function *is* security critical.

Evan
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Alpha, Darknet routing, et al.

2008-02-01 Thread Evan Daniel
On Feb 1, 2008 12:00 PM, Robert Hailey [EMAIL PROTECTED] wrote:


 On Jan 31, 2008, at 6:48 PM, Evan Daniel wrote:

  On Jan 30, 2008 5:49 PM, Matthew Toseland
  [EMAIL PROTECTED] wrote:
 
  You also need an escape-route mechanism - a way to find an entrance
  into
  another network once regular routing has exhausted the local network.
 
  Doesn't this allow an attacker to selectively DOS the bottleneck
  points by sending out requests for non-existant data?
 
  Evan Daniel

 If we allow the requestor to specify which network they are trying to
 get to, then maybe (but the node still can rejectoverload like any
 other). I think it would work better to the negative; specify which
 networks *not* to route to, this would not only help on a reject of a
 network-gateway node, but it also lets nodes w/o a good routing table
 to use the same mechanism.

Even if the requestor can't specify a target network, I think it
works.  If the model is that the request is first routed within the
network, and if that fails it tries to find an escape route -- then
that escape route is a bottleneck (by definition).

The nodes using rejectoverload is insufficient, I think -- they'll
reject the attacker's requests and real requests with similar
probability, and so performance for real requests will degrade
substantially.  Now the attacker only needs resources comparable to
the bottlenecks; they don't even have to know where those bottlenecks
are in order to seriously degrade the network topology.

I'm not familiar enough with the details of the proposed ULPRs and how
USKs and Frost and the like check for new updates / messages, but it
seems possible that simple legitimate checks for new content would
have a similar effect.  Of course, failure tables would help a lot
with that case, but they wouldn't help against a malicious attacker.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Alpha, Darknet routing, et al.

2008-02-01 Thread Evan Daniel
On Feb 1, 2008 12:57 PM, Matthew Toseland [EMAIL PROTECTED] wrote:
  Even if the requestor can't specify a target network, I think it
  works.  If the model is that the request is first routed within the
  network, and if that fails it tries to find an escape route -- then
  that escape route is a bottleneck (by definition).
 
  The nodes using rejectoverload is insufficient, I think -- they'll
  reject the attacker's requests and real requests with similar
  probability, and so performance for real requests will degrade
  substantially.  Now the attacker only needs resources comparable to
  the bottlenecks; they don't even have to know where those bottlenecks
  are in order to seriously degrade the network topology.
 
  I'm not familiar enough with the details of the proposed ULPRs and how
  USKs and Frost and the like check for new updates / messages, but it
  seems possible that simple legitimate checks for new content would
  have a similar effect.  Of course, failure tables would help a lot
  with that case, but they wouldn't help against a malicious attacker.

 Could ULPRs help to resolve it? Would it be possible to estimate the demand
 for a key (in a way which doesn't favour single nodes that constantly
 rerequest, and is biased by links so that an attacker could only attack
 proportionately to the number of connections he has), in order to decide
 which requests to let through?

I think ULPRs will do a good job of preventing legitimate traffic from
creating such an effect.  A malicious attacker, however, would have no
reason to repeat keys, so any technique that simply tries to make
re-requests more efficient would have no effect.  Biasing on
popularity is probably a good thing, and if it can be done in a
relatively attack-proof manner, might be the solution.

Do we have any understanding of how well network clusters will
correlate with content clusters?  That is, if there are effectively
two networks, especially if they result from cultural and language
barriers, to what extent will the two sides be uninterested in
communicating with each other?  I think having a ballpark answer to
that question will go a long way in determining how big a problem this
really is, and also what sort of solutions might be appropriate.  Of
course, it sounds hard to answer :)

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Packet size proposal

2008-03-08 Thread Evan Daniel
On Sat, Mar 8, 2008 at 7:47 AM, Michael Rogers [EMAIL PROTECTED] wrote:
 Matthew Toseland wrote:
   RFC 2861 is rather late in the day for TCP. If we had set out to copy TCP 
 we
   would likely not have seen it.

  So what's your point - that because TCP has bugs, anything that's not
  TCP won't have bugs? We're the only people looking for bugs in Freenet.
  Lots of people are looking at TCP.


   Not possible. Well, maybe with transport plugins we'd have it as well as 
 UDP,
   but our primary transport will be based on UDP for the foreseeable future,
   because of NATs.

  TCP NAT traversal is nearly as reliable as UDP now, and likely to get
  better as new NATs implement the BEHAVE standards. Plus we have UPnP
  (yes, it's ugly and unreliable, but it's better than nothing) and
  NAT-PMP. Again, a lot of other people are working on this.

At least for the near term future, and probably longer, we need an
answer other than TCP because of ugliness like Comcast's Sandvine
hardware.  Forged TCP reset packets are non-trivial to deal with, but
the equivalent problem doesn't even exist for UDP.

Also, most consumer-level NATs are probably old devices that won't be
upgraded any time soon.  Remember, we want to handle an average user's
NAT well, even if they can't / won't change the settings when Freenet
asks them to.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Packet size proposal

2008-03-08 Thread Evan Daniel
On Sat, Mar 8, 2008 at 9:30 AM, Michael Rogers [EMAIL PROTECTED] wrote:
 Evan Daniel wrote:
   At least for the near term future, and probably longer, we need an
   answer other than TCP because of ugliness like Comcast's Sandvine
   hardware.  Forged TCP reset packets are non-trivial to deal with, but
   the equivalent problem doesn't even exist for UDP.

  True, UDP is more robust than TCP against this particular attack, but
  that just means the next logical step in the P2P vs ISP arms race is for
  all the P2P apps to move to UDP, and then the ISPs will just start
  throttling UDP instead of forging RSTs. Ultimately if your ISP doesn't
  want to carry your traffic, they won't carry it.

Sure, the arms race will continue.  Hence near-term future.  For the
near-term future, we want to be on the winning side of it, rather than
assuming we can switch to the way that *currently doesn't work*.

You're also ignoring the reason they're forging resets rather than
throttling -- they don't need to modify the main routing hardware to
inject packets, they do need to modify it to drop or delay them.  Thus
it's cheaper to forge TCP resets than to throttle UDP or TCP.  They
certainly *can* throttle things properly, but the point remains that
they aren't, and likely will continue doing exactly what they're doing
for the near term future.

I don't know what the state of legacy NATs is.  I had been of the
impression UDP worked better, but I could easily be mistaken.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Post 0.7 idea: off-grid darknet!

2008-05-10 Thread Evan Daniel
On Sat, May 10, 2008 at 12:33 PM, Ian Clarke [EMAIL PROTECTED] wrote:
 Ian is of the view that this should be a separate application based on 
 similar
 principles to Freenet. I'm not. We agree that there are some significant
 issues to deal with. I am of the view that these networks are mutually
 complementary and therefore should talk to each other

 I think the use-cases are too different for these to be part of the
 same application.

IMHO, there's another interesting use-case.  If I have a friend or two
I see daily at work or similar, and we swap 8GB memory cards, that
represents more bw than my cable modem uplink!  (And the cost of a
memory card is lower than 1 month's subscription, provided it gets
swapped most days.)  There's an interesting hybrid option here -- for
large queued downloads, requests go over the network link, but
responses go over sneakernet.

I think flood routing inserts opportunistically is a good idea --
there's no point in sending out a memory card less than full, and
routed requests / inserts may well not be enough to fill it.

One interesting case is Cuba -- there's an operational sneakernet
there already:
http://www.nytimes.com/2008/03/06/world/americas/06cuba.html?ex=1362546000en=eff6155b2c2d280dei=5124partner=permalinkexprod=permalink
Currently it's basically manually flood routed, but I imagine there
would be significant demand for proper freenet routing to distribute
entertainment; everyone wants to see the latest news media, but
perhaps not the entertainment stuff depending how much there is.
There may also be significant numbers of local wifi hops available
that aren't boardly connected (pure speculation on my part), so
switching back and forth between regular Freenet links and sneakernet
links could be useful.  Also, in small communities where there's
strong motivation and short geographical distances, you may well find
the motivation sufficient to produce latencies of a couple hours, not
a day or so, at least in some cases.

I have visions of Neo from The Matrix, sitting in a darkened apartment
and acting as clandestine data broker...

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Post 0.7 idea: off-grid darknet!

2008-05-12 Thread Evan Daniel
On Mon, May 12, 2008 at 6:48 PM, Michael Rogers [EMAIL PROTECTED] wrote:
 Evan Daniel wrote:

  I think flood routing inserts opportunistically is a good idea --
  there's no point in sending out a memory card less than full, and
  routed requests / inserts may well not be enough to fill it.
 

  My knee-jerk reaction was flooding doesn't scale, but it's actually
 worked alright for Usenet - with a couple of tweaks. First, break down the
 traffic into channels and allow each node to decide which channels to carry.
 Second, flood the message IDs rather than the messages, and only request the
 messages you haven't seen.

What I mean is: you're preparing an 8GB memory card to send to a
neighbor.  You've been able to find 3GB worth of data to fulfill some
of his outstanding requests.  Rather than leave the other 5GB empty,
you should fill it with inserts you've seen recently, even if they
might not be normally worth directing to him (because of wrong
location, etc).  Unlike other sorts of bandwidth, it can't be retasked
for transmission to a different node.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Post 0.7 idea: off-grid darknet!

2008-05-12 Thread Evan Daniel
On Mon, May 12, 2008 at 6:56 PM, Ian Clarke [EMAIL PROTECTED] wrote:
 On Mon, May 12, 2008 at 9:52 AM, Matthew Toseland
   2. Most or all Freenet apps assume a few seconds latency on requests
   (Frost, Fproxy, etc), yet the latency with the sneakernet would be
   measured in days.  Freenet's existing apps would be useless here.
  
   Not true IMHO. A lot of existing Freenet apps deal with long term requests,
   which would work very nicely with sneakernet.

  Such as?  FMS is pretty slow even with multi-second requests, do you
  really think it would be useful with multi-day requests?  I can't
  think of a single Freenet app that would be useful over a transport
  with multi-day latencies, it would be insane.

I'm pretty sure FMS is slow because it has a list of a few hundred
identities to poll for messages, and it only polls 10-20 at a time.
On a sneakernet you'd send all the poll requests at once.  There's no
reason the delay on receiving a message couldn't be roughly the
one-way latency of the path.

Downloading any sort of large media file can take days on Freenet
*right now*.  People still do it.  What do I care whether the 4 day
download delay is routing delay or bandwidth limit?

The major change needed would be a way to request not the specific SSK
block, but the SSK, whatever CHK it happens to redirect to, and any
CHK blocks needed to decode the result -- plus a way to prevent that
being a DoS attack (tit-for-tat?).

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] plugins/build.xml

2008-05-14 Thread Evan Daniel
Fix to plugins/build.xml to correctly specify paths, assuming normal
svn checkout of both plugins/ and freenet/

Evan Daniel

Index: build.xml
===
--- build.xml   (revision 19885)
+++ build.xml   (working copy)
@@ -2,8 +2,8 @@
 !-- ant build file for Freenet --

 project name=Freenet default=dist basedir=.
-   property name=freenet-cvs-snapshot.location
location=/home/nextgens/src/freenet/src/freenet/lib/freenet-cvs-snapshot.jar/
-   property name=freenet-ext.location
location=/home/nextgens/src/freenet/src/freenet/lib/freenet-ext.jar/
+   property name=freenet-cvs-snapshot.location
location=../freenet/lib/freenet-cvs-snapshot.jar/
+   property name=freenet-ext.location
location=../freenet/lib/freenet-ext.jar/
property name=source-version value=1.4/
property name=build location=build//
property name=dist location=dist//
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] [freenet-cvs] r19912 - trunk/freenet/src/freenet/crypt/ciphers

2008-05-16 Thread Evan Daniel
On Fri, May 16, 2008 at 8:05 AM, Matthew Toseland
[EMAIL PROTECTED] wrote:
 On Friday 16 May 2008 00:52, Daniel Cheng wrote:
 On Fri, May 16, 2008 at 1:13 AM, Matthew Toseland
 [EMAIL PROTECTED] wrote:
  On Thursday 15 May 2008 17:01, Daniel Cheng wrote:
  On Thu, May 15, 2008 at 10:30 PM, Matthew Toseland
  [EMAIL PROTECTED] wrote:
   On Tuesday 13 May 2008 17:10, [EMAIL PROTECTED] wrote:
   Author: j16sdiz
   Date: 2008-05-13 16:10:32 + (Tue, 13 May 2008)
   New Revision: 19912
  
   Modified:
  trunk/freenet/src/freenet/crypt/ciphers/Rijndael.java
   Log:
   No Monte Carlo test for Rijndael
  
   Huh?
 
  The test output the monte carlo test result, it is supposed to be
 compared
  with ecb_e_m.txt in the FIPS standard.
 
  Our implementation is the original Rijndael (not the one in FIPS
 standard),
  the output does not match ecb_e_m.txt.
 
  Is that bad? Presumably changes during the standardisation process were to
  improve security?
 

 Just like what NIST did to other cipher, this remain a mystery -- no
 one outside NIST know why. This can be good or bad, depends on the
 conspiracy level.

 FYI, NIST once fixed a DES vulnerability before anybody else suspect
 there was a weakness.

 The standard AES is not compatible to our Rijndael implementation 
 I guess it's not worth breaking the backward compatibility in 0.7.1.

 It might be if it's more secure...?

Unless I'm mistaken, the difference between Rijndael and AES relates
to things like specified block sizes and not the core crypto:

http://en.wikipedia.org/wiki/Rijndael#Description_of_the_cipher

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Moving to java 1.5

2008-05-17 Thread Evan Daniel
On Sat, May 17, 2008 at 10:08 PM, Juiceman [EMAIL PROTECTED] wrote:
 On Sat, May 17, 2008 at 4:58 PM, Matthew Toseland
 [EMAIL PROTECTED] wrote:
 GCC 4.3 shipped in March, including the new ECJ frontend. It has full support
 for all the new 1.5 language features. IMHO this means that there is no
 longer any reason to stick to java 1.4.


 Are the opensource jvm's up to 1.5?  If so, I say go for it.  :)

IIRC, there are no JVM changes, only compiler changes.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Freenet uninstallation survey results so-far

2008-09-01 Thread Evan Daniel
On Mon, Sep 1, 2008 at 4:43 PM, Matthew Toseland
[EMAIL PROTECTED] wrote:

 Searching: Lots of users want searching, and are disappointed when they don't
 get it. Integrating XMLLibrarian into freesites and into the homepage would
 help; making XMLSpider easier to use might help too; in any case, we need
 somebody to regularly insert an index.

At present the spider doesn't work.  I'm happy to run it for testing,
but probably won't run it for a public index (I'd want to run it
anonymously).  See https://bugs.freenetproject.org/view.php?id=2350
for details.  Short version: after some time, the spider stalls with a
queue full or queries that neither complete nor fail.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Freenet uninstallation survey results so-far

2008-09-02 Thread Evan Daniel
On Tue, Sep 2, 2008 at 2:43 AM, Florent Daignière
[EMAIL PROTECTED] wrote:
 * Evan Daniel [EMAIL PROTECTED] [2008-09-01 22:35:07]:

 On Mon, Sep 1, 2008 at 4:43 PM, Matthew Toseland
 [EMAIL PROTECTED] wrote:

  Searching: Lots of users want searching, and are disappointed when they 
  don't
  get it. Integrating XMLLibrarian into freesites and into the homepage would
  help; making XMLSpider easier to use might help too; in any case, we need
  somebody to regularly insert an index.

 At present the spider doesn't work.  I'm happy to run it for testing,
 but probably won't run it for a public index (I'd want to run it
 anonymously).  See https://bugs.freenetproject.org/view.php?id=2350
 for details.  Short version: after some time, the spider stalls with a
 queue full or queries that neither complete nor fail.


 I have changed how callbacks are handled in previous stable; can you
 still reproduce it with builds post 1158?

The behavior is unchanged with build 1160.  The quickest way to verify
is to recompile the spider with maxParallelRequests = 1.  After a
little while it should get stuck with a request in the queue that
never completes or finishes.  (It took a few minutes to stall here,
with 800 completed requests and 50 failed.)

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Wiki and documentation

2008-12-11 Thread Evan Daniel
I think there are a number of problems that could be solved by better,
up to date documentation.  Many of these could better be solved by
major fixes to the UI, but those have yet to appear.

I think the simplest and easiest way to get better documentation is to
have a more active wiki.  There's a decent amount of good material on
the wiki, but I don't think it sees much use.  I propose that the wiki
be given a more prominent place in the documentation:

- Currently, the link to the wiki from the freenetproject.org main
page is buried as the last link under the documentation subheading.
Move it to the second link (after What is Freenet? of the main menu.
- Add a link on fproxy (with a redirect warning that the wiki is not anonymous).
- Make an effort to point users to the relevant wiki pages if their
question / problem has a known answer.  Raising user awareness of the
wiki is important.

Thoughts?

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Freenet progress update

2009-02-27 Thread Evan Daniel
On Thu, Feb 26, 2009 at 4:19 PM, Matthew Toseland
t...@amphibian.dyndns.org wrote:

 - Possibly increase the number of nodes faster nodes can connect to.
[...]
 TOP FIVE USERVOICE SUGGESTIONS:
 1. Release the 20 nodes barrier.
 This is marked as under review, it may happen in 0.9. It requires some
 alchemy/tweaking. :|

Alternately (or in addition), you could decrease the node limit on
slow nodes.  I've been running with a lower bandwidth limit lately
(15KiB/s out), and I see slightly improved payload fraction with a 15
node limit than 20 -- about 71-72%, vs about 68% with 20KiB/s.
Bandwidth limiting still hits its target effectively (currently
showing 14.3KiB/s average on 18h uptime).  Subjectively, I can't see a
difference running 20 vs 15 connections.

As I understand it, the problem is that per-connection speed is
limited by the slowest connection (approximately).  If slow nodes had
fewer connections, those connections would be faster, just as if the
faster node had more connections.  So from a bandwidth usage
standpoint, the two approaches should be similar.

I do see two advantages to not increasing the connection limit,
though.  With a small network of only a few thousand nodes, the
diameter of the network is very small.  Eventually, when Freenet has a
large network, routing needs to work over a larger diameter.  If you
increase the connection limit now, you'll learn less about how Freenet
scales in practice in the near future.

Since reducing the connection limit on low bw nodes seems to increase
the payload fraction, that means their bw is being used more
efficiently.  My recollection is that reducing the connection limit
didn't change payload fraction at higher bw limits.  Efficiency
improvements are nice even if they're small and only on some of the
network.

I think it would be inappropriate to reduce the connection limit
without further testing.  Has anyone else with a low bw limit tried
this?  Does it cause any problems?  If it doesn't cause any problems,
I would suggest making the change be a small one initially.  Rather
than a flat 20 connections, something like 1 connection per 2KiB/s of
outbound bandwidth, with a minimum of 15 and a max of 20.  I'll
perform some testing with 15 connections, 30KiB/s limit and report
back on that.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Freenet progress update

2009-03-02 Thread Evan Daniel
On Mon, Mar 2, 2009 at 3:11 PM, Florent Daignière
nextg...@freenetproject.org wrote:
 * Evan Daniel eva...@gmail.com [2009-02-27 10:58:19]:
 I think it would be inappropriate to reduce the connection limit
 without further testing.
[...]
 Tweaking that code based on one's experience is just plain silly.

Then it seems we're in agreement.

Tweaking an emergent system based on hunches is silly.  Gathering data
and tweaking based on that data isn't.  Individual anecdotes like my
node's performance prove nothing, but can suggest routes for further
investigation.  Right now, all I think we know is that the current
system works, and that there is reason to believe improvement is
possible (ie unused available bandwidth).  Do you disagree with that
assessment?

Is there a reason not to investigate this?  I'm not wedded to any
particular solution or testing method, and I can think of plenty of
flaws in mine.  If you have an improved proposal, by all means say so.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Freenet progress update

2009-03-11 Thread Evan Daniel
On Tue, Mar 10, 2009 at 7:16 PM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Monday 02 March 2009 20:55:59 Florent Daignière wrote:
 * Evan Daniel eva...@gmail.com [2009-03-02 15:41:59]:

  On Mon, Mar 2, 2009 at 3:11 PM, Florent Daignière
  nextg...@freenetproject.org wrote:
   * Evan Daniel eva...@gmail.com [2009-02-27 10:58:19]:
   I think it would be inappropriate to reduce the connection limit
   without further testing.
  [...]
   Tweaking that code based on one's experience is just plain silly.
 
  Then it seems we're in agreement.
 
  Tweaking an emergent system based on hunches is silly.  Gathering data
  and tweaking based on that data isn't.  Individual anecdotes like my
  node's performance prove nothing, but can suggest routes for further
  investigation.  Right now, all I think we know is that the current
  system works, and that there is reason to believe improvement is
  possible (ie unused available bandwidth).  Do you disagree with that
  assessment?
 
  Is there a reason not to investigate this?  I'm not wedded to any
  particular solution or testing method, and I can think of plenty of
  flaws in mine.  If you have an improved proposal, by all means say so.
 

 Yes, they are *good* reasons why we should keep the number of peers
 constant accross nodes.

  - makes traffic analysis harder (CBR is good; there is even an argument
    saying we should pad and send garbage if we have to)

 How is this related to the number of peers being constant across very fast and
 very slow nodes? On a node with a very low transfer limit, we will have
 different padding behaviour than on a node with a very high transfer limit: a
 fast node has more opportunities for padding because we have a fixed period
 of time for coalescing.

Agreed.  This actually sounds like an argument in favor of variable
connection count (at least relative to the current system).  If each
connection is CBR, then how many there are doesn't tell an observer
anything.  If all nodes have the same number of connections, then
either there is bw going unused or some connections are higher bw than
others -- which means there is significant routing based on capacity
rather than location.  If a fast node simply has more connections at
the same (constant) bit rate instead of a mix of fast and slow links,
this is mitigated.


  - we don't want to go back to the route and missroute according to load
 approach

 Agreed.

  - dynamic systems are often easier to abuse than static ones

 It has been discussed numerous times already; As far as I am concerned,
 nothing has changed... We have to accept that we will always have to
 deal with slow nodes and those are going to determine the speed of the
 whole network. The only parameter we should change is the height of the
  entry fence: how much is the minimal configuration needed to access
 freenet.

 Obviously that's some form of elitism... but that's much better than the
 alternative (creating a dynamic, fragile system which will work well only
  for *some* people).

 I don't see why it would be more fragile... however we would have to deploy it
 and try to see if we can tell whether it's an improvement, which may be
 tricky...

 NextGen$

 ___
 Devl mailing list
 Devl@freenetproject.org
 http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Current design of Freenet, routing/storage heuristics and arbitrary constants was Re: Article contd.

2009-03-11 Thread Evan Daniel
2009/3/4 Matthew Toseland t...@amphibian.dyndns.org:

 STORAGE:
 There are 6 stores: a store and a cache for each of SSK, CHK and public key
 (for SSKs; we can avoid sending this over the network in many cases). Every
 block fetched is stored in the relevant cache. When an insert completes, it
 will be stored only if there is no node with a location closer to the key's
 location form than this node. But this calculation ignores peers with a lower
 reported uptime than 40% (this is simply an estimate computed by the node and
 sent to us on connection), and ignores backed off nodes unless all nodes are
 backed off. It does not take into account recently failed status etc.
 Arguably we should send uptime messages more often than on connection: this
 disadvantages high uptime nodes that connected when they were first
 installed. In practice the store fills up *much* slower than the cache.

There is a technique that would make the store fill more quickly than
it currently does without any drawbacks (aside from a small amount of
development time ;) ).  Right now, there are two hash tables with one
slot per location.  One hash table is the store, one the cache.
(Obviously I'm only considering a single one of CHK/SSK/pubkey.)  This
is equivalent to a single hash table with two slots per location, with
rules that decide which slot to use rather than which hash table.

Currently, the rule is that inserts go in the store and fetches in the
cache.  The change is this: when storing a key in the cache, if the
cache slot is already occupied but the store slot is empty, put it in
the store instead (and vice versa).  Even without the bloom filter,
this doesn't add any disk reads -- by treating it as one hash table
with two slots per location, you put those two slots adjacent on disk
and simply make one larger read to retrieve both keys.

This is a technique I first saw in hash tables for chess programs.
Evaluation results are cached, with two distinct slots per location.
One slot stores the most recent evaluation result, the other stores
the most expensive to recompute.  There is a noticeable performance
improvement in the cache if you are willing to store a result in the
wrong slot when only one of the two is full already.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Current design of Freenet, routing/storage heuristics and arbitrary constants was Re: Article contd.

2009-03-11 Thread Evan Daniel
On Wed, Mar 11, 2009 at 1:22 PM, Oskar Sandberg os...@sandbergs.org wrote:
 On Wed, Mar 11, 2009 at 5:03 PM, Evan Daniel eva...@gmail.com wrote:

 There is a technique that would make the store fill more quickly than
 it currently does without any drawbacks (aside from a small amount of
 development time ;) ).  Right now, there are two hash tables with one
 slot per location.  One hash table is the store, one the cache.
 (Obviously I'm only considering a single one of CHK/SSK/pubkey.)  This
 is equivalent to a single hash table with two slots per location, with
 rules that decide which slot to use rather than which hash table.

 Currently, the rule is that inserts go in the store and fetches in the
 cache.  The change is this: when storing a key in the cache, if the
 cache slot is already occupied but the store slot is empty, put it in
 the store instead (and vice versa).  Even without the bloom filter,
 this doesn't add any disk reads -- by treating it as one hash table
 with two slots per location, you put those two slots adjacent on disk
 and simply make one larger read to retrieve both keys.

 This is a technique I first saw in hash tables for chess programs.
 Evaluation results are cached, with two distinct slots per location.
 One slot stores the most recent evaluation result, the other stores
 the most expensive to recompute.  There is a noticeable performance
 improvement in the cache if you are willing to store a result in the
 wrong slot when only one of the two is full already.

 Isn't this just a more complicated way of saying: put anything which you
 cache into the store if the store isn't full yet?

Basically.  And an observation that there doesn't have to be a
performance penalty in doing so.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Current uservoice top 5

2009-04-22 Thread Evan Daniel
On Mon, Apr 20, 2009 at 10:34 AM, Matthew Toseland
t...@amphibian.dyndns.org wrote:

 So duplicating the top block is fairly important. Another weakness is that
 the last segment in a splitfile may have much less redundancy than the rest;
 this can be fixed by making the last 2 segments the same size.

Is there any reason to make the first n-2 segments full size and the
last 2 balanced and potentially only (slightly over) half size, rather
than make all n segments the same size?

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Easy top block duplication: Content Multiplication Keys

2009-04-23 Thread Evan Daniel
On Thu, Apr 23, 2009 at 4:01 PM, Robert Hailey
rob...@freenetproject.org wrote:

 On Apr 23, 2009, at 12:34 PM, Florent Daigniere wrote:

 Robert Hailey wrote:

 Perhaps there is an easier solution?

 How about extending the chk logic into an alternate-chk-key (ACK?);

 that simply adds 0.25 to the expected location (for routing and

 storage).

 So when you insert the top block, put it in as a chk and an ack (no

 extra uri's neccesary). When you go to fetch it, if the chk does not

 work, try the ack variant of the same key.

 At the moment each node on the path can verify that the data sent by
 previous hop corresponds to what it ought to; How would that work with
 your proposed solution?

 NextGen$

 Sorta like this...
 package freenet.keys;
 public class ASKKey extends NodeCHK {
 public double toNormalizedDouble() {
 return (super.toNormalizedDouble()+0.25)%1.0;
 }
 }
 The only difference is where any node would look for it. This would not be
 exposed to the client. My idea is that any chk could be converted to an
 alternate-location-finding-key just by type (which surely would mean a
 different fetch-command, e.g. fetchCHK/fetchACK...).
 There would be no difference in handling, the only difference would be how
 the target-routing-location is identified from the key (the same as CHK plus
 a constant [mod 1.0]). After all, the mapping from the key to the
 small-world location is open to interpretation...
 --
 Robert Hailey

I suggested the obvious extension of this on IRC.  Instead of simple
searching at location + 0.25, you search at location + n/N, where n is
which copy of the block you're looking for, and N is the number of
copies inserted.

Toad didn't like this because it makes top blocks identifiable to
everyone on the routing path, and involves network-level changes.  The
other approaches can be implemented at a higher level as a translation
before handing a normal CHK request to the network.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Can we implement Bloom filter sharing quickly???

2009-05-01 Thread Evan Daniel
On Fri, May 1, 2009 at 3:37 PM, Robert Hailey rob...@freenetproject.org wrote:

 On May 1, 2009, at 9:35 AM, Matthew Toseland wrote:

 IMPLEMENTING IT:
 Main tasks:
 - Converting our datastore indexes into a compact, slightly more lossy,
 1-bit
 filter. (Easy)
 - Creating a snapshot once an hour, and keeping for some period (1 week?).
 (Easy)
 - Creating an efficient binary diff format for hour-to-hour diffs, and
 keeping
 them. (Moderate)

 This might actually be quite hard as an efficient bloom filter will scatter
 (even a few) updates all over the filter.

Actually, it's not hard at all.

The naive version is to observe that a tiny fraction of the bits in
the filter changed, and just record the location of each bit that
changed.  At 0.24 writes per second, you'll get on average 0.24*3600*2
keys changed per hour (each write represents 1 add and 1 delete), or
0.24*3600*2*16 counters changed per hour.  Unless I'm mistaken,
changing a counter in the counting bloom filter has ~ 50% probability
of changing the bit in the compressed non-counting version.  So that
means 0.24*3600*2*16*0.5=13824 bits changed per hour.

The address of a bit can be represented in 32 bits trivially (for
bloom filters  512MiB in size), so the 1-hour diff should consume
13824*32/8=55296 bytes.  That represents 15.36 bytes/s of traffic for
each peer, or 307.2B/s across 20 peers.

That encoding isn't terribly efficient.  More efficient is to sort the
addresses and compute the deltas.  (So if bits 19, 7, and 34 changed,
I send the numbers 7, 12, 15.)  Those deltas should follow a geometric
distribution with mean (number of bits changed) / (size of filter).
It's easy to build an arithmetic coding for that data that will
achieve near-perfect compression (see
http://en.wikipedia.org/wiki/Arithmetic_coding for example).  My BOTE
estimate using toad's 84MiB filter it would compress at 14.5 bits per
address (instead of the 30 or 32 you'd get with no compression; gzip
or lzw should be somewhere in between).

Evan
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Can we implement Bloom filter sharing quickly???

2009-05-01 Thread Evan Daniel
On Fri, May 1, 2009 at 4:32 PM, Robert Hailey rob...@freenetproject.org wrote:

 On May 1, 2009, at 3:02 PM, Matthew Toseland wrote:

 [20:37:16] evanbd You standardize on a size for the bloom filters; say
 1MiB.
 Then, if your store has 100GB of data, and needs 18MiB of bloom filters,
 you
 partition the keyspace into 18 chunks of roughly equal population.  Each
 segment of the keyspace then gets its own bloom filter.
 [20:38:24] evanbd Then, if my node has 100MiB of memory to spend on my
 peers' bloom filters, and 20 peers, I just ask each peer for up to 5 bloom
 filters.
 [20:40:11] toad_ evanbd: it's good to have a partial filter for each node
 yes
 [20:40:24] toad_ evanbd: however, you end up checking more filters which
 increases false positives
 [20:40:52] evanbd No, the fp rate stays the same.
 [20:41:18] evanbd Suppose your node has 18 filters, each with a 1E-5 fp
 rate
 [20:41:36] evanbd When I get a request, I compare it to your node's
 filter
 set.
 [20:41:57] evanbd But only *one* of those filters gets checked, since
 each
 one covers a different portion of the keyspace

 If requested to send 5 partial filters, which do you send? I presume those
 closest to the originator's present location.
 Looks like you actually may end up checking less filters overall (if the
 blooms from large peers are out-of-range of the request).

Yes, that's a question worth considering.  There are both performance
and security issues involved, I think.  Note that the partition could
be a set of contiguous regions (allowing performance optimization
around which piece of the keyspace you send info about), but it could
just as easily be determined by a hash function instead.

You still check the same number of filters overall -- one per peer.
The difference is that for some peers you may have a partial filter
set, and therefore sometimes check their filters, instead of deciding
you don't have the memory for that peer's filter and never checking
it.

 Nice idea, Evan!

Thanks!

Evan
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Can we implement Bloom filter sharing quickly???

2009-05-01 Thread Evan Daniel
On Fri, May 1, 2009 at 7:01 PM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Friday 01 May 2009 22:43:50 Robert Hailey wrote:

 On May 1, 2009, at 3:46 PM, Evan Daniel wrote:

  Yes, that's a question worth considering.  There are both performance
  and security issues involved, I think.  Note that the partition could
  be a set of contiguous regions (allowing performance optimization
  around which piece of the keyspace you send info about), but it could
  just as easily be determined by a hash function instead.
 
  You still check the same number of filters overall -- one per peer.
  The difference is that for some peers you may have a partial filter
  set, and therefore sometimes check their filters, instead of deciding
  you don't have the memory for that peer's filter and never checking
  it.

 Maybe if we partition it we can also get a free datastore histogram on
 the stats page.

 No, we cannot divide by actual keyspace, the keys must be hashed first, or the
 middle bloom filter will be far too big.

Well, you could partition by actual keyspace as long as the partitions
are (approximately) equal in population rather than fraction of the
keyspace they cover.  Doing that is only mildly nontrivial, and gives
you a histogram with variable-width bars, but still an accurate
histogram.  Each bar would cover the same area; tall and skinny near
the node's location, wide and short away from it.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Can we implement Bloom filter sharing quickly???

2009-05-01 Thread Evan Daniel
On Fri, May 1, 2009 at 7:59 PM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Saturday 02 May 2009 00:53:27 Evan Daniel wrote:
 On Fri, May 1, 2009 at 7:01 PM, Matthew Toseland
 t...@amphibian.dyndns.org wrote:
  On Friday 01 May 2009 22:43:50 Robert Hailey wrote:
 
  On May 1, 2009, at 3:46 PM, Evan Daniel wrote:
 
   Yes, that's a question worth considering.  There are both performance
   and security issues involved, I think.  Note that the partition could
   be a set of contiguous regions (allowing performance optimization
   around which piece of the keyspace you send info about), but it could
   just as easily be determined by a hash function instead.
  
   You still check the same number of filters overall -- one per peer.
   The difference is that for some peers you may have a partial filter
   set, and therefore sometimes check their filters, instead of deciding
   you don't have the memory for that peer's filter and never checking
   it.
 
  Maybe if we partition it we can also get a free datastore histogram on
  the stats page.
 
  No, we cannot divide by actual keyspace, the keys must be hashed first, or
 the
  middle bloom filter will be far too big.

 Well, you could partition by actual keyspace as long as the partitions
 are (approximately) equal in population rather than fraction of the
 keyspace they cover.  Doing that is only mildly nontrivial, and gives
 you a histogram with variable-width bars, but still an accurate
 histogram.  Each bar would cover the same area; tall and skinny near
 the node's location, wide and short away from it.

 And if the distribution was ever to change ... it's not mildly nontrivial
 IMHO.

 Anyway, the first version should simply use the existing filters, to make
 things easy.


You have to recompute the bloom filter occasionally regardless,
otherwise the counters eventually all saturate.  If the distribution
changes enough to be problematic, you rebalance the filters.

If you're willing to be slightly inefficient, you can avoid
recomputing all the filters.  When one filter starts getting full, you
split its portion of the keyspace in half and create a pair to take
its place.  On average your filters will only be 3/4 full or so, but
you reduce the computational load (though probably not the disk load?
hmm...)

Or you could just partition the keyspace by hashing and trust that
equal size partitions will have equal populations.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Current uservoice top 5

2009-05-04 Thread Evan Daniel
On Mon, May 4, 2009 at 11:33 AM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 1. Release the 20 nodes barrier (206 votes)

 As I have mentioned IMHO this is a straightforward plea for more performance.

I'll reiterate a point I've made before.

While this represents a simple plea for performance, I don't think
it's an irrational one -- that is, I think the overall network
performance is hampered by having all nodes have the same number of
connections.

Because all connections use similar amounts of bandwidth, the network
speed is limited by the slower nodes.  This is true regardless of the
absolute number of connections; raising the maximum for fast nodes
should have a very similar effect to lowering it for slow nodes.  What
matters is that slow nodes have fewer connections than fast nodes.

For example, the max allowed connections (and default setting) could
be 1 connection per 2KiB/s output bandwidth, but never more than 20 or
less than 15.  Those numbers are based on some (very limited) testing
I've done -- if I reduce the allowed bw, that is the approximate
number of connections required to make full use of it.

Reducing the number of connections for slow nodes has some additional
benefits.  First, my limited testing shows a slight increase in
payload % at low bw limits as a result of reducing the connection
count (there is some per-connection network overhead).  Second, bloom
filter sharing represents a per-connection overhead (mostly in the
initial transfer -- updates are low bw, as discussed).  If (when?)
implemented, it will represent a smaller total overhead with fewer
connections than with more.  Presumably, the greatest impact is on
slower nodes.

On the other hand, too few connections may make various attacks
easier.  I have no idea how strong an effect this is.  However, a node
that has too many connections (ie insufficient bw to use them all
fully) may show burstier behavior and thus be more susceptible to
traffic analysis.  In addition, fewer connections means a larger
network diameter on average, which may have an impact on routing.
Lower degree also means that the node has fewer neighbor bloom filters
to check, which means that a request is compared against fewer stores
during its traversal of the network.

I'm intentionally suggesting a small change -- it's less likely to
cause major problems.  By keeping the ratio between slow nodes (15
connections) and fast nodes (20 connections) modest, the potential for
reliance on ubernodes is kept minimal.  (Similarly, if you want to
raise the 20 connections limit instead of lower it, I think it should
only be increased slightly.)

And finally: I have done some testing on this proposed change.  At
first glance, it looks like it doesn't hurt and may help.  However, I
have not done enough testing to be able to say anything with
confidence.  I'm not suggesting to implement this change immediately;
rather, I'm saying that *any* change like this should see some
real-world testing before implementation, and that reducing the
defaults for slow nodes is as worthy of consideration and testing as
raising it for fast nodes.

Also: do we have any idea what the distribution of available node
bandwidth looks like?

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Current uservoice top 5

2009-05-04 Thread Evan Daniel
On Mon, May 4, 2009 at 6:15 PM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Monday 04 May 2009 17:29:51 Evan Daniel wrote:
 On Mon, May 4, 2009 at 11:33 AM, Matthew Toseland
 t...@amphibian.dyndns.org wrote:
  1. Release the 20 nodes barrier (206 votes)
 
  As I have mentioned IMHO this is a straightforward plea for more
 performance.

 I'll reiterate a point I've made before.

 While this represents a simple plea for performance, I don't think
 it's an irrational one -- that is, I think the overall network
 performance is hampered by having all nodes have the same number of
 connections.

 Because all connections use similar amounts of bandwidth, the network
 speed is limited by the slower nodes.  This is true regardless of the
 absolute number of connections; raising the maximum for fast nodes
 should have a very similar effect to lowering it for slow nodes.  What
 matters is that slow nodes have fewer connections than fast nodes.

 For example, the max allowed connections (and default setting) could
 be 1 connection per 2KiB/s output bandwidth, but never more than 20 or
 less than 15.

 What would the point be? Don't we need a significant range for it to make much
 difference?

If the network is in fact limited by the per-connection speed of the
slower nodes, and they are in fact a minority of the network,
increasing the per-connection bandwidth of the slower nodes by 33%
should result in a throughput increase for most of the rest of the
network of a similar magnitude.  A performance improvement of 10-30%
should be easily measurable, and (at the high end of that) noticeable
enough to be appreciated by most users.

Really, though, the idea would be to use it as a network-wide test.
Small tests by a few users are helpful, but not nearly as informative
as a network-wide test.  Assuming the change produced measurable
improvement, it would make sense to explore further changes.  For
example, changing the range to 15-30, or increasing the per-connection
bandwidth requirement, or making the per-connection requirement
nonlinear, or some other option.  However, security concerns
(especially ubernodes) are bigger with more dramatic changes.


 Those numbers are based on some (very limited) testing
 I've done -- if I reduce the allowed bw, that is the approximate
 number of connections required to make full use of it.

 Reducing the number of connections for slow nodes has some additional
 benefits.  First, my limited testing shows a slight increase in
 payload % at low bw limits as a result of reducing the connection
 count (there is some per-connection network overhead).

 True.

To be specific, my anecdotal evidence is that it improves the payload
fraction by roughly 3-8%.


 Second, bloom
 filter sharing represents a per-connection overhead (mostly in the
 initial transfer -- updates are low bw, as discussed).  If (when?)
 implemented, it will represent a smaller total overhead with fewer
 connections than with more.  Presumably, the greatest impact is on
 slower nodes.

 Really it's determined by churn, isn't it? Or by any heuristic artificial
 limits we impose...

My assumption is that connection duration is well modeled by a
per-connection half-life, that is largely independent of the number of
connections.  The bandwidth used on such filters is proportional to
the total churn, so fewer connections means less churn in absolute
sense but the same connection half-life.  (That is, bloom filter
bandwidth usage is proportional to # of connections * per-connection
churn rate.)  I don't have any evidence for that assumption, though.


 On the other hand, too few connections may make various attacks
 easier.  I have no idea how strong an effect this is.  However, a node
 that has too many connections (ie insufficient bw to use them all
 fully) may show burstier behavior and thus be more susceptible to
 traffic analysis.

 Yes, definitely true with our current padding algorithms.

 In addition, fewer connections means a larger
 network diameter on average, which may have an impact on routing.
 Lower degree also means that the node has fewer neighbor bloom filters
 to check, which means that a request is compared against fewer stores
 during its traversal of the network.

 True.

Do you know how big a problem this would cause?  My assumption is that
it would be a fairly small effect even on the nodes with fewer
connections, and that they would be in the minority.


 I'm intentionally suggesting a small change -- it's less likely to
 cause major problems.  By keeping the ratio between slow nodes (15
 connections) and fast nodes (20 connections) modest, the potential for
 reliance on ubernodes is kept minimal.  (Similarly, if you want to
 raise the 20 connections limit instead of lower it, I think it should
 only be increased slightly.)

 Why? I don't see the point unless the upper bound is significantly higher than
 the lower bound: any improvement won't be measurable.

As above, I would hope

Re: [freenet-dev] Question about an important design decision of the WoT plugin

2009-05-06 Thread Evan Daniel
I don't have any specific ideas for how to choose whether to ignore
identities, but I think you're making the problem much harder than it
needs to be.  The problem is that you need to prevent spam, but at the
same time prevent malicious non-spammers from censoring identities who
aren't spammers.  Fortunately, there is a well documented algorithm
for doing this: the Advogato trust metric.

The WoT documentation claims it is based upon the Advogato trust
metric.  (Brief discussion: http://www.advogato.org/trust-metric.html
Full paper: http://www.levien.com/thesis/compact.pdf )  I think this
is wonderful, as I think there is much to recommend the Advogato
metric (and I pushed for it early on in the WoT discussions).
However, my understanding of the paper and what is actually
implemented is that the WoT code does not actually implement it.
Before I go into detail, I should point out that I haven't read the
WoT code and am not fully up to date on the documentation and
discussions; if I'm way off base here, I apologize.

The Advogato metric is designed from the ground up to have strong
spam-resistance properties.  In fact, it has a mathematical proof of
how strong they are: the amount of spam that gets through is limited
by the number of confused nodes, that is nodes who are not spammers
(or simple shills of spammers), but who have mistakenly marked
spammers as trustworthy.  The existence of this proof is, to me, so
compelling an argument in favor of using the metric that I believe any
changes to the algorithm that do not come with an updated version of
the proof should be looked upon with extreme suspicion.

I'll leave the precise descriptions of the two algorithms to those who
are actually writing the code for now.  (Though I have read the
Advogato paper and feel I understand it fairly well -- it's rather
dense, though, and I'd be happy to try to offer a clearer or more
detailed explanation of the paper if that would be helpful.)  However,
one of the properties of the Advogato metric (which the WoT algorithm,
AIUI, does not have) is worth discussing, as I think it is
particularly relevant to issues around censorship that are frequently
discussed wrt WoT and Freenet.  Specifically, Advogato does not use
negative trust ratings, whereas both WoT and FMS do.

The concept of negative trust ratings has absolutely nothing to do
with the arbitrary numbers one person assigns to another in their
published trust list.  Those can be on any scale you like, whether
it's 0-100, 1-5, or -100 to +100.  A system can have or not have
negative trust properties on any of those scales.  Instead, negative
trust is a property based on how the trust rating computed for an
identity behaves as other identities *change* their trust ratings.
Let's suppose that Alice trusts Bob, and is trying to compute a trust
rating for Carol (whom she does not have a direct rating for).  Alice
has trust ratings for people not named, some of whom have ratings for
Carol published.  If the trust computation is such that there exists a
rating Bob can assign to Carol such that Alice's rating of Carol is
worse than if Bob had not rated her at all, then the system exhibits
negative trust behaviors.

This is, broadly, equivalent to the ability to censor a poster of FMS
or WoT by marking them untrusted.  There has been much debate over the
question of censoring posters never, only for spamming, for spamming
plus certain objectionable speech, what should be objectionable,
whether you should censor someone who publishes a trust list that
censors non-spammers, etc.  In my opinion, all of that discussion is
very silly to be having in the first place, since the answer is so
well documented: simply don't use a trust metric with negative trust
behaviors!

The problem of introductions, etc is not magically solved by the
Advogato algorithm.  However, I don't think it is made any harder by
it.  The dual benefits of provable spam resistance and lack of
censorship are, in my opinion, rather compelling.

Evan Daniel

On Wed, May 6, 2009 at 5:00 PM, xor x...@gmx.li wrote:
 Hello,

 I am currently refactoring the WoT plugin to allow per-context trust values.

 Lets first explain how WoT currently works so you can understand what I mean:
 - There is a set of Identities. An identity has a SSK URI, a nick name, a set
 of contexts (and a set of properties). An own identity is an identity of the
 user of the plugin, he owns the SSK insert URI so he can insert the identity.

 - Each identity can offer a SetString of contexts. A context is a client
 application, currently there are: Introduction (the
 given identity publishes captchas to allow other identities to get known by
 the web of trust by solving a captcha - if you solve one, you get on the
 publisher's trust list) and Freetalk, which is the messaging system based on
 WoT (comparable to FMS) which I am implementing.

 - Identities currently can give each users a trust value from -100 to +100.
 Each trust relationship

Re: [freenet-dev] Question about an important design decision of the WoT plugin

2009-05-07 Thread Evan Daniel
On Thu, May 7, 2009 at 4:00 AM, xor x...@gmx.li wrote:
 On Thursday 07 May 2009 00:02:11 Evan Daniel wrote:
 The WoT documentation claims it is based upon the Advogato trust
 metric.  (Brief discussion: http://www.advogato.org/trust-metric.html
 Full paper: http://www.levien.com/thesis/compact.pdf )  I think this
 is wonderful, as I think there is much to recommend the Advogato
 metric (and I pushed for it early on in the WoT discussions).
 However, my understanding of the paper and what is actually
 implemented is that the WoT code does not actually implement it.

 I must admit that I do not know whether its claim that it implements Advogato
 is right or not. I have refactored the code but I have not modified the trust
 calculation logic and have not checked whether it is Advogato or not. Someone
 should probably do that.

 I don't have any specific ideas for how to choose whether to ignore
 identities, but I think you're making the problem much harder than it
 needs to be.

 Why exactly? Your post is nice but I do not see how it answers my question.
 The general problem my post is about: New identities are obtained by taking
 them from trust lists of known identities. An attacker therefore could put
 100 identities in his trust list to fill up your database and slow down
 WoT. Therefore, an decision has to be made when to NOT import new identities
 from someone's trust list. In the current implementation, it is when he has a
 negative score.

 As I've pointed out, in the future there will be MULTIPLE webs of trust, for
 different contexts - Freetalk, Filesharing, Identity-Introduction (you can get
 a trust value from someone in that context when you solve a captcha he has
 published), so the question now is: Which context(s) shall be used to decide
 when to NOT import new identity's from someones trust list anymore?

I have not examined the WoT code.  However, the Advogato metric has
two attributes that I don't think the current WoT method has: no
negative trust behavior (if there is a trust rating Bob can assign to
Carol such that Alice will trust Carol less than if Bob had not
assigned a rating, that's a negative trust behavior), and a
mathematical proof as to the upper limit on the quantity of spammer
nodes that get trusted.

The Advogato metric is *specifically* designed to handle the case of
the attacker creating millions of accounts.  In that case, his success
is bounded (linear with modest constant) by the number of confused
nodes -- that is, legitimate nodes that have (incorrectly) marked his
accounts as legitimate.  If you look at the flow computation, it
follows that for nodes for which the computed trust value is zero, you
don't have to bother downloading their trust lists, so the number of
such lists you download is similarly well controlled.

As for contexts, why should the same identity be treated differently
in different contexts?  If the person is (believed to be) a spammer in
one context, is there any reason to trust them in some other context?
I suppose I don't really understand the purpose of having different
contexts if your goal is only to filter out spammers.  Wasn't part of
the point of the modular approach of WoT that different applications
could share trust lists, thus preventing users from having to mark
trust values for the same identities several times?

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Question about an important design decision of the WoT plugin

2009-05-07 Thread Evan Daniel
On Thu, May 7, 2009 at 2:02 PM, Thomas Sachau m...@tommyserver.de wrote:
 Evan Daniel schrieb:
 I don't have any specific ideas for how to choose whether to ignore
 identities, but I think you're making the problem much harder than it
 needs to be.  The problem is that you need to prevent spam, but at the
 same time prevent malicious non-spammers from censoring identities who
 aren't spammers.  Fortunately, there is a well documented algorithm
 for doing this: the Advogato trust metric.

 The WoT documentation claims it is based upon the Advogato trust
 metric.  (Brief discussion: http://www.advogato.org/trust-metric.html
 Full paper: http://www.levien.com/thesis/compact.pdf )  I think this
 is wonderful, as I think there is much to recommend the Advogato
 metric (and I pushed for it early on in the WoT discussions).
 However, my understanding of the paper and what is actually
 implemented is that the WoT code does not actually implement it.
 Before I go into detail, I should point out that I haven't read the
 WoT code and am not fully up to date on the documentation and
 discussions; if I'm way off base here, I apologize.

 I think, you are:

 The advogato idea may be nice (i did not read it myself), if you have exactly 
 1 trustlist for
 everything. But xor wants to implement 1 trustlist for every app as people 
 may act differently e.g.
 on firesharing than on forums or while publishing freesites. You basicly dont 
 want to censor someone
 just because he tries to disturb filesharing while he may be tries to bring 
 in good arguments at
 forum discussions about it.
 And i dont think that advogato will help here, right?

There are two questions here.  The first question is given a set of
identities and their trust lists, how do you compute the trust for an
identity the user has not rated?  The second question is, how do you
determine what trust lists to use in which contexts?  The two
questions are basically orthogonal.

I'm not certain about the contexts issue; Toad raised some good
points, and while I don't fully agree with him, it's more complicated
than I first thought.  I may have more to say on that subject later.

Within a context, however, the computation algorithm matters.  The
Advogato idea is very nice, and imho much better than the current WoT
or FMS answers.  You should really read their simple explanation page.
 It's really not that complicated; the only reasons I'm not fully
explaining it here is that it's hard to do without diagrams, and they
already do a good job of it.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Question about an important design decision of the WoT plugin

2009-05-07 Thread Evan Daniel
On Thu, May 7, 2009 at 4:43 PM, xor x...@gmx.li wrote:
 On Thursday 07 May 2009 11:23:51 Evan Daniel wrote:
 On Thu, May 7, 2009 at 4:00 AM, xor x...@gmx.li wrote:
  On Thursday 07 May 2009 00:02:11 Evan Daniel wrote:
  The WoT documentation claims it is based upon the Advogato trust
  metric.  (Brief discussion: http://www.advogato.org/trust-metric.html
  Full paper: http://www.levien.com/thesis/compact.pdf )  I think this
  is wonderful, as I think there is much to recommend the Advogato
  metric (and I pushed for it early on in the WoT discussions).
  However, my understanding of the paper and what is actually
  implemented is that the WoT code does not actually implement it.
 
  I must admit that I do not know whether its claim that it implements
  Advogato is right or not. I have refactored the code but I have not
  modified the trust calculation logic and have not checked whether it is
  Advogato or not. Someone should probably do that.
 
  I don't have any specific ideas for how to choose whether to ignore
  identities, but I think you're making the problem much harder than it
  needs to be.
 
  Why exactly? Your post is nice but I do not see how it answers my
  question. The general problem my post is about: New identities are
  obtained by taking them from trust lists of known identities. An attacker
  therefore could put 100 identities in his trust list to fill up your
  database and slow down WoT. Therefore, an decision has to be made when to
  NOT import new identities from someone's trust list. In the current
  implementation, it is when he has a negative score.
 
  As I've pointed out, in the future there will be MULTIPLE webs of trust,
  for different contexts - Freetalk, Filesharing, Identity-Introduction
  (you can get a trust value from someone in that context when you solve a
  captcha he has published), so the question now is: Which context(s) shall
  be used to decide when to NOT import new identity's from someones trust
  list anymore?

 I have not examined the WoT code.  However, the Advogato metric has
 two attributes that I don't think the current WoT method has: no
 negative trust behavior (if there is a trust rating Bob can assign to
 Carol such that Alice will trust Carol less than if Bob had not
 assigned a rating, that's a negative trust behavior), and a
 mathematical proof as to the upper limit on the quantity of spammer
 nodes that get trusted.

 The Advogato metric is *specifically* designed to handle the case of
 the attacker creating millions of accounts.  In that case, his success
 is bounded (linear with modest constant) by the number of confused
 nodes -- that is, legitimate nodes that have (incorrectly) marked his
 accounts as legitimate.  If you look at the flow computation, it
 follows that for nodes for which the computed trust value is zero, you
 don't have to bother downloading their trust lists, so the number of
 such lists you download is similarly well controlled.


 Well I'm no mathematician, I cannot comment on that. I think toads argument
 sounds reasonable though: That there must be a way to distrust someone if the
 original person who trusted him disappears.

 I do not plan to change the trust logic on my own, I consider myself more as a
 programmer who can implement things than a designer of algorithms etc.

All the more reason to use Advogato (or some other metric with useful
provable properties) :)

The current WoT is entirely black magic alchemy.  Maybe it works,
maybe it doesn't, but us non-mathematicians have trouble saying
anything conclusive.  Alchemy is to be avoided; if you have the
ability to show why it works, it ceases to be alchemy.  Advogato is
certainly not perfect, but its limits are well defined (spammer
identities are linearly bounded by the trust granted to confused
legitimate identities) and imho acceptable -- if you've forced the
spammer to do manual work linearly proportional (with a sane constant)
to the amount of spam he wants to send, you've won.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Question about an important design decision of the WoT plugin

2009-05-07 Thread Evan Daniel
On Thu, May 7, 2009 at 6:33 PM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Thursday 07 May 2009 21:32:42 Evan Daniel wrote:
 On Thu, May 7, 2009 at 2:02 PM, Thomas Sachau m...@tommyserver.de wrote:
  Evan Daniel schrieb:
  I don't have any specific ideas for how to choose whether to ignore
  identities, but I think you're making the problem much harder than it
  needs to be.  The problem is that you need to prevent spam, but at the
  same time prevent malicious non-spammers from censoring identities who
  aren't spammers.  Fortunately, there is a well documented algorithm
  for doing this: the Advogato trust metric.
 
  The WoT documentation claims it is based upon the Advogato trust
  metric.  (Brief discussion: http://www.advogato.org/trust-metric.html
  Full paper: http://www.levien.com/thesis/compact.pdf )  I think this
  is wonderful, as I think there is much to recommend the Advogato
  metric (and I pushed for it early on in the WoT discussions).
  However, my understanding of the paper and what is actually
  implemented is that the WoT code does not actually implement it.
  Before I go into detail, I should point out that I haven't read the
  WoT code and am not fully up to date on the documentation and
  discussions; if I'm way off base here, I apologize.
 
  I think, you are:
 
  The advogato idea may be nice (i did not read it myself), if you have
 exactly 1 trustlist for
  everything. But xor wants to implement 1 trustlist for every app as people
 may act differently e.g.
  on firesharing than on forums or while publishing freesites. You basicly
 dont want to censor someone
  just because he tries to disturb filesharing while he may be tries to
 bring in good arguments at
  forum discussions about it.
  And i dont think that advogato will help here, right?

 There are two questions here.  The first question is given a set of
 identities and their trust lists, how do you compute the trust for an
 identity the user has not rated?  The second question is, how do you
 determine what trust lists to use in which contexts?  The two
 questions are basically orthogonal.

 I'm not certain about the contexts issue; Toad raised some good
 points, and while I don't fully agree with him, it's more complicated
 than I first thought.  I may have more to say on that subject later.

 Within a context, however, the computation algorithm matters.  The
 Advogato idea is very nice, and imho much better than the current WoT
 or FMS answers.  You should really read their simple explanation page.
  It's really not that complicated; the only reasons I'm not fully
 explaining it here is that it's hard to do without diagrams, and they
 already do a good job of it.

 It's nice, but it doesn't work. Because the only realistic way for positive
 trust to be assigned is on the basis of posted messages, in a purely casual
 way, and without the sort of permanent, universal commitment that any
 pure-positive-trust scheme requires: If he spams on any board, if I ever gave
 him trust and haven't changed that, then *I AM GUILTY* and *I LOSE TRUST* as
 the only way to block the spam.


How is that different than the current situation?  Either the fact
that he spams and you trust him means you lose trust because you're
allowing the spam through, or somehow the spam gets stopped despite
your trust -- which implies either that a lot of people have to update
their trust lists before anything happens, and therefore the spam
takes forever to stop, or it doesn't take that many people to censor
an objectionable but non-spamming poster.

I agree, this is a bad thing.  I'm just not seeing that the WoT system
is *that* much better.  It may be somewhat better, but the improvement
comes at a cost of trading spam resistance vs censorship ability,
which I think is fundamentally unavoidable.

There's another reason I don't see this as a problem: I'm working from
the assumption that if you can force a spammer to perform manual
effort on par with the amount of spam he can send, then the problem
*has been solved*.  The reason email spam and Frost spam is a problem
is not that there are lots of spammers; there aren't.  It's that the
spammers can send colossal amounts of spam.

The solution, imho, is mundane: if the occasional trusted identity
starts a spam campaign, I mark them as a spammer.  This is optionally
published, but can be ignored by others to maintain the positive trust
aspects of the behavior.  Locally, it functions as a slightly stronger
killfile: their messages get ignored, and their identity's trust
capacity is forced to zero.

In the context of the routing and data store algorithms, Freenet has a
strong prejudice against alchemy and in favor of algorithms with
properties that are both useful and provable from reasonable
assumptions, even though they are not provably perfect.  Like routing,
the generalized trust problem is non-trivial.  Advogato has such
properties; the current WoT and FMS algorithms do not: they are
alchemical

Re: [freenet-dev] Recent progress on Interdex

2009-05-12 Thread Evan Daniel
On Tue, May 12, 2009 at 4:26 PM, Ximin Luo xl...@cam.ac.uk wrote:

 (one way of storing it which would allow token-deflate would be having each
 indexnode as a CHK, then you'd only have to INS an updated node and all its
 parents up to the root, but i chose not to do this as CHKs have a higher limit
 for being turned into a splitfile. was this the right decision?)

My impression is that most of the time to download a key is the
routing time to find it, not the time to transfer the data once found.
 So a 32KiB CHK is only somewhat slower to download than a 1KiB SSK.
(Though I haven't seen hard numbers on this in ages, so I could be
completely wrong.)

My instinct is that the high latency for a single-key lookup that is
the norm for Freenet means that if using CHKs instead results in an
appreciably shallower tree, that will yield a performance improvement.
 The other effect to consider is how likely the additional data
fetched is to be useful to some later request.  Answering that is
probably trickier, since it requires reasonable assumptions about
index size and usage.

It would be nice if there was a way to get some splitfile-type
redundancy in these indexes; otherwise uncommonly searched terms won't
be retrievable.  However, there's obviously a tradeoff with common
search term latency.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Question about an important design decision of the WoT plugin

2009-05-13 Thread Evan Daniel
On Wed, May 13, 2009 at 9:03 AM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Friday 08 May 2009 02:12:21 Evan Daniel wrote:
 On Thu, May 7, 2009 at 6:33 PM, Matthew Toseland
 t...@amphibian.dyndns.org wrote:
  On Thursday 07 May 2009 21:32:42 Evan Daniel wrote:
  On Thu, May 7, 2009 at 2:02 PM, Thomas Sachau m...@tommyserver.de
 wrote:
   Evan Daniel schrieb:
   I don't have any specific ideas for how to choose whether to ignore
   identities, but I think you're making the problem much harder than it
   needs to be.  The problem is that you need to prevent spam, but at the
   same time prevent malicious non-spammers from censoring identities who
   aren't spammers.  Fortunately, there is a well documented algorithm
   for doing this: the Advogato trust metric.
  
   The WoT documentation claims it is based upon the Advogato trust
   metric.  (Brief discussion: http://www.advogato.org/trust-metric.html
   Full paper: http://www.levien.com/thesis/compact.pdf )  I think this
   is wonderful, as I think there is much to recommend the Advogato
   metric (and I pushed for it early on in the WoT discussions).
   However, my understanding of the paper and what is actually
   implemented is that the WoT code does not actually implement it.
   Before I go into detail, I should point out that I haven't read the
   WoT code and am not fully up to date on the documentation and
   discussions; if I'm way off base here, I apologize.
  
   I think, you are:
  
   The advogato idea may be nice (i did not read it myself), if you have
  exactly 1 trustlist for
   everything. But xor wants to implement 1 trustlist for every app as
 people
  may act differently e.g.
   on firesharing than on forums or while publishing freesites. You
 basicly
  dont want to censor someone
   just because he tries to disturb filesharing while he may be tries to
  bring in good arguments at
   forum discussions about it.
   And i dont think that advogato will help here, right?
 
  There are two questions here.  The first question is given a set of
  identities and their trust lists, how do you compute the trust for an
  identity the user has not rated?  The second question is, how do you
  determine what trust lists to use in which contexts?  The two
  questions are basically orthogonal.
 
  I'm not certain about the contexts issue; Toad raised some good
  points, and while I don't fully agree with him, it's more complicated
  than I first thought.  I may have more to say on that subject later.
 
  Within a context, however, the computation algorithm matters.  The
  Advogato idea is very nice, and imho much better than the current WoT
  or FMS answers.  You should really read their simple explanation page.
   It's really not that complicated; the only reasons I'm not fully
  explaining it here is that it's hard to do without diagrams, and they
  already do a good job of it.
 
  It's nice, but it doesn't work. Because the only realistic way for
 positive
  trust to be assigned is on the basis of posted messages, in a purely
 casual
  way, and without the sort of permanent, universal commitment that any
  pure-positive-trust scheme requires: If he spams on any board, if I ever
 gave
  him trust and haven't changed that, then *I AM GUILTY* and *I LOSE TRUST*
 as
  the only way to block the spam.

 How is that different than the current situation?  Either the fact
 that he spams and you trust him means you lose trust because you're
 allowing the spam through, or somehow the spam gets stopped despite
 your trust -- which implies either that a lot of people have to update
 their trust lists before anything happens, and therefore the spam
 takes forever to stop, or it doesn't take that many people to censor
 an objectionable but non-spamming poster.

 I agree, this is a bad thing.  I'm just not seeing that the WoT system
 is *that* much better.  It may be somewhat better, but the improvement
 comes at a cost of trading spam resistance vs censorship ability,
 which I think is fundamentally unavoidable.

 So how do you solve the contexts problem? The only plausible way to add trust
 is to do it on the basis of valid messages posted to the forum that the user
 reads. If he posts nonsense to other forums, or even introduces identities
 that spam other forums, the user adding trust probably does not know about
 this, so it is problematic to hold him responsible for that. In a positive
 trust only system this is unsolvable afaics?

 Perhaps some form of feedback/ultimatum system? Users who are affected by spam
 from an identity can send proof that the identity is a spammer to the users
 they trust who trust that identity. If the proof is valid, those who trust
 the identity can downgrade him within a reasonable period; if they don't do
 this they get downgraded themselves?

I don't have an easy solution for the contexts issue.  As I see it,
there are several related but distinct issues:
-- Given a set of trust ratings, what is the algorithm

Re: [freenet-dev] Question about an important design decision of the WoT plugin

2009-05-13 Thread Evan Daniel
On Wed, May 13, 2009 at 12:58 PM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Wednesday 13 May 2009 15:47:24 Evan Daniel wrote:
 On Wed, May 13, 2009 at 9:03 AM, Matthew Toseland
 t...@amphibian.dyndns.org wrote:
  On Friday 08 May 2009 02:12:21 Evan Daniel wrote:
  On Thu, May 7, 2009 at 6:33 PM, Matthew Toseland
  t...@amphibian.dyndns.org wrote:
   On Thursday 07 May 2009 21:32:42 Evan Daniel wrote:
   On Thu, May 7, 2009 at 2:02 PM, Thomas Sachau m...@tommyserver.de
  wrote:
Evan Daniel schrieb:
I don't have any specific ideas for how to choose whether to ignore
identities, but I think you're making the problem much harder than
 it
needs to be.  The problem is that you need to prevent spam, but at
 the
same time prevent malicious non-spammers from censoring identities
 who
aren't spammers.  Fortunately, there is a well documented algorithm
for doing this: the Advogato trust metric.
   
The WoT documentation claims it is based upon the Advogato trust
metric.  (Brief discussion:
 http://www.advogato.org/trust-metric.html
Full paper: http://www.levien.com/thesis/compact.pdf )  I think
 this
is wonderful, as I think there is much to recommend the Advogato
metric (and I pushed for it early on in the WoT discussions).
However, my understanding of the paper and what is actually
implemented is that the WoT code does not actually implement it.
Before I go into detail, I should point out that I haven't read the
WoT code and am not fully up to date on the documentation and
discussions; if I'm way off base here, I apologize.
   
I think, you are:
   
The advogato idea may be nice (i did not read it myself), if you
 have
   exactly 1 trustlist for
everything. But xor wants to implement 1 trustlist for every app as
  people
   may act differently e.g.
on firesharing than on forums or while publishing freesites. You
  basicly
   dont want to censor someone
just because he tries to disturb filesharing while he may be tries
 to
   bring in good arguments at
forum discussions about it.
And i dont think that advogato will help here, right?
  
   There are two questions here.  The first question is given a set of
   identities and their trust lists, how do you compute the trust for an
   identity the user has not rated?  The second question is, how do you
   determine what trust lists to use in which contexts?  The two
   questions are basically orthogonal.
  
   I'm not certain about the contexts issue; Toad raised some good
   points, and while I don't fully agree with him, it's more complicated
   than I first thought.  I may have more to say on that subject later.
  
   Within a context, however, the computation algorithm matters.  The
   Advogato idea is very nice, and imho much better than the current WoT
   or FMS answers.  You should really read their simple explanation page.
    It's really not that complicated; the only reasons I'm not fully
   explaining it here is that it's hard to do without diagrams, and they
   already do a good job of it.
  
   It's nice, but it doesn't work. Because the only realistic way for
  positive
   trust to be assigned is on the basis of posted messages, in a purely
  casual
   way, and without the sort of permanent, universal commitment that any
   pure-positive-trust scheme requires: If he spams on any board, if I
 ever
  gave
   him trust and haven't changed that, then *I AM GUILTY* and *I LOSE
 TRUST*
  as
   the only way to block the spam.
 
  How is that different than the current situation?  Either the fact
  that he spams and you trust him means you lose trust because you're
  allowing the spam through, or somehow the spam gets stopped despite
  your trust -- which implies either that a lot of people have to update
  their trust lists before anything happens, and therefore the spam
  takes forever to stop, or it doesn't take that many people to censor
  an objectionable but non-spamming poster.
 
  I agree, this is a bad thing.  I'm just not seeing that the WoT system
  is *that* much better.  It may be somewhat better, but the improvement
  comes at a cost of trading spam resistance vs censorship ability,
  which I think is fundamentally unavoidable.
 
  So how do you solve the contexts problem? The only plausible way to add
 trust
  is to do it on the basis of valid messages posted to the forum that the
 user
  reads. If he posts nonsense to other forums, or even introduces identities
  that spam other forums, the user adding trust probably does not know about
  this, so it is problematic to hold him responsible for that. In a positive
  trust only system this is unsolvable afaics?
 
  Perhaps some form of feedback/ultimatum system? Users who are affected by
 spam
  from an identity can send proof that the identity is a spammer to the
 users
  they trust who trust that identity. If the proof is valid, those who trust
  the identity can downgrade him within

Re: [freenet-dev] a social problem with Wot (was: Hashcash introduction, was: Question about WoT )

2009-05-13 Thread Evan Daniel
On Wed, May 13, 2009 at 4:28 PM, xor x...@gmx.li wrote:
 On Wednesday 13 May 2009 10:01:31 Luke771 wrote:
 Thomas Sachau wrote:
  Luke771 schrieb:
  I can't comment on the technical part because I wouldnt know what im
  talking about.
  However, I do like the 'social' part (being able to see an identity even
  if the censors mark it down it right away as it's created)
 
  The censors? There is no central authority to censor people. Censors
  can only censor the web-of-trust for those people that trust them and
  which want to see a censored net. You cant and should not prevent them
  from this, if they want it.

 This have been discussed  a lot.
 the fact that censoship isnt done by a central authority but by a mob
 rule is irrelevant.
 Censorship in this contest is blocking users based on the content of
 their messages

  The whole point  is basically this: A tool created to block flood
 attacks  is being used to discriminate against a group of users.

 Now, it is true that they can't really censor anything because users can
 decide what trust lists to use, but it is also true that this abuse of
 the wot does creates problems. They are social problems and not
 technical ones, but still 'freenet problems'.

 If we see the experience with FMS as a test for the Web of Trust, the
 result of that test is in my opinion something in between a miserable
 failure and a catastrophe.

 The WoT never got to prove itself against a real flood attack, we have
 no idea what would happen if someone decided to attack FMS, not even if
 the WoT would stop the attempted attack at all, leave alone finding out
 how fast and/or how well it would do it.

 In other words, for what we know, the WoT may very well be completely
 ineffective against a DoS attack.
 All we know about it is that the WoT can be used to discriminate against
 people, we know that it WILL be used in that way, and we know that
 because of a proven fact: it's being used to discriminate against people
 right now, on FMS

 That's all we know.
 We know that some people will abuse WoT, but we dont really know if it
 would be effective at stopping DoS attacks.
 Yes, it should work, but we don't 'know'.

 The WoT has never been tested t actually do the job it's designed to do,
 yet the Freenet 'decision makers' are acting as if the WoT had proven
 its validity beyond any reasonable doubt, and at the same time they
 decide to ignore the only one proven fact that we have.

 This whole situation is ridiculous,  I don't know if it's more funny or
 sad...  it's grotesque. It reminds me of our beloved politicians, always
 knowing what's the right thing to do, except that it never works as
 expected.


 No, it is not ridiculous, you are just having a point of view which is not
 abstract enough:

 If there is a shared medium (= Freenet, Freetalk, etc.) which is writable by
 EVERYONE, it is absolutely IMPOSSIBLE to *automatically* (as in by writing an
 intelligent software) distinguish spam from useful uploads, because
 EVERYONE can be evil.

 EITHER you manually view every single piece of information which is uploaded
 and decide yourself whether you consider it as spam or not OR you adopt the
 ratings of other people so each person only has to rate a small subset of the
 uploaded data. There are no other options.

 And what the web of trust does is exactly the second option: it load
 balances the content rating equally between all users.

While your statement is trivially true (assuming we ignore some fairly
potent techniques like bayesian classifiers that rely on neither
additional work by the user or reliance on the opinions of others...),
it misses the real point: the fact that WoT spreads the work around
does not mean it does so efficiently or effectively, or that the
choices it makes wrt various design tradeoffs are actually the choices
that we, as its users, would make if we considered those choices
carefully.

A web of trust is a complex system, the entire purpose of which is to
create useful emergent behaviors.  Too much focus on the micro-level
behavior of the parts of such a system, instead of the emergent
properties of the system as a whole, means that you won't get the
emergent properties you wanted.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] a social problem with Wot (was: Hashcash introduction, was: Question about WoT )

2009-05-14 Thread Evan Daniel
On Thu, May 14, 2009 at 4:22 AM, xor x...@gmx.li wrote:
 On Wednesday 13 May 2009 22:48:53 Evan Daniel wrote:
 On Wed, May 13, 2009 at 4:28 PM, xor x...@gmx.li wrote:
  On Wednesday 13 May 2009 10:01:31 Luke771 wrote:
  Thomas Sachau wrote:
   Luke771 schrieb:
   I can't comment on the technical part because I wouldnt know what im
   talking about.
   However, I do like the 'social' part (being able to see an identity
   even if the censors mark it down it right away as it's created)
  
   The censors? There is no central authority to censor people.
   Censors can only censor the web-of-trust for those people that trust
   them and which want to see a censored net. You cant and should not
   prevent them from this, if they want it.
 
  This have been discussed  a lot.
  the fact that censoship isnt done by a central authority but by a mob
  rule is irrelevant.
  Censorship in this contest is blocking users based on the content of
  their messages
 
   The whole point  is basically this: A tool created to block flood
  attacks  is being used to discriminate against a group of users.
 
  Now, it is true that they can't really censor anything because users can
  decide what trust lists to use, but it is also true that this abuse of
  the wot does creates problems. They are social problems and not
  technical ones, but still 'freenet problems'.
 
  If we see the experience with FMS as a test for the Web of Trust, the
  result of that test is in my opinion something in between a miserable
  failure and a catastrophe.
 
  The WoT never got to prove itself against a real flood attack, we have
  no idea what would happen if someone decided to attack FMS, not even if
  the WoT would stop the attempted attack at all, leave alone finding out
  how fast and/or how well it would do it.
 
  In other words, for what we know, the WoT may very well be completely
  ineffective against a DoS attack.
  All we know about it is that the WoT can be used to discriminate against
  people, we know that it WILL be used in that way, and we know that
  because of a proven fact: it's being used to discriminate against people
  right now, on FMS
 
  That's all we know.
  We know that some people will abuse WoT, but we dont really know if it
  would be effective at stopping DoS attacks.
  Yes, it should work, but we don't 'know'.
 
  The WoT has never been tested t actually do the job it's designed to do,
  yet the Freenet 'decision makers' are acting as if the WoT had proven
  its validity beyond any reasonable doubt, and at the same time they
  decide to ignore the only one proven fact that we have.
 
  This whole situation is ridiculous,  I don't know if it's more funny or
  sad...  it's grotesque. It reminds me of our beloved politicians, always
  knowing what's the right thing to do, except that it never works as
  expected.
 
  No, it is not ridiculous, you are just having a point of view which is
  not abstract enough:
 
  If there is a shared medium (= Freenet, Freetalk, etc.) which is writable
  by EVERYONE, it is absolutely IMPOSSIBLE to *automatically* (as in by
  writing an intelligent software) distinguish spam from useful uploads,
  because EVERYONE can be evil.
 
  EITHER you manually view every single piece of information which is
  uploaded and decide yourself whether you consider it as spam or not OR
  you adopt the ratings of other people so each person only has to rate a
  small subset of the uploaded data. There are no other options.
 
  And what the web of trust does is exactly the second option: it load
  balances the content rating equally between all users.

 While your statement is trivially true (assuming we ignore some fairly
 potent techniques like bayesian classifiers that rely on neither
 additional work by the user or reliance on the opinions of others...),

 Bayesian filters DO need input: You need to give them old spam and non-spam
 messages so that they can decide about new input.

 But they cannot help Freetalk because they cannot prevent identity spam,
 i.e. the creation of very large amounts of identities.

They do not require input from *other people*.


 it misses the real point: the fact that WoT spreads the work around
 does not mean it does so efficiently or effectively, or that the
 choices it makes wrt various design tradeoffs are actually the choices
 that we, as its users, would make if we considered those choices
 carefully.

 A web of trust is a complex system, the entire purpose of which is to
 create useful emergent behaviors.  Too much focus on the micro-level
 behavior of the parts of such a system, instead of the emergent
 properties of the system as a whole, means that you won't get the
 emergent properties you wanted.


 Yes, the current web of trust implementation might not be perfect. But it is
 one of the only solutions to the spam problem, if not the only.

 So the question is not whether to use a WoT but rather how to program the WoT
 to fit our purposes.

 Well anyway

Re: [freenet-dev] Question about an important design decision of the WoT plugin

2009-05-14 Thread Evan Daniel
, but manageable.  Second,
I think that if the amount of spam they can send is limited to that
level, they (generally) won't bother in the first place, and so in
practice you will only rarely see even that level of spam.

Constraining trust list changes is definitely required.  I would start
with a system that says that if Alice is calculating her trust web,
and Bob has recently removed Sam from his trust list, then when Alice
is propagating trust through Bob's node, she starts by requiring one
unit of flow go to Sam before anyone else on the list, but that that
unit of flow has no effect on Alice's computation of Sam's
trustworthiness.  Or, equivalently, Bob's connection to the supersink
is sized as (1 + number of recently un-trusted identities) rather than
the normal constant 1.  That allows Bob to manage his trust list in a
prompt fashion, but if he removes people from it then he is prevented
from adding new people to replace them too rapidly.  The definition of
recent could be tweaked as well, possibly something like only 1
identity gets removed from the recent list per time period, rather
than a fixed window during which any removed id counts as recently
removed.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Question about an important design decision of the WoT plugin

2009-05-14 Thread Evan Daniel
On Thu, May 14, 2009 at 6:14 PM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Thursday 14 May 2009 17:33:29 Evan Daniel wrote:
 On Thu, May 14, 2009 at 11:32 AM, Matthew Toseland
 t...@amphibian.dyndns.org wrote:
  IMHO these are not solutions to the contexts problem -- it merely
  shifts the balance between allowing spam and allowing censorship.  In
  one case, the attacker can build trust in one context and use it to
  spam a different context.  In the other case, he can build trust in
  one context and use it to censor in another.
 
  Right now, the only good answer I see to contexts is to make them
  fully independent.  Perhaps I missed it, but I don't recall a
  discussion of how any other option would work in any detail -- the
  alternative under consideration appears to be to treat everything as
  one unified context.  I'm not necessarily against that, but the
  logical conclusion is that you're responsible for paying attention to
  everything someone you've trusted does in all contexts in which you
  trust them -- which, for a unified context, means everywhere.
 
  Having to bootstrap on each forum would be _bad_. Totally impractical.
 
  What about ultimatums? these above refers to WoT with negative trust,
 right?
  Ultimatums: I mark somebody as a spammer, I demand that my peers mark him
 as
  a spammer, they evaluate the situation, if they don't mark the spammer as
  spammer then I mark them as spammer.

 Right.  So all the forums go in a single context.  I don't see how you
 can usefully define two different contexts such that trust is common
 to them but responsibility is not.  I think the right solution (at
 least for now) is one context per application.  So you have to
 boostrap into the forums app, and into the filesharing app, and into
 the mail app, but not per-forum.  Otherwise I have to be able to
 evaluate possible spam in an application I may not have installed.

 Ultimatums sound like a reasonable approach.  Though if Alice sends
 Bob an ultimatum about Bob's trust for Sam, and Bob does not act, I'm
 inclined to think that Alice's client should continue downloading
 Bob's messages, but cease publishing a trust rating for Bob.  After
 all, Bob might just be lazy, in which case his trust list is worthless
 but his messages aren't.

 Agreed, I have no problem with not reducing message trust in this case.

   Also, I don't see how this attack is specific to the Advogato metric.
   It works equally well in WoT / FMS.  The only thing stopping it there
   is users manually examining each other's trust lists to look for such
   things.  If you assume equally vigilant users with Advogato the attack
   is irrelevant.
  
   It is solvable with positive trust, because the spammer will gain trust
  from
   posting messages, and lose it by spamming. The second party will likely
 be
   the stronger in most cases, hence we get a zero or worse outcome.
 
  Which second party?
 
  The group of users affected by the spam. The first party is the group of
 users
  who are not affected by the spam but appreciate the spammer's messages to
 a
  forum and therefore give him trust.

 Ah.  You meant solvable with *negative* trust then?

 Yes, sorry.

There's a potential problem here (in the negative trust version): if
you post good stuff in a popular forum, and spam in a smaller one, the
fact that the influence of any one person is bounded means that you
might keep your overall trust rating positive.  XKCD describes the
problem well:  http://xkcd.com/325/

I continue to think that the contexts problem is nontrivial, though
different systems will have different tradeoffs.  Fundamentally, I
think that if trust and responsibility apply to different regions,
there are potential problems.


  OK.
 
  I think you really mean Pure positive only works *perfectly* if every
  user...
 
  Hmm, maybe.
 
  We don't need a perfect system that stops all spam and
  nothing else.  Any system will have some failings.  Minimizing those
  failings should be a design goal, but knowing where we expect those
  failings to be, and placing them where we want them, is also an
  important goal.
 
  Or, looked at another way:  We have ample evidence that people will
  abuse the new identity creation process to post spam.  That is a
  problem worth expending significant effort to solve.  Do we have
  evidence that spammers will actually exert per-identity manual effort
  in order to send problematic amounts of spam?
 
  I don't see why it would be per-identity.

 Per fake identity that will be sending spam.  If they can spend manual
 effort to create a trusted id, and then create unlimited fake ids
 bootstrapped from that one to spam with, that's a problem.  If the
 amount of effort they have to spend is linear with the number of ids
 sending spam, that's not a problem, regardless of whether the effort
 is spent on the many spamming ids or the single bootstrap id.

 Because there is a limit on churn, and one spamming identity

Re: [freenet-dev] Why WoTs won't work....

2009-05-21 Thread Evan Daniel
It's not all that interesting.  It has been discussed to death many
times.  The Advogato algorithm (or something like it) solves this
problem (not perfectly, but far, far better than the current FMS / WoT
alchemy), as I have explained in great detail.

Evan Daniel

On Sat, May 9, 2009 at 12:57 PM,  gu...@gmx.org wrote:
 Interesting discussion from Frost, especially the last post at the bottom. 
 Its about WoTs in general and why they won't work.



 - hahaha...@yle3zhs5lkiwe3fdjyqlcf5+rka - 2009.04.05 - 02:28:11GMT 
 -

 I had to forward this one here.

 --- jezreel℺X~GLTTHo9aaYtIpGT6OOyBMMFwl3b8LwFu6TUw9Q82E sent via FMS on 
 2009-04-05 at 01:31:54GMT ---

 Probably not an amazing subject, but FMS is so dead lately so what the
 hell.  Falafel, why do some of the folks posting to Frost hate you so
 much?  In particular Luke771 and VolodyA.  You generally seem like a
 nice identity so I'm just curious.
 --
 jezr...@pbxwgrdegrigwuteoz3tc6cfla2xu3trmi2tgr2enfrvi4bxkzlectshfvyw2wcxkrffslccijku4z2om5gtoojrpzdfcolkovtdawjuifigkrcqjbevsvrnla2xotcdpfvw6lkmkzzsyqkrifbucqkf.freemail

 ;))

 - VolodyA! V a...@r0pa7z7ja1haf2xttt7aklre+yw - 2009.04.11 - 
 12:11:26GMT -

 I should point out i do not hate falafel, i do not know him. I do hate what 
 he is doing to Freenet, however. If he was to stop supporting intruduction of 
 censorship on Freenet i would not say anything bad about him.

 In fact i tend to agree with much of what he has to say on other subjects.

 --
 May all the sentient beings benefit from our conversation.

 - denmin...@dlkkaikia79j4ovpbgfk4znh25y - 2009.04.14 - 16:13:07GMT 
 -

 Falafel is doing nothing. If a single guy can make the protocol not work, 
 then the protocol is shitty from the start and we need to write a new one.

 - Anonymous - 2009.04.14 - 20:46:59GMT -

 The only shit around here is spewing from the mouths of those who don't 
 understand how it works.  No one can stop you from seeing what you want to 
 see.  Anyone who tells you otherwise is spreading misinformation.

 - luke...@ch4jcmdc27eeqm9cw_wvju+coim - 2009.05.04 - 01:01:46GMT -


 No one can really censor FMS alright, BUT there IS a problem with those 
 'censored trust lists' anyway.
 The existance of censored trust lists forces users to actively maintain their 
 own trust lists, the WoT wont work 'on its own' as it would if everyone used 
 it the way it's supposed to.

 Let me try to explain: if everyone used wot to block flood attacks and 
 nothing else, new users wouldnt need to try and find out which trust lists 
 are 'good', they wouldnt need to work on thir trust lists for hours every 
 day, try to spot censors or guys who wont block pedos, they could simply 
 use FMS and occasianlly set a high trust for someone they actually trust, or 
 lower the trust for someone they caught spamming

 But the current situation makes FMS a pain in the ass. Users have to work on 
 your trust lists regularly, and new users risk (and probably do) to have some 
 of the content blocked by some censor because the guy posted one message on a 
 board that the censor found 'immoral'.

 It may take time until the new user figures out which trust lists to use, and 
 there's a very real risk that he would think that it isnt worth the hassle 
 and give up on FMS completely.
 I did that, others did that, and more will.

 THAT is the real problem with the Little Brothers, not their non-existent 
 ability to censor content. they cant censor anything and they know it. But 
 they can and do make FMS a pain in the ass to use.

 Another problem is that, assuming that the fms community will survive (which 
 i dontthink it will), it my end up split into a number of closed 
 sub-communities that refuse to talk to each other. But this is only a guess, 
 so far. We'll have to see how it turns out.

 In the meantime, making FMS into a PITa has been done already, that is why 
 FMS is as good as dead, and that's why I think that invesiting develpers' 
 time and effort into WoT and Freetalk is a huge waste: FMS failed because of 
 human stupidity and arrogance, and so will Freetalk/WoT, and I really cant 
 understand why the devs cant see the obvious (or refuse to admit it)

 BTW, I dont hate Falafel. Hate costs energy. A lot of it.
 --
 FAFS - the Freenet Applications FreeSite
 u...@ugb~uuscsidmi-ze8laze~o3buib3s50i25riwdh99m,9T20t3xoG-dQfMO94LGOl9AxRTkaz~TykFY-voqaTQI,AQACAAE/FAFS/47/

 -Don't think out of the box: destroy it!-
 ___
 Devl mailing list
 Devl@freenetproject.org
 http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Why WoTs won't work....

2009-05-22 Thread Evan Daniel
: stop trusting spammers or we'll stop trusting you. This would have
 to be answered in a reasonable time, hence is a problem for those not
 constantly at their nodes.

 evanbd has argued that the latter two measures are unnecessary, and that the
 limited number of spam identities that any one identity can introduce will
 make the problem manageable. An attacker who just introduces via a CAPTCHA
 will presumably only get short-lived trust, and if he only posts spam he
 won't get any positive trust. An attacker who contributes to boards to gain
 trust to create spamming sub-identities with has to do manual work to gain
 and maintain reputation among some sub-community. A newbie will not see old
 captcha-based spammers, only new ones, and those spam identities that the
 attacker's main, positive identity links to. He will have to manually block
 each such identity, because somebody is bound to have positive trust for the
 spammer parent identity.

Well...  I argue that they *may* be unnecessary.  Specifically, I
think we can defer implementation until there are problems that
warrant it.


 In terms of UI, if evanbd is right, all we need is a button to mark the poster
 of a message as a spammer (and get rid of all messages from them), and a
 small amount of automatic trust when answering a message (part of the UI so
 it can be disabled). Only those users who know how, and care enough, would
 actually change the trust for the spammer-parent, and in any case doing so
 would only affect them and contribute nothing to the community.

 But if he is wrong, or if an attacker is sufficiently determined, we will also
 need some way to detect spam-parents, and send them ultimatums.

I'm not certain that's the right way to grant manual trust.  (Or
perhaps we need more than one level of it.)  You don't want a spammer
to be able to get manual trust by posting a message to the test board
consisting only of Hi, can anyone see this? -- they can do that
automatically.  I think there should be a pair of buttons, mark
spammer and mark non-spammer.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Why WoTs won't work....

2009-05-22 Thread Evan Daniel
.  The result is that his one main
identity can get a large quantity of spam through, even though it can
only mark a limited number of child identities trusted and each of
them can only send a limited amount of spam.

Also, what do you mean by review of identities added from others?
Surely you don't mean that I should have to manually review every
poster?  Isn't the whole point of using a wot in the first place that
I can get good trust estimates of people I've never seen before?


 It probably also requires:
 - Some indication of which trusted identities trust a spammer when you mark 
 an
 identity as a spammer.

 In FMS, you can simply watch the list of trusts of the spammer identity to 
 get this information.

 - Sending an ultimatum to the trusted identity that trusts more than one
 spammer: stop trusting spammers or we'll stop trusting you. This would have
 to be answered in a reasonable time, hence is a problem for those not
 constantly at their nodes.

 You may note him about if, if you want (either public, or if implemented via 
 private message), but
 basicly, why this warning? Does it help him in any way, if we trust him or 
 does it harm him, if we
 dont any more trust him? At least in FMS it does not change his visibility, 
 but may change the
 trustlist trust that others get for him and so may or may not include his 
 trusts.

Having a well-connected graph is useful, regardless of the algorithm.
If the reason the person trusted a spammer was that they made an
honest mistake (or got scammed by a bait-and-switch, or...) then you
may want to continue using their trust list but inform them of the
problem.  If they don't want to fix the problem, you probably don't
want to continue using their trust list.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Why WoTs won't work....

2009-05-22 Thread Evan Daniel
On Fri, May 22, 2009 at 12:39 PM, Thomas Sachau m...@tommyserver.de wrote:
 Evan Daniel schrieb:
 On Fri, May 22, 2009 at 10:48 AM, Thomas Sachau m...@tommyserver.de wrote:
 Matthew Toseland schrieb:
 On Friday 22 May 2009 08:17:55 bbac...@googlemail.com wrote:
 Is'nt his point that the users just won't maintain the trust lists?
 I thought that is the problem that he meant how can Advogato help us
 here?
 Advogato with only positive trust introduces a different tradeoff, which is
 still a major PITA to maintain, but maybe less of one:
 - Spammers only disappear when YOU mark them as spammers, or ALL the people
 you trust do. Right now they disappear when the majority, from the point of
 view of your position on the WoT, mark them as spammers (more or less).
 So this is a disadvantage of avogato against current FMS implementation. 
 With the current FMS
 implementation, only a majority of trusted identities need to mark him 
 down, with avogato, either
 all original trusters need to mark him down or you need to do it yourself 
 (either mark him down or
 everyone, who trusts him, so
 FMS 1:0 avogato

 As I've said repeatedly, I believe there is a fundamental tradeoff
 between spam resistance and censorship resistance, in the limiting
 case.  (It's obviously possible to have an algorithm that does poorly
 at both.)  Advogato *might* let more spam through than FMS.  There is
 no proof provided for how much spam FMS lets through; with Advogato it
 is limited in a provable manner.  Alchemy is a bad thing.  FMS
 definitely makes censorship by the mob easier.  By my count, that's a
 win for Advogato on both.

 I dont think you can divide between spam resistance and censorship 
 resistance for a simple
 reason: Who defines what sort of action or text is spam? Many people may 
 mostly aggree about some
 sort of action or content to be spam, but others could claim the reduced 
 visibility censorship.
 And i dont see any alchemy with the current trust system of FMS, if something 
 is alchemy and not
 clear, please point it out, but the exact point please.
 And FMS does not make censorship by a mob easier. Simply because you should 
 select the people you
 trust yourself. Like you should select your friends and darknet peers 
 yourself. If you let others do
 it for you, dont argue about what follows (like a censored view on FMS).

Yes, the spam and censorship problems are closely related.  That's why
I say there is something of a tradeoff between them.  The problem with
FMS should be obvious: if some small group actively tries to censor
things I consider non-spam, then it requires a significant amount of
effort by me to stop that.  I have to look at trust lists that mostly
contain valid markings, and belong to real people posting real
messages, and somehow determine that some of the entries on them are
invalid, and then decide not to trust their trust list.  Furthermore,
I have to do this without actually examining each entry on their trust
list -- I'm trying to look at *less* spam here, not more.  The result
is a balkanization of trust lists based on differing policies.  Any
mistakes I make will go unnoticed, since I won't see the erroneously
rejected messages.

In FMS, a group with permissive policies (spam filtering only) and a
group that filtered content they found objectionable can't make
effective use of each other's trust lists.  However, the former group
would like to trust the not-spammer ratings made by the latter group,
and the latter group would like to trust the spammer ratings made by
the former.  AIUI, the balkanization of FMS trust lists largely
prevents this.  Advogato would allow the permissive group to make use
of the less permissive group's ratings, without allowing them to act
as censors.

IMHO, the Advogato case is better for two reasons: first, favoring
those who only want to stop spam over those who want to filter
objectionable content is more consistent with the philosophy behind
Freenet.  Second, spam filters of any sort should be biased towards
type II errors, since they're less problematic and easier to correct.

Essentially, I think that FMS goes overboard in its attempts to reduce
spam.  It is my firm belief that limiting the amount of spam that can
be sent to a modest linear function of the amount of *manual* effort a
spammer exerts is sufficient.  Spam is a problem in both Frost and
email because spammers can simply run bots.  The cost of FMS, both in
worry over mob censorship and work required to maintain trust lists,
is very high.  I think that the total effort spent by the community
would be reduced by the use of an algorithm that took more effort to
stop spammers, and less effort to enable normal communications.  We
need to be aware of what we optimize for, and make sure it's really
what we want.

I've explained why FMS is alchemy before, but it's an important point,
so I don't mind repeating it.  FMS has some goals, and it performs an
algorithm.  There is no proof

Re: [freenet-dev] Why WoTs won't work....

2009-05-22 Thread Evan Daniel
On Fri, May 22, 2009 at 4:16 PM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Friday 22 May 2009 15:39:06 Evan Daniel wrote:
 On Fri, May 22, 2009 at 8:17 AM, Matthew Toseland
 t...@amphibian.dyndns.org wrote:
  On Friday 22 May 2009 08:17:55 bbac...@googlemail.com wrote:
  Is'nt his point that the users just won't maintain the trust lists?
  I thought that is the problem that he meant how can Advogato help us
  here?
 
  Advogato with only positive trust introduces a different tradeoff, which is
  still a major PITA to maintain, but maybe less of one:
  - Spammers only disappear when YOU mark them as spammers, or ALL the people
  you trust do. Right now they disappear when the majority, from the point of
  view of your position on the WoT, mark them as spammers (more or less).

 When they *fail to mark them as trusted*.  It's an important
 distinction, as it means that in order for the spammer to do anything
 they first have to *manually* build trust.  If an identity suddenly
 starts spamming, only people that originally marked it as trusted have
 to change their trust lists in order to stop them.

  - If you mark a spammer as positive because he posts useful content on one
  board, and you don't read the boards he spams you are likely to get marked 
  as
  a spammer yourself.

 Depends how militant people are.  I suspect in practice people won't
 do this unless you trust a lot of spammers... in which case they have
 a point.  (This is also a case for distinguishing message trust from
 trust list trust; while Advogato doesn't do this, the security proof
 extends to cover it without trouble.)  You can take an in-between
 step: if Alice marks both Bob and Carol as trusted, and Bob marks
 Carol a spammer, Alice's software notices and alerts Alice, and offers
 to show Alice recent messages from Carol from other boards.
 (Algorithmically, publishing Sam is a spammer is no different from
 not publishing anything about Sam, but it makes some nice things
 possible from a UI standpoint.)  This may well get most of the benefit
 of ultimatums with lower complexity.

 Right, this is something I keep forgetting to mention. When marking a user as 
 a spammer, the UI should ask the user about people who trust that spammer and 
 other spammers. However, it does encourage militancy, doesn't it? It 
 certainly doesn't solve the problem the way that ultimatums do...

I don't know how much militancy the software should encourage.  I'm
inclined to think it should start low, and then change it if that
doesn't work.


  - If a spammer doesn't spam himself, but gains trust through posting useful
  content on various boards and then spends this trust by trusting spam
  identities, it will be necessary to give him zero message list trust. Again
  this has serious issues with collateral damage, depending on how
  trigger-happy people are and how much of a problem it is for newbies to see
  spam.
 
  Technologically, this requires:
  - Changing WoT to only support positive trust. This is more or less a one 
  line
  change.

 If all you want is positive trust only, yes.  If you want the security
 proof, it requires using the network flow algorithm as specified in
 the paper, which is a bit more complex.  IMHO, fussing with the
 algorithm in ways that don't let you apply the security proof is just
 trading one set of alchemy for another -- it might help, but I don't
 think it would be wise.

 I was under the impression that WoT already used Advogato, apart from 
 supporting negative trust values and therefore negative trust.

The documentation mentions Advogato, and there are some diagrams that
relate to it, but none of the detailed description of the algorithm is
at all related.  Advogato is based on network flow computation.  WoT
as described on the freesite is not -- an identity with 40 trust
points is permitted to give up to 40 points *each* to any number of
child identities, with the actual number given determined by a trust
rating.

In contrast, Advogato has multiple levels of trust, and each identity
either trusts or does not trust each other identity at a given level.
The number of trust points an identity gets is based on capacity and
the optimal flow path.  It does not speak to how trustworthy that
identity is; at a given trust level, the algorithm either accepts or
does not accept a given identity.  Multiple trust levels (eg, a level
for captcha solving and a level for manual trust) implies running the
algorithm multiple times on different (though related) graphs; when
running at a given level, connections at that level and all higher
levels are used.

This implies running Ford-Fulkerson or similar; it's more complicated
than the current system, though not drastically so.
http://en.wikipedia.org/wiki/Ford-Fulkerson_algorithm


  - Making sure that my local ratings always override those given by others, 
  so
  I can mark an identity as spam and never see it again. Dunno if this is
  currently implemented

Re: [freenet-dev] Why WoTs won't work....

2009-05-22 Thread Evan Daniel
). We can try to be polite about this using 
 ultimatums, since it's likely that they didn't deliberately choose to trust 
 the spam-parent knowing he is a spam-parent - but if they don't respond in 
 some period by removing him from their trust list, we will have to reduce our 
 trust in them. This will cause collateral damage and may be abused for 
 censorship which might be even more dangerous than the current problems on 
 FMS. However, if there is a LOT of spam, or if we want the network to be 
 fairly spam-free for newbies, the first two options are insufficient. :|

I'm not certain you're correct about this.  The first two methods are,
imho, sufficient to limit spam to levels that are annoying, but where
the network is still usable.  Even if they download a bunch of
messages, a new user only has to click the spam button once per
spamming identity, and those are limited in a well defined manner
(linear with modest coefficient with the number of dummy identities
the spammer is willing to maintain).

My suspicion is that if all they can aspire to be is a nuisance, the
spammers won't be nearly as interested.  There is much more appeal to
being able to DoS a board or the whole network than being able to
mildly annoy the users.  So if we limit the amount of damage they can
do to a sane level, the actual amount of damage done will be
noticeably less than that limit.

There is another possible optimization we could do (I've just thought
of it, and I'm not entirely certain that it works or that I like it).
Suppose that Alice trusts Bob trusts Carol (legitimate but confused)
trusts Sam (a spammer), and Alice is busy computing her trust list.
Bob has (correctly) marked Sam as a spammer.  In the basic
implementation, Alice will accept Sam.  Bob may think that Carol is
normally correct (and not malicious), and be unwilling to zero out his
trust list trust for her.  However, since this is a flow computation,
we can place an added restriction: when Alice calculates trust, flow
passing through Bob may not arrive at Sam even if there are
intermediate nodes.  If Alice can find an alternate route for flow to
go from Alice to Carol or Sam, she will accept Sam.

This modification is in some ways a negative trust feature, since
Bob's marking of Sam as a spammer is different from silence.  However,
it doesn't let Bob censor anyone he couldn't censor by removing Carol
from his trust list.  Under no circumstances will Alice using Bob's
trust list result in fewer people being accepted than not using Bob's
trust list.  It does mean that Bob, as a member of the evil cabal of
default trust list members for newbies, can (with the unanimous help
of the cabal) censor identities in a more subtle fashion than simply
not trusting anyone.

The caveats: this is a big enough change that it needs a close
re-examination of the security proof (I'm pretty sure it's still
valid, but I'm not certain).  If it sounds like an interesting idea, I
can do that.  Also, I don't think it's compatible with Ford-Fulkerson
or the other simple flow capacity algorithms.  The changes required
might be non-trivial, possibly to the point of changing the running
time.  Again, I could look at this in detail if it's interesting
enough to warrant it.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Why WoTs won't work....

2009-05-23 Thread Evan Daniel
 result in fewer people being accepted than not using Bob's
 trust list.  It does mean that Bob, as a member of the evil cabal of
 default trust list members for newbies, can (with the unanimous help
 of the cabal) censor identities in a more subtle fashion than simply
 not trusting anyone.

 The caveats: this is a big enough change that it needs a close
 re-examination of the security proof (I'm pretty sure it's still
 valid, but I'm not certain).  If it sounds like an interesting idea, I
 can do that.  Also, I don't think it's compatible with Ford-Fulkerson
 or the other simple flow capacity algorithms.  The changes required
 might be non-trivial, possibly to the point of changing the running
 time.  Again, I could look at this in detail if it's interesting
 enough to warrant it.

 Worth investigating IMHO.

OK, I'll examine it further.

Evan Daniel

___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Why WoTs won't work....

2009-05-23 Thread Evan Daniel
On Sat, May 23, 2009 at 10:06 AM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Saturday 23 May 2009 10:43:09 Arne Babenhauserheide wrote:
 On Friday, 22. May 2009 23:10:42 Mike Bush wrote:
  I have been watching this debate an I was wondering whether it could
  help to have 2 sets of trust values for each identity in a trust list,
  this could mean you could mark an identity as spamming or that I don't
  want to see these posts again as i find them objectionable.

 This is what Credence did in the end for spam detection on Gnutella, so it
 might fit the human psyche :)

 People got the option to say that's bad quality or misleading, I don't 
 like
 it or that's spam.

 For messages that could be

 * that ID posts spam
 * that ID posts crap

 The first can easily be reviewed, the second is subjective. That would give a
 soft group censorship option, but give the useful spam detection to everyone.

 Best wishes,
 Arne

 PS: Yes, I mostly just tried to clarify Mikes post for me. I hope the mail's
 useful to you nontheless.

 People will game the system, no? If they think paedophiles are scum who 
 should not be allowed to speak, and they realise that clicking This is spam 
 is more effective than This is crap, they will click the former, no?

I would assume that's the normal case.  OTOH, there isn't much harm in
implementing it, and if some people use it, that would help
somewhat...  Perhaps implement, but not required for initial release?

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Policy on removing people from mailing list archives?

2009-05-25 Thread Evan Daniel
On Mon, May 25, 2009 at 8:01 PM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Tuesday 26 May 2009 00:56:22 Matthew Toseland wrote:
 On one prior occasion (this year), we have authorised a mailing list archive 
 site to remove messages posted by somebody. I have now had another mail 
 asking for us to remove somebody's name from two archives which we don't run 
 - which generally requires him asking them and getting authorisation from us 
 - and from our own archives.

 If this is to be a regular occurrence, we need to formulate some policy, and 
 IMHO the best way to do this is to discuss it here. Does anyone have an 
 opinion on this? I doubt very much that we have any legal obligation to 
 remove somebody's posts, especially as at least one of the other archive 
 sites will only remove messages with our say so, but I guess we could get 
 legal advice on it... Any opinions on the principle? IMHO rewriting history 
 to make yourself look good to employers is dubious, but at the same time we 
 clearly don't want to pick fights and unnecessarily annoy people.


 Suggested solution: Authorise removal from the external sites, and obscure 
 the name on our archives.

 ___
 Devl mailing list
 Devl@freenetproject.org
 http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


I concur.

IMHO other sites should operate as they choose...  if they're willing
to remove people, then I think we should authorize it.  I think it is
important to retain all messages, but for archives the name is less
important than the content.  I would recommend obscuring it as
[removed name #n] or similar, so that it's obvious whether it's the
same removed name as some other message.

Given Freenet's pro-anonymity stance, I think if someone has a desire
to be made more anonymous, especially as regards potentially illegal
software usage, that we should support them.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Question about an important design decision of the WoT plugin

2009-05-26 Thread Evan Daniel
On Tue, May 26, 2009 at 4:02 PM, xor x...@gmx.li wrote:
 On Thursday 07 May 2009 11:23:51 Evan Daniel wrote:

 
  Why exactly? Your post is nice but I do not see how it answers my
  question. The general problem my post is about: New identities are
  obtained by taking them from trust lists of known identities. An attacker
  therefore could put 100 identities in his trust list to fill up your
  database and slow down WoT. Therefore, an decision has to be made when to
  NOT import new identities from someone's trust list. In the current
  implementation, it is when he has a negative score.
 
 [...]

 I have not examined the WoT code.  However, the Advogato metric has
 two attributes that I don't think the current WoT method has: no
 negative trust behavior (if there is a trust rating Bob can assign to
 Carol such that Alice will trust Carol less than if Bob had not
 assigned a rating, that's a negative trust behavior), and a
 mathematical proof as to the upper limit on the quantity of spammer
 nodes that get trusted.

 The Advogato metric is *specifically* designed to handle the case of
 the attacker creating millions of accounts.  In that case, his success
 is bounded (linear with modest constant) by the number of confused
 nodes -- that is, legitimate nodes that have (incorrectly) marked his
 accounts as legitimate.  If you look at the flow computation, it
 follows that for nodes for which the computed trust value is zero, you
 don't have to bother downloading their trust lists, so the number of
 such lists you download is similarly well controlled.

 I have read your messages again and all your new messages and you are so
 convinced about advogato that I'd like to ask you more questions about how it
 would work, I don't want you to feel like everyone is ignoring you :)
 (- I am more of a programmer right now than a designer of algorithms, I
 concentrate on spending most available time on *implementing* WoT/FT because
 nobody else is doing it and it needs to get done... so I have not talked much
 in this discussion)

Well...  to be fair, I'm not actually completely certain it will work.
 I do, however, think that it has a lot of potential.  I don't know
any way to get the answer short of running the experiment, and I'm
very optimistic about the results.  I firmly expect them to be good,
but not perfect.

Your questions are certainly welcome :)


 Consider the following case, using advogato and not the current FMS/WoT
 alchemy:

 1. Identity X is an occasional and trustworthy poster. X has received many
 positive trust values from hundreds of identities because it has posted
 hundreds of messages over the months, so it has a high score and capacity to
 give trust values, and all newbies will know about the identity and it's high
 score because it is well-integrated into the trust graph.

Careful: Advogato doesn't assign trust scores in the same sense that
FMS and WoT do.

Because X is trusted by many identities, many identities can reach it,
and therefore accept it.  That is a purely binary consideration -- it
does not matter directly that it is reachable by many paths.  Because
many identities link to X, X is only a short distance away from many
identities.  When A calculates his trust graph, X is likely to be
nearby.  However, even if X is poorly connected, this will be true for
some identities; the connectivity changes how likely it is.  Capacity
of a node is determined (in the base algorithm; there are tweaks worth
considering) only by distance, nothing else.  Whether that capacity
actually limits anything or not depends on a variety of factors.  If
there aren't enough downstream nodes, then it isn't needed.  If the
upstream nodes spend their capacity elsewhere, there might not be
enough available to fill it -- here is the other place that X being
well connected matters.


 2. Now a spammer gets a single identity Y onto the trust list of X by solving
 a captcha, his score is very low because he has only solved a captcha but the
 score is there. Therefore, any newbie will see Y because X is well-integrated
 into the WoT

Correct.


 3. X is gone for quite some time due to inactivity, during that time Y creates
 500 spam identities on his trust list and starts to spam all boards. X will
 not remove Y from his trust list because he is *away* for weeks.

Several points.  First, one of the optimizations worth considering is
tightly limiting the capacity of any identity that only has captcha
level trust.  This means that newbies have to solve captchas from
identities that have received manual trust, which is easy enough to
determine.  It also means that though our spammer lists 500 fake ids,
other people will only accept a very small number of them -- possibly
as low as zero, if the captcha trust only nodes are limited to
capacity 1.  So most of those ids are worthless, and spam is
contained.

This is one of the weaknesses of the simplest implementation (no
limits on captcha-only ids, that is use

Re: [freenet-dev] Question about an important design decision of the WoT plugin

2009-05-26 Thread Evan Daniel
2009/5/26 xor x...@gmx.li:
 On Tuesday 26 May 2009 22:02:37 xor wrote:
 On Thursday 07 May 2009 11:23:51 Evan Daniel wrote:
   Why exactly? Your post is nice but I do not see how it answers my
   question. The general problem my post is about: New identities are
   obtained by taking them from trust lists of known identities. An
   attacker therefore could put 100 identities in his trust list to
   fill up your database and slow down WoT. Therefore, an decision has to
   be made when to NOT import new identities from someone's trust list. In
   the current implementation, it is when he has a negative score.

 [...]

  I have not examined the WoT code.  However, the Advogato metric has
  two attributes that I don't think the current WoT method has: no
  negative trust behavior (if there is a trust rating Bob can assign to
  Carol such that Alice will trust Carol less than if Bob had not
  assigned a rating, that's a negative trust behavior), and a
  mathematical proof as to the upper limit on the quantity of spammer
  nodes that get trusted.
 
  The Advogato metric is *specifically* designed to handle the case of
  the attacker creating millions of accounts.  In that case, his success
  is bounded (linear with modest constant) by the number of confused
  nodes -- that is, legitimate nodes that have (incorrectly) marked his
  accounts as legitimate.  If you look at the flow computation, it
  follows that for nodes for which the computed trust value is zero, you
  don't have to bother downloading their trust lists, so the number of
  such lists you download is similarly well controlled.

 I have read your messages again and all your new messages and you are so
 convinced about advogato that I'd like to ask you more questions about how
 it would work, I don't want you to feel like everyone is ignoring you :) (-
 I am more of a programmer right now than a designer of algorithms, I
 concentrate on spending most available time on *implementing* WoT/FT
 because nobody else is doing it and it needs to get done... so I have not
 talked much in this discussion)

 Consider the following case, using advogato and not the current FMS/WoT
 alchemy:

 1. Identity X is an occasional and trustworthy poster. X has received many
 positive trust values from hundreds of identities because it has posted
 hundreds of messages over the months, so it has a high score and capacity
 to give trust values, and all newbies will know about the identity and it's
 high score because it is well-integrated into the trust graph.

 2. Now a spammer gets a single identity Y onto the trust list of X by
 solving a captcha, his score is very low because he has only solved a
 captcha but the score is there. Therefore, any newbie will see Y because X
 is well-integrated into the WoT

 3. X is gone for quite some time due to inactivity, during that time Y
 creates 500 spam identities on his trust list and starts to spam all
 boards. X will not remove Y from his trust list because he is *away* for
 weeks.

 Also consider the case that instead of 500 new identities he just posts
 500 messages with his single identity Y. How do we get rid of Y?

First, you rate limit messages.  I'm having trouble coming up with a
case where I ever want my node downloading that many messages from one
identity.

Second, after I read a few, I'll mark some as spam and the rest will
go away.  From a practical standpoint, I don't really care about the
difference between 5 messages, 500, or 50 -- I'll read one, or a
few, and then mark Y as a spammer.  I'll never see the rest.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Why WoTs won't work....

2009-05-26 Thread Evan Daniel
On Tue, May 26, 2009 at 4:45 PM, xor x...@gmx.li wrote:
 On Friday 22 May 2009 16:39:06 Evan Daniel wrote:
 On Fri, May 22, 2009 at 8:17 AM, Matthew Toseland

 t...@amphibian.dyndns.org wrote:
  On Friday 22 May 2009 08:17:55 bbac...@googlemail.com wrote:
  Is'nt his point that the users just won't maintain the trust lists?
  I thought that is the problem that he meant how can Advogato help us
 
  here?
 
  Advogato with only positive trust introduces a different tradeoff, which
  is still a major PITA to maintain, but maybe less of one:
  - Spammers only disappear when YOU mark them as spammers, or ALL the
  people you trust do. Right now they disappear when the majority, from the
  point of view of your position on the WoT, mark them as spammers (more or
  less).

 When they *fail to mark them as trusted*.  It's an important
 distinction, as it means that in order for the spammer to do anything
 they first have to *manually* build trust.  If an identity suddenly
 starts spamming, only people that originally marked it as trusted have
 to change their trust lists in order to stop them.

  - If you mark a spammer as positive because he posts useful content on
  one board, and you don't read the boards he spams you are likely to get
  marked as a spammer yourself.

 Depends how militant people are.  I suspect in practice people won't
 do this unless you trust a lot of spammers... in which case they have
 a point.  (This is also a case for distinguishing message trust from
 trust list trust; while Advogato doesn't do this, the security proof
 extends to cover it without trouble.)  You can take an in-between
 step: if Alice marks both Bob and Carol as trusted, and Bob marks
 Carol a spammer, Alice's software notices and alerts Alice, and offers
 to show Alice recent messages from Carol from other boards.
 (Algorithmically, publishing Sam is a spammer is no different from
 not publishing anything about Sam, but it makes some nice things
 possible from a UI standpoint.)  This may well get most of the benefit
 of ultimatums with lower complexity.

  - If a spammer doesn't spam himself, but gains trust through posting
  useful content on various boards and then spends this trust by trusting
  spam identities, it will be necessary to give him zero message list
  trust. Again this has serious issues with collateral damage, depending on
  how trigger-happy people are and how much of a problem it is for newbies
  to see spam.
 
  Technologically, this requires:
  - Changing WoT to only support positive trust. This is more or less a one
  line change.

 If all you want is positive trust only, yes.  If you want the security
 proof, it requires using the network flow algorithm as specified in
 the paper, which is a bit more complex.  IMHO, fussing with the
 algorithm in ways that don't let you apply the security proof is just
 trading one set of alchemy for another -- it might help, but I don't
 think it would be wise.

  - Making sure that my local ratings always override those given by
  others, so I can mark an identity as spam and never see it again. Dunno
  if this is currently implemented.
  - Making CAPTCHA announcement provide some form of short-lived trust, so
  if the newly introduced identity doesn't get some trust it goes away.
  This may also be implemented.

 My proposal: there are two levels of trust (implementation starts
 exactly as per Advogato levels).  The lower level is CAPTCHA trust;
 the higher is manually set only.  (This extends to multiple manual
 levels without loss of generality.)  First, the algorithm is run
 normally on the manual trust level.  Then, the algorithm is re-run on
 the CAPTCHA trust level, with modification: identities that received
 no manual trust have severely limited capacity (perhaps as low as 1),
 and the general set of capacity vs distance from root is changed to
 not go as deep.

 The first part means that the spammer can't chain identities *at all*
 before getting the top one manually trusted.  The second means that
 identities that only solved a CAPTCHA will only be seen by a small
 number of people -- ie they can't spam everyone.  The exact numbers
 for flow vs depth would need some tuning for both trust levels,
 obviously.  You want enough people to see new identities that they
 will receive manual trust.

 It is absolutely INACCEPTABLE for a discussion system to only display messages
 of newbies to some people due to the nature of discussion:
 - The *value* of a single post from a new identity which has posted a single
 message can be ANYTHING... it can be absolute crap... but it can also be a
 highly valuable secret document which reveals stuff which is interesting for
 millions of people. In other words: The fact that someone is a newbie does not
 say ANYTHING about the worth of his posts. In more other words: NO individual
 has the right to increase the worth of his posts - as in the amount of
 people reading them - by speaking very much on Freetalk

Re: [freenet-dev] Question about an important design decision of the WoT plugin

2009-05-26 Thread Evan Daniel
On Tue, May 26, 2009 at 5:38 PM, xor x...@gmx.li wrote:
 On Tuesday 26 May 2009 23:19:53 Evan Daniel wrote:
 2009/5/26 xor x...@gmx.li:
  On Tuesday 26 May 2009 22:02:37 xor wrote:
  On Thursday 07 May 2009 11:23:51 Evan Daniel wrote:
Why exactly? Your post is nice but I do not see how it answers my
question. The general problem my post is about: New identities are
obtained by taking them from trust lists of known identities. An
attacker therefore could put 100 identities in his trust list to
fill up your database and slow down WoT. Therefore, an decision has
to be made when to NOT import new identities from someone's trust
list. In the current implementation, it is when he has a negative
score.
 
  [...]
 
   I have not examined the WoT code.  However, the Advogato metric has
   two attributes that I don't think the current WoT method has: no
   negative trust behavior (if there is a trust rating Bob can assign to
   Carol such that Alice will trust Carol less than if Bob had not
   assigned a rating, that's a negative trust behavior), and a
   mathematical proof as to the upper limit on the quantity of spammer
   nodes that get trusted.
  
   The Advogato metric is *specifically* designed to handle the case of
   the attacker creating millions of accounts.  In that case, his success
   is bounded (linear with modest constant) by the number of confused
   nodes -- that is, legitimate nodes that have (incorrectly) marked his
   accounts as legitimate.  If you look at the flow computation, it
   follows that for nodes for which the computed trust value is zero, you
   don't have to bother downloading their trust lists, so the number of
   such lists you download is similarly well controlled.
 
  I have read your messages again and all your new messages and you are so
  convinced about advogato that I'd like to ask you more questions about
  how it would work, I don't want you to feel like everyone is ignoring
  you :) (- I am more of a programmer right now than a designer of
  algorithms, I concentrate on spending most available time on
  *implementing* WoT/FT because nobody else is doing it and it needs to
  get done... so I have not talked much in this discussion)
 
  Consider the following case, using advogato and not the current FMS/WoT
  alchemy:
 
  1. Identity X is an occasional and trustworthy poster. X has received
  many positive trust values from hundreds of identities because it has
  posted hundreds of messages over the months, so it has a high score and
  capacity to give trust values, and all newbies will know about the
  identity and it's high score because it is well-integrated into the
  trust graph.
 
  2. Now a spammer gets a single identity Y onto the trust list of X by
  solving a captcha, his score is very low because he has only solved a
  captcha but the score is there. Therefore, any newbie will see Y because
  X is well-integrated into the WoT
 
  3. X is gone for quite some time due to inactivity, during that time Y
  creates 500 spam identities on his trust list and starts to spam all
  boards. X will not remove Y from his trust list because he is *away* for
  weeks.
 
  Also consider the case that instead of 500 new identities he just posts
  500 messages with his single identity Y. How do we get rid of Y?

 First, you rate limit messages.  I'm having trouble coming up with a
 case where I ever want my node downloading that many messages from one
 identity.

 And how to find a practical rate limit?
 Consider SVN/GIT/etc. log-bots: They post a single message for each commit to
 the repository.

FMS, WoT, and Advogato all mark identities, not messages.  Why is this
scenario relevant to the question at hand -- that is, which algorithm
to run on the trust graph?  If you want to discuss intelligent rate
limiting, and how to make that usable and useful to users, that is
basically a UI problem.  I have ideas and suggestions, and would be
happy to discuss them.  However, that would be completely unrelated to
the current subject, so I suggest starting a new thread.



 Second, after I read a few, I'll mark some as spam and the rest will
 go away.  From a practical standpoint, I don't really care about the
 difference between 5 messages, 500, or 50 -- I'll read one, or a
 few, and then mark Y as a spammer.  I'll never see the rest.

 Can a messaging system survive which will appear as full of spam to every
 newbie?

 Isn't it the core goal of the WoT to prevent *newbies* from seeing spam, to
 let the community design a set of ratings which prevents EVERYONE from having
 to manually mark spam/non spam, letting only a subset of the community doing
 the work and others can benefit from it?

 I think thats what any algorithm needs to be able to do: Provide a nice first
 usage experience.

 First usage = empty trust list. So this also applies to people who are to lazy
 to mark everything as spam which is spam. Which probably applies to  50

Re: [freenet-dev] Question about an important design decision of the WoT plugin

2009-05-27 Thread Evan Daniel
On Wed, May 27, 2009 at 1:18 PM, Thomas Sachau m...@tommyserver.de wrote:
 Evan Daniel schrieb:
 That is fundamentally a hard problem.
 - Advogato is not perfect.  I am certain there will be some amount of
 spam getting through; hopefully it will be a small amount.
 - With Advogato, the amount of spam possible is well defined.  With
 FMS and WoT it is not.  Neither of them have an upper bound on the
 amount of spam.

 How do you define spam?

Please clarify the question.  Do you mean me, personally?  The Freenet
community as a whole?  Or in the context of the proof?

 One limit per identity is the amount of messages which are accepted per day. 
 And if you trust some
 active indentities, which think the same as you, you will get nearly no spam 
 at all because they
 already marked it as spam. FMS/WoT depends on the trust relationship between 
 people and them telling
 each other about third partys.

 - Being too good at solving the spam problem means we are too good at
 mob censorship.  Both are problems.  In practice, the goal should be
 to strike an appropriate balance between the two, not simply to
 eliminate spam.

 Since you cannot say what is spam and what not, this is relative. In FMS, you 
 can choose to trust
 those that think the same as you and you will get their spam markings. Can 
 you get the same with
 avogato?

I have only *very* rarely had any difficulty determining whether a
message was spam or not.  Why would this be any different?

Of course Advogato gives you the same ability, that is the entire
point.  The precise algorithm is different, but the problem it tries
to solve is the same.  The one difference is that Advogato is not
about determining that person X is a spammer, it's about determining
that person X *isn't* a spammer.  From a user's standpoint, the two
questions are precisely identical, but at an algorithm level they're
not.


 - I believe that Advogato is capable of limiting spam to levels where
 the system is usable, even in the case of reasonably determined
 spammers.  If the most they can aspire to is being a nuisance, I don't
 think the spammers will be as interested.  If spamming takes work and
 doesn't do all that much, they'll give up.  The actual amount of spam
 seen in practice should be well below the worst possible case -- if
 and only if the worst case isn't catastrophic.

 How much noice will it allow? The alice bot spam in frost was also just 
 annoying, but i do think
 that many new users where annoyed and left frost and freenet. So a default 
 system should not only
 make it usable, but also relative spamfree from the view of the majority.

It will accept a number of spam identities at most equal to the sum of
the excess capacity of the set of confused identities.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Why WoTs won't work....

2009-05-27 Thread Evan Daniel
-based trust has tighter limitations than
manual trust, he has to solve a captcha for each fake identity.  This
proposal is not a required part of Advogato, it is my own suggestion.
It could be applied to WoT as well, I believe.

If you assume that people will not maintain trust lists, then it
doesn't matter what algorithm you run on the trust graph.  There won't
be one.  FMS, WoT, and Advogato all fail completely under that
assumption.


 Fundamentally, it's a question of whether you believe CAPTCHAs work.
 I don't.  If you start with an assumption that CAPTCHAs are a minor
 hindrance at most, then if you require that everyone sees messages
 sent by identities that have only solved CAPTCHAs and not gained
 manual trust, then you've made it a design criteria to permit
 unlimited amounts of spam.  (That's bad.)  If you believe CAPTCHAs
 work, then things are a bit easier...  but I think the balance of the
 evidence is against that belief.

 Captchas may not be the ultimative solution. But they are one way to let 
 people in while prooving to
 be humans. And you will need this limit (human proove), so you will always 
 need some sort of captcha
 or a real friends trust network.

Captchas do not prove someone is human.  They prove that someone
solved a problem.  If your captchas are good, that means they are more
likely to be human.  I work from an assumption that captchas are
marginally effective at best.  If you think I am mistaken in that,
please explain why.  From that assumption, I conclude that we need a
system that is reasonably effective against a spammer who can solve
significant numbers of captchas, but still is capable of making use of
the information that solving a captcha does provide.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Question about an important design decision of the WoT plugin

2009-05-27 Thread Evan Daniel
On Wed, May 27, 2009 at 2:44 PM, Thomas Sachau m...@tommyserver.de wrote:
 Evan Daniel schrieb:
 On Wed, May 27, 2009 at 1:18 PM, Thomas Sachau m...@tommyserver.de wrote:
 Evan Daniel schrieb:
 That is fundamentally a hard problem.
 - Advogato is not perfect.  I am certain there will be some amount of
 spam getting through; hopefully it will be a small amount.
 - With Advogato, the amount of spam possible is well defined.  With
 FMS and WoT it is not.  Neither of them have an upper bound on the
 amount of spam.
 How do you define spam?

 Please clarify the question.  Do you mean me, personally?  The Freenet
 community as a whole?  Or in the context of the proof?

 The question should point out the problem about spam. One may say that only 
 messages with random
 letters are spam. Others may add many messages, which are all the same or 
 similar. Others may add
 messages with different languages than their own. Others may add logbots. 
 Another one may want to
 add everyone who argues for avogato or FMS. Since there is no objective spam 
 definition, you can
 neither say that the amount of spam is well defined nor that there is no 
 upper bound on the
 amount of spam.

Have you read the proof?


 - Being too good at solving the spam problem means we are too good at
 mob censorship.  Both are problems.  In practice, the goal should be
 to strike an appropriate balance between the two, not simply to
 eliminate spam.
 Since you cannot say what is spam and what not, this is relative. In FMS, 
 you can choose to trust
 those that think the same as you and you will get their spam markings. Can 
 you get the same with
 avogato?

 I have only *very* rarely had any difficulty determining whether a
 message was spam or not.  Why would this be any different?

 You yourself had no problems. But are you sure others share your view on it?

How is this remotely relevant to choice of algorithm?


 - I believe that Advogato is capable of limiting spam to levels where
 the system is usable, even in the case of reasonably determined
 spammers.  If the most they can aspire to is being a nuisance, I don't
 think the spammers will be as interested.  If spamming takes work and
 doesn't do all that much, they'll give up.  The actual amount of spam
 seen in practice should be well below the worst possible case -- if
 and only if the worst case isn't catastrophic.
 How much noice will it allow? The alice bot spam in frost was also just 
 annoying, but i do think
 that many new users where annoyed and left frost and freenet. So a default 
 system should not only
 make it usable, but also relative spamfree from the view of the majority.

 It will accept a number of spam identities at most equal to the sum of
 the excess capacity of the set of confused identities.

 The question is this: Will it prevent enough, so almost all spam or will the 
 amount of spam force
 new (and old) users to leave like it happened and happens with frost and the 
 alice bot?

That is one question, but not the only one.  Another one is, is the
provable upper bound better or worse than the provable upper bound for
a specific alternative proposal?  As to the former, I cannot say with
certainty until we try it out, and even then we only have indications,
not proof.  As to the latter, I have yet to hear anyone propose an
alternative to Advogato that has an upper bound, let alone one that's
better.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Why WoTs won't work....

2009-05-27 Thread Evan Daniel
On Wed, May 27, 2009 at 3:11 PM, Thomas Sachau m...@tommyserver.de wrote:
 Evan Daniel schrieb:
 On Wed, May 27, 2009 at 1:29 PM, Thomas Sachau m...@tommyserver.de wrote:
 A small number could still be rather large.  Having thousands see it
 ought to suffice.  For the current network, I see no reason not to
 have the (default) limits such that basically everyone sees it.
 If your small number is that big, you should add that because for me, 
 small is not around
 thousends. Additionally, if you allow them to reach thousends (will a 
 freenet based message system
 ever reach more people?), is there any value in restricting this anyway?

 Currently, the total number of people using Freenet is small.
 Hopefully that will not always be the case.  Designing a new system
 that assumes it will always be the case seems like a rather bad idea
 to me.

 In this context, I would say small means sublinear growth with the
 size of the entire network.  Having the new-identity spam reach
 thousands of recipients is far better than having it reach tens of
 thousands or millions.

 Why not let the WoT solve the problem? In practise, not all of those will 
 pull the spam at the same
 time. So some will get it first, see it is spam and mark it as such. Later 
 ones will then see the
 spammer mark and not even fetch the message. On the other hand, if it is no 
 spam, it will get fetched.

If WoT can solve it, fine.  If it can't, that's fine too.  Neither
case has any bearing on Advogato's abilities, merely the standard of
comparison.



 If the post is really that valuable, some people will mark the poster
 as trusted.  Then everyone will see it.
 Why should they? People are lazy, so most, if not all will just read it, 
 maybe answer it, but who
 thinks about rating someone because of a single post? People are and will 
 always be lazy.

 If the post is only somewhat valuable, it might take a few posts.  If
 it's a provocative photo that escaped from an oppressive regime, I
 suspect it wouldn't.

 A few? I do sometimes check some FMS trustlists. And those i did check did 
 not set some trust value
 for many people. Additionally remember that FMS is used by people who are 
 willing to do something.
 So i would expect much less from the default WoT inside freenet.
 With your suggestion, someone will have to wait, until someone uncensors 
 him. Imho, noone should
 be censored by default, so it should be exactly the other way round.

See below on captchas.



 Granting trust automatically on replies is an idea that has been
 discussed before.  It has a great deal of merit.  I'm in favor of it.
 I just don't think that should be the highest level of trust.

 It may be an additional option, but this would only make those well-trusted, 
 which do write many
 posts, while others with less posts get less trust. Would be another place, 
 where a spammer could do
 something to make his attacks more powerfull.

It is my firm belief that if the system makes the spammer perform
manual work per identity they wish to spam with, the problem is
solved.  Do you have evidence or sound reasoning to the contrary?  All
systems I know of -- such as email and Frost -- have spam problems
because the spammer can automate all the steps.



 You may think that everyone should be equal; I don't.  If newbies are
 posting stuff that isn't spam (be it one message or many), I'm willing
 to believe someone my web can reach will mark them trusted.  You
 obviously aren't; that's fine too.  Fortunately, there is no
 requirement we use the same capacity limiting functions -- that should
 be configurable for each user.  If you want to make the default
 function fairly permissive, that's fine.  I think you'd be making the
 wrong choice, but personally I wouldn't care that much because I'd
 just change it away from the default if new-identity spam was a
 problem.
 So you want the default to be more censoring. And you trust people to not 
 be lazy. I oppose both.
 First, if you really want to implement such censorship, make the default 
 open, with thousends of
 trusted users, it wont be a difference anyway. Second, why should people 
 mark new identities as
 trusted? I use FMS and i dont change the trust of every identity i see 
 there. And i do somehow
 manage a trustlist there. If someone is lazy (and the majority is), they 
 will do nothing.

 If one of your design requirements is that new identities can post and
 be seen by everyone, you have made the spam problem unsolvable BY
 DEFINITION.  That is bad.

 Wrong. The initial barrier is the proove to solve a problem. Which should be 
 done with a problem
 hard for computers and easy for humans. But this just prevents automated 
 computerbased identity
 creation.

Please cite evidence that such a problem exists in a form that is
user-friendly enough we can use it.  Unless I am greatly mistaken,
Freenet's goal as a project is not to solve the captcha problem when
no one else has.  Taking on oppressive governments

Re: [freenet-dev] Freenet doesn't work with java 1.6.0.14??? was Fwd: Re: [freenet-support] freenet

2009-06-04 Thread Evan Daniel
On Thu, Jun 4, 2009 at 3:35 PM, Matthew Toseland
t...@amphibian.dyndns.org wrote:



 -- Forwarded message --
 From: Philip Bych pot...@googlemail.com
 To: Matthew Toseland t...@amphibian.dyndns.org
 Date: Thu, 4 Jun 2009 19:46:46 +0100
 Subject: Re: [freenet-support] freenet
 thanks but i have sorted the problem
 it was the java update
  i updated java runtime  to the latest 1.6.0.14 and it seems that freenet 
 does not work with this so i have reinstalled java 1.6.0.13 and uninstalled 
 then reinstalled freenet
 all works fine as long as i do not update java runtime.
 cheers

 2009/6/4 Matthew Toseland t...@amphibian.dyndns.org

 On Tuesday 02 June 2009 20:54:09 goat wrote:
  updated java to 1.6.0.14 and updated freenet now nothing works keep
  getting no start up script have disabled norton and use a different
  browser and all other security still no joy
  what other options  are there

 Hi. To help us solve this problem, please:
 - Find the directory Freenet is installed in, find a file called 
 wrapper.log, and send me it.
 - Open a terminal (run cmd.exe), cd to where Freenet is installed, type 
 start.exe (or start.cmd if you have an old installation). What happens? Send 
 any output.


 ___
 Devl mailing list
 Devl@freenetproject.org
 http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Works for me (TM).

On Debian, using Sun Java (package sun-java6-jdk, etc).  I just
installed the version out of unstable.
$ java -version
java version 1.6.0_14
Java(TM) SE Runtime Environment (build 1.6.0_14-b08)
Java HotSpot(TM) Server VM (build 14.0-b16, mixed mode)

Freenet seems to be functioning normally.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Should the spider ignore common words?

2009-06-09 Thread Evan Daniel
On my (incomplete) spider index, the index file for the word the (it
indexes no other words) is 17MB.  This seems rather large.  It might
make sense to have the spider not even bother creating an index on a
handful of very common words (the, be, to, of, and, a, in, I, etc).
Of course, this presents the occasional difficulty:
http://bash.org/?514353  I think I'm in favor of not indexing common
words even so.

Also, on a related note, the index splitting policy should be a bit
more sophisticated: in an attempt to fit within the max index size as
configured, it split all the way down to index_8fc42.xml.  As a
result, the file index_8fc4b.xml sits all by itself at 3KiB.  It
contains the two words vergessene and txjmnsm.  I suspect it would
have reliability issues should anyone actually want to search either
of those.  It would make more sense to have all of index_8fc4 in one
file, since it would be only trivially larger.  (I have a patch that I
thought did that, but it has a bug; I'll test once my indexwriter is
finished writing, since I don't want to interrupt it by reloading the
plugin.)

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Should the spider ignore common words?

2009-06-10 Thread Evan Daniel
On Wed, Jun 10, 2009 at 1:54 AM, Daniel Chengj16sdiz+free...@gmail.com wrote:
 On Wed, Jun 10, 2009 at 12:02 PM, Evan Danieleva...@gmail.com wrote:
 On my (incomplete) spider index, the index file for the word the (it
 indexes no other words) is 17MB.  This seems rather large.  It might
 make sense to have the spider not even bother creating an index on a
 handful of very common words (the, be, to, of, and, a, in, I, etc).
 Of course, this presents the occasional difficulty:
 http://bash.org/?514353  I think I'm in favor of not indexing common
 words even so.

 Yes, it should ignore common words.
 This is called stopword in search engine termology.


 Also, on a related note, the index splitting policy should be a bit
 more sophisticated: in an attempt to fit within the max index size as
 configured, it split all the way down to index_8fc42.xml.  As a
 result, the file index_8fc4b.xml sits all by itself at 3KiB.  It
 contains the two words vergessene and txjmnsm.  I suspect it would
 have reliability issues should anyone actually want to search either
 of those.  It would make more sense to have all of index_8fc4 in one
 file, since it would be only trivially larger.  (I have a patch that I
 thought did that, but it has a bug; I'll test once my indexwriter is
 finished writing, since I don't want to interrupt it by reloading the
 plugin.)

 trivially larger ...
 ugh... how trivial is trivial?

 the xmllibrarian can handle  index_8fc42.xml on its own but all other
 8fc4 on  index_8fc4.xml.
 however, as i have stated in irc, that make index generation even slower.

8fc42 is 17382 KiB.  All other 8fc4 are 79 KiB combined.

Also, it would make index generation faster.  The spider first does
all the work of creating 8fc4, then discards it to recreate the
sub-indexes.  The vast majority of this work is in 8fc42, which gets
created twice.  Not splitting the index would nearly halve the time to
create the 8fc4 set of indexes.

Of course, a more efficient algorithm for creating the indexes in the
first place would both make it far faster and make the two take
approximately the same time.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Should the spider ignore common words?

2009-06-10 Thread Evan Daniel
On Wed, Jun 10, 2009 at 2:56 AM, Daniel Chengj16sdiz+free...@gmail.com wrote:
 On Wed, Jun 10, 2009 at 2:06 PM, Evan Danieleva...@gmail.com wrote:
 On Wed, Jun 10, 2009 at 1:54 AM, Daniel Chengj16sdiz+free...@gmail.com 
 wrote:
 On Wed, Jun 10, 2009 at 12:02 PM, Evan Danieleva...@gmail.com wrote:
 On my (incomplete) spider index, the index file for the word the (it
 indexes no other words) is 17MB.  This seems rather large.  It might
 make sense to have the spider not even bother creating an index on a
 handful of very common words (the, be, to, of, and, a, in, I, etc).
 Of course, this presents the occasional difficulty:
 http://bash.org/?514353  I think I'm in favor of not indexing common
 words even so.

 Yes, it should ignore common words.
 This is called stopword in search engine termology.


 Also, on a related note, the index splitting policy should be a bit
 more sophisticated: in an attempt to fit within the max index size as
 configured, it split all the way down to index_8fc42.xml.  As a
 result, the file index_8fc4b.xml sits all by itself at 3KiB.  It
 contains the two words vergessene and txjmnsm.  I suspect it would
 have reliability issues should anyone actually want to search either
 of those.  It would make more sense to have all of index_8fc4 in one
 file, since it would be only trivially larger.  (I have a patch that I
 thought did that, but it has a bug; I'll test once my indexwriter is
 finished writing, since I don't want to interrupt it by reloading the
 plugin.)

 trivially larger ...
 ugh... how trivial is trivial?

 the xmllibrarian can handle  index_8fc42.xml on its own but all other
 8fc4 on  index_8fc4.xml.
 however, as i have stated in irc, that make index generation even slower.

 8fc42 is 17382 KiB.  All other 8fc4 are 79 KiB combined.

 Also, it would make index generation faster.  The spider first does
 all the work of creating 8fc4, then discards it to recreate the
 sub-indexes.  The vast majority of this work is in 8fc42, which gets
 created twice.  Not splitting the index would nearly halve the time to

 It don't get created twice, it shortcut early.
 see the estimateSize variable in IndexWriter.

Unless I'm mistaken, the slow part of the index creation is the
term.getPages() call.  That call is where all the disk io hides, no?
The shortcut doesn't occur until after that call returns.  As
discussed above, the accounts for about 99.5% of the whole index,
and therefore (I'm assuming) 99.5% of the disk io.  And that 99.5%
happens twice.

The shortcut only functions properly when the largest term accounts
for a modest fraction of the total work, which is exactly what isn't
happening here.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Should the spider ignore common words?

2009-06-10 Thread Evan Daniel
On Wed, Jun 10, 2009 at 3:49 AM, Daniel Chengj16sdiz+free...@gmail.com wrote:
 On Wed, Jun 10, 2009 at 3:18 PM, Evan Danieleva...@gmail.com wrote:
 On Wed, Jun 10, 2009 at 2:56 AM, Daniel Chengj16sdiz+free...@gmail.com 
 wrote:
 On Wed, Jun 10, 2009 at 2:06 PM, Evan Danieleva...@gmail.com wrote:
 On Wed, Jun 10, 2009 at 1:54 AM, Daniel Chengj16sdiz+free...@gmail.com 
 wrote:
 On Wed, Jun 10, 2009 at 12:02 PM, Evan Danieleva...@gmail.com wrote:
 On my (incomplete) spider index, the index file for the word the (it
 indexes no other words) is 17MB.  This seems rather large.  It might
 make sense to have the spider not even bother creating an index on a
 handful of very common words (the, be, to, of, and, a, in, I, etc).
 Of course, this presents the occasional difficulty:
 http://bash.org/?514353  I think I'm in favor of not indexing common
 words even so.

 Yes, it should ignore common words.
 This is called stopword in search engine termology.


 Also, on a related note, the index splitting policy should be a bit
 more sophisticated: in an attempt to fit within the max index size as
 configured, it split all the way down to index_8fc42.xml.  As a
 result, the file index_8fc4b.xml sits all by itself at 3KiB.  It
 contains the two words vergessene and txjmnsm.  I suspect it would
 have reliability issues should anyone actually want to search either
 of those.  It would make more sense to have all of index_8fc4 in one
 file, since it would be only trivially larger.  (I have a patch that I
 thought did that, but it has a bug; I'll test once my indexwriter is
 finished writing, since I don't want to interrupt it by reloading the
 plugin.)

 trivially larger ...
 ugh... how trivial is trivial?

 the xmllibrarian can handle  index_8fc42.xml on its own but all other
 8fc4 on  index_8fc4.xml.
 however, as i have stated in irc, that make index generation even slower.

 8fc42 is 17382 KiB.  All other 8fc4 are 79 KiB combined.

 Also, it would make index generation faster.  The spider first does
 all the work of creating 8fc4, then discards it to recreate the
 sub-indexes.  The vast majority of this work is in 8fc42, which gets
 created twice.  Not splitting the index would nearly halve the time to

 It don't get created twice, it shortcut early.
 see the estimateSize variable in IndexWriter.

 Unless I'm mistaken, the slow part of the index creation is the
 term.getPages() call.  That call is where all the disk io hides, no?

 no :)
 getPages() return a IPersistentSet (ScalableSet) which is lazy evaluated.

 Internally, it is a linkedset when small, btree when large.
 the .size() method is always cached.

In this case, I don't think it helps.  13 bytes is a gross
underestimate of the size adding a page adds to the file.
estimateSize isn't checked again until all the pages have been added.

Furthermore, that leaves the timing unexplained.  It takes as long to
generate b70 as all the rest of b7* combined.  This is fairly
consistent across the whole set of files (obviously some variation is
present).

2009-06-10 02:59 index_b6e.xml
2009-06-10 03:00 index_b6f.xml
2009-06-10 03:16 index_b70.xml
2009-06-10 03:17 index_b71.xml
2009-06-10 03:18 index_b72.xml
2009-06-10 03:19 index_b73.xml
2009-06-10 03:20 index_b74.xml
2009-06-10 03:21 index_b75.xml
2009-06-10 03:21 index_b76.xml
2009-06-10 03:22 index_b77.xml
2009-06-10 03:24 index_b78.xml
2009-06-10 03:24 index_b79.xml
2009-06-10 03:25 index_b7a.xml
2009-06-10 03:27 index_b7b.xml
2009-06-10 03:28 index_b7c.xml
2009-06-10 03:28 index_b7d.xml
2009-06-10 03:29 index_b7e.xml
2009-06-10 03:30 index_b7f.xml
2009-06-10 03:45 index_b80.xml
2009-06-10 03:47 index_b81.xml

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Should the spider ignore common words?

2009-06-10 Thread Evan Daniel
On Wed, Jun 10, 2009 at 6:49 AM, Mike Bushmpb...@gmail.com wrote:
 XMLLibrarian doesn't currently support searching for phrases or rating
 relevance of results based on proximity so I don't think common words
 could be of any use in searches now.

 Also, I'm not sure but I think the current index doesn't include words
 under 4 letters at all.

If you read my previous mails, you'll see that the the spider is in
fact indexing the word the.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Reducing Bloom filter memory usage

2009-06-12 Thread Evan Daniel
Currently, the Bloom filters use 46 bits per key (23 buckets, 2-bit
counter per bucket) of RAM, using 16 hash functions.  This gives a
false positive rate of 1.5E-5.

Oddly enough, a simple hash table has lower memory usage.  Create an
array in memory of n-bit values, one value per slot in the salted-hash
store.  The value stored in the array is an n-bit hash of the key in
the corresponding store location.  On a given lookup, the odds of a
false positive are 2^-n.  Because the store does quadratic probing
with 4 slots, there are 4 lookups per key request.  The false positive
rate is then 2^-(n-2).  For n=18, we get a false positive rate of
1.5E-5.  n=16 would align on byte boundaries and save a tiny bit of
memory at a cost of a false positive rate of 6.1E-5.

The reason this works better than the Bloom filter is because a given
key can only go in a limited set of locations in the salted hash
store.  The Bloom filter works equally well whether or not that
restriction is present.  With this method, as the number of possible
locations grows, so does the false positive rate.  For low
associativity, the simple hash table wins.

This also makes updates and removal trivial.  However, we probably
can't share it with our neighbors without giving away our salt value.
For that, we probably want to continue planning to use Bloom filters.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Variable opennet connections: moving forward

2009-06-13 Thread Evan Daniel
On Sat, Jun 13, 2009 at 1:08 PM, Matthew
Toselandt...@amphibian.dyndns.org wrote:
 Now that 0.7.5 has shipped, we can start making disruptive changes again in a 
 few days. The number one item on freenet.uservoice.com has been for some time 
 to allow more opennet peers for fast nodes. We have discussed this in the 
 past, and the conclusions which I agree with and some others do:
 - This is feasible.
 - It will not seriously break routing.
 - Reducing the number of connections on slow nodes may actually be a gain in 
 security, by increasing opportunities for coalescing. It will improve payload 
 percentages, improve average transfer rates, let slow nodes accept more 
 requests from each connection, and should improve overall performance.
 - The network should be less impacted by the speed of the slower nodes.
 - But we have tested using fewer connections on slow nodes in the past and 
 had anecdotal evidence that it is slower. We need to evaluate it more 
 rigorously somehow.
 - Increasing the number of peers allowed for fast opennet nodes, within 
 reason, should not have a severe security impact. It should improve routing 
 (by a smaller network diameter). It will of course allow fast nodes to 
 contribute more to the network. We do need to be careful to avoid 
 overreliance on ubernodes (hence an upper limit of maybe 50 peers).
 - Routing security: FOAF routing allows you to capture most of the traffic 
 from a node already, the only thing stopping this is the 30%-to-one-peer 
 limit.
 - Coalescing security: Increasing the number of peers without increasing the 
 bandwidth usage does increase vulnerability to traffic analysis by doing less 
 coalescing. On the other hand, this is not a problem if the bandwidth usage 
 scales with the number of nodes.

 How can we move forward? We need some reliable test results on whether a 
 10KB/sec node is better off with 10 peers or with 20 peers. I think it's a 
 fair assumption for faster nodes. Suggestions?

I haven't tested at numbers that low.  At 15KiB/s, the stats page
suggests your slightly better off with 12-15 peers than 20.  I saw no
subjective difference in browsing speed either way.

I'm happy to do some testing here, if you tell me what data you want
me to collect.  More testers would obviously be good.


 We also need to set some arbitrary parameters. There is an argument for 
 linearity, to avoid penalising nodes with different bandwidth levels, but 
 nodes with more peers and the same amount of bandwidth per peer are likely to 
 be favoured by opennet anyway... Non-linearity, in the sense of having a 
 lower threshold and an upper threshold and linearly add peers between them 
 but not necessarily consistently with the lower threshold, would mean fewer 
 nodes with lots of peers, and might achieve better results? E.g.

 10 peers at 10KB/sec ... 20 peers at 20KB/sec (1 per KB/sec)
 20 peers at 20KB/sec ... 50 peers at 80KB/sec (1 per 3KB/sec)

I wouldn't go as low as 10 peers, simply because I haven't tested it.
Other than that, those seem perfectly sensible to me.

We should also watch for excessive cpu usage.  If there's lots of bw
available, we'd want to have just enough connections to not quite
limit on available cpu power.  Of course, I don't really know how many
connections / how much bw it is before that becomes a concern.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] About the website

2009-06-13 Thread Evan Daniel
On Sat, Jun 13, 2009 at 2:18 PM, Matthew
Toselandt...@amphibian.dyndns.org wrote:
 Probably worth moving forward on this? Submenus are important, we have a lot 
 of content and it's not well organised. And arguably the theme is better, and 
 arguably even if it isn't better it's a change, and is no worse...

 On Monday 01 June 2009 01:24:24 Clément wrote:
 Hello all,

 about three weeks ago I had a HCI webproject to do.
 The subject was : improve an existant website (well, I'm not 100% sure that 
 it
 was the subject, but that what we've done)

 I convinced the three other people who worked with me to work on the freenet
 website. It was a small project though (3 hours with a teacher in the room, +
 3 hours max of personal time), so we didn't go far.

 But maybe some of what we've done could be usefull for the project.

 Here is the copy/paste of what we've done :

 I like the submenus. I think that is fairly universal. New layout is fine. I 
 am not convinced about the way the site has been split up however. More 
 comment below...

 --
 Objective of the new website:
 - To improve the existing navigation controls of freenetproject.org
 - To improve it's structural presentation of information on home page


 Aim of the web-site:
 - to present the software product and provide support
 - documentation and tools to users and developers to allow them to use and
 contribute to the software.

 Problems:
 The problems of current website http://freenetproject.org :
 - irrelevant information for homepage: mainly financial status, we don't know
 what freenet is
 - too many items in left navigation menu and not really well structured
 - documentation section where subsections do not have direct hyperlinks - its
 confusing

 Solutions we proposed:
 - simpler horizontal navigation bar with restructured tree
 - new menu tree proposition:

         Home -- what is freenet a bit modified page

 This page looks good IMHO.

         About freenet:
                 what is freenet
                 philosophy
                 contributors

         Downloads:
                 freenet
                 tools

 Tools are unofficial and unsupported. Maybe download should be under home?

         Contribute:
                 papers -- research and stuff
                 developer

 I'm not convinced papers belong under Contribute.

         Donations
                 donate
                 sponsors

 Ok. But shouldn't both be under Contribute?

         Support  feedback
                 help --documentation and stuff
                 faq --move out from help section
                 mailing lists
                 suggestions

 What about the wiki? Shouldn't it be on the same level as the uservoice 
 tracker?

 How about:

 Home
 - Home
 - What is Freenet?
 - Download Freenet

 About:
 - Philosophy
 - Papers
 - People

 Contribute:
 - Developer page
 - Donate
 - Sponsors

 Help:
 - Docs
 - FAQ
 - Mailing lists
 - Suggestions
 - Wiki

 Too many links? Depends on the theme I guess...

 Actually, I'm not convinced we want to keep the documentation pages:
 - Install only applies to the java installer, needs some typo fixes and a new 
 final screenshot, and some guidance on the post-install wizard.
 - Connect: needs updating but is basically acceptable.
 - Content: dunno, isn't this more About? but i'm not sure we want to move it 
 there...
 - Understand: maybe keep
 - Freemail: probably keep, is sort of official
 - Frost: dunno, we don't ship it, and we don't review it, but at the moment 
 we recommend it ...
 - jSite: keep
 - Thaw: see Frost
 - FAQ: should be at a higher level
 - Wiki: should be at a higher level

 ___
 Devl mailing list
 Devl@freenetproject.org
 http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


IMHO the wiki should be made more prominent, with a top level link.

Is there any reason the following shouldn't be wiki pages?
Current docs
FAQ
What is Freenet?
Papers
Philosophy
People

I'm happy to volunteer to work on the wiki, but only if it is going to
be made prominent enough that new users are likely to see it.  Buried
under a submenu as it presently is, I feel that effort spent improving
it would be wasted because no one who needs the info would ever see
it.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Variable opennet connections: moving forward

2009-06-13 Thread Evan Daniel
On Sat, Jun 13, 2009 at 2:54 PM, Matthew
Toselandt...@amphibian.dyndns.org wrote:
 On Saturday 13 June 2009 19:05:36 Evan Daniel wrote:
 On Sat, Jun 13, 2009 at 1:08 PM, Matthew
 Toselandt...@amphibian.dyndns.org wrote:
  Now that 0.7.5 has shipped, we can start making disruptive changes again 
  in a few days. The number one item on freenet.uservoice.com has been for 
  some time to allow more opennet peers for fast nodes. We have discussed 
  this in the past, and the conclusions which I agree with and some others 
  do:
  - This is feasible.
  - It will not seriously break routing.
  - Reducing the number of connections on slow nodes may actually be a gain 
  in security, by increasing opportunities for coalescing. It will improve 
  payload percentages, improve average transfer rates, let slow nodes accept 
  more requests from each connection, and should improve overall performance.
  - The network should be less impacted by the speed of the slower nodes.
  - But we have tested using fewer connections on slow nodes in the past and 
  had anecdotal evidence that it is slower. We need to evaluate it more 
  rigorously somehow.
  - Increasing the number of peers allowed for fast opennet nodes, within 
  reason, should not have a severe security impact. It should improve 
  routing (by a smaller network diameter). It will of course allow fast 
  nodes to contribute more to the network. We do need to be careful to avoid 
  overreliance on ubernodes (hence an upper limit of maybe 50 peers).
  - Routing security: FOAF routing allows you to capture most of the traffic 
  from a node already, the only thing stopping this is the 30%-to-one-peer 
  limit.
  - Coalescing security: Increasing the number of peers without increasing 
  the bandwidth usage does increase vulnerability to traffic analysis by 
  doing less coalescing. On the other hand, this is not a problem if the 
  bandwidth usage scales with the number of nodes.
 
  How can we move forward? We need some reliable test results on whether a 
  10KB/sec node is better off with 10 peers or with 20 peers. I think it's a 
  fair assumption for faster nodes. Suggestions?

 I haven't tested at numbers that low.  At 15KiB/s, the stats page
 suggests your slightly better off with 12-15 peers than 20.  I saw no
 subjective difference in browsing speed either way.

 Which stats are you comparing?

Output bandwidth (average), payload %, and nodeAveragePingTime.  I'd
be happy to track others as well.


 I'm happy to do some testing here, if you tell me what data you want
 me to collect.  More testers would obviously be good.

 That would be a good start. It would be useful to compare:
 - 12KB/sec with 10, 12, 20 peers.
 - 8KB/sec with 8, 10, 20 peers.
 - 20KB/sec with 10, 15, 20 peers.

10 peers on each setting (proposed minimum), 20 peers (current
setting), and 1 peer per KiB/s...  What's the rationale behind 20KiB/s
with 15 peers?

The huge variable is what sort of load I put on the node.  Nothing?  A
few queued downloads?  Run the spider?  Some test files inserted for
the purpose by someone else?  Other ideas?


  We also need to set some arbitrary parameters. There is an argument for 
  linearity, to avoid penalising nodes with different bandwidth levels, but 
  nodes with more peers and the same amount of bandwidth per peer are likely 
  to be favoured by opennet anyway... Non-linearity, in the sense of having 
  a lower threshold and an upper threshold and linearly add peers between 
  them but not necessarily consistently with the lower threshold, would mean 
  fewer nodes with lots of peers, and might achieve better results? E.g.
 
  10 peers at 10KB/sec ... 20 peers at 20KB/sec (1 per KB/sec)
  20 peers at 20KB/sec ... 50 peers at 80KB/sec (1 per 3KB/sec)

 I wouldn't go as low as 10 peers, simply because I haven't tested it.

 Well, maybe the lower bound should be different. Testing should help. It 
 might very well be that there is a minimum number of opennet connections 
 below which it just doesn't work well.

I suspect that is the case.  I have no idea where that limit is,
though.  I suspect having the 30% limit become relevant just due to
normal routing policy would be bad.

Also, your math above is off: 20 KiB/s to 80 KiB/s is a 60 KiB/s jump;
adding 30 peers is 1 peer per 2 KiB/s.


 Other than that, those seem perfectly sensible to me.

 We should also watch for excessive cpu usage.  If there's lots of bw
 available, we'd want to have just enough connections to not quite
 limit on available cpu power.  Of course, I don't really know how many
 connections / how much bw it is before that becomes a concern.

 Maybe... just routing requests isn't necessarily a big part of our overall 
 CPU usage, the client layer stuff tends to be pretty heavy ... IMHO if people 
 have CPU problems they can just reduce their bandwidth limits. To some degree 
 ping time will keep it in check, but that's a crude measure in that it can't 
 do much until the situation

Re: [freenet-dev] About the website

2009-06-13 Thread Evan Daniel
On Sat, Jun 13, 2009 at 3:15 PM, Matthew
Toselandt...@amphibian.dyndns.org wrote:
 On Saturday 13 June 2009 20:01:18 Evan Daniel wrote:
 On Sat, Jun 13, 2009 at 2:18 PM, Matthew
 Toselandt...@amphibian.dyndns.org wrote:
  Probably worth moving forward on this? Submenus are important, we have a 
  lot of content and it's not well organised. And arguably the theme is 
  better, and arguably even if it isn't better it's a change, and is no 
  worse...
 
  On Monday 01 June 2009 01:24:24 Clément wrote:
  Hello all,
 
  about three weeks ago I had a HCI webproject to do.
  The subject was : improve an existant website (well, I'm not 100% sure 
  that it
  was the subject, but that what we've done)
 
  I convinced the three other people who worked with me to work on the 
  freenet
  website. It was a small project though (3 hours with a teacher in the 
  room, +
  3 hours max of personal time), so we didn't go far.
 
  But maybe some of what we've done could be usefull for the project.
 
  Here is the copy/paste of what we've done :
 
  I like the submenus. I think that is fairly universal. New layout is fine. 
  I am not convinced about the way the site has been split up however. More 
  comment below...
 
  --
  Objective of the new website:
  - To improve the existing navigation controls of freenetproject.org
  - To improve it's structural presentation of information on home page
 
 
  Aim of the web-site:
  - to present the software product and provide support
  - documentation and tools to users and developers to allow them to use and
  contribute to the software.
 
  Problems:
  The problems of current website http://freenetproject.org :
  - irrelevant information for homepage: mainly financial status, we don't 
  know
  what freenet is
  - too many items in left navigation menu and not really well structured
  - documentation section where subsections do not have direct hyperlinks - 
  its
  confusing
 
  Solutions we proposed:
  - simpler horizontal navigation bar with restructured tree
  - new menu tree proposition:
 
          Home -- what is freenet a bit modified page
 
  This page looks good IMHO.
 
          About freenet:
                  what is freenet
                  philosophy
                  contributors
 
          Downloads:
                  freenet
                  tools
 
  Tools are unofficial and unsupported. Maybe download should be under home?
 
          Contribute:
                  papers -- research and stuff
                  developer
 
  I'm not convinced papers belong under Contribute.
 
          Donations
                  donate
                  sponsors
 
  Ok. But shouldn't both be under Contribute?
 
          Support  feedback
                  help --documentation and stuff
                  faq --move out from help section
                  mailing lists
                  suggestions
 
  What about the wiki? Shouldn't it be on the same level as the uservoice 
  tracker?
 
  How about:
 
  Home
  - Home
  - What is Freenet?
  - Download Freenet
 
  About:
  - Philosophy
  - Papers
  - People
 
  Contribute:
  - Developer page
  - Donate
  - Sponsors
 
  Help:
  - Docs
  - FAQ
  - Mailing lists
  - Suggestions
  - Wiki
 
  Too many links? Depends on the theme I guess...
 
  Actually, I'm not convinced we want to keep the documentation pages:
  - Install only applies to the java installer, needs some typo fixes and a 
  new final screenshot, and some guidance on the post-install wizard.
  - Connect: needs updating but is basically acceptable.
  - Content: dunno, isn't this more About? but i'm not sure we want to move 
  it there...
  - Understand: maybe keep
  - Freemail: probably keep, is sort of official
  - Frost: dunno, we don't ship it, and we don't review it, but at the 
  moment we recommend it ...
  - jSite: keep
  - Thaw: see Frost
  - FAQ: should be at a higher level
  - Wiki: should be at a higher level
 
  ___
  Devl mailing list
  Devl@freenetproject.org
  http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
 

 IMHO the wiki should be made more prominent, with a top level link.

 Is there any reason the following shouldn't be wiki pages?
 Current docs
 FAQ
 What is Freenet?
 Papers
 Philosophy
 People

 They'd have to be locked, or they'd get vandalised during a release. And 
 vandalism here could be very nasty. Also, there might be performance issues, 
 although if we have a third party hosting our wiki it might not be a problem.

Locked might be overkill.  Allowing edits by non-new accounts would
probably work.


 OTOH, docs and FAQ would definitely make sense as wiki pages ... they would 
 still need menu items on the main site... If we use mediawiki, can we get 
 notifications by email when a page is changed? On wikipedia i think you have 
 to login to see such notices?

I don't know.  I don't see

Re: [freenet-dev] About the website

2009-06-15 Thread Evan Daniel
On Mon, Jun 15, 2009 at 7:14 PM, Matthew
Toselandt...@amphibian.dyndns.org wrote:
 I have done the first phase of deploying this, after discussions with Ian. We 
 use the new background and the new logo, but we waste a lot of space on the 
 top line with the banner, and we don't use the horizontal menu yet as we 
 need to implement the sub-menus. Also I have rewritten the What is Freenet? 
 page with some input from Ian.

Looking at the new version, it feels like it's targetted to an
academic who is interested in the theory of anonymous networks.  IMHO,
it should be targeted at a potential new Freenet user.  What they want
to know is what they can do with it.  The first sentence is a great
introduction; it says that Freenet does something to let them
communicate anonymously and without censorship.  At that point, I
think the obvious question for a potential user isn't How does it
manage that? but What sorts of communication?  In the current
version, a new user has to get to the fourth paragraph before they get
any hint about what they can do with it, rather than how it works.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] About the website

2009-06-16 Thread Evan Daniel
On Tue, Jun 16, 2009 at 1:52 PM, Matthew
Toselandt...@amphibian.dyndns.org wrote:
 On Tuesday 16 June 2009 03:18:47 Evan Daniel wrote:
 On Mon, Jun 15, 2009 at 7:14 PM, Matthew
 Toselandt...@amphibian.dyndns.org wrote:
  I have done the first phase of deploying this, after discussions with Ian. 
  We use the new background and the new logo, but we waste a lot of space on 
  the top line with the banner, and we don't use the horizontal menu yet 
  as we need to implement the sub-menus. Also I have rewritten the What is 
  Freenet? page with some input from Ian.

 Looking at the new version, it feels like it's targetted to an
 academic who is interested in the theory of anonymous networks.  IMHO,
 it should be targeted at a potential new Freenet user.  What they want
 to know is what they can do with it.  The first sentence is a great
 introduction; it says that Freenet does something to let them
 communicate anonymously and without censorship.  At that point, I
 think the obvious question for a potential user isn't How does it
 manage that? but What sorts of communication?  In the current
 version, a new user has to get to the fourth paragraph before they get
 any hint about what they can do with it, rather than how it works.

 Okay. The homepage now says:
 ' Freenet is free software which lets you anonymously share files, browse and 
 publish web sites, and chat on forums, without fear of censorship. Users are 
 anonymous, and Freenet is entirely decentralised. Without anonymity there can 
 never be true freedom of speech, and without decentralisation the network 
 would be vulnerable to attack. Learn more!'

 The What is Freenet? page now says:
 ' Freenet is free software which lets you anonymously share files, browse and 
 publish web sites (freesites), and chat on forums, without fear of 
 censorship. Users are anonymous, and Freenet is entirely decentralised. 
 Without anonymity there can never be true freedom of speech, and without 
 decentralisation the network would be vulnerable to attack.

 Communications by Freenet nodes are encrypted and are routed through other 
 nodes to make it extremely difficult to determine who is requesting the 
 information and what its content is.

 Users contribute to the network by giving bandwidth and a portion of their 
 hard drive (called the data store) for storing files. Files are 
 automatically kept or deleted depending on how popular they are, with the 
 least popular being discarded to make way for newer or more popular content. 
 Files are encrypted, so generally the user cannot easily discover what is in 
 his datastore, and hopefully can't be held accountable for it. Chat forums, 
 websites, and search functionality, are all built on top of this distributed 
 data store.

 Freenet has been downloaded by over 2 million users since the project 
 started, and used for the distribution of censored information all over the 
 world including countries such as China and the Middle East. Ideas and 
 concepts pioneered in Freenet have had a significant impact in the academic 
 world. Our 2000 paper Freenet: A Distributed Anonymous Information Storage 
 and Retrieval System was the most cited computer science paper of 2000 
 according to Citeseer, and Freenet has also inspired papers in the worlds of 
 law and philosophy. Ian Clarke, Freenet's creator and project coordinator, 
 was selected as one of the top 100 innovators of 2003 by MIT's Technology 
 Review magazine.

 An important recent development, which very few other networks have, is the 
 darknet: By only connecting to people they trust, users can greatly reduce 
 their vulnerability, and yet still connect to a global network through their 
 friends' friends' friends and so on. This enables people to use Freenet even 
 in places where Freenet may be illegal, makes it very difficult for 
 governments to block it, and does not rely on tunneling to the free world.

 Sounds good? Try it!'


I think that's better.

I might change browse and publish web sites (freesites) to
something like browse and publish freesites (freesites are like web
sites, but hosted on Freenet).  The current version sounds somewhat
like you could browse the normal web with Freenet.  Given that some
new users are familiar with TOR, they may be expecting something like
it; best not to feed those assumptions.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] usability testing

2009-06-16 Thread Evan Daniel
On Tue, Jun 16, 2009 at 2:42 PM, Matthew
Toselandt...@amphibian.dyndns.org wrote:
 e) On the Download page: No idea what a node reference is. (Could be
 rephrased or explained better)

 That's why it's in quotes, and the Add a friend page does explain it. Do 
 you have any suggestion as to how to improve the wording?

Leave it the same, but make node reference a link to an explanation
(perhaps even a wiki page...) instead of in quotes?

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Good screenshots needed

2009-06-16 Thread Evan Daniel
On Tue, Jun 16, 2009 at 7:26 PM, Matthew
Toselandt...@amphibian.dyndns.org wrote:
 We need some (3?) screenshots. These must be legal, look reasonably good both 
 in full and when thumbnailed to a reasonable size so we can put them on the 
 homepage.

 Alternatively, please explain why it would be bad to replace the news on the 
 homepage with a few screenshots.

I would vote for a front page that had the title and (possibly) the
first paragraph of the most recent news item, with a read more link.
 It should be small enough to not take up too much of the
above-the-fold section.  Screenshots below that would be good.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Variable opennet connections: moving forward

2009-06-18 Thread Evan Daniel
On Thu, Jun 18, 2009 at 8:00 PM, Matthew
Toselandt...@amphibian.dyndns.org wrote:
 Are you doing more testing?

 On Saturday 13 June 2009 19:05:36 Evan Daniel wrote:
 On Sat, Jun 13, 2009 at 1:08 PM, Matthew
 Toselandt...@amphibian.dyndns.org wrote:
  Now that 0.7.5 has shipped, we can start making disruptive changes again 
  in a few days. The number one item on freenet.uservoice.com has been for 
  some time to allow more opennet peers for fast nodes. We have discussed 
  this in the past, and the conclusions which I agree with and some others 
  do:
  - This is feasible.
  - It will not seriously break routing.
  - Reducing the number of connections on slow nodes may actually be a gain 
  in security, by increasing opportunities for coalescing. It will improve 
  payload percentages, improve average transfer rates, let slow nodes accept 
  more requests from each connection, and should improve overall performance.
  - The network should be less impacted by the speed of the slower nodes.
  - But we have tested using fewer connections on slow nodes in the past and 
  had anecdotal evidence that it is slower. We need to evaluate it more 
  rigorously somehow.
  - Increasing the number of peers allowed for fast opennet nodes, within 
  reason, should not have a severe security impact. It should improve 
  routing (by a smaller network diameter). It will of course allow fast 
  nodes to contribute more to the network. We do need to be careful to avoid 
  overreliance on ubernodes (hence an upper limit of maybe 50 peers).
  - Routing security: FOAF routing allows you to capture most of the traffic 
  from a node already, the only thing stopping this is the 30%-to-one-peer 
  limit.
  - Coalescing security: Increasing the number of peers without increasing 
  the bandwidth usage does increase vulnerability to traffic analysis by 
  doing less coalescing. On the other hand, this is not a problem if the 
  bandwidth usage scales with the number of nodes.
 
  How can we move forward? We need some reliable test results on whether a 
  10KB/sec node is better off with 10 peers or with 20 peers. I think it's a 
  fair assumption for faster nodes. Suggestions?

 I haven't tested at numbers that low.  At 15KiB/s, the stats page
 suggests your slightly better off with 12-15 peers than 20.  I saw no
 subjective difference in browsing speed either way.

 I'm happy to do some testing here, if you tell me what data you want
 me to collect.  More testers would obviously be good.

 
  We also need to set some arbitrary parameters. There is an argument for 
  linearity, to avoid penalising nodes with different bandwidth levels, but 
  nodes with more peers and the same amount of bandwidth per peer are likely 
  to be favoured by opennet anyway... Non-linearity, in the sense of having 
  a lower threshold and an upper threshold and linearly add peers between 
  them but not necessarily consistently with the lower threshold, would mean 
  fewer nodes with lots of peers, and might achieve better results? E.g.
 
  10 peers at 10KB/sec ... 20 peers at 20KB/sec (1 per KB/sec)
  20 peers at 20KB/sec ... 50 peers at 80KB/sec (1 per 3KB/sec)

 I wouldn't go as low as 10 peers, simply because I haven't tested it.
 Other than that, those seem perfectly sensible to me.

 We should also watch for excessive cpu usage.  If there's lots of bw
 available, we'd want to have just enough connections to not quite
 limit on available cpu power.  Of course, I don't really know how many
 connections / how much bw it is before that becomes a concern.

 Evan Daniel


I'd been running the Spider, and trying to get a complete run out of
it in order to provide a full set of bug reports.  Unfortunately,
after spidering over 100k keys (representing over a week of runtime),
the .dbs file became unrecoverably corrupted, and it won't write index
files.  I had started rerunning it; I've since paused that and started
taking data on connections.

I've got a little data so far at 12KiB/s limit, 10 and 12 peers.
Basically, I don't see a difference between 10 and 12 peers.  Both
produce reasonable performance numbers.  My node has 2 darknet peers,
remainder opennet.  I'm not using the node much during these tests; it
has a few MiB of downloads queued that aren't making progress (old
files that have probably dropped off).

Evan Daniel


12 peers, 12 KiB/s limit

# bwlimitDelayTime: 91ms
# nodeAveragePingTime: 408ms
# darknetSizeEstimateSession: 0 nodes
# opennetSizeEstimateSession: 63 nodes
# nodeUptime: 1h37m

# Connected: 10
# Backed off: 2

# Input Rate: 2.54 KiB/s (of 60.0 KiB/s)
# Output Rate: 12.9 KiB/s (of 12.0 KiB/s)
# Total Input: 31.3 MiB (5.5 KiB/s average)
# Total Output: 47.5 MiB (8.34 KiB/s average)
# Payload Output: 32.6 MiB (5.73 KiB/sec)(68%)

1469Output bandwidth liability
18  SUB_MAX_PING_TIME

Success rates
Group   P(success)  Count
All requests3.329%  10,633
CHKs9.654%  3,377
SSKs0.386%  7,256
Local requests  2.022

[freenet-dev] FCP questions

2009-06-18 Thread Evan Daniel
I'm trying to test out SSK inserts with FCP, and running into problems.

First, the Disconnect command does not work:

Disconnect
EndMessage

ProtocolError
ExtraDescription=Unknown message name Disconnect
Fatal=false
CodeDescription=Don't know what to do with message
Code=7
EndMessage


Second, DDA isn't working:

TestDDARequest
Directory=/data/freenet/
WantReadDirectory=true
WantWriteDirectory=false
EndMessage

TestDDAReply
ReadFilename=/data/freenet/DDACheck-2801570529831049194.tmp
Directory=/data/freenet
EndMessage

TestDDAResponse
Directory=/data/freenet/
ReadContent=[stuff read from file indicated above]
EndMessage

TestDDAComplete
ReadDirectoryAllowed=false
Directory=/data/freenet
EndMessage


Third, if I didn't want to use DDA to insert my data, how would I do
that?  The documentation on ClientPut doesn't say how to actually send
data in the direct mode.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] FCP questions

2009-06-19 Thread Evan Daniel
On Fri, Jun 19, 2009 at 3:50 AM, bo-lebo...@web.de wrote:
 Am Freitag, 19. Juni 2009 03:08:13 schrieb Evan Daniel:
 I'm trying to test out SSK inserts with FCP, and running into problems.

 First, the Disconnect command does not work:

 Disconnect
 EndMessage

 Where did you get this info? I have never seen this.

 Just close the socket. The node will detect it ;)

From http://wiki.freenetproject.org/FCP2p0Disconnect

Is there more accurate / up to date FCP documentation somewhere?

I'd been testing with telnet, and I'd rather type Disconnect than fuss
with escape sequences, but it's not exactly a huge problem.



 ProtocolError
 ExtraDescription=Unknown message name Disconnect
 Fatal=false
 CodeDescription=Don't know what to do with message
 Code=7
 EndMessage


 Second, DDA isn't working:

 TestDDARequest
 Directory=/data/freenet/
 WantReadDirectory=true
 WantWriteDirectory=false
 EndMessage

 TestDDAReply
 ReadFilename=/data/freenet/DDACheck-2801570529831049194.tmp
 Directory=/data/freenet
 EndMessage

 TestDDAResponse
 Directory=/data/freenet/
 ReadContent=[stuff read from file indicated above]
 EndMessage

 TestDDAComplete
 ReadDirectoryAllowed=false
 Directory=/data/freenet
 EndMessage


 Third, if I didn't want to use DDA to insert my data, how would I do
 that?  The documentation on ClientPut doesn't say how to actually send
 data in the direct mode.


 sample command:

 ClientPut
 Persistence=connection
 Verbosity=-1                 //be my wife  err be max verbose.
 UploadFrom=direct
 PriorityClass=1
 DataLength=42
 Identifier=jHyperocha1245397350876
 TargetFilename=            // missing this field means auto, a empty name
 means off
 Global=false
 uri=...@blah,blub,AQECAAE/singlechunktest
 Data
 42 bytes of data

Thanks, that works!

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Can we implement Bloom filter sharing quickly???

2009-06-19 Thread Evan Daniel
On Fri, Jun 19, 2009 at 1:36 PM, Matthew
Toselandt...@amphibian.dyndns.org wrote:
 On Friday 19 June 2009 17:37:00 Robert Hailey wrote:

 On Jun 18, 2009, at 9:07 PM, Matthew Toseland wrote:

  CHUNK SIZE PROBLEM:
  ===
 
  Current plans call to split the keyspace up for each datastore, and
  assign keys to a manageable sized bloom filter for a section of the
  (hashed) keyspace. These can then be transferred separately, are
  useful immediately after being transferred, can be juggled in
  memory, etc. However, we cannot guarantee that the populaion of such
  a chunk will be small enough for the filter to be effective. Solving
  this requires either moving the boundaries dynamically (possibly
  continually), or making the chunks significantly larger than they
  would need to be in an ideal world.

 Right... but I believe we prescribed that the split would be based on
 a hashed value of the key and not by logical keyspace location to
 avoid disproportionate chunks.

 That is to say... ideally a node is going to get a disporportionate
 amount of cache/store data about it's network location.

 STORE_SIZE/MAX_BLOOM_CHUNK - N_CHUNKS
 H(key, N_CHUNKS) = n  (0  n  N)
 CHUNK[n].add(key)

 Wouldn't the problem be reduced to finding a well-scattering hash
 function then?

 Yes and no. How much variation will we have even if we divide by hashed 
 keyspace? Hence how much bigger than the ideal splitting size do we need each 
 chunk to be to maintain approximately the right false positives ratio?

If the hash spreads things well, number of keys in a single bloom
filter should be normally distributed with mean ((total keys) /
(number of filters)) and standard deviation sqrt((total keys) * (1 /
(number of filters)) * (1 - 1 / (number of filters))).  (It's a
binomial distribution; any given key has a 1/ number of filters chance
of landing in a specific filter.)  For a 100GiB total store, we have
3E6 keys between store and cache (for CHKs, same for SSKs).  That
implies 17MiB of bloom filters.  If we size the filters at 1MiB each,
with 17 filters, we have 176k keys per filter on average.  From the
preceding formula, standard deviation is 408 keys.

Size variation is only a serious concern if the hash function is not
distributing the keys at random.  To be safe, we could slightly
underfill the filters.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Can we implement Bloom filter sharing quickly???

2009-06-19 Thread Evan Daniel
On Fri, Jun 19, 2009 at 9:15 PM, Juicemanjuicema...@gmail.com wrote:
 On Fri, Jun 19, 2009 at 1:50 PM, Evan Danieleva...@gmail.com wrote:
 On Fri, Jun 19, 2009 at 1:36 PM, Matthew
 Toselandt...@amphibian.dyndns.org wrote:
 On Friday 19 June 2009 17:37:00 Robert Hailey wrote:

 On Jun 18, 2009, at 9:07 PM, Matthew Toseland wrote:

  CHUNK SIZE PROBLEM:
  ===
 
  Current plans call to split the keyspace up for each datastore, and
  assign keys to a manageable sized bloom filter for a section of the
  (hashed) keyspace. These can then be transferred separately, are
  useful immediately after being transferred, can be juggled in
  memory, etc. However, we cannot guarantee that the populaion of such
  a chunk will be small enough for the filter to be effective. Solving
  this requires either moving the boundaries dynamically (possibly
  continually), or making the chunks significantly larger than they
  would need to be in an ideal world.

 Right... but I believe we prescribed that the split would be based on
 a hashed value of the key and not by logical keyspace location to
 avoid disproportionate chunks.

 That is to say... ideally a node is going to get a disporportionate
 amount of cache/store data about it's network location.

 STORE_SIZE/MAX_BLOOM_CHUNK - N_CHUNKS
 H(key, N_CHUNKS) = n  (0  n  N)
 CHUNK[n].add(key)

 Wouldn't the problem be reduced to finding a well-scattering hash
 function then?

 Yes and no. How much variation will we have even if we divide by hashed 
 keyspace? Hence how much bigger than the ideal splitting size do we need 
 each chunk to be to maintain approximately the right false positives ratio?

 If the hash spreads things well, number of keys in a single bloom
 filter should be normally distributed with mean ((total keys) /
 (number of filters)) and standard deviation sqrt((total keys) * (1 /
 (number of filters)) * (1 - 1 / (number of filters))).  (It's a
 binomial distribution; any given key has a 1/ number of filters chance
 of landing in a specific filter.)  For a 100GiB total store, we have
 3E6 keys between store and cache (for CHKs, same for SSKs).  That
 implies 17MiB of bloom filters.  If we size the filters at 1MiB each,
 with 17 filters, we have 176k keys per filter on average.  From the
 preceding formula, standard deviation is 408 keys.

 Size variation is only a serious concern if the hash function is not
 distributing the keys at random.  To be safe, we could slightly
 underfill the filters.

 Evan Daniel
 ___
 Devl mailing list
 Devl@freenetproject.org
 http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


 Perhaps this is what you are discussing, but if (generally speaking)
 requests are supposed to route to the neighboring node with the
 closest location to hopefully get closer each time, would it make
 sense to limit the bloom filter we share to a section of the keyspace
 we are supposed to specialize in?  For example, say my node's location
 is 0.5, I have neighbors expecting me to have data close to 0.5.  Do
 they care if I have a key for the 0.9 area of the keyspace?  They
 should be asking checking the bloom filter of their top n choices of
 neighbors for that keyspace for a direct hit.  If not, chuck it
 towards toward the closest one and move on.  The next node in the
 chain will do the same hopefully closer each time...

 Might we instead send the bloom filter of x portion of our
 datastore, x being either a fixed number of keys near our
 specialization or a percent of our datastore in each direction.  Even
 if we just send the closest 25%-50% it should provide an improvement
 in speed at a cost/benefit/risk ratio that would be easier than full
 bloom filter sharing...

 This would reduce the amount of bloom filter data needing to be shared
 while limiting exposing that we have received an insert.


Several thoughts in no particular order...

I see two basic ways to do it.  Split fairly with a hash function, or
split on actual keyspace.  If splitting on actual keyspace, the
obvious way to do it is to add keys to a filter until it gets overly
full, then split it in half and rescan the datastore.  That results in
filters that are (on average) 3/4 full; the hash approach results in
filters that are almost completely full.  The memory savings aren't
huge, but the disk io difference might be.

Splitting on normal keyspace means we can tell our neighbors about
selected portions of the keyspace.  That may improve normal routing.
Telling them about the wrong portion of the keyspace might make old
keys in the wrong spot that would otherwise seem to have fallen out of
the network more available.  Which is better is not clear to me,
though I suspect sending info about the keyspace near the node is.
(Though should we send info about the keyspace near us, or near the
node we're sending to?  They're usually but not always similar.)

Clearly, the ideal case is that we send all of the bloom filters

Re: [freenet-dev] New website design

2009-07-16 Thread Evan Daniel
On Thu, Jul 16, 2009 at 7:02 AM, Luke771luke771.li...@gmail.com wrote:
 Colin Davis wrote:

        * If the user is on windows (Detectable using
 http://www.quirksmode.org/js/detect.html), we should link the Download
 button directly to
 http://freenet.googlecode.com/files/FreenetInstaller-1222.exe

 No.
 A lot of users have a Windows desktop and a *nix server, and would run
 Freenet on the unix server.
 I say make one download page for everyone.. or possibly make different
 download pages according to the detected OS, but all of them including
 all the version, changing the order (detected OS up)

Users who want a download that doesn't match OS detection tend to know
what they're doing.  I suggest a large button that links directly to
the autodetected download, and says which OS it is for.  Then, below
that, a link to the full download page.

For example:
http://www.mozilla.com/en-US/firefox/firefox.html

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Installer file name

2009-07-22 Thread Evan Daniel
On Wed, Jul 22, 2009 at 10:41 AM, bren...@artvote.com wrote:

 Message: 4
 Date: Tue, 21 Jul 2009 12:57:28 -0500
 From: Ian Clarke i...@locut.us
 Subject: Re: [freenet-dev] Installer file name
 To: Discussion of development issues devl@freenetproject.org
 Message-ID:
 823242bd0907211057s4e91e920m35ba1df107081...@mail.gmail.com
 Content-Type: text/plain; charset=ISO-8859-1

 Brendan,

 The 4 digit number would, in most products, be an internal build
 number. The problem with Freenet is that we do extremely frequent
 releases, and to distinguish between them requires a rather large
 number :-)

 We do also do infrequent major releases, like 0.7.5 - however in
 practice there are many many intermediate releases.

 I'm certainly wide open to ideas about an alternative nomenclature.

 Ian.

 --
 Just curious, whom exactly does the 4 digit number benefit? Do users care
 about this number? And if so why? (Sorry if these are dumb questions. Just
 trying to wrap my head around the issue:)
 -Brendan

It's the build number.  Basically a different style of version number.
 There are frequent updates that increment the build number, and
occasional updates that change the version number (eg 0.7.5 - 0.8.0).
 Version numbers can be thought of as just a name for a specific
build.  Users care because Freenet does mandatory updates relatively
frequently -- nodes won't talk to any node older than the most recent
mandatory update.  So if you install an outdated build, it won't be
able to connect properly (the update over mandatory code should let it
update itself and then start working, but it's less than ideal).

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Experiences writing a plugin

2009-07-22 Thread Evan Daniel
I've been slowly working on writing a plugin for Freenet over the past
couple weeks.  (Details aren't particularly relevant to what follows;
for the curious, it's a microblogging / chat application aiming at
keeping latencies as low as possible.  RFC available at
u...@cf9ctasza8w2jafeqmiln49tfrpdz2q5m68m1m5r9w0,NQiPGX7tNcaXVRXljGJnFlKhnf0eozNQsb~NwmBAJ4k,AQACAAE/Fritter-site/1/
)

Having not written much actual Freenet code before, I'm learning a lot
about how Freenet works in the process -- which is harder than it has
any reason to be.  Why?  NOTHING IS DOCUMENTED.  For example, after
retrieving some text from Freenet, my plugin would like to display it
on a web page, including filtering it so that it doesn't break
anything, even in the case of malicious input.  The method
HTMLEncoder.encode() sounds like it ought to do that.  Let's take a
look at the Javadoc:

encode

public static java.lang.String encode(java.lang.String s)

That's it.  Nothing about what the method is supposed to do.  So,
after hunting through several source files (when reading the Javadoc
should have sufficed), I have a fairly good sense of what the method
does.  I'm pretty sure it doesn't do precisely what I'm looking for
(though the HTML specs are complicated enough, and I don't know them
well enough, that I'm not certain either way).  Neither the Javadoc
nor inline comments reference any standard that it is trying to
conform to.  If there was a contract specified, I could fairly easily
determine whether that contract matched what I needed -- and therefore
whether I should be writing my own function or submitting a patch (or
whether I'm misreading the relevant specs, for that matter).

If this were an isolated incident, it wouldn't matter much.  It isn't.
 It is the norm for Freenet.  For a platform whose primary impediment
to wider adoption (IMO, of course) is a lack of things to do with it,
rather than a lack of underlying functionality, this is a problem.  I
haven't tracked it, but I wouldn't be surprised if I've spent nearly
as much time trying to figure out how the plugin API works (or even
which classes it consists of) as I have actually writing code.

In case I haven't made my point yet, here are a few questions I've
had.  Can anyone point me to documentation that answers them (Javadoc
or wiki)?  I've spent some time looking, and I haven't found it.  Most
(but not all) I've answered for myself by reading lots of Freenet code
-- a vastly slower process.  Some of them I believe represent bugs.

How do I make a request keep retrying forever?
How do I determine whether a ClientGetter represents an active request?
Are there any circumstances under which ClientGetCallback.onSuccess()
will be called more than once, and do I need to handle them?
Why doesn't ClientGetCallback.onFetchable() get called (more than a
trivial time) before onSuccess()?
How do I guarantee that an insert won't generate a redirect?
What, if anything, should I be doing with this ObjectContainer
container that gets passed around everywhere?
Under what circumstances will I see a FetchException or
InsertException thrown when using the HighLevelSimpleClient?
Why does the Hello World plugin break if I turn it into a FredPluginHTTP?

None of these seem like overly complex or unexpected questions for
someone trying to write a plugin for Freenet.  They should all be
answerable by reading documentation.

On a closely related note, here are a few related issues -- imho they
need fixes in the code, but proper documentation of the actual
behavior would have left me far less confused when debugging:
FreenetURI.setDocName(String) doesn't.
Creating a ClientSSK from a FreenetURI and then attempting to insert a
file to it throws an exception about wrong extra bytes even though
the extra bytes are unchanged from the FreenetURI the node generated
for me.

At this point, I think I have a much better understanding of why
Freenet has so little software that makes use of it, despite the fact
that Freenet itself seems to work fairly well.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Experiences writing a plugin

2009-07-23 Thread Evan Daniel
On Thu, Jul 23, 2009 at 2:09 AM, David ‘Bombe’
Rodenbo...@pterodactylus.net wrote:
 On Thursday 23 July 2009 01:12:15 Evan Daniel wrote:

 The method HTMLEncoder.encode() sounds like it ought to do that.  Let's take
 a look at the Javadoc:

 encode

 public static java.lang.String encode(java.lang.String s)

 I guess I’m the only having to take the blame for that. HTMLNode and consorts
 have been committed by me. And though in a local version all of the HTML*
 classes do have javadoc comments those seem to have been created after I
 committed them to Freenet.

 Since the time I wrote those classes I have gone over to javadoc commenting
 everything I write as I write it. As you already have noticed documentation is
 not a strong side of developers so that even if I had committed those files
 with documentation that documentation would now be largely out of date so that
 it might even be wrong. That would be equally helpful as having no
 documentation at all. :)

 I’m aware that the HTMLEncoder class was merely an example that you used to
 demonstrate what’s wrong the code base in general. Unfortunately there’s not
 really a way to force developers to write (good!) documentation for the stuff
 that they do.

 I have configured my Eclipse to perform javadoc validation on everything,
 including checking for malformed comments and the like. And I’ve grown opposed
 to files that show any warning in Eclipse so that at least the code I’m
 committing in the future will have meaningful javadoc comments. I’m urging
 everyone to do the same but—as I already said—you can’t enforce it.

Well, if you're sufficiently motivated, you can enforce it -- simply
don't apply patches that don't include sufficient documentation.
That's harsh enough that I'm not sure it's worth it, though.  And yes,
the HTMLEncoder was merely the most recent example I'd been frustrated
with; it's no better or worse than most of the rest of the code I've
looked at.

If you'd commit your comments on HTMLEncoder, I'd appreciate it.
Specifically, I'm trying to figure out whether it's supposed to take
well-formed, non-malicious strings and format them so they display
properly in HTML, or whether it should also be filtering malicious
strings so that they don't screw up the page.  Not being an HTML
expert, I'm uncertain whether it does the latter.  It worries me that
it doesn't filter the ascii control characters.  The best citation I
can find on that says that aside from tab, CR, and LF, they shouldn't
appear in HTML documents:
http://www.w3.org/MarkUp/html3/specialchars.html
However, that's for HTML 3.  HTML 4 specifies everything in reference
to character encodings, and I'm having trouble answering the question
for it.  Perhaps someone who knows HTML better than I do could chime
in?

My personal habit in documenting involves more comments than most of
Freenet has.  It tends towards big blocks of text rather than Javadoc
since I rarely run the javadoc engine against my own code -- my
projects tend to be small enough I usually have all the relevant files
open in gvim anyway.  I think credit for that goes to my Assembly
professor, who wouldn't even bother grading things that didn't have
sufficient comments.  Of course, I'm not particularly good about
keeping the comments up to date, which is sometimes better and
occasionally worse than no comments.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Experiences writing a plugin

2009-07-25 Thread Evan Daniel
On Sat, Jul 25, 2009 at 12:27 PM, Zero3ze...@zerosplayground.dk wrote:
 Evan Daniel skrev:
 Having not written much actual Freenet code before, I'm learning a lot
 about how Freenet works in the process -- which is harder than it has
 any reason to be.  Why?  NOTHING IS DOCUMENTED.

 [snip]

 If this were an isolated incident, it wouldn't matter much.  It isn't.
 It is the norm for Freenet.  For a platform whose primary impediment
 to wider adoption (IMO, of course) is a lack of things to do with it,
 rather than a lack of underlying functionality, this is a problem.  I
 haven't tracked it, but I wouldn't be surprised if I've spent nearly
 as much time trying to figure out how the plugin API works (or even
 which classes it consists of) as I have actually writing code.

 [snip]

 At this point, I think I have a much better understanding of why
 Freenet has so little software that makes use of it, despite the fact
 that Freenet itself seems to work fairly well.

 I completely agree. I've been pulling my hair over similar issues before
 as well. The closest thing to documentation was for me the Wiki and/or
 simply askin toad about what I needed to know.

That has been my strategy as well.  I tend to think it's a bad use of
toad's time to answer questions that could be answered by
documentation, and a bad use of my time to wait until he's available
to get answers.


 I guess the fact that the Freenet core is ever-changing has a lot to do
 with it.

If the documentation were merely out of date, I would agree.  However,
it's not out of date, it's nonexistant.  Also, the main APIs have been
stable enough for long enough that I don't think this is an excuse any
longer, especially for parts like plugins and FCP that are expected to
be used by outside programs (as opposed to FNP, etc).

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Experiences writing a plugin

2009-07-25 Thread Evan Daniel
On Sat, Jul 25, 2009 at 3:53 PM, Matthew
Toselandt...@amphibian.dyndns.org wrote:
 The basic answer is that plugins should be written to a defined, fixed, 
 sandboxed, external API. But there is no such API. We will need to implement 
 one eventually. But this takes considerable time, which could be spent on 
 things which give a more direct benefit to end users in terms of  usability, 
 performance or security.

My general complaint is not that the plugin api isn't well defined and
perfect, though that would be nice.  It's that what does exist isn't
documented.  My current take on coding for Freenet is that the API
isn't perfect, but that I could work with it if I knew what it did.
However, figuring out what it does requires reading significant
quantities of Freenet source code as often as not.


 On Thursday 23 July 2009 00:12:15 Evan Daniel wrote:
 I've been slowly working on writing a plugin for Freenet over the past
 couple weeks.  (Details aren't particularly relevant to what follows;
 for the curious, it's a microblogging / chat application aiming at
 keeping latencies as low as possible.  RFC available at
 u...@cf9ctasza8w2jafeqmiln49tfrpdz2q5m68m1m5r9w0,NQiPGX7tNcaXVRXljGJnFlKhnf0eozNQsb~NwmBAJ4k,AQACAAE/Fritter-site/1/
 )

 Having not written much actual Freenet code before, I'm learning a lot
 about how Freenet works in the process -- which is harder than it has
 any reason to be.  Why?  NOTHING IS DOCUMENTED.  For example, after
 retrieving some text from Freenet, my plugin would like to display it
 on a web page, including filtering it so that it doesn't break
 anything, even in the case of malicious input.  The method
 HTMLEncoder.encode() sounds like it ought to do that.  Let's take a
 look at the Javadoc:

 encode

 public static java.lang.String encode(java.lang.String s)

 That's it.  Nothing about what the method is supposed to do.  So,
 after hunting through several source files (when reading the Javadoc
 should have sufficed), I have a fairly good sense of what the method
 does.  I'm pretty sure it doesn't do precisely what I'm looking for
 (though the HTML specs are complicated enough, and I don't know them
 well enough, that I'm not certain either way).  Neither the Javadoc
 nor inline comments reference any standard that it is trying to
 conform to.  If there was a contract specified, I could fairly easily
 determine whether that contract matched what I needed -- and therefore
 whether I should be writing my own function or submitting a patch (or
 whether I'm misreading the relevant specs, for that matter).

 HTMLEncoder is supposed to encode text so that it can be safely inserted into 
 HTML, as its name suggests. It is used widely and hopefully it is sufficient.

See https://bugs.freenetproject.org/view.php?id=3335


 If this were an isolated incident, it wouldn't matter much.  It isn't.
  It is the norm for Freenet.  For a platform whose primary impediment
 to wider adoption (IMO, of course) is a lack of things to do with it,
 rather than a lack of underlying functionality, this is a problem.  I
 haven't tracked it, but I wouldn't be surprised if I've spent nearly
 as much time trying to figure out how the plugin API works (or even
 which classes it consists of) as I have actually writing code.

 That is an interesting theory. I guess since we have a load of money from 
 Google it would be possible to spend some time on the plugin API???

I think it would be *very* wise to spend time on the Javadoc for what
currently exists, and making assorted small improvements.  (For
example: InsertContext should be cloneable, like FetchContext.)  As
far as writing a better API, or sandboxing, I'm not as certain.  The
current one is probably workable, and there are lots of important
things that need doing.


 In case I haven't made my point yet, here are a few questions I've
 had.  Can anyone point me to documentation that answers them (Javadoc
 or wiki)?  I've spent some time looking, and I haven't found it.  Most
 (but not all) I've answered for myself by reading lots of Freenet code
 -- a vastly slower process.  Some of them I believe represent bugs.

 How do I make a request keep retrying forever?

 Set maxRetries to -1

Yeah, I figured that out eventually -- by asking in IRC, I think.  But
it isn't documented.  It would take all of one sentence in the Javadoc
to make it easy for me or any other plugin author to find, and that's
a lot easier than answering the question every time someone has it.


 How do I determine whether a ClientGetter represents an active request?

 Dunno.

https://bugs.freenetproject.org/view.php?id=3336


 Are there any circumstances under which ClientGetCallback.onSuccess()
 will be called more than once, and do I need to handle them?

 Shouldn't be, if there are it's a bug.

It happens on every fetch, afaict.
https://bugs.freenetproject.org/view.php?id=3331


 Why doesn't ClientGetCallback.onFetchable() get called (more than a
 trivial time) before onSuccess

Re: [freenet-dev] NedaNet evaluation of Iran circumvention technologies on Freenet

2009-07-27 Thread Evan Daniel
On Mon, Jul 27, 2009 at 9:18 AM, Matthew
Toselandt...@amphibian.dyndns.org wrote:
 - Easy insert of freesites. We need a good freesite insertion wizard, a 
 plugin as part of the base install. Maybe another attempt at an easy blogging 
 wizard too.
[...]
 - Better, integrated, attack resistant, easy to use, fast, chat. Negative 
 trust issues may or may not be important, depending on the deployment, likely 
 not a problem in the short term anyway. Embedded chat forums would probably 
 work well...
 - Microblogging.

(I'll reply to the rest in detail later.)

For these three things, I think the biggest thing you could do would
be documentation of what currently exists of the plugins API.
Obviously I only speak for myself, and of possible such applications
mine is the least developed, but that's what I'd find helpful.  I've
considered taking what I've learned and turning it into a piece of
example code that's very simple, but demonstrates a bit more than the
HelloWorld plugin.  For example, it would demonstrate basic plugin
loading, and a little of how to use the HLSC.  If you think that would
be useful, I can probably put such a thing together today (depends how
much real life interferes).

The second thing would be to address
https://bugs.freenetproject.org/view.php?id=3338
Right now I see latencies in testing of typically 15-40 seconds, with
30 seconds as a typical number.  I suspect it could be much better.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] RSKs (was Re: NedaNet evaluation of Iran circumvention technologies on Freenet)

2009-07-27 Thread Evan Daniel
 in a correctly-formatted manner.
- The ability to verify a binary blob and read the data it contains.
This would require also knowing the key it represents (AIUI, the blob
normally contains the routing key but not the decryption key, just
like the datastore).  If Alice intends to revoke Bob's site, she
probably wants to first check that the blob she is about to insert
will do that (inserting the revoke for the wrong site would be bad).
When she receives the revoke blob from Bob, she probably also wants to
check that the revoke message says what Bob said it did.  This
verification step needs to not actually insert the blob.
- And, obviously, the ability to actually insert a blob that the user has.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Variable opennet connections: moving forward

2009-07-28 Thread Evan Daniel
On Thu, Jun 18, 2009 at 8:50 PM, Evan Danieleva...@gmail.com wrote:
 On Thu, Jun 18, 2009 at 8:00 PM, Matthew
 Toselandt...@amphibian.dyndns.org wrote:
 Are you doing more testing?

 On Saturday 13 June 2009 19:05:36 Evan Daniel wrote:
 On Sat, Jun 13, 2009 at 1:08 PM, Matthew
 Toselandt...@amphibian.dyndns.org wrote:
  Now that 0.7.5 has shipped, we can start making disruptive changes again 
  in a few days. The number one item on freenet.uservoice.com has been for 
  some time to allow more opennet peers for fast nodes. We have discussed 
  this in the past, and the conclusions which I agree with and some others 
  do:
  - This is feasible.
  - It will not seriously break routing.
  - Reducing the number of connections on slow nodes may actually be a gain 
  in security, by increasing opportunities for coalescing. It will improve 
  payload percentages, improve average transfer rates, let slow nodes 
  accept more requests from each connection, and should improve overall 
  performance.
  - The network should be less impacted by the speed of the slower nodes.
  - But we have tested using fewer connections on slow nodes in the past 
  and had anecdotal evidence that it is slower. We need to evaluate it more 
  rigorously somehow.
  - Increasing the number of peers allowed for fast opennet nodes, within 
  reason, should not have a severe security impact. It should improve 
  routing (by a smaller network diameter). It will of course allow fast 
  nodes to contribute more to the network. We do need to be careful to 
  avoid overreliance on ubernodes (hence an upper limit of maybe 50 peers).
  - Routing security: FOAF routing allows you to capture most of the 
  traffic from a node already, the only thing stopping this is the 
  30%-to-one-peer limit.
  - Coalescing security: Increasing the number of peers without increasing 
  the bandwidth usage does increase vulnerability to traffic analysis by 
  doing less coalescing. On the other hand, this is not a problem if the 
  bandwidth usage scales with the number of nodes.
 
  How can we move forward? We need some reliable test results on whether a 
  10KB/sec node is better off with 10 peers or with 20 peers. I think it's 
  a fair assumption for faster nodes. Suggestions?

 I haven't tested at numbers that low.  At 15KiB/s, the stats page
 suggests your slightly better off with 12-15 peers than 20.  I saw no
 subjective difference in browsing speed either way.

 I'm happy to do some testing here, if you tell me what data you want
 me to collect.  More testers would obviously be good.

 
  We also need to set some arbitrary parameters. There is an argument for 
  linearity, to avoid penalising nodes with different bandwidth levels, but 
  nodes with more peers and the same amount of bandwidth per peer are 
  likely to be favoured by opennet anyway... Non-linearity, in the sense of 
  having a lower threshold and an upper threshold and linearly add peers 
  between them but not necessarily consistently with the lower threshold, 
  would mean fewer nodes with lots of peers, and might achieve better 
  results? E.g.
 
  10 peers at 10KB/sec ... 20 peers at 20KB/sec (1 per KB/sec)
  20 peers at 20KB/sec ... 50 peers at 80KB/sec (1 per 3KB/sec)

 I wouldn't go as low as 10 peers, simply because I haven't tested it.
 Other than that, those seem perfectly sensible to me.

 We should also watch for excessive cpu usage.  If there's lots of bw
 available, we'd want to have just enough connections to not quite
 limit on available cpu power.  Of course, I don't really know how many
 connections / how much bw it is before that becomes a concern.

 Evan Daniel


 I'd been running the Spider, and trying to get a complete run out of
 it in order to provide a full set of bug reports.  Unfortunately,
 after spidering over 100k keys (representing over a week of runtime),
 the .dbs file became unrecoverably corrupted, and it won't write index
 files.  I had started rerunning it; I've since paused that and started
 taking data on connections.

 I've got a little data so far at 12KiB/s limit, 10 and 12 peers.
 Basically, I don't see a difference between 10 and 12 peers.  Both
 produce reasonable performance numbers.  My node has 2 darknet peers,
 remainder opennet.  I'm not using the node much during these tests; it
 has a few MiB of downloads queued that aren't making progress (old
 files that have probably dropped off).

 Evan Daniel


 12 peers, 12 KiB/s limit

 # bwlimitDelayTime: 91ms
 # nodeAveragePingTime: 408ms
 # darknetSizeEstimateSession: 0 nodes
 # opennetSizeEstimateSession: 63 nodes
 # nodeUptime: 1h37m

 # Connected: 10
 # Backed off: 2

 # Input Rate: 2.54 KiB/s (of 60.0 KiB/s)
 # Output Rate: 12.9 KiB/s (of 12.0 KiB/s)
 # Total Input: 31.3 MiB (5.5 KiB/s average)
 # Total Output: 47.5 MiB (8.34 KiB/s average)
 # Payload Output: 32.6 MiB (5.73 KiB/sec)(68%)

 1469    Output bandwidth liability
 18      SUB_MAX_PING_TIME

 Success rates
 Group   P(success

Re: [freenet-dev] status

2009-07-28 Thread Evan Daniel
On Tue, Jul 28, 2009 at 3:26 PM, Florent
Daignièrenextg...@freenetproject.org wrote:
 * Matthew Toseland t...@amphibian.dyndns.org [2009-07-28 20:03:42]:
 So we need to deal with the bug tracker. But it will take several days work 
 (= approximately the cost to both FPI and Ian of emu for 1.5 months). Is 
 this really a high priority? IMHO losing our existing bugs database will 
 cost significant work in the medium to long term, hence the need to migrate 
 data...

 Last time people talked about it, only you and xor where objecting to
 trashing the bug database altogether and switching to another bug
 tracker (possibly hosted by someone else).

As someone who has submitted a number of both feature requests and
bugs to the database, I would be right annoyed if they were simply
dropped without any plan to keep track of and address them.  That
said, I have no particular attachment to the current software, and no
objection to changing to something else if would improve things.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Freenet blog plugin?

2009-07-30 Thread Evan Daniel
On Thu, Jul 30, 2009 at 1:08 PM, Matthew
Toselandt...@amphibian.dyndns.org wrote:
 On Thursday 30 July 2009 14:52:49 Zero3 wrote:
 Matthew Toseland skrev:
  It has been pointed out that a minimal blog engine can be written in 
  approx 22KB of php - around 800 lines of code at most. I suspect that 
  given a template I could probably put together a blog plugin in a few 
  days. This would integrate with Freetalk for comments and for announcing 
  the site initially. It should make it easier to contribute content to 
  Freenet, eliminating the need to get Thingamablog working, etc. Thoughts?

 IMHO it would be a better idea to have someone less experienced with
 Freenet development work on and maintain such (with your mentoring). A
 project like this seems like a great opportunity to get a new developer
 into working with Freenet.

 Given that such person is available, of course.

 The problem is we had at least one try in the past and fail... and there may 
 be significant benefits to having such functionality sooner rather than later.

My personal opinion is that your time would be better invested in
documenting the plugins interface.  I think it would be far easier to
find another developer willing and able to write such a plugin if the
interface was actually documented.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Serious near-practical AES attack, consequences for Freenet

2009-07-31 Thread Evan Daniel
On Fri, Jul 31, 2009 at 2:07 PM, Matthew
Toselandt...@amphibian.dyndns.org wrote:
 http://www.schneier.com/blog/archives/2009/07/another_new_aes.html

 Practical related-key/related-subkey attacks on AES with a 256-bit key with 
 9, 10 and 11 rounds. The official standard uses 14 rounds, so there is 
 precious little safety margin - attacks always get better.

 We use AES/256 (technically we use Rijndael with 256 bit key and 256 bit 
 block size mostly, which isn't strictly AES, although we use 128 bit block 
 size, which is, for store encryption).

 Such attacks rely on related-key weaknesses in the protocol (as in WEP, where 
 the IV was too small). In theory we shouldn't have any, although I am not 
 entirely sure how to determine this. We shouldn't have known ciphertext, 
 because we have an unforgeable authenticator on all packets, but I'm not sure 
 exactly what the definition of a related-key weakness is.

 Nonetheless, it would seem prudent to increase the number of rounds as 
 Schneier outlines (28 rounds for a 256-bit key). We have the infrastructure 
 to do this without too much trouble, with key subtypes and negotiation types. 
 Moving to AES/128 would be considerably more work.

I think it would be worth trying to get someone who is a qualified
cryptographer to look in detail at how Freenet uses cryptography.
Freenet does a *lot* of crypto, mixed together in ways that aren't
necessarily common.  It's also a very interesting project from a
cryptographic standpoint; it seems possible that someone could be
talked into doing it on a volunteer basis.  Even if it wasn't
volunteer, it might be worth seeing how much a proper review would
cost.  Cryptographic review seems appropriate for a program which
relies so strongly on the strength of its cryptography.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] please please please can we get a better screenshot?

2009-08-03 Thread Evan Daniel
On Mon, Aug 3, 2009 at 2:04 PM, Matthew
Toselandt...@amphibian.dyndns.org wrote:
 On Monday 03 August 2009 16:17:12 Ian Clarke wrote:
 Right now the most visible thing on our website is a screenshot that really
 doesn't look good.  Can someone create a screenshot that looks more
 appealing?  This could be a screenshot of JSite, or Frost, anything that
 shows Freenet doing its thing but actually looks appealing?

 IMHO a screenshot of the default theme would make more sense than a 
 screenshot of the minimalblue theme.

 We do not bundle either jSite or Frost (or Thingamablog, or FMS). 
 Thingamablog is the best tool for creating a blog, jSite is the best tool for 
 uploading existing content, both have been reviewed by core devs but 
 Thingamablog is a bit large... But maybe we should bundle them anyway? Either 
 that or replace them in the near future with web-based plugins doing the same 
 job.

How would a new user find out about such software?  It doesn't look
obvious from the front page of the site to me.  Frost and FMS have
links from the discussion tab on the node page, but jSite and
Thingamablog don't.  There's some info on the documentation page of
the freenetproject.org, but that's not where I would think to look for
Freenet-related applications.

Is Thingamablog maintained?

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] F2F web proxy???

2009-08-06 Thread Evan Daniel
On Thu, Aug 6, 2009 at 8:09 AM, Matthew
Toselandt...@amphibian.dyndns.org wrote:
 I propose that as a darknet value-add, and as an additional tool for those in 
 hostile regimes who have friends on the outside, we implement a 
 web-proxy-over-your-darknet-peers option. Your Friends would announce whether 
 they are willing to proxy for you, and you could choose which friends to use, 
 or allow it to use all of them (assuming people on the inside don't offer). 
 You could then configure your browser to use Freenet as a proxy. This would 
 not provide any anonymity but it would get you past network obstacles and/or 
 out of Bad Place and into Happy Place. It's not a long term solution, but:
 - We have expended considerable effort on making darknet viable: IP 
 detection, ARKs etc.
 - It could take advantage of future transport plugins, but even before that, 
 FNP 0.7 is quite hard to block.
 - Many people are in this situation.
 - It is easy to implement. HTTP is complex but cache-less proxies can be very 
 simple.
 - It could be combined with longer term measures (growing the internal 
 darknet), and just work for as long as it works. Most likely it would be 
 throttled rather than blocked outright to start with, hopefully allowing for 
 a smooth-ish migration of users to more robust mechanisms...
 - We could allow recursive proxying to some depth - maybe friend of a friend. 
 This would provide a further incentive to grow the internal darknet, which is 
 what we want.
 - The classic problem with proxies is that they are rare so hundreds of 
 people connect to them, and the government finds out and blocks them. This 
 does not apply here.

I like it.  Darknet features are a very good thing.  This probably
also needs some care wrt bandwidth management (related to 3334 --
similar considerations probably apply).

However, as I mentioned on IRC, there are several things I think
should be higher priority.  Of course, I'm not the one implementing
any of this, but here's my opinion anyway ;)  In no particular order:

- Documentation!  Both the plugins api and making sure that the FCP
docs on the wiki are current and correct.
- Bloom filter sharing.  (Probably? I have no idea what the relative
work required is for these two.)
- Freetalk and a blogging app of some sort (though these are probably
mostly for someone other than toad?).
- A few specific bugs: 3295 (percent encoding is horribly,
embarrassingly broken -- in at least 5 different ways), 2931 (split
blocks evenly between splitfile segments -- should help dramatically
with availability), fixing top block healing on splitfiles (discussed
in 3358).
- Low-latency inserts flag as per 3338.  (I know, most people probably
don't care all that much, but I'd really like to see whether Freenet
can hit near-real-time latencies for the messaging app I'm working
on.)

Also, it's worth considering other ways to make darknet connections
more useful (in addition to this, whether before or after I don't have
a strong opinion on).  Enabling direct transfer of large files would
be good (at a bare minimum, this shouldn't fail silently like it does
right now).  Improving messaging would be good; I should be able to
see recently sent / received messages (including timestamps), queue a
message to be sent when a peer comes online, and tell whether a
message I've sent arrived successfully.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] How to start a new translation?

2009-08-06 Thread Evan Daniel
On Thu, Aug 6, 2009 at 1:20 PM, Alex Pyattaevalex.pyatt...@gmail.com wrote:
 Hi there!
 I'd like to make a russian translation for freenet program (it seems it will
 be very popular here some time soon).
 I just want to know how to organize my work so that it is not lost in vain.
 Thanks.

Glad to hear it!  It should be fairly easy.  First, go to
configuration - core settings, and set your language to 'unlisted'.
Then, go to configuration - translation and start translating.  When
you're done (some or all of the strings; you don't have to do it all
at once), you can download a translation file from the translations
page and email it to this list.

I'm not certain how you specify the language name / country code in
this process (I haven't actually done a translation myself).  Just be
sure to say what language it is in your email.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


  1   2   3   4   5   6   >