[freenet-dev] How to fix UID exploits properly?

2012-09-04 Thread Evan Daniel
her known opennet attacks, or combining it ... However, see Ian's solution.
>
> Opennet/Announcement:
> ===
>
> Section IIIA:
> - Perverse effects of the current replacement heuristics? The dominant 
> heuristic is that any of the neighbour categories has served at least 10 
> requests... This means if you do requests near the node's location you can 
> get your node accepted. Can we improve on this? I do think the principle of 
> limiting by the number of requests makes sense - we don't want the nodes to 
> get dropped immediately after their 5 minute grace period expires, by the 
> first request that happens and tries to path fold, or by announcements? One 
> thing we could do would be make sure the inter-accept interval is more than 
> the average request time?
>
> We could cripple announcement, to make it a bit slower to reach targeted 
> nodes. However this would reduce first time performance. Freenet 0.4/5 used a 
> collaborative random number generation algorithm and hoped the node would 
> migrate to where it should be, I don't think it worked well. We could reduce 
> the precision of announcement a bit perhaps, as a partial solution, but again 
> it would reduce first time performance *and* it would reduce performance on a 
> slashdot.
>
> Paper's solution
> ==
>
> The paper suggests we get rid of the various different failure modes. 
> Unfortunately we need most of them:
> - RejectedOverload: We need this to be distinct for load management.
> - RouteNotFound: This is non-fatal, and reduces the number of hops.
> - DataNotFound: This is fatal, and terminates the request.
>
> Combining RejectedLoop with RNF might be possible. I'm not sure whether or 
> not there would be enough information to figure out when it's a loop and when 
> it's a genuine RNF; although the attacker may know your peers, lots of things 
> can affect routing, e.g. per-node failure tables.
>
> We definitely need to reduce the number of distinct failure messages.
>
> Purely random termination (no HTL) might allow for major simplifications, as 
> well as reducing the information on the requests, but it would greatly 
> increase the variability of request completion times (making timeouts more 
> difficult), and might undermine some of the more important optimisations such 
> as per-node failure tables. (I'm not sure whether that's absolutely certain, 
> as it acts only after multiple requests...)
>
> Ian's solution
> 
>
> Get rid of RejectedLoop. Always accept, never route to the same peer as we've 
> already routed that UID to, and RNF if we can't find any more nodes to route 
> to.
>
> I am worried about what this could do to routing. I don't think we should 
> implement it without some theoretical/simulation analysis? I can see that it 
> might improve things, but we need more than that given it could be fairly 
> significant.
>
> However it is the most comprehensive way to get rid of these problems, and 
> might have the least performance impact.
>

I like this solution. It was my immediate reaction to the problem description.

It will make local minimums harder to escape. Basically, you prevent
duplicating an edge along a route, rather than a node. That's a much
less powerful approach to avoiding minimums. I suspect FOAF routing
helps a lot here, but that seems like it might be problematic from a
security perspective as well.

In general, making routing better (link length distribution, mainly)
will make this less of an issue; local minimums are a problem that
results when you have too few short links, which is the current
problem with the network.

> DARKNET
> ==
>
> Best solution is to make darknet easy!
>
> We also need to fix darknet. That means fixing Pitch Black.

Among other problems. Location stability interactions with datastore,
and opennet/darknet hybrid nodes, in particular.

It also means we need to focus on the user experience when setting up
darknet, which currently sucks.

Evan Daniel



Re: [freenet-dev] How to fix UID exploits properly?

2012-09-04 Thread Evan Daniel
. Can we improve on this? I do think the principle of 
 limiting by the number of requests makes sense - we don't want the nodes to 
 get dropped immediately after their 5 minute grace period expires, by the 
 first request that happens and tries to path fold, or by announcements? One 
 thing we could do would be make sure the inter-accept interval is more than 
 the average request time?

 We could cripple announcement, to make it a bit slower to reach targeted 
 nodes. However this would reduce first time performance. Freenet 0.4/5 used a 
 collaborative random number generation algorithm and hoped the node would 
 migrate to where it should be, I don't think it worked well. We could reduce 
 the precision of announcement a bit perhaps, as a partial solution, but again 
 it would reduce first time performance *and* it would reduce performance on a 
 slashdot.

 Paper's solution
 ==

 The paper suggests we get rid of the various different failure modes. 
 Unfortunately we need most of them:
 - RejectedOverload: We need this to be distinct for load management.
 - RouteNotFound: This is non-fatal, and reduces the number of hops.
 - DataNotFound: This is fatal, and terminates the request.

 Combining RejectedLoop with RNF might be possible. I'm not sure whether or 
 not there would be enough information to figure out when it's a loop and when 
 it's a genuine RNF; although the attacker may know your peers, lots of things 
 can affect routing, e.g. per-node failure tables.

 We definitely need to reduce the number of distinct failure messages.

 Purely random termination (no HTL) might allow for major simplifications, as 
 well as reducing the information on the requests, but it would greatly 
 increase the variability of request completion times (making timeouts more 
 difficult), and might undermine some of the more important optimisations such 
 as per-node failure tables. (I'm not sure whether that's absolutely certain, 
 as it acts only after multiple requests...)

 Ian's solution
 

 Get rid of RejectedLoop. Always accept, never route to the same peer as we've 
 already routed that UID to, and RNF if we can't find any more nodes to route 
 to.

 I am worried about what this could do to routing. I don't think we should 
 implement it without some theoretical/simulation analysis? I can see that it 
 might improve things, but we need more than that given it could be fairly 
 significant.

 However it is the most comprehensive way to get rid of these problems, and 
 might have the least performance impact.


I like this solution. It was my immediate reaction to the problem description.

It will make local minimums harder to escape. Basically, you prevent
duplicating an edge along a route, rather than a node. That's a much
less powerful approach to avoiding minimums. I suspect FOAF routing
helps a lot here, but that seems like it might be problematic from a
security perspective as well.

In general, making routing better (link length distribution, mainly)
will make this less of an issue; local minimums are a problem that
results when you have too few short links, which is the current
problem with the network.

 DARKNET
 ==

 Best solution is to make darknet easy!

 We also need to fix darknet. That means fixing Pitch Black.

Among other problems. Location stability interactions with datastore,
and opennet/darknet hybrid nodes, in particular.

It also means we need to focus on the user experience when setting up
darknet, which currently sucks.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] How to gather more data was Re: Beyond New Load Management: A proposal

2012-08-02 Thread Evan Daniel
On Thu, Aug 2, 2012 at 7:18 AM, Matthew Toseland
 wrote:
> On Thursday 01 Sep 2011 15:29:35 Evan Daniel wrote:
>> On Thu, Sep 1, 2011 at 7:24 AM, Matthew Toseland
>>  wrote:
>> >> I like this proposal :)
>> >>
>> >> Is the documentation on the math of how to get the random routing to
>> >> behave well sufficient? Let me know if it isn't. The MHMC routing math
>> >> shouldn't be too complicated, but we want to be certain it's
>> >> implemented correctly so that the data is sound.
>> >
>> > Do you have a metric for how clustered vs uniform a node's peers are?
>>
>> Maybe. It's a tougher problem than it looks at first glance, unless we
>> have a somewhat reliable network size estimate available. I'll give it
>> some more thought.
>>
>> If you want a qualitative, visual estimate, just change the peer
>> location distribution graph on the stats page to have a logarithmic x
>> axis. Basically, a node should have similar numbers of peers at
>> distance 0.25 < d <= 0.5, and at 0.125 < d <= 0.25, etc. That is, bar
>> n should graph the number of nodes at distance 2^(-n-2) < d <
>> 2^(-n-1). That doesn't provide an answer you can use in code to
>> evaluate and make decisions, but it should give better anecdotal
>> evidence about problems.
>
> So you/operhiem1 plan to do something with this now we have probe requests?

Yes, though probably not immediately.

>>
>> > What's MHMC?
>>
>> Metropolis-Hastings Monte Carlo. We currently use it to get the right
>> probability distribution for location swaps.
>
> Do we? I thought the walks were purely random, and the probability of 
> swapping was based solely on a comparison between the distances before and 
> after?

Absolutely. It's the mathematical technique used to prove that both
the path folding algorithm of opennet and the location swapping
algorithm of darknet will produce the desired results. The fact that
the formulas used in those proofs don't appear in the Freenet source
code is irrelevant, except that it contributes to Freenet code being
difficult to understand.

Evan Daniel



Re: [freenet-dev] How to gather more data was Re: Beyond New Load Management: A proposal

2012-08-02 Thread Evan Daniel
On Thu, Aug 2, 2012 at 7:18 AM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Thursday 01 Sep 2011 15:29:35 Evan Daniel wrote:
 On Thu, Sep 1, 2011 at 7:24 AM, Matthew Toseland
 t...@amphibian.dyndns.org wrote:
  I like this proposal :)
 
  Is the documentation on the math of how to get the random routing to
  behave well sufficient? Let me know if it isn't. The MHMC routing math
  shouldn't be too complicated, but we want to be certain it's
  implemented correctly so that the data is sound.
 
  Do you have a metric for how clustered vs uniform a node's peers are?

 Maybe. It's a tougher problem than it looks at first glance, unless we
 have a somewhat reliable network size estimate available. I'll give it
 some more thought.

 If you want a qualitative, visual estimate, just change the peer
 location distribution graph on the stats page to have a logarithmic x
 axis. Basically, a node should have similar numbers of peers at
 distance 0.25  d = 0.5, and at 0.125  d = 0.25, etc. That is, bar
 n should graph the number of nodes at distance 2^(-n-2)  d 
 2^(-n-1). That doesn't provide an answer you can use in code to
 evaluate and make decisions, but it should give better anecdotal
 evidence about problems.

 So you/operhiem1 plan to do something with this now we have probe requests?

Yes, though probably not immediately.


  What's MHMC?

 Metropolis-Hastings Monte Carlo. We currently use it to get the right
 probability distribution for location swaps.

 Do we? I thought the walks were purely random, and the probability of 
 swapping was based solely on a comparison between the distances before and 
 after?

Absolutely. It's the mathematical technique used to prove that both
the path folding algorithm of opennet and the location swapping
algorithm of darknet will produce the desired results. The fact that
the formulas used in those proofs don't appear in the Freenet source
code is irrelevant, except that it contributes to Freenet code being
difficult to understand.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Java version issues: ideas how to fix this gordian knot?

2012-07-30 Thread Evan Daniel
On Mon, Jul 30, 2012 at 8:15 AM, xor  wrote:
>> We should probably formally move to OpenJDK, but that seems to have
>> bug/stability problems too for some users?
>
> My node has been running on OpenJDK for years - without any issues.

Ditto.

Well, it might be more like "year" than "years" in my case. I don't
really remember. The changeover and operation since was unmemorable.

I run a higher memory limit than some will, which might be relevant.

I'm generally in favor of such a change.

Evan Daniel



[freenet-dev] Statistics Project Update #1

2012-05-09 Thread Evan Daniel
On Tue, May 1, 2012 at 7:48 AM, Zlatin Balevsky  wrote:
>> On 04/28/2012 06:56 PM, Zlatin Balevsky wrote:
>>> In Gnutella we observed that long-lived nodes tend to be better
>>> connected and that they also cluster with other high-uptime nodes.
>>> If the same is true for Freenet it's a good idea to keep an eye for
>>> side effects as you tweak the behavior.
>>
>> Good to know - I'll look for that. Are there any particular effects
>> you had in mind? The Metropolis-Hastings correction in the new probes
>> should produce a fairly uniform distribution of endpoints despite
>> clustering and well-connected nodes, but explicitly simulating the
>> effects of high uptime could be helpful.
>
> There was a study that higher uptime correlated with the probability
> of further uptime so if you shift bias towards low-uptime nodes you
> could end will lower overall reliability. ?It was done on a different
> network with different usage patterns but imho you should definitely
> treat node uptime as a parameter in any simulations.
> ___
> Devl mailing list
> Devl at freenetproject.org
> https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

MH should produce a good simple random sample from all nodes currently
online, provided that the walk is of sufficient length, regardless of
clustering effects. If there are partitioning effects, those will make
the required walk length to get good dispersion longer, in a way that
might be somewhat difficult to measure, but as long as the network is
not completely partitioned, a sufficient walk length will produce a
good sample. The fact that a large sample must be taken over an
extended period means that low-uptime nodes will have a somewhat
disproportionately lower chance of being in the sample (I think...
need to do math here), but isn't a huge problem.

Evan Daniel



Re: [freenet-dev] Statistics Project Update #1

2012-05-09 Thread Evan Daniel
On Tue, May 1, 2012 at 7:48 AM, Zlatin Balevsky zlat...@gmail.com wrote:
 On 04/28/2012 06:56 PM, Zlatin Balevsky wrote:
 In Gnutella we observed that long-lived nodes tend to be better
 connected and that they also cluster with other high-uptime nodes.
 If the same is true for Freenet it's a good idea to keep an eye for
 side effects as you tweak the behavior.

 Good to know - I'll look for that. Are there any particular effects
 you had in mind? The Metropolis-Hastings correction in the new probes
 should produce a fairly uniform distribution of endpoints despite
 clustering and well-connected nodes, but explicitly simulating the
 effects of high uptime could be helpful.

 There was a study that higher uptime correlated with the probability
 of further uptime so if you shift bias towards low-uptime nodes you
 could end will lower overall reliability.  It was done on a different
 network with different usage patterns but imho you should definitely
 treat node uptime as a parameter in any simulations.
 ___
 Devl mailing list
 Devl@freenetproject.org
 https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

MH should produce a good simple random sample from all nodes currently
online, provided that the walk is of sufficient length, regardless of
clustering effects. If there are partitioning effects, those will make
the required walk length to get good dispersion longer, in a way that
might be somewhat difficult to measure, but as long as the network is
not completely partitioned, a sufficient walk length will produce a
good sample. The fact that a large sample must be taken over an
extended period means that low-uptime nodes will have a somewhat
disproportionately lower chance of being in the sample (I think...
need to do math here), but isn't a huge problem.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Coding standards

2012-04-02 Thread Evan Daniel
I'm in favor, and I suspect I'm the author of that line :)

I also don't actually care that much, so feel free to change it to 4.

On Apr 2, 2012 6:07 PM, "Matthew Toseland" 
wrote:

On Friday 30 Mar 2012 19:34:02 Juiceman wrote:
> On Mar 30, 2012 1:12 PM, "Matthew Toseland" https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
-- next part --
An HTML attachment was scrubbed...
URL: 



Re: [freenet-dev] Coding standards

2012-04-02 Thread Evan Daniel
I'm in favor, and I suspect I'm the author of that line :)

I also don't actually care that much, so feel free to change it to 4.

On Apr 2, 2012 6:07 PM, Matthew Toseland t...@amphibian.dyndns.org
wrote:

On Friday 30 Mar 2012 19:34:02 Juiceman wrote:
 On Mar 30, 2012 1:12 PM, Matthew Toseland toad@a...
Maybe the wiki is wrong? :) All in favour of 8 space tabs?

___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

[freenet-dev] Logging subsystem rewrite

2012-03-28 Thread Evan Daniel
On Tue, Mar 27, 2012 at 2:35 PM, Marco Schulze
 wrote:
> On 27-03-2012 12:51, Martin Nyhus wrote:
>> I won't say much about the code since you say you aren't finished, but
>> please
>> follow the code style of the rest of the code base.
>
> Apart from the lack of braces, what violates the coding standards? I mean,
> compared to the rest of fred code, I use too many blank lines, 80 character
> lines, variables are declared at the top of the function and other minor
> details. I hope that that isn't a problem.

http://new-wiki.freenetproject.org/Coding_standards is what we would
like to have for new code.

Evan Daniel



Re: [freenet-dev] Logging subsystem rewrite

2012-03-28 Thread Evan Daniel
On Tue, Mar 27, 2012 at 2:35 PM, Marco Schulze
marco.c.schu...@gmail.com wrote:
 On 27-03-2012 12:51, Martin Nyhus wrote:
 I won't say much about the code since you say you aren't finished, but
 please
 follow the code style of the rest of the code base.

 Apart from the lack of braces, what violates the coding standards? I mean,
 compared to the rest of fred code, I use too many blank lines, 80 character
 lines, variables are declared at the top of the function and other minor
 details. I hope that that isn't a problem.

http://new-wiki.freenetproject.org/Coding_standards is what we would
like to have for new code.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Stuff to do next

2012-03-24 Thread Evan Daniel
On Sat, Mar 24, 2012 at 7:17 PM, Matthew Toseland
 wrote:
> I am de-emphasising a release because frankly Freenet isn't ready, and even 
> if it was, until the exams I'm only going to have one or two days a week 
> available, and I need to be around during a major release. That's just what 
> I'm thinking at the moment, not meaning to undermine Ian or anything.
>
> I'm also assuming Bombe handles most of the day to day build management, and 
> I hope that people will flag up important bugs to me - I may not be able to 
> keep up with FMS regularly for example. So how can I make the maximum 
> possible impact the most quickly in the limited time available?
>
> I'm assuming Bombe will release 1406 soon. I'm absolutely delighted if 
> somebody else takes on any of these tasks, but please let me know when you do!
>
> Here are some bug-someone issues that are mainly somebody else's area but 
> where I will be bugging people or might be able to help:
> - Freenet: Build 1406 (Bombe)
> - Freetalk: Auto-backups and turn off fsync on commit. (Massive improvement) 
> (p0s)
> - Freetalk: Only GC after receiving a full identity list. (Massive 
> improvement) (p0s)
>
> The key tasks (for me) are:
>
> Near term:
> - Get the SSL certificate replaced.
> - Update various plugins.
> - Tweak the peer scaling constant. (1407, soon after 1406).
> - Wizard UI tweaks: See devl post "fewer pages"
> - Review Sone and deploy as an unofficial plugin.
> - Get default manifest putter working for transient + make default. (Some 
> bugs e.g. on tracker/FMS, talk to Bombe)
>
> Bigger/harder/less urgent:
> - Implement auto-backup and turn off sync-on-commit in Freenet.
> - Get default manifest putter working for persistent + make default. (Need to 
> de-leak it)
> - Random routed probe requests, including is-this-reachable.
> - Basic implementation of "keep this available".
> - Build/tweak a standard test for insert-success-by-number-of-inserts.
> - Help digger3 to gather more data.
> - Try some of the proposed post-NLM tweaks.
> -- E.g. Fully separate CHK vs SSK load management.
> - Chase insert-success-by-number-of-inserts empirically.
> - Implement invites and related stuff (good for darknet but also for viral 
> spread)
> - Merge the ogg filters.
> - Make it easy to change the revocation/update keys.
>
> ___
> Devl mailing list
> Devl at freenetproject.org
> https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

I'd add one to that list. I think Steve Dougherty and I are making
good progress on the problems with the current link topology. Assuming
we come up with concrete results, tweaks to that may be very
important.

Monitoring the results from this might require the random routed probe
requests first, or there might be other ways.

Evan



Re: [freenet-dev] Stuff to do next

2012-03-24 Thread Evan Daniel
On Sat, Mar 24, 2012 at 7:17 PM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 I am de-emphasising a release because frankly Freenet isn't ready, and even 
 if it was, until the exams I'm only going to have one or two days a week 
 available, and I need to be around during a major release. That's just what 
 I'm thinking at the moment, not meaning to undermine Ian or anything.

 I'm also assuming Bombe handles most of the day to day build management, and 
 I hope that people will flag up important bugs to me - I may not be able to 
 keep up with FMS regularly for example. So how can I make the maximum 
 possible impact the most quickly in the limited time available?

 I'm assuming Bombe will release 1406 soon. I'm absolutely delighted if 
 somebody else takes on any of these tasks, but please let me know when you do!

 Here are some bug-someone issues that are mainly somebody else's area but 
 where I will be bugging people or might be able to help:
 - Freenet: Build 1406 (Bombe)
 - Freetalk: Auto-backups and turn off fsync on commit. (Massive improvement) 
 (p0s)
 - Freetalk: Only GC after receiving a full identity list. (Massive 
 improvement) (p0s)

 The key tasks (for me) are:

 Near term:
 - Get the SSL certificate replaced.
 - Update various plugins.
 - Tweak the peer scaling constant. (1407, soon after 1406).
 - Wizard UI tweaks: See devl post fewer pages
 - Review Sone and deploy as an unofficial plugin.
 - Get default manifest putter working for transient + make default. (Some 
 bugs e.g. on tracker/FMS, talk to Bombe)

 Bigger/harder/less urgent:
 - Implement auto-backup and turn off sync-on-commit in Freenet.
 - Get default manifest putter working for persistent + make default. (Need to 
 de-leak it)
 - Random routed probe requests, including is-this-reachable.
 - Basic implementation of keep this available.
 - Build/tweak a standard test for insert-success-by-number-of-inserts.
 - Help digger3 to gather more data.
 - Try some of the proposed post-NLM tweaks.
 -- E.g. Fully separate CHK vs SSK load management.
 - Chase insert-success-by-number-of-inserts empirically.
 - Implement invites and related stuff (good for darknet but also for viral 
 spread)
 - Merge the ogg filters.
 - Make it easy to change the revocation/update keys.

 ___
 Devl mailing list
 Devl@freenetproject.org
 https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

I'd add one to that list. I think Steve Dougherty and I are making
good progress on the problems with the current link topology. Assuming
we come up with concrete results, tweaks to that may be very
important.

Monitoring the results from this might require the random routed probe
requests first, or there might be other ways.

Evan
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Coding standards

2012-03-21 Thread Evan Daniel
And I see a checkstyle patch submitted. Is there an Eclipse patch?
(Speaking as someone who doesn't use Eclipse...)

Seems to me like both are worthy additions. Along with any other
similar tools, as long as they're maintained.

Evan Daniel

On Wed, Mar 21, 2012 at 8:33 PM, Ian Clarke  wrote:
> I know, and?
>
> On Mar 21, 2012 4:21 PM, "Marco Schulze"  wrote:
>>
>> Not everyone uses Eclipse.
>>
>> On 21-03-2012 13:45, Ian Clarke wrote:
>>
>> Or to commit the relevant Eclipse project to enforce these standards as a
>> "save action".
>>
>> Ian.
>>
>> On Tue, Mar 20, 2012 at 6:07 AM, Nicolas Hernandez
>>  wrote:
>>>
>>> Could it be possible to have a checktsyle file ?
>>>
>>> - Nicolas Hernandez
>>> a-n - aleph-networks
>>> associ?
>>> http://www.aleph-networks.com
>>>
>>>
>>>
>>>
>>> On Tue, Mar 20, 2012 at 11:47 AM, Matthew Toseland
>>>  wrote:
>>>>
>>>> On Monday 19 Mar 2012 23:12:15 Steve Dougherty wrote:
>>>> > I'm all for it. The coding standard is rather clear on indenting with
>>>> > tabs, so I guess all that would be required is a run with a
>>>> > re-indenting/code style conformance tool. That's something for a
>>>> > janitor tree, and would ideally be timed between releases and when all
>>>> > known pull requests have been merged or rejected to minimize
>>>> > whitespace-related disruption to existing work.
>>>>
>>>> Gigantic third party patches should come with some means to verify them.
>>>>
>>>> For example, converting all the spaces to tabs in a single commit is
>>>> fine because then you can just do diff -uw.
>>>>
>>>> However, automated bulk indenting doesn't always makes things easier to
>>>> read - e.g. devs may not like the style it produces.
>>>> >
>>>> > On 03/19/2012 06:13 PM, Marco Schulze wrote:
>>>> > > May I add a vote to standardise indentation? This mess of spaces
>>>> > > with tabs really bugs me.
>>>>
>>>> ___
>>>> Devl mailing list
>>>> Devl at freenetproject.org
>>>> https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
>>>
>>>
>>>
>>> ___
>>> Devl mailing list
>>> Devl at freenetproject.org
>>> https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
>>
>>
>>
>>
>> --
>> Ian Clarke
>> Personal blog: http://blog.locut.us/
>>
>>
>> ___
>> Devl mailing list
>> Devl at freenetproject.org
>> https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
>>
>>
>> ___
>> Devl mailing list
>> Devl at freenetproject.org
>> https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
>
>
> ___
> Devl mailing list
> Devl at freenetproject.org
> https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl



[freenet-dev] Coding standards

2012-03-19 Thread Evan Daniel
Exactly. We already had this discussion and came to an agreement. New
code should follow it. Patches to fix old code would be welcome :)

Evan

On Mon, Mar 19, 2012 at 7:12 PM, Steve Dougherty  wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> I'm all for it. The coding standard is rather clear on indenting with
> tabs, so I guess all that would be required is a run with a
> re-indenting/code style conformance tool. That's something for a
> janitor tree, and would ideally be timed between releases and when all
> known pull requests have been merged or rejected to minimize
> whitespace-related disruption to existing work.
>
> On 03/19/2012 06:13 PM, Marco Schulze wrote:
>> May I add a vote to standardise indentation? This mess of spaces
>> with tabs really bugs me.
>>
>> On 19-03-2012 19:06, Matthew Toseland wrote:
>>> On Monday 19 Mar 2012 07:42:00 David 'Bombe' Roden wrote:
 On 18.03.2012, at 19:37, Steve Dougherty wrote:

> Is this what you're looking for?
>
> http://new-wiki.freenetproject.org/Coding_standards
 In light of 3ef15c7701d666f7661cd9b58b41ae525ef32569, does toad
 know about these?
>>> Fair point. :|
>>>
>>> if(blah) do_blah();
>>>
>>> Is too easy to screw up. I guess we should standardise on always
>>> using {} if it's multi-line. They're not necessary if it's single
>>> line though.
>>>
>>>
>>> ___ Devl mailing
>>> list Devl at freenetproject.org
>>> https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
>>
>>
>>
>> ___ Devl mailing list
>> Devl at freenetproject.org
>> https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.11 (GNU/Linux)
>
> iQIcBAEBAgAGBQJPZ71EAAoJECLJP19KqmFuzIAP/R6yi7JPO8nqhbor1Yu6jwg2
> shB0jhIJ2o1d3+SPcU9cRHqgEeDwVTa7dkPAiin85n9fUPqBjoCuaejJskrjxm4u
> 63bUCKZzrMAggWwt9zNaoKxb/qIcCTCK3uC+3+jFDKAbNeNhjOSoh4cm6a54MlGf
> aHQ585A7iGZxeTzMDHmHUc0gFRSo83Y9XEacgg94ZdMpDd0Ff9GJAos+7swgZ+gT
> 4fXkO7vkvJGfHy+gskZC6Dnj0r0vHXJy2yJ8CifgozXIIkWeTWlZcBVaZ1emJ74J
> /vBtZYhmI2sHdQkVnz/IhdkfK3aEqiQSixvc6aYW9Uk5/eerZil4xRIpqQrR2PC+
> zoe988E7awVDQFRoBprHtcPqW0Q7i9fZmQawfnOMjDyJ57hbl27w7kaKi/J50GiP
> 0NMA4bKv43O20RvlonVVIo5EK1uOjzJGcRnsTsyoVjXmLiVrM3lCPzjK0X5aoy26
> 8Tjh3CZZe9oNGkXPS+z1zG9Jg8sk9apy0Q+zGrGEf7H8PBe6tYK9MTRIWxSP/rlw
> zwyjnj2kKIGWUIJ8iMuBQbXzqvRk80Zup5DZFfziEzz6F2PRwpkBsxg87jY5f7js
> 2EsvKEWIsJ19L7XsHhjRZQ2UMEKEPGTjmUUeBOpwJa6h/bWKaHaTuLFI680sed0G
> oTJlm2CiTlabZoE+8Uvs
> =ikVA
> -END PGP SIGNATURE-
> ___
> Devl mailing list
> Devl at freenetproject.org
> https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl



Re: [freenet-dev] Coding standards

2012-03-19 Thread Evan Daniel
Exactly. We already had this discussion and came to an agreement. New
code should follow it. Patches to fix old code would be welcome :)

Evan

On Mon, Mar 19, 2012 at 7:12 PM, Steve Dougherty st...@asksteved.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 I'm all for it. The coding standard is rather clear on indenting with
 tabs, so I guess all that would be required is a run with a
 re-indenting/code style conformance tool. That's something for a
 janitor tree, and would ideally be timed between releases and when all
 known pull requests have been merged or rejected to minimize
 whitespace-related disruption to existing work.

 On 03/19/2012 06:13 PM, Marco Schulze wrote:
 May I add a vote to standardise indentation? This mess of spaces
 with tabs really bugs me.

 On 19-03-2012 19:06, Matthew Toseland wrote:
 On Monday 19 Mar 2012 07:42:00 David 'Bombe' Roden wrote:
 On 18.03.2012, at 19:37, Steve Dougherty wrote:

 Is this what you're looking for?

 http://new-wiki.freenetproject.org/Coding_standards
 In light of 3ef15c7701d666f7661cd9b58b41ae525ef32569, does toad
 know about these?
 Fair point. :|

 if(blah) do_blah();

 Is too easy to screw up. I guess we should standardise on always
 using {} if it's multi-line. They're not necessary if it's single
 line though.


 ___ Devl mailing
 list Devl@freenetproject.org
 https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl



 ___ Devl mailing list
 Devl@freenetproject.org
 https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.11 (GNU/Linux)

 iQIcBAEBAgAGBQJPZ71EAAoJECLJP19KqmFuzIAP/R6yi7JPO8nqhbor1Yu6jwg2
 shB0jhIJ2o1d3+SPcU9cRHqgEeDwVTa7dkPAiin85n9fUPqBjoCuaejJskrjxm4u
 63bUCKZzrMAggWwt9zNaoKxb/qIcCTCK3uC+3+jFDKAbNeNhjOSoh4cm6a54MlGf
 aHQ585A7iGZxeTzMDHmHUc0gFRSo83Y9XEacgg94ZdMpDd0Ff9GJAos+7swgZ+gT
 4fXkO7vkvJGfHy+gskZC6Dnj0r0vHXJy2yJ8CifgozXIIkWeTWlZcBVaZ1emJ74J
 /vBtZYhmI2sHdQkVnz/IhdkfK3aEqiQSixvc6aYW9Uk5/eerZil4xRIpqQrR2PC+
 zoe988E7awVDQFRoBprHtcPqW0Q7i9fZmQawfnOMjDyJ57hbl27w7kaKi/J50GiP
 0NMA4bKv43O20RvlonVVIo5EK1uOjzJGcRnsTsyoVjXmLiVrM3lCPzjK0X5aoy26
 8Tjh3CZZe9oNGkXPS+z1zG9Jg8sk9apy0Q+zGrGEf7H8PBe6tYK9MTRIWxSP/rlw
 zwyjnj2kKIGWUIJ8iMuBQbXzqvRk80Zup5DZFfziEzz6F2PRwpkBsxg87jY5f7js
 2EsvKEWIsJ19L7XsHhjRZQ2UMEKEPGTjmUUeBOpwJa6h/bWKaHaTuLFI680sed0G
 oTJlm2CiTlabZoE+8Uvs
 =ikVA
 -END PGP SIGNATURE-
 ___
 Devl mailing list
 Devl@freenetproject.org
 https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] (no subject)

2012-03-18 Thread Evan Daniel
On Sat, Mar 17, 2012 at 10:29 PM, Leah Hicks  wrote:
> Sorry about that last message; I must have hit a keyboard shortcut for the
> send button in gmail. Here's the correct message:
>
> Hello,
>
> I'm a freelance web designer and I'm interested in redesigning Freenet's
> home page. It looks kind of out dated compared to most modern sites, not to
> mention it's using the rather out dated HTML 4 Transitional doctype.
> Although not exactly supported yet by some browsers, I assume that most of
> your userbase uses Chrome, Firefox, or Opera, which have support for all the
> new HTML 5 tags and CSS3 selectors. Mainly I just want to tidy the site up.
>
> Here's my portfolio:?http://kori-designs.com
>
> My primary suggestions are:
>
> - Update the site to HTML5 and CSS3
> - Change the top menu to be less cramped, possible move certain items to a
> sidebar
> - Make new buttons for Download / Donate
> - General restyling of the site to look cleaner and easier to read.
>
> Any suggestions or feedback would be greatly appreciated, and again sorry
> for the last message! ^^

I like this idea.

If we update to HTML5/CSS3, will older browsers manage ok? I don't
have a problem with outdated insecure browsers producing a lower
quality result, but I'd like to be certain they won't break
completely.

In general, usability and design improvements are something we would
very much like help with.

Evan Daniel



Re: [freenet-dev] (no subject)

2012-03-17 Thread Evan Daniel
On Sat, Mar 17, 2012 at 10:29 PM, Leah Hicks korii.r...@gmail.com wrote:
 Sorry about that last message; I must have hit a keyboard shortcut for the
 send button in gmail. Here's the correct message:

 Hello,

 I'm a freelance web designer and I'm interested in redesigning Freenet's
 home page. It looks kind of out dated compared to most modern sites, not to
 mention it's using the rather out dated HTML 4 Transitional doctype.
 Although not exactly supported yet by some browsers, I assume that most of
 your userbase uses Chrome, Firefox, or Opera, which have support for all the
 new HTML 5 tags and CSS3 selectors. Mainly I just want to tidy the site up.

 Here's my portfolio: http://kori-designs.com

 My primary suggestions are:

 - Update the site to HTML5 and CSS3
 - Change the top menu to be less cramped, possible move certain items to a
 sidebar
 - Make new buttons for Download / Donate
 - General restyling of the site to look cleaner and easier to read.

 Any suggestions or feedback would be greatly appreciated, and again sorry
 for the last message! ^^

I like this idea.

If we update to HTML5/CSS3, will older browsers manage ok? I don't
have a problem with outdated insecure browsers producing a lower
quality result, but I'd like to be certain they won't break
completely.

In general, usability and design improvements are something we would
very much like help with.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Tweaking the peer limit scaling constant

2012-03-15 Thread Evan Daniel
On Thu, Mar 15, 2012 at 6:08 PM, Matthew Toseland
 wrote:
> On Thursday 15 Mar 2012 20:24:10 Evan Daniel wrote:
>> On Thu, Mar 15, 2012 at 10:22 AM, Matthew Toseland
>>  wrote:
>> > commit cb21f7d4bc5e1940b6acdb49004b912f8fb705e5
>> > Author: Matthew Toseland 
>> > Date: ? Thu Mar 15 14:18:10 2012 +
>> >
>> > ? ?Tweak peers limit by bandwidth SCALING_CONSTANT: Lots of peers are on 
>> > the high end of the peers limit. Reduce the multiplier, so it takes more 
>> > bandwidth to have 40 peers; this should improve the average bandwidth per 
>> > connection.
>> >
>> > This is actually from evanbd:
>> >
>> > [17:11:04]  evanbd: what was your new proposed formula for 
>> > bandwidth/peers limit?
>> > [17:11:19]  evanbd: we should try that out in 1406 if there aren't 
>> > any other major changes, and in 1407 if there are
>> > [17:11:43]  Currently, it's peers = sqrt(12*limit), with limit in 
>> > KiB/s.
>> > [17:11:47]  ok
>> > [17:11:54]  I'm proposing changing the 12 to 6 or so.
>> > [17:12:07]  ahhh
>> > [17:12:26]  meaning high bandwidth nodes have the same number of 
>> > peers but middle ones have less => higher bandwidth per peer
>> > [17:12:34]  Exactly.
>> > [17:13:03]  okay, we should do that soon
>> >
>> > This should be in 1406 or 1407.
>>
>> Oops, had that in my branch, looks like I forgot to do a pull request.
>> Anyway, thanks!
>
> This should be merged soon after 1406 IMHO. We want to have one network level 
> change per build, and there are some minor things in 1406 (deadlock fixes and 
> the location thingy).

I like that plan.

Evan



[freenet-dev] Tweaking the peer limit scaling constant

2012-03-15 Thread Evan Daniel
On Thu, Mar 15, 2012 at 10:22 AM, Matthew Toseland
 wrote:
> commit cb21f7d4bc5e1940b6acdb49004b912f8fb705e5
> Author: Matthew Toseland 
> Date: ? Thu Mar 15 14:18:10 2012 +
>
> ? ?Tweak peers limit by bandwidth SCALING_CONSTANT: Lots of peers are on the 
> high end of the peers limit. Reduce the multiplier, so it takes more 
> bandwidth to have 40 peers; this should improve the average bandwidth per 
> connection.
>
> This is actually from evanbd:
>
> [17:11:04]  evanbd: what was your new proposed formula for 
> bandwidth/peers limit?
> [17:11:19]  evanbd: we should try that out in 1406 if there aren't any 
> other major changes, and in 1407 if there are
> [17:11:43]  Currently, it's peers = sqrt(12*limit), with limit in 
> KiB/s.
> [17:11:47]  ok
> [17:11:54]  I'm proposing changing the 12 to 6 or so.
> [17:12:07]  ahhh
> [17:12:26]  meaning high bandwidth nodes have the same number of peers 
> but middle ones have less => higher bandwidth per peer
> [17:12:34]  Exactly.
> [17:13:03]  okay, we should do that soon
>
> This should be in 1406 or 1407.
>
> ___
> Devl mailing list
> Devl at freenetproject.org
> http://freenetproject.org/cgi-bin/mailman/listinfo/devl

Oops, had that in my branch, looks like I forgot to do a pull request.
Anyway, thanks!

Evan



Re: [freenet-dev] Tweaking the peer limit scaling constant

2012-03-15 Thread Evan Daniel
On Thu, Mar 15, 2012 at 10:22 AM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 commit cb21f7d4bc5e1940b6acdb49004b912f8fb705e5
 Author: Matthew Toseland t...@amphibian.dyndns.org
 Date:   Thu Mar 15 14:18:10 2012 +

    Tweak peers limit by bandwidth SCALING_CONSTANT: Lots of peers are on the 
 high end of the peers limit. Reduce the multiplier, so it takes more 
 bandwidth to have 40 peers; this should improve the average bandwidth per 
 connection.

 This is actually from evanbd:

 [17:11:04] toad_ evanbd: what was your new proposed formula for 
 bandwidth/peers limit?
 [17:11:19] toad_ evanbd: we should try that out in 1406 if there aren't any 
 other major changes, and in 1407 if there are
 [17:11:43] evanbd Currently, it's peers = sqrt(12*limit), with limit in 
 KiB/s.
 [17:11:47] toad_ ok
 [17:11:54] evanbd I'm proposing changing the 12 to 6 or so.
 [17:12:07] toad_ ahhh
 [17:12:26] toad_ meaning high bandwidth nodes have the same number of peers 
 but middle ones have less = higher bandwidth per peer
 [17:12:34] evanbd Exactly.
 [17:13:03] toad_ okay, we should do that soon

 This should be in 1406 or 1407.

 ___
 Devl mailing list
 Devl@freenetproject.org
 http://freenetproject.org/cgi-bin/mailman/listinfo/devl

Oops, had that in my branch, looks like I forgot to do a pull request.
Anyway, thanks!

Evan
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Tweaking the peer limit scaling constant

2012-03-15 Thread Evan Daniel
On Thu, Mar 15, 2012 at 6:08 PM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Thursday 15 Mar 2012 20:24:10 Evan Daniel wrote:
 On Thu, Mar 15, 2012 at 10:22 AM, Matthew Toseland
 t...@amphibian.dyndns.org wrote:
  commit cb21f7d4bc5e1940b6acdb49004b912f8fb705e5
  Author: Matthew Toseland t...@amphibian.dyndns.org
  Date:   Thu Mar 15 14:18:10 2012 +
 
     Tweak peers limit by bandwidth SCALING_CONSTANT: Lots of peers are on 
  the high end of the peers limit. Reduce the multiplier, so it takes more 
  bandwidth to have 40 peers; this should improve the average bandwidth per 
  connection.
 
  This is actually from evanbd:
 
  [17:11:04] toad_ evanbd: what was your new proposed formula for 
  bandwidth/peers limit?
  [17:11:19] toad_ evanbd: we should try that out in 1406 if there aren't 
  any other major changes, and in 1407 if there are
  [17:11:43] evanbd Currently, it's peers = sqrt(12*limit), with limit in 
  KiB/s.
  [17:11:47] toad_ ok
  [17:11:54] evanbd I'm proposing changing the 12 to 6 or so.
  [17:12:07] toad_ ahhh
  [17:12:26] toad_ meaning high bandwidth nodes have the same number of 
  peers but middle ones have less = higher bandwidth per peer
  [17:12:34] evanbd Exactly.
  [17:13:03] toad_ okay, we should do that soon
 
  This should be in 1406 or 1407.

 Oops, had that in my branch, looks like I forgot to do a pull request.
 Anyway, thanks!

 This should be merged soon after 1406 IMHO. We want to have one network level 
 change per build, and there are some minor things in 1406 (deadlock fixes and 
 the location thingy).

I like that plan.

Evan
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Should we switch the websites to httpS only?

2012-03-09 Thread Evan Daniel
On Fri, Mar 9, 2012 at 4:21 PM, Florent Daigniere
 wrote:
> Hi,
>
> I've been doing some sysadmin tonight:
> ? ? ? ?- re-enabled ipv6 on all services
> ? ? ? ?- updated the DNS records (SPF, ...)
> ? ? ? ?- deployed a valid certificate on postfix
>
> Let me know if I broke something.
>
> I was wondering, do we have any good reason not to switch the various 
> websites to HTTPS only? (with a 301 redirect on HTTP)

Awesome, thanks!

I'm in favor of https only. The only real arguments against it are
probably server cpu load. I assume that given our traffic levels,
that's not likely to be an issue?

Evan Daniel



Re: [freenet-dev] Should we switch the websites to httpS only?

2012-03-09 Thread Evan Daniel
On Fri, Mar 9, 2012 at 4:21 PM, Florent Daigniere
nextg...@freenetproject.org wrote:
 Hi,

 I've been doing some sysadmin tonight:
        - re-enabled ipv6 on all services
        - updated the DNS records (SPF, ...)
        - deployed a valid certificate on postfix

 Let me know if I broke something.

 I was wondering, do we have any good reason not to switch the various 
 websites to HTTPS only? (with a 301 redirect on HTTP)

Awesome, thanks!

I'm in favor of https only. The only real arguments against it are
probably server cpu load. I assume that given our traffic levels,
that's not likely to be an issue?

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] GSOC 2012 Deadline Approaching

2012-03-04 Thread Evan Daniel
On Sun, Mar 4, 2012 at 11:31 AM, Matthew Toseland
 wrote:
> On Sunday 04 Mar 2012 16:09:41 Evan Daniel wrote:
>> On Sun, Mar 4, 2012 at 11:06 AM, Matthew Toseland
>>  wrote:
>> > On Friday 02 Mar 2012 07:36:22 Steve Dougherty wrote:
>> >> Matthew has said he's unable to be the administrator of the project for
>> >> this year's Google Summer of Code, and that in order to apply we'd need a
>> >> volunteer administrator in addition to mentors. Is anyone willing to
>> >> volunteer?
>> >
>> > Well, basically the problem is admin ends up being backup mentor if the 
>> > real mentor disappears. If anyone has a solution to this then we could do 
>> > it.
>>
>> Well, we could designate a backup mentor, assuming we have a volunteer
>> for that. I'm not sure that makes things any easier though :/
>
> Exactly. What happens is the mentor disappears, the admin panics, and 
> somebody (likely the admin to avoid further chasing) has to go over the 
> available evidence (mostly code) in a very short period. I don't have time 
> for that sort of madness at the moment.

I mean, designate one in advance, so that the searching can happen in
an orderly manner without need for panic :)

But, I suspect that's approximately as hard as finding a volunteer for
the admin role :/

Evan



[freenet-dev] GSOC 2012 Deadline Approaching

2012-03-04 Thread Evan Daniel
On Sun, Mar 4, 2012 at 11:06 AM, Matthew Toseland
 wrote:
> On Friday 02 Mar 2012 07:36:22 Steve Dougherty wrote:
>> Matthew has said he's unable to be the administrator of the project for
>> this year's Google Summer of Code, and that in order to apply we'd need a
>> volunteer administrator in addition to mentors. Is anyone willing to
>> volunteer?
>
> Well, basically the problem is admin ends up being backup mentor if the real 
> mentor disappears. If anyone has a solution to this then we could do it.

Well, we could designate a backup mentor, assuming we have a volunteer
for that. I'm not sure that makes things any easier though :/

Evan Daniel



Re: [freenet-dev] GSOC 2012 Deadline Approaching

2012-03-04 Thread Evan Daniel
On Sun, Mar 4, 2012 at 11:06 AM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Friday 02 Mar 2012 07:36:22 Steve Dougherty wrote:
 Matthew has said he's unable to be the administrator of the project for
 this year's Google Summer of Code, and that in order to apply we'd need a
 volunteer administrator in addition to mentors. Is anyone willing to
 volunteer?

 Well, basically the problem is admin ends up being backup mentor if the real 
 mentor disappears. If anyone has a solution to this then we could do it.

Well, we could designate a backup mentor, assuming we have a volunteer
for that. I'm not sure that makes things any easier though :/

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] GSOC 2012 Deadline Approaching

2012-03-04 Thread Evan Daniel
On Sun, Mar 4, 2012 at 11:31 AM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Sunday 04 Mar 2012 16:09:41 Evan Daniel wrote:
 On Sun, Mar 4, 2012 at 11:06 AM, Matthew Toseland
 t...@amphibian.dyndns.org wrote:
  On Friday 02 Mar 2012 07:36:22 Steve Dougherty wrote:
  Matthew has said he's unable to be the administrator of the project for
  this year's Google Summer of Code, and that in order to apply we'd need a
  volunteer administrator in addition to mentors. Is anyone willing to
  volunteer?
 
  Well, basically the problem is admin ends up being backup mentor if the 
  real mentor disappears. If anyone has a solution to this then we could do 
  it.

 Well, we could designate a backup mentor, assuming we have a volunteer
 for that. I'm not sure that makes things any easier though :/

 Exactly. What happens is the mentor disappears, the admin panics, and 
 somebody (likely the admin to avoid further chasing) has to go over the 
 available evidence (mostly code) in a very short period. I don't have time 
 for that sort of madness at the moment.

I mean, designate one in advance, so that the searching can happen in
an orderly manner without need for panic :)

But, I suspect that's approximately as hard as finding a volunteer for
the admin role :/

Evan
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Gun.io

2012-02-12 Thread Evan Daniel
On Sun, Feb 12, 2012 at 4:02 PM, Juiceman  wrote:
>> OK, I created a pair of simple Freenet gigs. If it works, I might post
>> more.
>> http://gun.io/open/30/freenet-node-diagnostics-page
>> http://gun.io/open/31/fix-freenet-network-usage-stats
>>
>> To answer my own questions:
>>
>> I don't see any way to search. Anyone can post a gig for any project.
>> Anyone can add more money to an existing gig.
>>
>> All in all, it looks simple and effective. Neat.
>
>
> I added a little money to one of your projects.? It all adds up, right?? :)
>
> Should we put a link to gun.io freenet projects from the main freenet
> website, perhaps under the "Donate" tab call it "Post a bounty!" ?

Awesome, thanks!

And yes, I like that idea a lot! I bet we have some users willing to
vote with their wallets on our bug reports ;)

Evan Daniel



Re: [freenet-dev] Gun.io

2012-02-12 Thread Evan Daniel
On Sun, Feb 12, 2012 at 4:02 PM, Juiceman juicema...@gmail.com wrote:
 OK, I created a pair of simple Freenet gigs. If it works, I might post
 more.
 http://gun.io/open/30/freenet-node-diagnostics-page
 http://gun.io/open/31/fix-freenet-network-usage-stats

 To answer my own questions:

 I don't see any way to search. Anyone can post a gig for any project.
 Anyone can add more money to an existing gig.

 All in all, it looks simple and effective. Neat.


 I added a little money to one of your projects.  It all adds up, right?  :)

 Should we put a link to gun.io freenet projects from the main freenet
 website, perhaps under the Donate tab call it Post a bounty! ?

Awesome, thanks!

And yes, I like that idea a lot! I bet we have some users willing to
vote with their wallets on our bug reports ;)

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Gun.io

2012-02-11 Thread Evan Daniel
On Mon, Jan 16, 2012 at 6:20 PM, Evan Daniel  wrote:
> On Mon, Jan 16, 2012 at 11:46 AM, Ian Clarke  
> wrote:
>> On Sun, Jan 15, 2012 at 9:17 PM, Evan Daniel  wrote:
>>>
>>> Dumb question time: how would I go about searching for hypothetical
>>> freenet tasks on gun.io?
>>
>>
>> Right now there are only 10 open source gigs listed:?http://gun.io/open/
>>
>> I think we'd probably need to attract attention to any gigs we list, perhaps
>> with an appropriate post on reddit.com/r/programming.
>>
>>>
>>> Can anyone post a job for Freenet, or only project admins? How is
>>> payment handled?
>>
>>
>> Perhaps Rich can answer this or provide a pointer to an explanation.
>>
>>>
>>> Also, if it works, I think this would be an excellent use of FPI money.
>>
>>
>> I agree, especially since Matthew isn't burning too many hours on Freenet
>> these days.
>
> I haven't actually tried posting a gig, but I'd be happy to set up a
> few issues as tests, including putting up a little cash for bounties.
> I'll do that tonight or tomorrow depending on time.
>
> Is there a way for additional people to add money to an open source
> gig? Eg, if I set up an issue with a $30 bounty, can someone else add
> another $30 to the same gig easily?
>
> Evan

OK, I created a pair of simple Freenet gigs. If it works, I might post more.
http://gun.io/open/30/freenet-node-diagnostics-page
http://gun.io/open/31/fix-freenet-network-usage-stats

To answer my own questions:

I don't see any way to search. Anyone can post a gig for any project.
Anyone can add more money to an existing gig.

All in all, it looks simple and effective. Neat.

Evan Daniel



Re: [freenet-dev] Gun.io

2012-02-11 Thread Evan Daniel
On Mon, Jan 16, 2012 at 6:20 PM, Evan Daniel eva...@gmail.com wrote:
 On Mon, Jan 16, 2012 at 11:46 AM, Ian Clarke i...@freenetproject.org wrote:
 On Sun, Jan 15, 2012 at 9:17 PM, Evan Daniel eva...@gmail.com wrote:

 Dumb question time: how would I go about searching for hypothetical
 freenet tasks on gun.io?


 Right now there are only 10 open source gigs listed: http://gun.io/open/

 I think we'd probably need to attract attention to any gigs we list, perhaps
 with an appropriate post on reddit.com/r/programming.


 Can anyone post a job for Freenet, or only project admins? How is
 payment handled?


 Perhaps Rich can answer this or provide a pointer to an explanation.


 Also, if it works, I think this would be an excellent use of FPI money.


 I agree, especially since Matthew isn't burning too many hours on Freenet
 these days.

 I haven't actually tried posting a gig, but I'd be happy to set up a
 few issues as tests, including putting up a little cash for bounties.
 I'll do that tonight or tomorrow depending on time.

 Is there a way for additional people to add money to an open source
 gig? Eg, if I set up an issue with a $30 bounty, can someone else add
 another $30 to the same gig easily?

 Evan

OK, I created a pair of simple Freenet gigs. If it works, I might post more.
http://gun.io/open/30/freenet-node-diagnostics-page
http://gun.io/open/31/fix-freenet-network-usage-stats

To answer my own questions:

I don't see any way to search. Anyone can post a gig for any project.
Anyone can add more money to an existing gig.

All in all, it looks simple and effective. Neat.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Gun.io

2012-01-16 Thread Evan Daniel
On Mon, Jan 16, 2012 at 11:46 AM, Ian Clarke  wrote:
> On Sun, Jan 15, 2012 at 9:17 PM, Evan Daniel  wrote:
>>
>> Dumb question time: how would I go about searching for hypothetical
>> freenet tasks on gun.io?
>
>
> Right now there are only 10 open source gigs listed:?http://gun.io/open/
>
> I think we'd probably need to attract attention to any gigs we list, perhaps
> with an appropriate post on reddit.com/r/programming.
>
>>
>> Can anyone post a job for Freenet, or only project admins? How is
>> payment handled?
>
>
> Perhaps Rich can answer this or provide a pointer to an explanation.
>
>>
>> Also, if it works, I think this would be an excellent use of FPI money.
>
>
> I agree, especially since Matthew isn't burning too many hours on Freenet
> these days.

I haven't actually tried posting a gig, but I'd be happy to set up a
few issues as tests, including putting up a little cash for bounties.
I'll do that tonight or tomorrow depending on time.

Is there a way for additional people to add money to an open source
gig? Eg, if I set up an issue with a $30 bounty, can someone else add
another $30 to the same gig easily?

Evan



[freenet-dev] Fwd: Call for Papers: IEEE P2P 2012

2012-01-16 Thread Evan Daniel
On Sun, Jan 15, 2012 at 10:39 PM, Michael Grube  
wrote:
> On Sun, Jan 15, 2012 at 10:09 PM, Evan Daniel  wrote:
>>
>> Submitting a response to the Pitch Black paper seems a bit premature,
>> given that in the real world we probably have network distribution
>> problems even without an active adversary.
>
>
> Could you be more specific? Are you talking about the clustering that occurs
> on its own?
>
> The paper I am writing simulates proposed solutions to the pitch black
> attack and measures their effectiveness. Assuming I can get that done in
> short order, I will begin looking for better approaches if they can be
> improved upon.

It's been too long since I looked at this to give as complete an
answer as I'd like. I can dive back into it at some point if there's
interest.

One big thing: there's anecdotal evidence that (opennet) link length
distributions show very different patterns for nodes doing a lot of
downloading than for nodes that aren't. Specifically, nodes doing
downloads have a more uniform distribution, not a 1/d distribution. I
don't think there's been any systematic investigation of whether this
occurs, or how big a problem with routing the resultant networks have
in simulation. If someone is seriously interested in this, I think my
periodic network stats scripts probably have enough information to
tackle it. Send me an email and I'll get you raw data to play with. (I
don't realistically have time anytime soon.)

(Yes, I realize that's an opennet problem and Pitch Black is a darknet
problem. My response is that the opennet problems should be a lot
easier to investigate, given that we have a large live network, and we
haven't even bothered with that much.)

Evan



Re: [freenet-dev] Fwd: Call for Papers: IEEE P2P 2012

2012-01-16 Thread Evan Daniel
On Sun, Jan 15, 2012 at 10:39 PM, Michael Grube michael.gr...@gmail.com wrote:
 On Sun, Jan 15, 2012 at 10:09 PM, Evan Daniel eva...@gmail.com wrote:

 Submitting a response to the Pitch Black paper seems a bit premature,
 given that in the real world we probably have network distribution
 problems even without an active adversary.


 Could you be more specific? Are you talking about the clustering that occurs
 on its own?

 The paper I am writing simulates proposed solutions to the pitch black
 attack and measures their effectiveness. Assuming I can get that done in
 short order, I will begin looking for better approaches if they can be
 improved upon.

It's been too long since I looked at this to give as complete an
answer as I'd like. I can dive back into it at some point if there's
interest.

One big thing: there's anecdotal evidence that (opennet) link length
distributions show very different patterns for nodes doing a lot of
downloading than for nodes that aren't. Specifically, nodes doing
downloads have a more uniform distribution, not a 1/d distribution. I
don't think there's been any systematic investigation of whether this
occurs, or how big a problem with routing the resultant networks have
in simulation. If someone is seriously interested in this, I think my
periodic network stats scripts probably have enough information to
tackle it. Send me an email and I'll get you raw data to play with. (I
don't realistically have time anytime soon.)

(Yes, I realize that's an opennet problem and Pitch Black is a darknet
problem. My response is that the opennet problems should be a lot
easier to investigate, given that we have a large live network, and we
haven't even bothered with that much.)

Evan
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Gun.io

2012-01-16 Thread Evan Daniel
On Mon, Jan 16, 2012 at 11:46 AM, Ian Clarke i...@freenetproject.org wrote:
 On Sun, Jan 15, 2012 at 9:17 PM, Evan Daniel eva...@gmail.com wrote:

 Dumb question time: how would I go about searching for hypothetical
 freenet tasks on gun.io?


 Right now there are only 10 open source gigs listed: http://gun.io/open/

 I think we'd probably need to attract attention to any gigs we list, perhaps
 with an appropriate post on reddit.com/r/programming.


 Can anyone post a job for Freenet, or only project admins? How is
 payment handled?


 Perhaps Rich can answer this or provide a pointer to an explanation.


 Also, if it works, I think this would be an excellent use of FPI money.


 I agree, especially since Matthew isn't burning too many hours on Freenet
 these days.

I haven't actually tried posting a gig, but I'd be happy to set up a
few issues as tests, including putting up a little cash for bounties.
I'll do that tonight or tomorrow depending on time.

Is there a way for additional people to add money to an open source
gig? Eg, if I set up an issue with a $30 bounty, can someone else add
another $30 to the same gig easily?

Evan
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Gun.io

2012-01-15 Thread Evan Daniel
On Thu, Jan 5, 2012 at 8:11 PM, Ian Clarke  wrote:
> I recieved the following email from Rich Jones, creator of Gun.io. ?This
> could be a very interesting way for us to get specific tasks done...
>
> -- Forwarded message --
>
> ...snip...
>
> My name is Rich Jones, and I'm the lead developer and director of a project
> called Gun.io, a platform for open source project managers to raise funds
> and to hire open source freelancers for microtasks on their projects. I'm
> writing to you today to invite you to give it a try!
>
> The way it works is pretty simple: you post a task which needs to be done
> for your project and offer up an amount of money to pay for it. Other people
> can then contribute to this pool of money, or they can work on the task
> assigned. The first person to complete the task to your satisfaction will
> then be awarded all of the money in the pool.
>
> Gun.io is perfect for discrete tasks that your project needs to move
> forward, like fixing bugs, adding new features, and writing tests, examples
> and documentation. It's a great way to raise and spend funds, too, as your
> donors will know that their contributions are going directly to improving
> the project. Gun.io is how you turn a good project into a great project.
>
> Gun.io has already used successfully by the Etherpad Foundation and Mozilla,
> the makers of Firefox. We are a fairly new project, but we already have
> thousands of registered developers who will be notified when your gig is
> posted.
>
> This is also completely free for open source projects! I developed Gun.io
> because I am an open source developer myself, and I wanted to hire
> assistance for my projects but was unsatisfied with offshore freelancers. I
> wanted to build a community-based solution which would have open source
> developers working for each other, so that's what we've made. I think that
> Freenet could really benefit from what we've built, so please give it a try!
>
> You can see our homepage here: http://gun.io
> you can browse our open source gigs here: http://gun.io/open/
> and you can post your own here: http://gun.io/open/new/
>
> If you've got any questions or comments (or if you just want to chat), feel
> free to email me any time at rich dot gun dot io.

Dumb question time: how would I go about searching for hypothetical
freenet tasks on gun.io?

Can anyone post a job for Freenet, or only project admins? How is
payment handled?

If I were to post some small bounties on existing bugs, would that
motivate anyone here to work on them? My budget more approximates the
"hey, thanks!" level than the "reasonably hourly wage" level.

Also, if it works, I think this would be an excellent use of FPI money.

Evan



[freenet-dev] Fwd: Call for Papers: IEEE P2P 2012

2012-01-15 Thread Evan Daniel
Submitting a response to the Pitch Black paper seems a bit premature,
given that in the real world we probably have network distribution
problems even without an active adversary.

I continue to think that the biggest thing preventing routing layer
improvements is a fairly deep lack of understanding of what is
actually happening on the network.

And, if you're looking for papers, I think several interesting ones
could be written on that subject :)

Evan Daniel

On Sun, Jan 15, 2012 at 6:46 PM, Ian Clarke  wrote:
> Will you submit? ?These guys have rejected our papers in the past
> specifically because we haven't yet formally responded to the Pitch Black
> paper, so submitting a response to it would be a pretty good thing IMHO.
>
> Have you had any contact with Theodore Hong? ?If you ask him very nicely he
> may be willing to provide feedback on your paper. ?Of course I will too, but
> Theo has a lot more experience with academic papers than I do.
>
> Ian.
>
>
> On Wed, Jan 11, 2012 at 9:29 PM, Michael Grube 
> wrote:
>>
>> Thanks for the info!
>>
>> On Wed, Jan 11, 2012 at 10:12 PM, Ian Clarke 
>> wrote:
>>>
>>> fyi
>>>
>>> -- Forwarded message --
>>> From: David Hausheer 
>>> Date: Wed, Jan 11, 2012 at 5:27 PM
>>> Subject: Call for Papers: IEEE P2P 2012
>>> To: David Hausheer 
>>>
>>>
>>> #
>>> ? ? ? ? ? ? ? ? ?IEEE P2P 2012
>>> ?12th International Conference in Peer-to-Peer Computing
>>> ? ? ? ? ? ? ? ? CALL FOR PAPERS
>>> #
>>>
>>> September 3-5 2012, Tarragona (Spain)
>>> http://www.ieee-p2p.org
>>>
>>> ##
>>> # Papers Due: *** April, 13 2012 ***
>>> # Accepted papers will be published in the conference proceedings
>>> # by the IEEE Computer Society Press, which are indexed by EI.
>>> ##
>>>
>>> The P2P'12 conference solicits papers on all aspects of large-scale
>>> distributed computing. Of particular interest is research that
>>> furthers the state-of-the-art in the design and analysis of
>>> large-scale distributed applications and systems, or that investigates
>>> real, deployed, applications or systems. We seek high-quality and
>>> original contributions on this general theme along a range of topics
>>> including:
>>>
>>> ? ?* Information retrieval and query support
>>> ? ?* P2P for cloud computing
>>> ? ?* Large-scale infrastructure technology
>>> ? ?* Semantic overlay networks and semantic query routing
>>> ? ?* P2P for grids, clouds, and datacenters
>>> ? ?* Deployed (commercial) applications and systems
>>> ? ?* Security, trust, and reputation
>>> ? ?* Cooperation, incentives, and fairness
>>> ? ?* P2P economics
>>> ? ?* Social networks
>>> ? ?* Overlay architectures and topologies
>>> ? ?* Overlay interaction with underlying infrastructure
>>> ? ?* Overlay monitoring and management
>>> ? ?* Self-organization
>>> ? ?* P2P applications and systems over mobile networks
>>> ? ?* Measurements and modeling of P2P and cloud systems
>>> ? ?* Performance, robustness, and scalability
>>>
>>>
>>>
>>> 
>>> Paper submission guidelines
>>> 
>>>
>>> Papers can be submitted either as full papers or as short papers
>>> (following the IEEE single-spaced two-column format and a 10-point
>>> font size). Full papers should not exceed 10 pages and short papers
>>> should not exceed 5 pages. Short papers are expected to present work
>>> that is less mature but holds promise, articulate a high-level vision,
>>> describe challenging future directions or offer results that do not
>>> merit a full submission. Please note that short papers also need
>>> evaluation results or analysis to corroborate the claims of the paper
>>> and that a paper longer than 5 pages is treated as a full paper.
>>>
>>> Papers must be submitted electronically in PDF format through the EDAS
>>> paper-submission website linked from the conference website. IEEE
>>> templates for LaTeX and Microsoft Word, as well as related
>>> information, can be found at the IEEE Digital Toolbox webpage. The

Re: [freenet-dev] Fwd: Call for Papers: IEEE P2P 2012

2012-01-15 Thread Evan Daniel
Submitting a response to the Pitch Black paper seems a bit premature,
given that in the real world we probably have network distribution
problems even without an active adversary.

I continue to think that the biggest thing preventing routing layer
improvements is a fairly deep lack of understanding of what is
actually happening on the network.

And, if you're looking for papers, I think several interesting ones
could be written on that subject :)

Evan Daniel

On Sun, Jan 15, 2012 at 6:46 PM, Ian Clarke i...@locut.us wrote:
 Will you submit?  These guys have rejected our papers in the past
 specifically because we haven't yet formally responded to the Pitch Black
 paper, so submitting a response to it would be a pretty good thing IMHO.

 Have you had any contact with Theodore Hong?  If you ask him very nicely he
 may be willing to provide feedback on your paper.  Of course I will too, but
 Theo has a lot more experience with academic papers than I do.

 Ian.


 On Wed, Jan 11, 2012 at 9:29 PM, Michael Grube michael.gr...@gmail.com
 wrote:

 Thanks for the info!

 On Wed, Jan 11, 2012 at 10:12 PM, Ian Clarke i...@freenetproject.org
 wrote:

 fyi

 -- Forwarded message --
 From: David Hausheer haush...@kom.tu-darmstadt.de
 Date: Wed, Jan 11, 2012 at 5:27 PM
 Subject: Call for Papers: IEEE P2P 2012
 To: David Hausheer haush...@kom.tu-darmstadt.de


 #
                  IEEE P2P 2012
  12th International Conference in Peer-to-Peer Computing
                 CALL FOR PAPERS
 #

 September 3-5 2012, Tarragona (Spain)
 http://www.ieee-p2p.org

 ##
 # Papers Due: *** April, 13 2012 ***
 # Accepted papers will be published in the conference proceedings
 # by the IEEE Computer Society Press, which are indexed by EI.
 ##

 The P2P'12 conference solicits papers on all aspects of large-scale
 distributed computing. Of particular interest is research that
 furthers the state-of-the-art in the design and analysis of
 large-scale distributed applications and systems, or that investigates
 real, deployed, applications or systems. We seek high-quality and
 original contributions on this general theme along a range of topics
 including:

    * Information retrieval and query support
    * P2P for cloud computing
    * Large-scale infrastructure technology
    * Semantic overlay networks and semantic query routing
    * P2P for grids, clouds, and datacenters
    * Deployed (commercial) applications and systems
    * Security, trust, and reputation
    * Cooperation, incentives, and fairness
    * P2P economics
    * Social networks
    * Overlay architectures and topologies
    * Overlay interaction with underlying infrastructure
    * Overlay monitoring and management
    * Self-organization
    * P2P applications and systems over mobile networks
    * Measurements and modeling of P2P and cloud systems
    * Performance, robustness, and scalability



 
 Paper submission guidelines
 

 Papers can be submitted either as full papers or as short papers
 (following the IEEE single-spaced two-column format and a 10-point
 font size). Full papers should not exceed 10 pages and short papers
 should not exceed 5 pages. Short papers are expected to present work
 that is less mature but holds promise, articulate a high-level vision,
 describe challenging future directions or offer results that do not
 merit a full submission. Please note that short papers also need
 evaluation results or analysis to corroborate the claims of the paper
 and that a paper longer than 5 pages is treated as a full paper.

 Papers must be submitted electronically in PDF format through the EDAS
 paper-submission website linked from the conference website. IEEE
 templates for LaTeX and Microsoft Word, as well as related
 information, can be found at the IEEE Digital Toolbox webpage. The
 conference proceedings will be published by the IEEE Communications
 Society.

 All submissions will be evaluated using a double-blind review
 process. To ensure blind reviewing, papers should be anonymized by
 removing author names and affiliations, as well as by masking any
 information about projects and bibliographic references, etc. that
 might reveal the authors' identities. Papers that are not properly
 anonymized will be rejected without review. Submitted papers should
 describe original and previously unpublished research and are not
 allowed to be simultaneously submitted or under review elsewhere.

 *Note:* P2P 2012 will experiment with two changes to the traditional
 reviewing process: two-phase reviewing and open reviews. These
 changes are an effort to provide additional feedback to authors of
 submitted

Re: [freenet-dev] Gun.io

2012-01-15 Thread Evan Daniel
On Thu, Jan 5, 2012 at 8:11 PM, Ian Clarke i...@freenetproject.org wrote:
 I recieved the following email from Rich Jones, creator of Gun.io.  This
 could be a very interesting way for us to get specific tasks done...

 -- Forwarded message --

 ...snip...

 My name is Rich Jones, and I'm the lead developer and director of a project
 called Gun.io, a platform for open source project managers to raise funds
 and to hire open source freelancers for microtasks on their projects. I'm
 writing to you today to invite you to give it a try!

 The way it works is pretty simple: you post a task which needs to be done
 for your project and offer up an amount of money to pay for it. Other people
 can then contribute to this pool of money, or they can work on the task
 assigned. The first person to complete the task to your satisfaction will
 then be awarded all of the money in the pool.

 Gun.io is perfect for discrete tasks that your project needs to move
 forward, like fixing bugs, adding new features, and writing tests, examples
 and documentation. It's a great way to raise and spend funds, too, as your
 donors will know that their contributions are going directly to improving
 the project. Gun.io is how you turn a good project into a great project.

 Gun.io has already used successfully by the Etherpad Foundation and Mozilla,
 the makers of Firefox. We are a fairly new project, but we already have
 thousands of registered developers who will be notified when your gig is
 posted.

 This is also completely free for open source projects! I developed Gun.io
 because I am an open source developer myself, and I wanted to hire
 assistance for my projects but was unsatisfied with offshore freelancers. I
 wanted to build a community-based solution which would have open source
 developers working for each other, so that's what we've made. I think that
 Freenet could really benefit from what we've built, so please give it a try!

 You can see our homepage here: http://gun.io
 you can browse our open source gigs here: http://gun.io/open/
 and you can post your own here: http://gun.io/open/new/

 If you've got any questions or comments (or if you just want to chat), feel
 free to email me any time at rich dot gun dot io.

Dumb question time: how would I go about searching for hypothetical
freenet tasks on gun.io?

Can anyone post a job for Freenet, or only project admins? How is
payment handled?

If I were to post some small bounties on existing bugs, would that
motivate anyone here to work on them? My budget more approximates the
hey, thanks! level than the reasonably hourly wage level.

Also, if it works, I think this would be an excellent use of FPI money.

Evan
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] How to gather more data was Re: Beyond New Load Management: A proposal

2011-09-01 Thread Evan Daniel
On Thu, Sep 1, 2011 at 7:24 AM, Matthew Toseland
 wrote:
>> I like this proposal :)
>>
>> Is the documentation on the math of how to get the random routing to
>> behave well sufficient? Let me know if it isn't. The MHMC routing math
>> shouldn't be too complicated, but we want to be certain it's
>> implemented correctly so that the data is sound.
>
> Do you have a metric for how clustered vs uniform a node's peers are?

Maybe. It's a tougher problem than it looks at first glance, unless we
have a somewhat reliable network size estimate available. I'll give it
some more thought.

If you want a qualitative, visual estimate, just change the peer
location distribution graph on the stats page to have a logarithmic x
axis. Basically, a node should have similar numbers of peers at
distance 0.25 < d <= 0.5, and at 0.125 < d <= 0.25, etc. That is, bar
n should graph the number of nodes at distance 2^(-n-2) < d <
2^(-n-1). That doesn't provide an answer you can use in code to
evaluate and make decisions, but it should give better anecdotal
evidence about problems.

> What's MHMC?

Metropolis-Hastings Monte Carlo. We currently use it to get the right
probability distribution for location swaps. We should also use it for
randomly routed requests. (For ones that need to be statistically
random, anyway. It's probably not worth the performance cost for eg
high-htl random routing of regular requests for security purposes.)
It's mentioned in the bug report:
https://bugs.freenetproject.org/view.php?id=3568
And there's a code snippet on my flog, that's used in my simulator:
freenet:USK at 
gjw6StjZOZ4OAG-pqOxIp5Nk11udQZOrozD4jld42Ac,BYyqgAtc9p0JGbJ~18XU6mtO9ChnBZdf~ttCn48FV7s,AQACAAE/flog/29/200909.xhtml
(Entry for 20090926)

Evan Daniel



Re: [freenet-dev] How to gather more data was Re: Beyond New Load Management: A proposal

2011-09-01 Thread Evan Daniel
On Thu, Sep 1, 2011 at 7:24 AM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 I like this proposal :)

 Is the documentation on the math of how to get the random routing to
 behave well sufficient? Let me know if it isn't. The MHMC routing math
 shouldn't be too complicated, but we want to be certain it's
 implemented correctly so that the data is sound.

 Do you have a metric for how clustered vs uniform a node's peers are?

Maybe. It's a tougher problem than it looks at first glance, unless we
have a somewhat reliable network size estimate available. I'll give it
some more thought.

If you want a qualitative, visual estimate, just change the peer
location distribution graph on the stats page to have a logarithmic x
axis. Basically, a node should have similar numbers of peers at
distance 0.25  d = 0.5, and at 0.125  d = 0.25, etc. That is, bar
n should graph the number of nodes at distance 2^(-n-2)  d 
2^(-n-1). That doesn't provide an answer you can use in code to
evaluate and make decisions, but it should give better anecdotal
evidence about problems.

 What's MHMC?

Metropolis-Hastings Monte Carlo. We currently use it to get the right
probability distribution for location swaps. We should also use it for
randomly routed requests. (For ones that need to be statistically
random, anyway. It's probably not worth the performance cost for eg
high-htl random routing of regular requests for security purposes.)
It's mentioned in the bug report:
https://bugs.freenetproject.org/view.php?id=3568
And there's a code snippet on my flog, that's used in my simulator:
freenet:USK@gjw6StjZOZ4OAG-pqOxIp5Nk11udQZOrozD4jld42Ac,BYyqgAtc9p0JGbJ~18XU6mtO9ChnBZdf~ttCn48FV7s,AQACAAE/flog/29/200909.xhtml
(Entry for 20090926)

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] How to gather more data was Re: Beyond New Load Management: A proposal

2011-08-31 Thread Evan Daniel
owing us to separate problems with bootstrapping from problems faced by 
> the average node).
>
> IMHO some or all of the above are worth seriously considering before 
> deploying any new scheme. If we want to be empirical we need to measure the 
> effect of our changes on the real network, not only in simulation.
>
> _______
> Devl mailing list
> Devl at freenetproject.org
> http://freenetproject.org/cgi-bin/mailman/listinfo/devl
>

I like this proposal :)

Is the documentation on the math of how to get the random routing to
behave well sufficient? Let me know if it isn't. The MHMC routing math
shouldn't be too complicated, but we want to be certain it's
implemented correctly so that the data is sound.

Evan Daniel



Re: [freenet-dev] How to gather more data was Re: Beyond New Load Management: A proposal

2011-08-31 Thread Evan Daniel
 :)

Is the documentation on the math of how to get the random routing to
behave well sufficient? Let me know if it isn't. The MHMC routing math
shouldn't be too complicated, but we want to be certain it's
implemented correctly so that the data is sound.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Freenet 0.7.5 build 1401

2011-08-28 Thread Evan Daniel
On Fri, Aug 26, 2011 at 11:50 AM, Matthew Toseland
 wrote:
> Freenet 0.7.5 build 1401 is now available. Please upgrade, it will be 
> mandatory at midnight. This build turns off New Load Management, for the time 
> being. If performance continues to be poor we will know the problem is 
> elsewhere (it is possible that it is a problem with the asyncGet changes, 
> although I don't see how). There are also fixes related to dropping peers due 
> to the one IP per connection setting. You should not normally enable this 
> setting on darknet (core settings); it can cause your friends to be lost.
>
> Thanks, and sorry for all the problems lately.

Was it really necessary to have the update be mandatory on such short
notice? I'm not trying to be sarcastic, this is a serious question and
I'm curious about your opinion.

It seems to me that you get most of the impact thanks to the
auto-updates, which most of the network uses. Some of us, however, do
not. I can't use the auto-update without it regularly interfering with
the network size graphs I produce. (Yes, the scripts that run it are
brittle and sensitive to things like that. Yes, I'd like to throw them
out and write something better, but motivation to actually do that has
yet to strike.) So I have to manually perform updates at times when
they aren't running, and I missed the window on that one. The result
is wonky network size info that I'm pretty sure is entirely an
artifact of that, and has no relationship to what the recent
performance issues have done to network size, which would have been an
interesting question.

Anyway, a little more warning would have been nice, but obviously the
health of the network comes first.

Is there a policy on what the requirements are for a build to be made
mandatory at all? And what the warning period should be? If not, it
seems like something we should have.

Evan Daniel



Re: [freenet-dev] Freenet 0.7.5 build 1401

2011-08-28 Thread Evan Daniel
On Fri, Aug 26, 2011 at 11:50 AM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 Freenet 0.7.5 build 1401 is now available. Please upgrade, it will be 
 mandatory at midnight. This build turns off New Load Management, for the time 
 being. If performance continues to be poor we will know the problem is 
 elsewhere (it is possible that it is a problem with the asyncGet changes, 
 although I don't see how). There are also fixes related to dropping peers due 
 to the one IP per connection setting. You should not normally enable this 
 setting on darknet (core settings); it can cause your friends to be lost.

 Thanks, and sorry for all the problems lately.

Was it really necessary to have the update be mandatory on such short
notice? I'm not trying to be sarcastic, this is a serious question and
I'm curious about your opinion.

It seems to me that you get most of the impact thanks to the
auto-updates, which most of the network uses. Some of us, however, do
not. I can't use the auto-update without it regularly interfering with
the network size graphs I produce. (Yes, the scripts that run it are
brittle and sensitive to things like that. Yes, I'd like to throw them
out and write something better, but motivation to actually do that has
yet to strike.) So I have to manually perform updates at times when
they aren't running, and I missed the window on that one. The result
is wonky network size info that I'm pretty sure is entirely an
artifact of that, and has no relationship to what the recent
performance issues have done to network size, which would have been an
interesting question.

Anyway, a little more warning would have been nice, but obviously the
health of the network comes first.

Is there a policy on what the requirements are for a build to be made
mandatory at all? And what the warning period should be? If not, it
seems like something we should have.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Freenet 0.7.5 build 1385

2011-07-21 Thread Evan Daniel
On Mon, Jul 18, 2011 at 11:24 AM, Matthew Toseland
 wrote:
> Freenet 0.7.5 build 1385 is now available, please upgrade.
>
> The main change is merging the store-io branch, aka slot filters. This is a 
> replacement for the old datastore bloom filters. It keeps 4 bytes in memory 
> for each slot in the datastore, indicating whether they are full and the 
> first few bytes of the (hashed, salted) key. This is slightly smaller than 
> the old bloom filters, but is kept on the heap, not memory mapped, so it will 
> increase your memory limit in wrapper.conf slightly when you first run it. It 
> should greatly reduce disk I/O, in particular disk reads caused by writing a 
> block to the datastore. It will delete the old bloom filters and build the 
> new slotfilter files, which will take some time, during which the node will 
> be using the disk quite heavily, but after that it should be much reduced.
>
> Please let us know if there are any problems! There is also a new version of 
> FlogHelper (which didn't work with the last build), a new load management fix 
> and some minor stuff.

Neat to see this feature going in :)

Upgraded earlier today. I'm showing no writes to store, cache, or
client cache since then (but yes to slashdot cache). This is true
across all three key types, even on the nearly empty pubkey stores.

Evan Daniel



Re: [freenet-dev] Freenet 0.7.5 build 1385

2011-07-20 Thread Evan Daniel
On Mon, Jul 18, 2011 at 11:24 AM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 Freenet 0.7.5 build 1385 is now available, please upgrade.

 The main change is merging the store-io branch, aka slot filters. This is a 
 replacement for the old datastore bloom filters. It keeps 4 bytes in memory 
 for each slot in the datastore, indicating whether they are full and the 
 first few bytes of the (hashed, salted) key. This is slightly smaller than 
 the old bloom filters, but is kept on the heap, not memory mapped, so it will 
 increase your memory limit in wrapper.conf slightly when you first run it. It 
 should greatly reduce disk I/O, in particular disk reads caused by writing a 
 block to the datastore. It will delete the old bloom filters and build the 
 new slotfilter files, which will take some time, during which the node will 
 be using the disk quite heavily, but after that it should be much reduced.

 Please let us know if there are any problems! There is also a new version of 
 FlogHelper (which didn't work with the last build), a new load management fix 
 and some minor stuff.

Neat to see this feature going in :)

Upgraded earlier today. I'm showing no writes to store, cache, or
client cache since then (but yes to slashdot cache). This is true
across all three key types, even on the nearly empty pubkey stores.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Load management theory

2011-07-09 Thread Evan Daniel
 favor nodes which
minimize ((time since last request answered) / (FOAF locations
advertised)). (Count of FOAF locations is then being treated as a
standin for capacity, as per the suggestion above.)

Of course, all of these have the interesting problem that they distort
topology away from stuff we have careful theoretical models for.
That's bad. OTOH, I'm not entirely sure that those same models are
actually directly applicable to the current network, given the
combination of LRU peers selection, FOAF routing, and variable peer
counts with capacity. The models also don't tell us what to do about
the fact that some nodes generate a lot more requests than others.

Evan Daniel



[freenet-dev] Load management theory

2011-07-09 Thread Evan Daniel
 favor nodes which
minimize ((time since last request answered) / (FOAF locations
advertised)). (Count of FOAF locations is then being treated as a
standin for capacity, as per the suggestion above.)

Of course, all of these have the interesting problem that they distort
topology away from stuff we have careful theoretical models for.
That's bad. OTOH, I'm not entirely sure that those same models are
actually directly applicable to the current network, given the
combination of LRU peers selection, FOAF routing, and variable peer
counts with capacity. The models also don't tell us what to do about
the fact that some nodes generate a lot more requests than others.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Freenet 0.7.5 build 1258 (sorry for all the 1255 bugs!)

2010-07-06 Thread Evan Daniel
On Tue, Jul 6, 2010 at 5:05 PM, Marco A. Calamari  wrote:
> On Tue, 2010-07-06 at 14:33 +0100, Matthew Toseland wrote:
>> On Saturday 03 July 2010 23:50:00 Evan Daniel wrote:
>> > On Sat, Jul 3, 2010 at 1:42 PM, Marco A. Calamari 
>> wrote:
>
>> > Huh? ?So don't upgrade until the insert finishes, or use persistent
>> > inserts. ?The last one wasn't mandatory for a week, iirc.
>>
>> Agreed, that doesn't make any sense: As long as an insert is
>> persistent it should finish the insert even after a restart with a new
>> build.
>>
>> So please explain *EXACTLY WHAT HAPPENED*, instead of just vaguely
>> grumbling without giving us enough information to debug!
>
> Please do not become nervous and do not scream.
>
> I'm inserting a rather big site (over 2 parts) using
> ?jSite, with an effective bandwidth of 0.3 KB/sec (calculated
> ?by jSite. The site usually insert in 4-7 days.
> No bug at all, but if an update become mandatory in less days
> ?that the site inserts, the jSite insert obviously fail.
>
> I do not know how insert sites in other ways; persistent insert
> ?are only for files, isn't it?
>
> I'm not grumbling. ?Peace. ? Marco

Persistent inserts can be used for any kind of insert, including both
sites and single files.  I use persistent inserts via a small shell
script and FCP to insert my network size stats site, for example.  The
inserting program just needs to set the appropriate options.

In general if you're inserting a site that large, you should at least
consider inserting the large files individually and then linking to
them from the main site.  I'm assuming this isn't all *new* data?  If
most of the insert remains unchanged, you could save yourself a lot of
time this way.

Anyway, support for persistent inserts in jSite would obviously be helpful here.

Evan Daniel



Re: [freenet-dev] Freenet 0.7.5 build 1258 (sorry for all the 1255 bugs!)

2010-07-06 Thread Evan Daniel
On Tue, Jul 6, 2010 at 5:05 PM, Marco A. Calamari marc...@dada.it wrote:
 On Tue, 2010-07-06 at 14:33 +0100, Matthew Toseland wrote:
 On Saturday 03 July 2010 23:50:00 Evan Daniel wrote:
  On Sat, Jul 3, 2010 at 1:42 PM, Marco A. Calamari marc...@dada.it
 wrote:

  Huh?  So don't upgrade until the insert finishes, or use persistent
  inserts.  The last one wasn't mandatory for a week, iirc.

 Agreed, that doesn't make any sense: As long as an insert is
 persistent it should finish the insert even after a restart with a new
 build.

 So please explain *EXACTLY WHAT HAPPENED*, instead of just vaguely
 grumbling without giving us enough information to debug!

 Please do not become nervous and do not scream.

 I'm inserting a rather big site (over 2 parts) using
  jSite, with an effective bandwidth of 0.3 KB/sec (calculated
  by jSite. The site usually insert in 4-7 days.
 No bug at all, but if an update become mandatory in less days
  that the site inserts, the jSite insert obviously fail.

 I do not know how insert sites in other ways; persistent insert
  are only for files, isn't it?

 I'm not grumbling.  Peace.   Marco

Persistent inserts can be used for any kind of insert, including both
sites and single files.  I use persistent inserts via a small shell
script and FCP to insert my network size stats site, for example.  The
inserting program just needs to set the appropriate options.

In general if you're inserting a site that large, you should at least
consider inserting the large files individually and then linking to
them from the main site.  I'm assuming this isn't all *new* data?  If
most of the insert remains unchanged, you could save yourself a lot of
time this way.

Anyway, support for persistent inserts in jSite would obviously be helpful here.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Freenet 0.7.5 build 1258 (sorry for all the 1255 bugs!)

2010-07-03 Thread Evan Daniel
On Sat, Jul 3, 2010 at 1:42 PM, Marco A. Calamari  wrote:
> On Sat, 2010-07-03 at 12:59 -0400, Evan Daniel wrote:
>> On Sat, Jul 3, 2010 at 12:38 PM, Marco A. Calamari 
>> wrote:
>> > On Sat, 2010-07-03 at 16:30 +0100, Matthew Toseland wrote:
>> >> Freenet 0.7.5 build 1258 is now available, please upgrade. This
>> build
>> >> fixes various bugs, mostly introduced in 1255, including the
>> Internal
>> >> error on some persistent
>> >
>> > No way to insert medium/big thing since long, long times ago .
>>
>> What do you mean by this? ?What did you do, and what happened?
>
> Too many incompatible/forced freenet updates make impossible to run long
> ?inserts
>
> Just this

Huh?  So don't upgrade until the insert finishes, or use persistent
inserts.  The last one wasn't mandatory for a week, iirc.

Evan Daniel



[freenet-dev] Freenet 0.7.5 build 1258 (sorry for all the 1255 bugs!)

2010-07-03 Thread Evan Daniel
On Sat, Jul 3, 2010 at 12:38 PM, Marco A. Calamari  wrote:
> On Sat, 2010-07-03 at 16:30 +0100, Matthew Toseland wrote:
>> Freenet 0.7.5 build 1258 is now available, please upgrade. This build
>> fixes various bugs, mostly introduced in 1255, including the Internal
>> error on some persistent
>
> No way to insert medium/big thing since long, long times ago .

What do you mean by this?  What did you do, and what happened?

Evan Daniel



Re: [freenet-dev] Freenet 0.7.5 build 1258 (sorry for all the 1255 bugs!)

2010-07-03 Thread Evan Daniel
On Sat, Jul 3, 2010 at 12:38 PM, Marco A. Calamari marc...@dada.it wrote:
 On Sat, 2010-07-03 at 16:30 +0100, Matthew Toseland wrote:
 Freenet 0.7.5 build 1258 is now available, please upgrade. This build
 fixes various bugs, mostly introduced in 1255, including the Internal
 error on some persistent

 No way to insert medium/big thing since long, long times ago .

What do you mean by this?  What did you do, and what happened?

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Freenet 0.7.5 build 1258 (sorry for all the 1255 bugs!)

2010-07-03 Thread Evan Daniel
On Sat, Jul 3, 2010 at 1:42 PM, Marco A. Calamari marc...@dada.it wrote:
 On Sat, 2010-07-03 at 12:59 -0400, Evan Daniel wrote:
 On Sat, Jul 3, 2010 at 12:38 PM, Marco A. Calamari marc...@dada.it
 wrote:
  On Sat, 2010-07-03 at 16:30 +0100, Matthew Toseland wrote:
  Freenet 0.7.5 build 1258 is now available, please upgrade. This
 build
  fixes various bugs, mostly introduced in 1255, including the
 Internal
  error on some persistent
 
  No way to insert medium/big thing since long, long times ago .

 What do you mean by this?  What did you do, and what happened?

 Too many incompatible/forced freenet updates make impossible to run long
  inserts

 Just this

Huh?  So don't upgrade until the insert finishes, or use persistent
inserts.  The last one wasn't mandatory for a week, iirc.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Improving insert persistence

2010-06-29 Thread Evan Daniel
On Tue, Jun 29, 2010 at 5:15 PM, Matthew Toseland
 wrote:
> On Tuesday 29 June 2010 16:54:42 Evan Daniel wrote:
>> On Tue, Jun 29, 2010 at 11:45 AM, Robert Hailey
>>  wrote:
>> >
>> > On Jun 29, 2010, at 8:26 AM, Matthew Toseland wrote:
>> >
>> >> Tests showed ages ago that triple inserting the same block gives 90%+
>> >> persistence after a week instead of 70%+. There is a (probably 
>> >> statistically
>> >> insignificant) improvement relative even to inserting 3 separate blocks.
>> >> I had thought it was due to not forking on cacheable, but we fork on
>> >> cacheable now and the numbers are the same.
>> >> [...]
>> >>
>> >> With backoff:
>> >>
>> >> IMHO rejections are more likely to be the culprit. There just isn't that
>> >> much backoff any more. However, we could allow an insert to be routed to a
>> >> backed off peer provided the backoff-time-remaining is under some 
>> >> arbitrary
>> >> threshold.
>> >>
>> >> Now, can we test these proposals? Yes.
>> >>
>> >> We need a new MHK tester to get more data, and determine whether triple
>> >> insertion still helps a lot. IMHO there is no obvious reason why it would
>> >> have degenerated. We need to insert and request a larger number of blocks
>> >> (rather than 3+1 per day), and we need to test with fork on cacheable vs
>> >> without it. We should probably also use a 2 week period rather than a 1 
>> >> week
>> >> period, to get more detailed numbers. However, we can add two more
>> >> per-insert flags which we could test:
>> >> - Ignore low backoff: If enabled, route inserts to nodes with backoff time
>> >> remaining under some threshold. This is easy to implement.
>> >> - Prefer inserts: If enabled, target a 1/1/3/3 ratio rather than a 1/1/1/1
>> >> ratio. To implement this using the current kludge, we would need to deduct
>> >> the space used by 2 inserts of each type from the space used, when we are
>> >> considering whether to accept an insert. However IMHO the current kludge
>> >> probably doesn't work very well. It would likely be better to change it as
>> >> above, then we could just have a different target ratio. But for testing
>> >> purposes we could reasonably just try the kludge.
>> >>
>> >> Of course, the real solution is probably to rework load management so we
>> >> don't misroute, or misroute much less (especially on inserts).
>> >
>> > About persistence... it logically must be confined to these areas.
>> >
>> > 1) insertion logic
>> > 2) network change over time
>> > 3) fetch logic
>> >
>> > If there is a major issue with 2 or 3, then beefing up 1 may not be a 
>> > "good"
>> > solution. Then again, I like your ideas more than just chalking it up to
>> > "bad network topology"...
>>
>> Bad topology is not confined to those areas. ?The insert / fetch logic
>> can be locally correct, and the network static, and bad topology will
>> still produce poor performance.
>
> True but opennet should produce good topology shouldn't it? Generally the 
> stats page seems to suggest routing is working?
>

True in theory.  Stats page suggests routing basically works, and is
not inconsistent with good overall topology.  I have enough data from
probe requests to do serious topology analysis, but have not yet done
so.  At this point I would say that the topology is assumed to be
good, but that we aren't completely certain.

Evan Daniel



[freenet-dev] Improving insert persistence

2010-06-29 Thread Evan Daniel
On Tue, Jun 29, 2010 at 11:45 AM, Robert Hailey
 wrote:
>
> On Jun 29, 2010, at 8:26 AM, Matthew Toseland wrote:
>
>> Tests showed ages ago that triple inserting the same block gives 90%+
>> persistence after a week instead of 70%+. There is a (probably statistically
>> insignificant) improvement relative even to inserting 3 separate blocks.
>> I had thought it was due to not forking on cacheable, but we fork on
>> cacheable now and the numbers are the same.
>> [...]
>>
>> With backoff:
>>
>> IMHO rejections are more likely to be the culprit. There just isn't that
>> much backoff any more. However, we could allow an insert to be routed to a
>> backed off peer provided the backoff-time-remaining is under some arbitrary
>> threshold.
>>
>> Now, can we test these proposals? Yes.
>>
>> We need a new MHK tester to get more data, and determine whether triple
>> insertion still helps a lot. IMHO there is no obvious reason why it would
>> have degenerated. We need to insert and request a larger number of blocks
>> (rather than 3+1 per day), and we need to test with fork on cacheable vs
>> without it. We should probably also use a 2 week period rather than a 1 week
>> period, to get more detailed numbers. However, we can add two more
>> per-insert flags which we could test:
>> - Ignore low backoff: If enabled, route inserts to nodes with backoff time
>> remaining under some threshold. This is easy to implement.
>> - Prefer inserts: If enabled, target a 1/1/3/3 ratio rather than a 1/1/1/1
>> ratio. To implement this using the current kludge, we would need to deduct
>> the space used by 2 inserts of each type from the space used, when we are
>> considering whether to accept an insert. However IMHO the current kludge
>> probably doesn't work very well. It would likely be better to change it as
>> above, then we could just have a different target ratio. But for testing
>> purposes we could reasonably just try the kludge.
>>
>> Of course, the real solution is probably to rework load management so we
>> don't misroute, or misroute much less (especially on inserts).
>
> About persistence... it logically must be confined to these areas.
>
> 1) insertion logic
> 2) network change over time
> 3) fetch logic
>
> If there is a major issue with 2 or 3, then beefing up 1 may not be a "good"
> solution. Then again, I like your ideas more than just chalking it up to
> "bad network topology"...

Bad topology is not confined to those areas.  The insert / fetch logic
can be locally correct, and the network static, and bad topology will
still produce poor performance.

Evan Daniel



Re: [freenet-dev] Improving insert persistence

2010-06-29 Thread Evan Daniel
On Tue, Jun 29, 2010 at 11:45 AM, Robert Hailey
rob...@freenetproject.org wrote:

 On Jun 29, 2010, at 8:26 AM, Matthew Toseland wrote:

 Tests showed ages ago that triple inserting the same block gives 90%+
 persistence after a week instead of 70%+. There is a (probably statistically
 insignificant) improvement relative even to inserting 3 separate blocks.
 I had thought it was due to not forking on cacheable, but we fork on
 cacheable now and the numbers are the same.
 [...]

 With backoff:

 IMHO rejections are more likely to be the culprit. There just isn't that
 much backoff any more. However, we could allow an insert to be routed to a
 backed off peer provided the backoff-time-remaining is under some arbitrary
 threshold.

 Now, can we test these proposals? Yes.

 We need a new MHK tester to get more data, and determine whether triple
 insertion still helps a lot. IMHO there is no obvious reason why it would
 have degenerated. We need to insert and request a larger number of blocks
 (rather than 3+1 per day), and we need to test with fork on cacheable vs
 without it. We should probably also use a 2 week period rather than a 1 week
 period, to get more detailed numbers. However, we can add two more
 per-insert flags which we could test:
 - Ignore low backoff: If enabled, route inserts to nodes with backoff time
 remaining under some threshold. This is easy to implement.
 - Prefer inserts: If enabled, target a 1/1/3/3 ratio rather than a 1/1/1/1
 ratio. To implement this using the current kludge, we would need to deduct
 the space used by 2 inserts of each type from the space used, when we are
 considering whether to accept an insert. However IMHO the current kludge
 probably doesn't work very well. It would likely be better to change it as
 above, then we could just have a different target ratio. But for testing
 purposes we could reasonably just try the kludge.

 Of course, the real solution is probably to rework load management so we
 don't misroute, or misroute much less (especially on inserts).

 About persistence... it logically must be confined to these areas.

 1) insertion logic
 2) network change over time
 3) fetch logic

 If there is a major issue with 2 or 3, then beefing up 1 may not be a good
 solution. Then again, I like your ideas more than just chalking it up to
 bad network topology...

Bad topology is not confined to those areas.  The insert / fetch logic
can be locally correct, and the network static, and bad topology will
still produce poor performance.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Improving insert persistence

2010-06-29 Thread Evan Daniel
On Tue, Jun 29, 2010 at 5:15 PM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Tuesday 29 June 2010 16:54:42 Evan Daniel wrote:
 On Tue, Jun 29, 2010 at 11:45 AM, Robert Hailey
 rob...@freenetproject.org wrote:
 
  On Jun 29, 2010, at 8:26 AM, Matthew Toseland wrote:
 
  Tests showed ages ago that triple inserting the same block gives 90%+
  persistence after a week instead of 70%+. There is a (probably 
  statistically
  insignificant) improvement relative even to inserting 3 separate blocks.
  I had thought it was due to not forking on cacheable, but we fork on
  cacheable now and the numbers are the same.
  [...]
 
  With backoff:
 
  IMHO rejections are more likely to be the culprit. There just isn't that
  much backoff any more. However, we could allow an insert to be routed to a
  backed off peer provided the backoff-time-remaining is under some 
  arbitrary
  threshold.
 
  Now, can we test these proposals? Yes.
 
  We need a new MHK tester to get more data, and determine whether triple
  insertion still helps a lot. IMHO there is no obvious reason why it would
  have degenerated. We need to insert and request a larger number of blocks
  (rather than 3+1 per day), and we need to test with fork on cacheable vs
  without it. We should probably also use a 2 week period rather than a 1 
  week
  period, to get more detailed numbers. However, we can add two more
  per-insert flags which we could test:
  - Ignore low backoff: If enabled, route inserts to nodes with backoff time
  remaining under some threshold. This is easy to implement.
  - Prefer inserts: If enabled, target a 1/1/3/3 ratio rather than a 1/1/1/1
  ratio. To implement this using the current kludge, we would need to deduct
  the space used by 2 inserts of each type from the space used, when we are
  considering whether to accept an insert. However IMHO the current kludge
  probably doesn't work very well. It would likely be better to change it as
  above, then we could just have a different target ratio. But for testing
  purposes we could reasonably just try the kludge.
 
  Of course, the real solution is probably to rework load management so we
  don't misroute, or misroute much less (especially on inserts).
 
  About persistence... it logically must be confined to these areas.
 
  1) insertion logic
  2) network change over time
  3) fetch logic
 
  If there is a major issue with 2 or 3, then beefing up 1 may not be a 
  good
  solution. Then again, I like your ideas more than just chalking it up to
  bad network topology...

 Bad topology is not confined to those areas.  The insert / fetch logic
 can be locally correct, and the network static, and bad topology will
 still produce poor performance.

 True but opennet should produce good topology shouldn't it? Generally the 
 stats page seems to suggest routing is working?


True in theory.  Stats page suggests routing basically works, and is
not inconsistent with good overall topology.  I have enough data from
probe requests to do serious topology analysis, but have not yet done
so.  At this point I would say that the topology is assumed to be
good, but that we aren't completely certain.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Planned changes to keys and UI

2010-06-24 Thread Evan Daniel
On Wed, Jun 23, 2010 at 5:43 PM, Matthew Toseland
 wrote:
> On Wednesday 23 June 2010 20:33:50 Sich wrote:
>> Le 23/06/2010 21:01, Matthew Toseland a ?crit :
>> > Insert a random, safe key
>> > This is much safer than the first option, but the key will be different 
>> > every time you or somebody else inserts the key. Use this if you are the 
>> > original source of some sensitive data.
>> >
>> >
>> Very interesting for filesharing if we split the file.
>> When some chunk are lost, you have only to reinsert those who are
>> lost... But then we use much datastore... But it's more secure...
>> Loosing datastore space is a big problem no ?
>
> If some people use the new key and some use the old then it's a problem. If 
> everyone uses one or the other it isn't. I guess this is another reason to 
> use par files etc (ugh).
>
> The next round of major changes (probably in 1255) will introduce 
> cross-segment redundancy which should improve the reliability of ?really big 
> files.
>
> Long term we may have selective reinsert support, but of course that would be 
> nearly as unsafe as reinserting the whole file to the same key ...
>
> If you're building a reinsert-on-demand based filesharing system let me know 
> if you need any specific functionality...

The obvious intermediate is to reinsert a small portion of a file.
The normal case is (and will continue to be) that when a file becomes
unretrievable, it's because one or more segments is only a couple
blocks short of being retrievable.  If you reinsert say 8 blocks out
of each segment (1/32 of the file), you'll be reinserting on average 4
unretrievable blocks from each segment.  That should be enough in a
lot of cases.  This is probably better than selective reinsert (the
attacker doesn't get to choose which blocks you reinsert as easily),
though it does mean reinserting more blocks (8 per segment when merely
reinserting the correct 3 blocks might suffice).

The simple defense against a mobile opennet attacker that has been
proposed before would be particularly well suited to partial
randomized reinserts.  The insert comes with a time (randomized per
block, to some time a bit before the reinsert started), and is only
routed along connections that were established before that time, until
it reaches some relatively low HTL (10?).  This prevents the attacker
from moving during the insert.  On a large file that takes a long time
to insert, this is problematic, because there aren't enough
connections that are old enough to route along.  For a partial
reinsert, this is less of a concern, simply because it doesn't take as
long.

Evan Daniel



Re: [freenet-dev] Planned changes to keys and UI

2010-06-23 Thread Evan Daniel
On Wed, Jun 23, 2010 at 5:43 PM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Wednesday 23 June 2010 20:33:50 Sich wrote:
 Le 23/06/2010 21:01, Matthew Toseland a écrit :
  Insert a random, safe key
  This is much safer than the first option, but the key will be different 
  every time you or somebody else inserts the key. Use this if you are the 
  original source of some sensitive data.
 
 
 Very interesting for filesharing if we split the file.
 When some chunk are lost, you have only to reinsert those who are
 lost... But then we use much datastore... But it's more secure...
 Loosing datastore space is a big problem no ?

 If some people use the new key and some use the old then it's a problem. If 
 everyone uses one or the other it isn't. I guess this is another reason to 
 use par files etc (ugh).

 The next round of major changes (probably in 1255) will introduce 
 cross-segment redundancy which should improve the reliability of  really big 
 files.

 Long term we may have selective reinsert support, but of course that would be 
 nearly as unsafe as reinserting the whole file to the same key ...

 If you're building a reinsert-on-demand based filesharing system let me know 
 if you need any specific functionality...

The obvious intermediate is to reinsert a small portion of a file.
The normal case is (and will continue to be) that when a file becomes
unretrievable, it's because one or more segments is only a couple
blocks short of being retrievable.  If you reinsert say 8 blocks out
of each segment (1/32 of the file), you'll be reinserting on average 4
unretrievable blocks from each segment.  That should be enough in a
lot of cases.  This is probably better than selective reinsert (the
attacker doesn't get to choose which blocks you reinsert as easily),
though it does mean reinserting more blocks (8 per segment when merely
reinserting the correct 3 blocks might suffice).

The simple defense against a mobile opennet attacker that has been
proposed before would be particularly well suited to partial
randomized reinserts.  The insert comes with a time (randomized per
block, to some time a bit before the reinsert started), and is only
routed along connections that were established before that time, until
it reaches some relatively low HTL (10?).  This prevents the attacker
from moving during the insert.  On a large file that takes a long time
to insert, this is problematic, because there aren't enough
connections that are old enough to route along.  For a partial
reinsert, this is less of a concern, simply because it doesn't take as
long.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] [freenet-support] Freenet 0.7.5 build 1253: UPGRADE NOW!!!

2010-06-14 Thread Evan Daniel
On Mon, Jun 14, 2010 at 4:56 PM, Matthew Toseland
 wrote:

> - Even segment splitting. Freenet divides large files up into "splitfiles": 
> because a CHK can only be 32KB, Freenet splits larger files up into many 
> blocks, and then divides those blocks into "segments" of no more than 256 
> blocks (128 data blocks, and 128 "check blocks" that are generated for 
> redundancy). Until 1251, on any splitfile of more than 128 blocks (4MB), all 
> but the last segment would be exactly 128 data blocks and 128 check blocks. 
> From this build onwards, all the segments will be roughly the same size. 
> Plus, we allow up to 131 data blocks (with 125 check blocks) to use fewer 
> segments (which is a net benefit for reliability), and we add an extra check 
> block to each segment of less than 128 data blocks.

You're only allowing > 128 data blocks on small files, right?  With >
520 total data blocks, it should be max 128 data blocks per segment
(ie at 5 segments or more).  And with less than that, the appropriate
max depends on file size.

Evan Daniel



Re: [freenet-dev] [freenet-support] Freenet 0.7.5 build 1253: UPGRADE NOW!!!

2010-06-14 Thread Evan Daniel
On Mon, Jun 14, 2010 at 4:56 PM, Matthew Toseland
t...@amphibian.dyndns.org wrote:

 - Even segment splitting. Freenet divides large files up into splitfiles: 
 because a CHK can only be 32KB, Freenet splits larger files up into many 
 blocks, and then divides those blocks into segments of no more than 256 
 blocks (128 data blocks, and 128 check blocks that are generated for 
 redundancy). Until 1251, on any splitfile of more than 128 blocks (4MB), all 
 but the last segment would be exactly 128 data blocks and 128 check blocks. 
 From this build onwards, all the segments will be roughly the same size. 
 Plus, we allow up to 131 data blocks (with 125 check blocks) to use fewer 
 segments (which is a net benefit for reliability), and we add an extra check 
 block to each segment of less than 128 data blocks.

You're only allowing  128 data blocks on small files, right?  With 
520 total data blocks, it should be max 128 data blocks per segment
(ie at 5 segments or more).  And with less than that, the appropriate
max depends on file size.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] How to figure out number of online Freenet users

2010-06-06 Thread Evan Daniel
Or, you can check my stats freesite, which has regular updates and graphs:
http://127.0.0.1:/USK at 
gjw6StjZOZ4OAG-pqOxIp5Nk11udQZOrozD4jld42Ac,BYyqgAtc9p0JGbJ~18XU6mtO9ChnBZdf~ttCn48FV7s,AQACAAE/graphs/459/

Typical numbers for users online vary from about 6300 to 8100,
depending mostly on time of day.  Number of total users is somewhere
in the 15k-20k range, and depends mostly on how you define "user".

The data is generated based on probe requests; see that site, my flog
(linked from it), and the mailing list archives for more details.

Evan Daniel

On Sat, Jun 5, 2010 at 10:59 PM, DJ Amireh  wrote:
> Ikram,
>
> AFAIK (which isn't that much) the only estimate you can get is that of
> opennet peers. All opennet peers are harvestable by nature (they are
> advertised publicly). On the Freenet Message System, there is a user who
> posts the opennet peers occasionally. Here is his latest post:
> http://pastebin.com/XD78EHmz
>
> - DJ Amireh
>
> On Sat, Jun 5, 2010 at 6:16 PM, Ikram M. Khan  wrote:
>>
>> Dear All,
>> How can one figure out the number of online FreeNet Users? How to estimate
>> total number of FreeNet users and what is the total number of Freenet
>> Users?
>> With best regards,
>> ikram
>> ___
>> Devl mailing list
>> Devl at freenetproject.org
>> http://freenetproject.org/cgi-bin/mailman/listinfo/devl
>
>
> ___
> Devl mailing list
> Devl at freenetproject.org
> http://freenetproject.org/cgi-bin/mailman/listinfo/devl
>



Re: [freenet-dev] How to figure out number of online Freenet users

2010-06-05 Thread Evan Daniel
Or, you can check my stats freesite, which has regular updates and graphs:
http://127.0.0.1:/u...@gjw6stjzoz4oag-pqoxip5nk11udqzorozd4jld42ac,BYyqgAtc9p0JGbJ~18XU6mtO9ChnBZdf~ttCn48FV7s,AQACAAE/graphs/459/

Typical numbers for users online vary from about 6300 to 8100,
depending mostly on time of day.  Number of total users is somewhere
in the 15k-20k range, and depends mostly on how you define user.

The data is generated based on probe requests; see that site, my flog
(linked from it), and the mailing list archives for more details.

Evan Daniel

On Sat, Jun 5, 2010 at 10:59 PM, DJ Amireh cactus...@gmail.com wrote:
 Ikram,

 AFAIK (which isn't that much) the only estimate you can get is that of
 opennet peers. All opennet peers are harvestable by nature (they are
 advertised publicly). On the Freenet Message System, there is a user who
 posts the opennet peers occasionally. Here is his latest post:
 http://pastebin.com/XD78EHmz

 - DJ Amireh

 On Sat, Jun 5, 2010 at 6:16 PM, Ikram M. Khan engr.ik...@gmail.com wrote:

 Dear All,
 How can one figure out the number of online FreeNet Users? How to estimate
 total number of FreeNet users and what is the total number of Freenet
 Users?
 With best regards,
 ikram
 ___
 Devl mailing list
 Devl@freenetproject.org
 http://freenetproject.org/cgi-bin/mailman/listinfo/devl


 ___
 Devl mailing list
 Devl@freenetproject.org
 http://freenetproject.org/cgi-bin/mailman/listinfo/devl

___
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl


[freenet-dev] Attribute reordering in HTML filter

2010-05-09 Thread Evan Daniel
On Sun, May 9, 2010 at 7:36 AM, Florent Daigniere
 wrote:
>> >> > Depending how much cleaning of the HTML filtering system you want to
>> >> > do... ?Has using something like JTidy ( http://jtidy.sourceforge.net/
>> >> > ) been discussed? ?That way you wouldn't have to worry about what's
>> >> > valid or invalid HTML, merely the security aspects of valid HTML that
>> >> > are unique to Freenet.
>> >
>> > That might be nice... but wouldn't we have the same problem in that it 
>> > would
>> > be hard to diff the output of the filter against the input for debugging
>> > purposes? What do other people think about this? It would make life much
>> > easier...
>>
>> I don't see why it would be a problem. ?I haven't used tidy much,
>> honestly. ?I don't see how to make it stop changing line breaks and
>> such in my page. ?However, I don't mind running it locally before
>> inserting, so that nothing changes when the filter runs it. ?I don't
>> need the filter to never change anything; I just need to know what to
>> do so that I can get a diff that shows only the changes made by the
>> filter. ?If I need to run tidy on the original, and then diff that vs
>> the filtered output, that's fine by me.
>>
>> And anything that makes the filtering more robust and less work is a
>> big win, imho.
>>
>> Evan Daniel
>
> No way. We have a filter which works (security-wise), why would we change?
>
> Auditing upstream changes is going to be more time-expensive than maintaining 
> our own
> ?because it implements only a subset of the features.

As I see it, there are three parts to the filter:
1) Parse the HTML / XHTML, and build a parse tree / DOM.  Handle
invalid markup, invalid character escapes, etc.
2) Remove any elements that present a security risk.
3) Turn the resulting DOM back into HTML output

The goal of using something like JTidy would be to make part 1 more
robust, and easier to maintain.  Part 2 would be the same filter we
have now.

At present, we allow a large amount of invalid markup through the
filter.  I don't like this for a variety of reasons, but the relevant
one is that browser behavior when presented with invalid markup is not
well defined, and therefore has a lot of potential for security risks.
 OTOH, we can't just ban invalid markup, because so many freesites use
it.  Using something like JTidy gets the best of both worlds: it
cleans up invalid markup and produces something that is valid and
likely to do what the freesite author wanted.  That means we can be
certain that the browser will interpret the document in the same
fashion our filter does, which is a win for security.

Reasons to change:
- Our filter works security-wise, but is more restrictive than
required on content.  Loosening those restrictions will be less work
if we can assume that the filtering process starts with a completely
valid DOM.
- We don't have to maintain a parser, just a filter.
- Our current filter breaks valid markup; fixing this is probably
easier if we use something like JTidy to handle the DOM rather than
rolling our own HTML / XHTML parser.

Reasons not to change:
- Changing takes work, and has the potential to introduce new bugs
- We have to worry about upstream changes

I'm not overly convinced about the upstream changes piece.  The
upstream code in question is a port of the W3C reference parser.
Since we'd be using a whitelist filter on the DOM it produces, we
don't need to worry about new features and supported markup, only new
bugs.  How much auditing do we currently do on upstream code?

I'm not trying to advocate spending lots of time changing everything
over.  We have better things to work on.  I'm asking whether it's
easier to fix the current code, or to refactor it to use a more robust
parser.  (And which is easier to maintain long term -- though if
switching is a lot of work, but easier long term, then imho we should
keep the current code for now and revisit the question later, like
after 0.8 is out.)

Evan Daniel



[freenet-dev] Attribute reordering in HTML filter

2010-05-09 Thread Evan Daniel
On Sat, May 8, 2010 at 9:35 PM, Spencer Jackson
 wrote:
> tOn Sat, May 8, 2010 at 10:38 AM, Matthew Toseland
>  wrote:
>>
>> On Saturday 08 May 2010 05:09:07 Evan Daniel wrote:
>> > On Fri, May 7, 2010 at 11:43 PM, Spencer Jackson
>> >  wrote:
>> > > On Fri, 2010-05-07 at 12:40 +0100, Matthew Toseland wrote:
>> > >> On Thursday 06 May 2010 20:40:03 Spencer Jackson wrote:
>> > >> > Hi guys, just wanted to touch base. Anyway, I'm working on
>> > >> > resolving bug
>> > >> > number 3571( https://bugs.freenetproject.org/view.php?id=3571 ). To
>> > >> > summarize, the filter tends to reorder attributes at semirandom
>> > >> > when
>> > >> > they get parsed. While the structure which holds the parsed
>> > >> > attribute is
>> > >> > a LinkedHashMap, meaning we should be able to stuff in values and
>> > >> > pull
>> > >> > them out in the same order, the put functions are called in the
>> > >> > derived
>> > >> > verifier's overrided sanitizeHash methods. These methods extract an
>> > >> > attribute, sanitize it, then place it in the Map. The problem is,
>> > >> > they
>> > >> > are extracted out of the original order, meaning they get pulled
>> > >> > out of
>> > >> > the Map in the wrong order. To fix this, I created a callback
>> > >> > object
>> > >> > which the derived classes pass to the baseclass. The baseclass may
>> > >> > then
>> > >> > parse all of the attributes in order, invoking the callback to
>> > >> > sanitize.If an attribute's contents fails to be processed, an
>> > >> > exception
>> > >> > may be thrown, so that the attribute will not be included in the
>> > >> > final
>> > >> > tag.
>> > >>
>> > >> It is important that only attributes that are explicitly parsed and
>> > >> understood are passed on, and that it doesn't take extra per-sanitiser 
>> > >> work
>> > >> to achieve this. Will this be the case?
>> > >>
>> > >
>> > > Yeah, this should be the case. ?Attributes which don't have a callback
>> > > stored simply aren't parsed. I am starting, however, to think this
>> > > approach might be overkill. ?Here I have a different take:
>> > >
>> > > http://github.com/spencerjackson/fred-staging/tree/HTMLAttributeReorder
>> > > Instead of running a callback in the base class, I simply create the
>> > > attributes, in order, with null content. Then, in the overloaded
>> > > methods
>> > > on the child classes I repopulate them with the correct data. This
>> > > preserves the original order of the attributes, while minimizing the
>> > > amount of new code that needs to be written. What do you think? Which
>> > > solution do you think is preferable?
>> >
>> > Do attributes without content still get written? ?Is that always
>> > valid? ?Not writing them isn't always valid; see eg bug 4125: current
>> > code happily removes required attributes from  tags, thus
>> > breaking valid pages.
>
>
> Odd. I'm looking at the code for MetaTagVerifier, and I can't see any code
> branches in which, if the 'content' attribute is defined, it is failed to be
> added to the LinkedHashMap unless nothing else is added either... I'm not on
> my home computer, so I'll have to test this tomorrow. Does it happen to all
>  tags? Oh. Do you mean, if there are no attributes, the tag will still
> exist, but be empty? I could alter MetaTagVerifier to return null if this is
> the case, and remove the tag from the final output. Would that fix this?

As mentioned in the other reply, the content filter alters my flog from

to


I haven't done a detailed analysis of why.

>
>>
>> >
>> > Depending how much cleaning of the HTML filtering system you want to
>> > do... ?Has using something like JTidy ( http://jtidy.sourceforge.net/
>> > ) been discussed? ?That way you wouldn't have to worry about what's
>> > valid or invalid HTML, merely the security aspects of valid HTML that
>> > are unique to Freenet.
>
> That might be nice... but wouldn't we have the same problem in that it would
> be hard to diff the output of the filter against the input for debugging
> purposes? What do other people think about this? It would make life much
> easier...

I don't see why it would be a problem.  I haven't used tidy much,
honestly.  I don't see how to make it stop changing line breaks and
such in my page.  However, I don't mind running it locally before
inserting, so that nothing changes when the filter runs it.  I don't
need the filter to never change anything; I just need to know what to
do so that I can get a diff that shows only the changes made by the
filter.  If I need to run tidy on the original, and then diff that vs
the filtered output, that's fine by me.

And anything that makes the filtering more robust and less work is a
big win, imho.

Evan Daniel



[freenet-dev] Attribute reordering in HTML filter

2010-05-09 Thread Evan Daniel
On Sat, May 8, 2010 at 11:38 AM, Matthew Toseland
 wrote:
> On Saturday 08 May 2010 05:09:07 Evan Daniel wrote:
>> On Fri, May 7, 2010 at 11:43 PM, Spencer Jackson
>>  wrote:
>> > On Fri, 2010-05-07 at 12:40 +0100, Matthew Toseland wrote:
>> >> On Thursday 06 May 2010 20:40:03 Spencer Jackson wrote:
>> >> > Hi guys, just wanted to touch base. Anyway, I'm working on resolving bug
>> >> > number 3571( https://bugs.freenetproject.org/view.php?id=3571 ). To
>> >> > summarize, the filter tends to reorder attributes at semirandom when
>> >> > they get parsed. While the structure which holds the parsed attribute is
>> >> > a LinkedHashMap, meaning we should be able to stuff in values and pull
>> >> > them out in the same order, the put functions are called in the derived
>> >> > verifier's overrided sanitizeHash methods. These methods extract an
>> >> > attribute, sanitize it, then place it in the Map. The problem is, they
>> >> > are extracted out of the original order, meaning they get pulled out of
>> >> > the Map in the wrong order. To fix this, I created a callback object
>> >> > which the derived classes pass to the baseclass. The baseclass may then
>> >> > parse all of the attributes in order, invoking the callback to
>> >> > sanitize.If an attribute's contents fails to be processed, an exception
>> >> > may be thrown, so that the attribute will not be included in the final
>> >> > tag.
>> >>
>> >> It is important that only attributes that are explicitly parsed and 
>> >> understood are passed on, and that it doesn't take extra per-sanitiser 
>> >> work to achieve this. Will this be the case?
>> >>
>> >
>> > Yeah, this should be the case. ?Attributes which don't have a callback
>> > stored simply aren't parsed. I am starting, however, to think this
>> > approach might be overkill. ?Here I have a different take:
>> > http://github.com/spencerjackson/fred-staging/tree/HTMLAttributeReorder
>> > Instead of running a callback in the base class, I simply create the
>> > attributes, in order, with null content. Then, in the overloaded methods
>> > on the child classes I repopulate them with the correct data. This
>> > preserves the original order of the attributes, while minimizing the
>> > amount of new code that needs to be written. What do you think? Which
>> > solution do you think is preferable?
>>
>> Do attributes without content still get written? ?Is that always
>> valid? ?Not writing them isn't always valid; see eg bug 4125: current
>> code happily removes required attributes from  tags, thus
>> breaking valid pages.
>>
>> Depending how much cleaning of the HTML filtering system you want to
>> do... ?Has using something like JTidy ( http://jtidy.sourceforge.net/
>> ) been discussed? ?That way you wouldn't have to worry about what's
>> valid or invalid HTML, merely the security aspects of valid HTML that
>> are unique to Freenet.
>
> IMHO sajack's solution is acceptable, you will have to just use null to 
> indicate no attribute and "" to indicate an attribute with no value? Or is 
> there a difference between attributes with an empty value and attributes with 
> no value?
>

It sounds fine to me, provided it doesn't take validating html and
make it stop validating.  Or at least does so no more than the current
code.

I'm asking what will happen when the attribute has null content
because the filter couldn't find anything to fill it with; does that
get written as  or  or something else?
Whichever it is, do we know that the result will be valid html?

The current filter turns eg

into


The first is valid xhtml, the second is not.  Run the w3c validator
against my flog, both filtered an unfiltered, for details.  So, how
will the new filter handle cases like this, where filter code hasn't
been completely written for all relevant aspects?

Evan Daniel



Re: [freenet-dev] Attribute reordering in HTML filter

2010-05-09 Thread Evan Daniel
On Sun, May 9, 2010 at 7:36 AM, Florent Daigniere
nextg...@freenetproject.org wrote:
   Depending how much cleaning of the HTML filtering system you want to
   do...  Has using something like JTidy ( http://jtidy.sourceforge.net/
   ) been discussed?  That way you wouldn't have to worry about what's
   valid or invalid HTML, merely the security aspects of valid HTML that
   are unique to Freenet.
 
  That might be nice... but wouldn't we have the same problem in that it 
  would
  be hard to diff the output of the filter against the input for debugging
  purposes? What do other people think about this? It would make life much
  easier...

 I don't see why it would be a problem.  I haven't used tidy much,
 honestly.  I don't see how to make it stop changing line breaks and
 such in my page.  However, I don't mind running it locally before
 inserting, so that nothing changes when the filter runs it.  I don't
 need the filter to never change anything; I just need to know what to
 do so that I can get a diff that shows only the changes made by the
 filter.  If I need to run tidy on the original, and then diff that vs
 the filtered output, that's fine by me.

 And anything that makes the filtering more robust and less work is a
 big win, imho.

 Evan Daniel

 No way. We have a filter which works (security-wise), why would we change?

 Auditing upstream changes is going to be more time-expensive than maintaining 
 our own
  because it implements only a subset of the features.

As I see it, there are three parts to the filter:
1) Parse the HTML / XHTML, and build a parse tree / DOM.  Handle
invalid markup, invalid character escapes, etc.
2) Remove any elements that present a security risk.
3) Turn the resulting DOM back into HTML output

The goal of using something like JTidy would be to make part 1 more
robust, and easier to maintain.  Part 2 would be the same filter we
have now.

At present, we allow a large amount of invalid markup through the
filter.  I don't like this for a variety of reasons, but the relevant
one is that browser behavior when presented with invalid markup is not
well defined, and therefore has a lot of potential for security risks.
 OTOH, we can't just ban invalid markup, because so many freesites use
it.  Using something like JTidy gets the best of both worlds: it
cleans up invalid markup and produces something that is valid and
likely to do what the freesite author wanted.  That means we can be
certain that the browser will interpret the document in the same
fashion our filter does, which is a win for security.

Reasons to change:
- Our filter works security-wise, but is more restrictive than
required on content.  Loosening those restrictions will be less work
if we can assume that the filtering process starts with a completely
valid DOM.
- We don't have to maintain a parser, just a filter.
- Our current filter breaks valid markup; fixing this is probably
easier if we use something like JTidy to handle the DOM rather than
rolling our own HTML / XHTML parser.

Reasons not to change:
- Changing takes work, and has the potential to introduce new bugs
- We have to worry about upstream changes

I'm not overly convinced about the upstream changes piece.  The
upstream code in question is a port of the W3C reference parser.
Since we'd be using a whitelist filter on the DOM it produces, we
don't need to worry about new features and supported markup, only new
bugs.  How much auditing do we currently do on upstream code?

I'm not trying to advocate spending lots of time changing everything
over.  We have better things to work on.  I'm asking whether it's
easier to fix the current code, or to refactor it to use a more robust
parser.  (And which is easier to maintain long term -- though if
switching is a lot of work, but easier long term, then imho we should
keep the current code for now and revisit the question later, like
after 0.8 is out.)

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://osprey.vm.bytemark.co.uk/cgi-bin/mailman/listinfo/devl


[freenet-dev] Attribute reordering in HTML filter

2010-05-08 Thread Evan Daniel
On Fri, May 7, 2010 at 11:43 PM, Spencer Jackson
 wrote:
> On Fri, 2010-05-07 at 12:40 +0100, Matthew Toseland wrote:
>> On Thursday 06 May 2010 20:40:03 Spencer Jackson wrote:
>> > Hi guys, just wanted to touch base. Anyway, I'm working on resolving bug
>> > number 3571( https://bugs.freenetproject.org/view.php?id=3571 ). To
>> > summarize, the filter tends to reorder attributes at semirandom when
>> > they get parsed. While the structure which holds the parsed attribute is
>> > a LinkedHashMap, meaning we should be able to stuff in values and pull
>> > them out in the same order, the put functions are called in the derived
>> > verifier's overrided sanitizeHash methods. These methods extract an
>> > attribute, sanitize it, then place it in the Map. The problem is, they
>> > are extracted out of the original order, meaning they get pulled out of
>> > the Map in the wrong order. To fix this, I created a callback object
>> > which the derived classes pass to the baseclass. The baseclass may then
>> > parse all of the attributes in order, invoking the callback to
>> > sanitize.If an attribute's contents fails to be processed, an exception
>> > may be thrown, so that the attribute will not be included in the final
>> > tag.
>>
>> It is important that only attributes that are explicitly parsed and 
>> understood are passed on, and that it doesn't take extra per-sanitiser work 
>> to achieve this. Will this be the case?
>>
>
> Yeah, this should be the case. ?Attributes which don't have a callback
> stored simply aren't parsed. I am starting, however, to think this
> approach might be overkill. ?Here I have a different take:
> http://github.com/spencerjackson/fred-staging/tree/HTMLAttributeReorder
> Instead of running a callback in the base class, I simply create the
> attributes, in order, with null content. Then, in the overloaded methods
> on the child classes I repopulate them with the correct data. This
> preserves the original order of the attributes, while minimizing the
> amount of new code that needs to be written. What do you think? Which
> solution do you think is preferable?

Do attributes without content still get written?  Is that always
valid?  Not writing them isn't always valid; see eg bug 4125: current
code happily removes required attributes from  tags, thus
breaking valid pages.

Depending how much cleaning of the HTML filtering system you want to
do...  Has using something like JTidy ( http://jtidy.sourceforge.net/
) been discussed?  That way you wouldn't have to worry about what's
valid or invalid HTML, merely the security aspects of valid HTML that
are unique to Freenet.

Evan Daniel



Re: [freenet-dev] Attribute reordering in HTML filter

2010-05-08 Thread Evan Daniel
On Sat, May 8, 2010 at 11:38 AM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Saturday 08 May 2010 05:09:07 Evan Daniel wrote:
 On Fri, May 7, 2010 at 11:43 PM, Spencer Jackson
 spencerandrewjack...@gmail.com wrote:
  On Fri, 2010-05-07 at 12:40 +0100, Matthew Toseland wrote:
  On Thursday 06 May 2010 20:40:03 Spencer Jackson wrote:
   Hi guys, just wanted to touch base. Anyway, I'm working on resolving bug
   number 3571( https://bugs.freenetproject.org/view.php?id=3571 ). To
   summarize, the filter tends to reorder attributes at semirandom when
   they get parsed. While the structure which holds the parsed attribute is
   a LinkedHashMap, meaning we should be able to stuff in values and pull
   them out in the same order, the put functions are called in the derived
   verifier's overrided sanitizeHash methods. These methods extract an
   attribute, sanitize it, then place it in the Map. The problem is, they
   are extracted out of the original order, meaning they get pulled out of
   the Map in the wrong order. To fix this, I created a callback object
   which the derived classes pass to the baseclass. The baseclass may then
   parse all of the attributes in order, invoking the callback to
   sanitize.If an attribute's contents fails to be processed, an exception
   may be thrown, so that the attribute will not be included in the final
   tag.
 
  It is important that only attributes that are explicitly parsed and 
  understood are passed on, and that it doesn't take extra per-sanitiser 
  work to achieve this. Will this be the case?
 
 
  Yeah, this should be the case.  Attributes which don't have a callback
  stored simply aren't parsed. I am starting, however, to think this
  approach might be overkill.  Here I have a different take:
  http://github.com/spencerjackson/fred-staging/tree/HTMLAttributeReorder
  Instead of running a callback in the base class, I simply create the
  attributes, in order, with null content. Then, in the overloaded methods
  on the child classes I repopulate them with the correct data. This
  preserves the original order of the attributes, while minimizing the
  amount of new code that needs to be written. What do you think? Which
  solution do you think is preferable?

 Do attributes without content still get written?  Is that always
 valid?  Not writing them isn't always valid; see eg bug 4125: current
 code happily removes required attributes from meta tags, thus
 breaking valid pages.

 Depending how much cleaning of the HTML filtering system you want to
 do...  Has using something like JTidy ( http://jtidy.sourceforge.net/
 ) been discussed?  That way you wouldn't have to worry about what's
 valid or invalid HTML, merely the security aspects of valid HTML that
 are unique to Freenet.

 IMHO sajack's solution is acceptable, you will have to just use null to 
 indicate no attribute and  to indicate an attribute with no value? Or is 
 there a difference between attributes with an empty value and attributes with 
 no value?


It sounds fine to me, provided it doesn't take validating html and
make it stop validating.  Or at least does so no more than the current
code.

I'm asking what will happen when the attribute has null content
because the filter couldn't find anything to fill it with; does that
get written as tag attribute= or tag or something else?
Whichever it is, do we know that the result will be valid html?

The current filter turns eg
meta http-equiv=Content-type content=application/xhtml+xml;charset=UTF-8 /
into
meta /

The first is valid xhtml, the second is not.  Run the w3c validator
against my flog, both filtered an unfiltered, for details.  So, how
will the new filter handle cases like this, where filter code hasn't
been completely written for all relevant aspects?

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://osprey.vm.bytemark.co.uk/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Attribute reordering in HTML filter

2010-05-08 Thread Evan Daniel
On Sat, May 8, 2010 at 9:35 PM, Spencer Jackson
spencerandrewjack...@gmail.com wrote:
 tOn Sat, May 8, 2010 at 10:38 AM, Matthew Toseland
 t...@amphibian.dyndns.org wrote:

 On Saturday 08 May 2010 05:09:07 Evan Daniel wrote:
  On Fri, May 7, 2010 at 11:43 PM, Spencer Jackson
  spencerandrewjack...@gmail.com wrote:
   On Fri, 2010-05-07 at 12:40 +0100, Matthew Toseland wrote:
   On Thursday 06 May 2010 20:40:03 Spencer Jackson wrote:
Hi guys, just wanted to touch base. Anyway, I'm working on
resolving bug
number 3571( https://bugs.freenetproject.org/view.php?id=3571 ). To
summarize, the filter tends to reorder attributes at semirandom
when
they get parsed. While the structure which holds the parsed
attribute is
a LinkedHashMap, meaning we should be able to stuff in values and
pull
them out in the same order, the put functions are called in the
derived
verifier's overrided sanitizeHash methods. These methods extract an
attribute, sanitize it, then place it in the Map. The problem is,
they
are extracted out of the original order, meaning they get pulled
out of
the Map in the wrong order. To fix this, I created a callback
object
which the derived classes pass to the baseclass. The baseclass may
then
parse all of the attributes in order, invoking the callback to
sanitize.If an attribute's contents fails to be processed, an
exception
may be thrown, so that the attribute will not be included in the
final
tag.
  
   It is important that only attributes that are explicitly parsed and
   understood are passed on, and that it doesn't take extra per-sanitiser 
   work
   to achieve this. Will this be the case?
  
  
   Yeah, this should be the case.  Attributes which don't have a callback
   stored simply aren't parsed. I am starting, however, to think this
   approach might be overkill.  Here I have a different take:
  
   http://github.com/spencerjackson/fred-staging/tree/HTMLAttributeReorder
   Instead of running a callback in the base class, I simply create the
   attributes, in order, with null content. Then, in the overloaded
   methods
   on the child classes I repopulate them with the correct data. This
   preserves the original order of the attributes, while minimizing the
   amount of new code that needs to be written. What do you think? Which
   solution do you think is preferable?
 
  Do attributes without content still get written?  Is that always
  valid?  Not writing them isn't always valid; see eg bug 4125: current
  code happily removes required attributes from meta tags, thus
  breaking valid pages.


 Odd. I'm looking at the code for MetaTagVerifier, and I can't see any code
 branches in which, if the 'content' attribute is defined, it is failed to be
 added to the LinkedHashMap unless nothing else is added either... I'm not on
 my home computer, so I'll have to test this tomorrow. Does it happen to all
 meta tags? Oh. Do you mean, if there are no attributes, the tag will still
 exist, but be empty? I could alter MetaTagVerifier to return null if this is
 the case, and remove the tag from the final output. Would that fix this?

As mentioned in the other reply, the content filter alters my flog from
meta http-equiv=Content-type content=application/xhtml+xml;charset=UTF-8 /
to
meta /

I haven't done a detailed analysis of why.



 
  Depending how much cleaning of the HTML filtering system you want to
  do...  Has using something like JTidy ( http://jtidy.sourceforge.net/
  ) been discussed?  That way you wouldn't have to worry about what's
  valid or invalid HTML, merely the security aspects of valid HTML that
  are unique to Freenet.

 That might be nice... but wouldn't we have the same problem in that it would
 be hard to diff the output of the filter against the input for debugging
 purposes? What do other people think about this? It would make life much
 easier...

I don't see why it would be a problem.  I haven't used tidy much,
honestly.  I don't see how to make it stop changing line breaks and
such in my page.  However, I don't mind running it locally before
inserting, so that nothing changes when the filter runs it.  I don't
need the filter to never change anything; I just need to know what to
do so that I can get a diff that shows only the changes made by the
filter.  If I need to run tidy on the original, and then diff that vs
the filtered output, that's fine by me.

And anything that makes the filtering more robust and less work is a
big win, imho.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://osprey.vm.bytemark.co.uk/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Attribute reordering in HTML filter

2010-05-07 Thread Evan Daniel
On Fri, May 7, 2010 at 11:43 PM, Spencer Jackson
spencerandrewjack...@gmail.com wrote:
 On Fri, 2010-05-07 at 12:40 +0100, Matthew Toseland wrote:
 On Thursday 06 May 2010 20:40:03 Spencer Jackson wrote:
  Hi guys, just wanted to touch base. Anyway, I'm working on resolving bug
  number 3571( https://bugs.freenetproject.org/view.php?id=3571 ). To
  summarize, the filter tends to reorder attributes at semirandom when
  they get parsed. While the structure which holds the parsed attribute is
  a LinkedHashMap, meaning we should be able to stuff in values and pull
  them out in the same order, the put functions are called in the derived
  verifier's overrided sanitizeHash methods. These methods extract an
  attribute, sanitize it, then place it in the Map. The problem is, they
  are extracted out of the original order, meaning they get pulled out of
  the Map in the wrong order. To fix this, I created a callback object
  which the derived classes pass to the baseclass. The baseclass may then
  parse all of the attributes in order, invoking the callback to
  sanitize.If an attribute's contents fails to be processed, an exception
  may be thrown, so that the attribute will not be included in the final
  tag.

 It is important that only attributes that are explicitly parsed and 
 understood are passed on, and that it doesn't take extra per-sanitiser work 
 to achieve this. Will this be the case?


 Yeah, this should be the case.  Attributes which don't have a callback
 stored simply aren't parsed. I am starting, however, to think this
 approach might be overkill.  Here I have a different take:
 http://github.com/spencerjackson/fred-staging/tree/HTMLAttributeReorder
 Instead of running a callback in the base class, I simply create the
 attributes, in order, with null content. Then, in the overloaded methods
 on the child classes I repopulate them with the correct data. This
 preserves the original order of the attributes, while minimizing the
 amount of new code that needs to be written. What do you think? Which
 solution do you think is preferable?

Do attributes without content still get written?  Is that always
valid?  Not writing them isn't always valid; see eg bug 4125: current
code happily removes required attributes from meta tags, thus
breaking valid pages.

Depending how much cleaning of the HTML filtering system you want to
do...  Has using something like JTidy ( http://jtidy.sourceforge.net/
) been discussed?  That way you wouldn't have to worry about what's
valid or invalid HTML, merely the security aspects of valid HTML that
are unique to Freenet.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://osprey.vm.bytemark.co.uk/cgi-bin/mailman/listinfo/devl


[freenet-dev] Node churn estimates

2010-05-04 Thread Evan Daniel
On Tue, May 4, 2010 at 9:31 AM, Matthew Toseland
 wrote:
> On Thursday 18 February 2010 20:44:57 Evan Daniel wrote:
>> I've followed up my previous crude estimates of node churn with some
>> more detailed numbers. ?(See my mail in re: "data persistence again"
>> on 20100122 for previous version and more detailed explanation.)
>>
>> Again, some brief caveats: the following basically assumes that all
>> samples are independent. ?This is quite incorrect, because of time of
>> day effects. ?Nonetheless, I think it's useful. ?Many of the obvious
>> uses for this data ("If an insert is stored on 3 nodes, how likely is
>> it one of them will be online later?") are strongly impacted by this.
>> Use appropriate caution in analysis. ?Also, I have a few missing
>> samples; for each sample, I looked at the previous set of 24 samples
>> that I did have, whether or not those were contiguous.
>>
>> What I did: for each of the probe request samples, I computed how many
>> nodes appeared in n of the previous 24 samples (24 samples at 5 hour
>> intervals is a 5 day window). ?I then averaged these counts across
>> samples. ?If an average sample has N_i nodes appearing in i of the
>> previous 24 samples, then the average sample size over those 24 is
>> sum(N_i*(i/24)). ?Over the 387 samples (ignoring the first 23 where
>> there aren't a "most recent 24 samples"), I have an average sample
>> size of 5757.1 nodes. ?If we assume that each node is online with
>> probability i/24, and all nodes are independent (see previous caveat
>> about this assumption being incorrect), then the number of nodes that
>> are online in both of two different sampling intervals is
>> sum(N_i*(i/24)^2). ?For this number, I get 3511.5 nodes. ?That is, if
>> you select a random online node at some time t_1, the odds that it
>> will be online at some later time t_2 are about 0.610.
>>
>> I then repeated the above using the most recent 72 samples (15 days).
>> The distributions were roughly similar. ?Average sample size was
>> 5824.1, expected nodes online in both of two samples is 3106.8, or a
>> probability of 0.533 that a randomly chosen node will be online later.
>>
>> Nodes online in 24 of 24 samples make up 21.9% of an average sample.
>> Nodes online in 70, 71, or 72 samples make up 13.6%. ?Low-uptime nodes
>> (< 40% according to sink logic; here taken as <= 9 samples of 24 or <=
>> 27 of 72 (to make the 24/72 numbers directly comparable)) are 30.8% on
>> the 24-sample data, and 37.7% on the 72-sample data. ?I believe both
>> of these discrepancies result from join/leave churn, whether permanent
>> or over medium time periods (ie users who use Freenet for a couple
>> hours or days every few weeks).
>>
>> Evan Daniel
>>
>> (If you want the full spreadsheet or raw data, ask. ?The spreadsheet
>> was nearly 0.5 MiB, so I didn't attach it. ?The averaged counts are
>> below; this is enough to reproduce my calculations assuming samples
>> are independent.)
>>
> Some more analysis on this:
>
> [14:24:50]  toad_: 5757 nodes online in an average sample. ?Taking 
> high uptime as 23 or 24 samples, low uptime as 1-9 samples, and medium as 
> 10-22...
> [14:25:52]  evanbd: the other question of course is how much 
> redundancy can we get away with before it starts to be a problem ... that 
> sort of depends on MHKs though
> [14:25:56]  toad_: The high uptime group is 1505 nodes (1258 in 
> 24/24). ?They have an average uptime of 99.3%.
> [14:26:23]  toad_: The medium uptime group is 2478 nodes; they have 
> an average uptime of 65%.
> [14:26:25]  if we don't have MHKs, the top block will always be 
> grossly unreliable ...
> [14:26:38]  evanbd: this is by nodes typically online ?
> [14:26:47]  toad_: And the low uptime group is 1774 nodes, with 
> average uptime 22.9%.
> [14:27:51]  evanbd: okay, and this is by nodes online at an instant?
> [14:28:09]  toad_: This is: Choose a random sample; choose a random 
> node online in that sample. ?It will be a medium-uptime node with probability 
> 2478/5757 (= 0.430). ?On average, its uptime will be 65%.
> [14:28:17]  toad_: (In other words, yes)
> [14:28:31]  this is much better than i had expected
> [14:28:47]  Well, by definition their uptime is > 40% :)
> [14:28:59]  so 26% have 99% uptime, 43% have 65% uptime, and 31% have 
> 23% uptime
> [14:29:18]  right, but it means that nearly 70% of nodes online at any 
> given time have 65%+ uptime
> [14:29:32]  i.e. we are *not* swamped with low uptime nodes
> [14:30:01]  at least if we consider a week ... this doesn't answer the 
> question

Re: [freenet-dev] Node churn estimates

2010-05-04 Thread Evan Daniel
On Tue, May 4, 2010 at 9:31 AM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Thursday 18 February 2010 20:44:57 Evan Daniel wrote:
 I've followed up my previous crude estimates of node churn with some
 more detailed numbers.  (See my mail in re: data persistence again
 on 20100122 for previous version and more detailed explanation.)

 Again, some brief caveats: the following basically assumes that all
 samples are independent.  This is quite incorrect, because of time of
 day effects.  Nonetheless, I think it's useful.  Many of the obvious
 uses for this data (If an insert is stored on 3 nodes, how likely is
 it one of them will be online later?) are strongly impacted by this.
 Use appropriate caution in analysis.  Also, I have a few missing
 samples; for each sample, I looked at the previous set of 24 samples
 that I did have, whether or not those were contiguous.

 What I did: for each of the probe request samples, I computed how many
 nodes appeared in n of the previous 24 samples (24 samples at 5 hour
 intervals is a 5 day window).  I then averaged these counts across
 samples.  If an average sample has N_i nodes appearing in i of the
 previous 24 samples, then the average sample size over those 24 is
 sum(N_i*(i/24)).  Over the 387 samples (ignoring the first 23 where
 there aren't a most recent 24 samples), I have an average sample
 size of 5757.1 nodes.  If we assume that each node is online with
 probability i/24, and all nodes are independent (see previous caveat
 about this assumption being incorrect), then the number of nodes that
 are online in both of two different sampling intervals is
 sum(N_i*(i/24)^2).  For this number, I get 3511.5 nodes.  That is, if
 you select a random online node at some time t_1, the odds that it
 will be online at some later time t_2 are about 0.610.

 I then repeated the above using the most recent 72 samples (15 days).
 The distributions were roughly similar.  Average sample size was
 5824.1, expected nodes online in both of two samples is 3106.8, or a
 probability of 0.533 that a randomly chosen node will be online later.

 Nodes online in 24 of 24 samples make up 21.9% of an average sample.
 Nodes online in 70, 71, or 72 samples make up 13.6%.  Low-uptime nodes
 ( 40% according to sink logic; here taken as = 9 samples of 24 or =
 27 of 72 (to make the 24/72 numbers directly comparable)) are 30.8% on
 the 24-sample data, and 37.7% on the 72-sample data.  I believe both
 of these discrepancies result from join/leave churn, whether permanent
 or over medium time periods (ie users who use Freenet for a couple
 hours or days every few weeks).

 Evan Daniel

 (If you want the full spreadsheet or raw data, ask.  The spreadsheet
 was nearly 0.5 MiB, so I didn't attach it.  The averaged counts are
 below; this is enough to reproduce my calculations assuming samples
 are independent.)

 Some more analysis on this:

 [14:24:50] evanbd toad_: 5757 nodes online in an average sample.  Taking 
 high uptime as 23 or 24 samples, low uptime as 1-9 samples, and medium as 
 10-22...
 [14:25:52] toad_ evanbd: the other question of course is how much 
 redundancy can we get away with before it starts to be a problem ... that 
 sort of depends on MHKs though
 [14:25:56] evanbd toad_: The high uptime group is 1505 nodes (1258 in 
 24/24).  They have an average uptime of 99.3%.
 [14:26:23] evanbd toad_: The medium uptime group is 2478 nodes; they have 
 an average uptime of 65%.
 [14:26:25] toad_ if we don't have MHKs, the top block will always be 
 grossly unreliable ...
 [14:26:38] toad_ evanbd: this is by nodes typically online ?
 [14:26:47] evanbd toad_: And the low uptime group is 1774 nodes, with 
 average uptime 22.9%.
 [14:27:51] toad_ evanbd: okay, and this is by nodes online at an instant?
 [14:28:09] evanbd toad_: This is: Choose a random sample; choose a random 
 node online in that sample.  It will be a medium-uptime node with probability 
 2478/5757 (= 0.430).  On average, its uptime will be 65%.
 [14:28:17] evanbd toad_: (In other words, yes)
 [14:28:31] toad_ this is much better than i had expected
 [14:28:47] evanbd Well, by definition their uptime is  40% :)
 [14:28:59] toad_ so 26% have 99% uptime, 43% have 65% uptime, and 31% have 
 23% uptime
 [14:29:18] toad_ right, but it means that nearly 70% of nodes online at any 
 given time have 65%+ uptime
 [14:29:32] toad_ i.e. we are *not* swamped with low uptime nodes
 [14:30:01] toad_ at least if we consider a week ... this doesn't answer the 
 question of try-it-and-leave


(09:31:32 AM) evanbd: toad_: No...  44% have uptime over 70% :)
(09:31:34 AM) toad_: evanbd: i've posted what you just said to devl
(09:31:52 AM) toad_: evanbd: ah :
(09:32:09 AM) toad_: yeah, i see ...
(09:32:10 AM) evanbd: toad_: 26% have uptime between 40% and 70%
(09:32:32 AM) toad_: right, 70% have uptime 40% :|

Also, note that the above numbers are based on the same data set as
the original email: that is, they're not current.

Evan

[freenet-dev] Bug tracker problems

2010-04-21 Thread Evan Daniel
On Wed, Apr 21, 2010 at 8:46 AM, Matthew Toseland
 wrote:
> I have to switch Mantis from packaged to directly installed:
> - Mantis 1.2.0 contains critical bug fixes including remote admin and 
> cross-site scripting vulnerabilities capable of capturing plaintext passwords.
> - Mantis 1.1 is officially unmaintained.
> - Mantis does not appear to ask for CVE's, so the issues are not taken 
> seriously by Debian and therefore by Ubuntu.
> - The package in Ubuntu is Mantis 1.1.8.
> - Ubuntu and Debian have not patched these issues. There are no bugs filed 
> for them either.
>
> Plus, Mantis is written in php, which has had many vulnerabilities and is 
> likely to continue having many vulnerabilities, at least in nextgens' view. 
> However half of the web is written in php and presumably the distributions do 
> deal with such vulnerabilities promptly.
>
> Last time I checked there were many options for third party hosting of 
> mantis, including upgrading it for us, unfortunately none of them (certainly 
> none of the free ones) would allow us to import our existing bugs.
>
> A related point is that only a relatively small proportion of users actually 
> report bugs on the bug tracker. However, closing it off would increase the 
> barrier to entry for new developers.
>
> As I see it our options are:
> - Keep Mantis, install it and upgrade it by hand.
> - Keep Mantis and restrict its use to registered developers.
> - Switch to something else.
>
> Most likely we will stick to the first option.

I think there is an important set of power users who like to look at
bug trackers and might occasionally report a bug there, and might one
day become devs (and are certainly useful to have around regardless!).
 So I think making the bug tracker inaccessible to them is very bad.

Switching to something else is a pain, and we haven't managed to agree
on what to switch to before.

So I like the first option, for now.  We can revisit as needed.

Evan Daniel



Re: [freenet-dev] Bug tracker problems

2010-04-21 Thread Evan Daniel
On Wed, Apr 21, 2010 at 8:46 AM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 I have to switch Mantis from packaged to directly installed:
 - Mantis 1.2.0 contains critical bug fixes including remote admin and 
 cross-site scripting vulnerabilities capable of capturing plaintext passwords.
 - Mantis 1.1 is officially unmaintained.
 - Mantis does not appear to ask for CVE's, so the issues are not taken 
 seriously by Debian and therefore by Ubuntu.
 - The package in Ubuntu is Mantis 1.1.8.
 - Ubuntu and Debian have not patched these issues. There are no bugs filed 
 for them either.

 Plus, Mantis is written in php, which has had many vulnerabilities and is 
 likely to continue having many vulnerabilities, at least in nextgens' view. 
 However half of the web is written in php and presumably the distributions do 
 deal with such vulnerabilities promptly.

 Last time I checked there were many options for third party hosting of 
 mantis, including upgrading it for us, unfortunately none of them (certainly 
 none of the free ones) would allow us to import our existing bugs.

 A related point is that only a relatively small proportion of users actually 
 report bugs on the bug tracker. However, closing it off would increase the 
 barrier to entry for new developers.

 As I see it our options are:
 - Keep Mantis, install it and upgrade it by hand.
 - Keep Mantis and restrict its use to registered developers.
 - Switch to something else.

 Most likely we will stick to the first option.

I think there is an important set of power users who like to look at
bug trackers and might occasionally report a bug there, and might one
day become devs (and are certainly useful to have around regardless!).
 So I think making the bug tracker inaccessible to them is very bad.

Switching to something else is a pain, and we haven't managed to agree
on what to switch to before.

So I like the first option, for now.  We can revisit as needed.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://osprey.vm.bytemark.co.uk/cgi-bin/mailman/listinfo/devl


[freenet-dev] FEC and memory usage

2010-04-14 Thread Evan Daniel
I've been investigating potential improvements to our FEC encoding in
my spare time (in particular the use of LDPC codes and their
relatives, but the following is generally applicable).  I'd like to
ask for opinions on what assumptions I should be making about
acceptable levels of CPU time, memory usage, and disk usage.

We care both about how well our FEC codes work, and how fast they are.
 How well they work is a surprisingly nuanced question, but for this
we can assume it's completely described by what block-level loss rate
the code can recover from, for a specified file size and success rate.
 As I see it, there are four fundamental metrics: disk space usage,
disk io (in particular seeks), ram usage, and CPU usage.  Different
FEC schemes have different characteristics, and allow different
tradeoffs to be made.

Our current FEC (simple segments, with Reed-Solomon encoding of each
segment) does very well on the disk performance.  I haven't examined
what it actually does, but it could be made to alternately read and
write sequential 4 MiB blocks, making one pass over the whole file,
without needing any additional space; this is as good possible.  It
does fairly well on memory usage: it needs to hold a whole 4 MiB
segment in ram at a time, plus a small amount of overhead for lookup
tables and Vandermonde matrices and such.  CPU performance is poor:
decoding each block requires operations on the entire segment, and
those operations are table lookups rather than simple math.  (Decode
CPU cost is O(n^2) with segment size, and our segments are big enough
that this is relevant.)

Other schemes will likely make different tradeoffs.  A naive LDPC
implementation will use very little RAM and CPU, but do lots of disk
seeks and need space to store (potentially) all data and check blocks
of the file on disk during that time (that is, double the file size,
where the current RS codes only need space equal to the final file
size).  However, it also allows ways to trade off more memory usage
and CPU time for less disk io and (I think) less disk space usage.  An
interleaved segments code based on RS codes (like the CIRC code used
on CDs) would be worse than our current scheme (equivalent memory
usage, poor CPU performance, slightly more disk space required, a
moderate number of disk seeks required).  (Both LDPC and interleaved
segments are more effective than our current scheme for large files.)

So, given that the tradeoffs will be complex, and that the decoder is
likely to have some flexibility (eg more memory usage for fewer
seeks), what baseline assumptions about these should I be making?  Do
we care more about reducing the number of seeks, even if it has an
increased cost in memory usage or CPU time?  How much memory is it
safe to assume will always be available?  Is it ok to need disk space
beyond the file size?  What if avoiding that has a significant cost in
CPU time?

I realize these are fairly vague questions; vague and opinion-based
answers are certainly welcome.  Hopefully it wont be too long before I
can toss some example numbers into the discussion.

Evan Daniel



[freenet-dev] FEC and memory usage

2010-04-14 Thread Evan Daniel
I've been investigating potential improvements to our FEC encoding in
my spare time (in particular the use of LDPC codes and their
relatives, but the following is generally applicable).  I'd like to
ask for opinions on what assumptions I should be making about
acceptable levels of CPU time, memory usage, and disk usage.

We care both about how well our FEC codes work, and how fast they are.
 How well they work is a surprisingly nuanced question, but for this
we can assume it's completely described by what block-level loss rate
the code can recover from, for a specified file size and success rate.
 As I see it, there are four fundamental metrics: disk space usage,
disk io (in particular seeks), ram usage, and CPU usage.  Different
FEC schemes have different characteristics, and allow different
tradeoffs to be made.

Our current FEC (simple segments, with Reed-Solomon encoding of each
segment) does very well on the disk performance.  I haven't examined
what it actually does, but it could be made to alternately read and
write sequential 4 MiB blocks, making one pass over the whole file,
without needing any additional space; this is as good possible.  It
does fairly well on memory usage: it needs to hold a whole 4 MiB
segment in ram at a time, plus a small amount of overhead for lookup
tables and Vandermonde matrices and such.  CPU performance is poor:
decoding each block requires operations on the entire segment, and
those operations are table lookups rather than simple math.  (Decode
CPU cost is O(n^2) with segment size, and our segments are big enough
that this is relevant.)

Other schemes will likely make different tradeoffs.  A naive LDPC
implementation will use very little RAM and CPU, but do lots of disk
seeks and need space to store (potentially) all data and check blocks
of the file on disk during that time (that is, double the file size,
where the current RS codes only need space equal to the final file
size).  However, it also allows ways to trade off more memory usage
and CPU time for less disk io and (I think) less disk space usage.  An
interleaved segments code based on RS codes (like the CIRC code used
on CDs) would be worse than our current scheme (equivalent memory
usage, poor CPU performance, slightly more disk space required, a
moderate number of disk seeks required).  (Both LDPC and interleaved
segments are more effective than our current scheme for large files.)

So, given that the tradeoffs will be complex, and that the decoder is
likely to have some flexibility (eg more memory usage for fewer
seeks), what baseline assumptions about these should I be making?  Do
we care more about reducing the number of seeks, even if it has an
increased cost in memory usage or CPU time?  How much memory is it
safe to assume will always be available?  Is it ok to need disk space
beyond the file size?  What if avoiding that has a significant cost in
CPU time?

I realize these are fairly vague questions; vague and opinion-based
answers are certainly welcome.  Hopefully it wont be too long before I
can toss some example numbers into the discussion.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://osprey.vm.bytemark.co.uk/cgi-bin/mailman/listinfo/devl


[freenet-dev] [freenet-chat] Add frost page to Freenet Default Bookmarks

2010-04-09 Thread Evan Daniel
On Fri, Apr 9, 2010 at 12:33 PM, Matthew Toseland
 wrote:
> On Friday 09 April 2010 09:08:18 artur wrote:
>> Hi,
>>
>> Am 08.04.2010 17:42, schrieb Evan Daniel:
>> > On Thu, Apr 8, 2010 at 11:27 AM, Matthew Toseland
>> >  ?wrote:
>> >> On Tuesday 06 April 2010 18:17:33 artur wrote:
>> ...
>> >>
>> >> On the other hand, Frost is broken by design, Freetalk will be integrated 
>> >> in the node soon (how soon nobody knows), and if we put it back on the 
>> >> homepage the spammer may come out of the woodwork.
>> >>
>> >> Anyone else have an opinion?
>>
>> Ok, Frost is spamable (like nearly every other communication system in
>> the internet). So, I would not call this "broken by design", but I know
>> which problems the spammer caused for frost.
>
> In Freenet terms, spammable is broken by design. This is not people 
> advertising black market pharmaceuticals. This is a deliberate and effective 
> attempt to make the system completely unusable, at least on target boards. 
> And it can be done anonymously, so the classic countermeasures of 
> blacklisting IP addresses etc don't work.
>>
>> > I think if we link to it, we should support it, at least to a point.
>> > I'd rather we weren't. ?But, we seem to be doing that regardless,
>> > so... ?OTOH, I think we should have a messaging system of some sort,
>> > and that isn't yet Freetalk. ?And I don't know whether it's better to
>> > link to a messaging system that's so spammed it's unusable, or link to
>> > nothing.
>>
>> Do you support Freemail? Freesite? FMS?
>> Are you in one way or another connected to the content published on the
>> various index pages, linked in the default bookmarks?
>> I think the freenet authors do not want to associate them self with what
>> is on that pages. The way a search engine does not account them self
>> responsible for their search results...
>
> We are talking about software here. And no, we don't link directly to 
> questionable material - we link to index sites e.g. that make it easy for 
> users to find what they want to find.

IMHO, if we link to the software directly, we will have users coming
to IRC asking for support.  I think that is independent of whether we
are "officially responsible" or what disclaimers we attach.  When a
user shows up with a problem, I don't want to tell them I won't help
them.  I also don't want to support Frost.  (But, as I said, I also
don't want to not have any messaging system at all.  I'm conflicted on
the matter :/ )

>>
>> Frost is a good tool for Freenet. Without Frost, Freenet would have had
>> a lot of less active users, an so it would have today.
>> It has been the main communication tool within Freenet for years.
>>
>> Today there is a strong alternative with FMS, but I could argue that FMS
>> is brogen by design as well. When Freenet is all about anti censorship,
>> FMS is the tool to bring it back. I don't want to say it is bad, but it
>> has its own disadvantages.
>
> Freetalk and FMS both use a distributed reputation system supporting 
> "negative trust". This makes it possible to block spam very effectively, 
> because a new user who posts spam can be blocked by a few people who see it 
> and then nobody will see the new user any more. There are alternatives that 
> may be more acceptable, and implementing these will not be difficult - it's 
> just not a priority for anyone actually working on this stuff at the moment. 
> The main alternative is to have a "positive trust" system. This would mean 
> that new users don't show up at all until the user has gained some trust from 
> others, so we would need either new users to show up to everyone (meaning if 
> a spammer is creating new identities to spam they have to be blocked one at a 
> time by *each user*), or that they would show up to some subset of everyone - 
> e.g. maybe the people whose captchas they solve and those who trust them.
>>
>> Frost is also a download manager.
>
> I was under the impression that the Frost download system was DoS'ed at the 
> moment, i.e. out of action due to exploitation of the fact that it is 
> fundamentally broken.

Frost has three main uses: a forums / message board system (spammable,
and therefore sometimes completely unusable); a filesharing / search
system (also spammable and sometimes unusable); and a nice UI to the
download / upload queue, that can be used entirely by entering keys to
fetch without either of the other two systems active.  So it's useful
even when the main features are completely spammed, *provided* the
user understands the software a

Re: [freenet-dev] [freenet-chat] Add frost page to Freenet Default Bookmarks

2010-04-09 Thread Evan Daniel
On Fri, Apr 9, 2010 at 12:33 PM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Friday 09 April 2010 09:08:18 artur wrote:
 Hi,

 Am 08.04.2010 17:42, schrieb Evan Daniel:
  On Thu, Apr 8, 2010 at 11:27 AM, Matthew Toseland
  t...@amphibian.dyndns.org  wrote:
  On Tuesday 06 April 2010 18:17:33 artur wrote:
 ...
 
  On the other hand, Frost is broken by design, Freetalk will be integrated 
  in the node soon (how soon nobody knows), and if we put it back on the 
  homepage the spammer may come out of the woodwork.
 
  Anyone else have an opinion?

 Ok, Frost is spamable (like nearly every other communication system in
 the internet). So, I would not call this broken by design, but I know
 which problems the spammer caused for frost.

 In Freenet terms, spammable is broken by design. This is not people 
 advertising black market pharmaceuticals. This is a deliberate and effective 
 attempt to make the system completely unusable, at least on target boards. 
 And it can be done anonymously, so the classic countermeasures of 
 blacklisting IP addresses etc don't work.

  I think if we link to it, we should support it, at least to a point.
  I'd rather we weren't.  But, we seem to be doing that regardless,
  so...  OTOH, I think we should have a messaging system of some sort,
  and that isn't yet Freetalk.  And I don't know whether it's better to
  link to a messaging system that's so spammed it's unusable, or link to
  nothing.

 Do you support Freemail? Freesite? FMS?
 Are you in one way or another connected to the content published on the
 various index pages, linked in the default bookmarks?
 I think the freenet authors do not want to associate them self with what
 is on that pages. The way a search engine does not account them self
 responsible for their search results...

 We are talking about software here. And no, we don't link directly to 
 questionable material - we link to index sites e.g. that make it easy for 
 users to find what they want to find.

IMHO, if we link to the software directly, we will have users coming
to IRC asking for support.  I think that is independent of whether we
are officially responsible or what disclaimers we attach.  When a
user shows up with a problem, I don't want to tell them I won't help
them.  I also don't want to support Frost.  (But, as I said, I also
don't want to not have any messaging system at all.  I'm conflicted on
the matter :/ )


 Frost is a good tool for Freenet. Without Frost, Freenet would have had
 a lot of less active users, an so it would have today.
 It has been the main communication tool within Freenet for years.

 Today there is a strong alternative with FMS, but I could argue that FMS
 is brogen by design as well. When Freenet is all about anti censorship,
 FMS is the tool to bring it back. I don't want to say it is bad, but it
 has its own disadvantages.

 Freetalk and FMS both use a distributed reputation system supporting 
 negative trust. This makes it possible to block spam very effectively, 
 because a new user who posts spam can be blocked by a few people who see it 
 and then nobody will see the new user any more. There are alternatives that 
 may be more acceptable, and implementing these will not be difficult - it's 
 just not a priority for anyone actually working on this stuff at the moment. 
 The main alternative is to have a positive trust system. This would mean 
 that new users don't show up at all until the user has gained some trust from 
 others, so we would need either new users to show up to everyone (meaning if 
 a spammer is creating new identities to spam they have to be blocked one at a 
 time by *each user*), or that they would show up to some subset of everyone - 
 e.g. maybe the people whose captchas they solve and those who trust them.

 Frost is also a download manager.

 I was under the impression that the Frost download system was DoS'ed at the 
 moment, i.e. out of action due to exploitation of the fact that it is 
 fundamentally broken.

Frost has three main uses: a forums / message board system (spammable,
and therefore sometimes completely unusable); a filesharing / search
system (also spammable and sometimes unusable); and a nice UI to the
download / upload queue, that can be used entirely by entering keys to
fetch without either of the other two systems active.  So it's useful
even when the main features are completely spammed, *provided* the
user understands the software and the problems fairly well.


 Fuqid might do a better job as a stand
 alone tool, but it is not cross platform, has never been really Freenet
 0.7 compatible and its development has been abandoned.

 So use Thaw. It's a perfectly good download manager, even if you don't find 
 the indexes easy to deal with.

 (Just in comparison: Frost has had 17 Commits last month.)
 And a good download tool is wanted by the community (1)

 There is no such thing at the moment, sadly. Frost certainly isn't it. Thaw 
 isn't it. Maybe we will have

[freenet-dev] [freenet-chat] Add frost page to Freenet Default Bookmarks

2010-04-08 Thread Evan Daniel
On Thu, Apr 8, 2010 at 11:27 AM, Matthew Toseland
 wrote:
> On Tuesday 06 April 2010 18:17:33 artur wrote:
>> Hi,
>>
>> I have noticed that Frost is no longer in the default bookmark list of
>> Freenet.
>>
>> I think it should be because:
>> - Many people are using Frost to communicate.
>> - It is much more easy and faster to setup than FMS. Freenet has never
>> been the fastest system, but it takes a long time to setup FMS and get
>> announced. While you can just fire up frost and get started.
>> - Frost has been a part of Freenet for a very long time now, it is
>> widely spread and tested. But new users do not know all the alternatives
>> of communication in Freenet. They have a shot look what is there, try
>> it, and if it does not work most of them will leave again. If there is
>> an alternative, they might have a second try...
>
> On the other hand, Frost is broken by design, Freetalk will be integrated in 
> the node soon (how soon nobody knows), and if we put it back on the homepage 
> the spammer may come out of the woodwork.
>
> Anyone else have an opinion?

I think if we link to it, we should support it, at least to a point.
I'd rather we weren't.  But, we seem to be doing that regardless,
so...  OTOH, I think we should have a messaging system of some sort,
and that isn't yet Freetalk.  And I don't know whether it's better to
link to a messaging system that's so spammed it's unusable, or link to
nothing.

I guess I don't actually have a strong opinion on the matter.  Slight
vote for not having it on the list.

Evan Daniel



[freenet-dev] [GSoC 2010] Improving Content Filters

2010-04-08 Thread Evan Daniel
Ogg container
> format. This is technically interesting, as it encapsulates other types of
> data. A generic Ogg parser will be written, which will need to validate the
> Ogg container, identify the bitstreams it contains, identify the codec used
> inside these bitstreams, and process the streams using a second(or nth,
> really, depending on how many bitstreams are in the container) codec
> specific filter. It should be possible to use this filter to either filter
> the just beginning of the file, or the whole thing. This will make it
> possible to preview a partially downloaded file, at some point in the
> future. Some things which will need to be taken into consideration are the
> possibility of Ogg pages being concealed inside of other Ogg pages. This
> will be checked for, and a fatal error will be raised if it occurs.
>
> The Ogg codecs which I will initially add support for are, in order, Vorbis,
> Theora, and FLAC.
>
>
> More content filters
>
> The more filters the better. In the time remaining, I will implement as many
> different possible content filters. While this step is very important, these
> codecs individually are of a lower priority than previous steps. I will
> implement ATOM/RSS, mp3, and the rudiments of pdf.
>
>
>
> Milestones
> Here are clear milestones which may be used to evaluate my performance. The
> following are a list of these goals which should be met to signify
> completion, along with very rough estimates as to how long each step should
> take:
>
> *Stream based filters (3 days)
> *Filters are moved to the client layer, with support for (disableable)
> support filtering files going to the hard drive, and inserts (9 days)
> *Filters can be tested on data, without inserting it into the network (3
> days)
> *Compressors can be interacted with through streams (4 days)
> *An Ogg content filter is implemented, supporting the following codecs: (3
> days)
> ?-The Vorbis codec (2 days)
> ?-The Theora codec (2 days)
> ?-The FLAC codec (2 days)
> *Content filters for ATOM/RSS are implemented (5 days)
> *A content filter for MP3 is implemented (6 days)
> *A basic content filter for pdf is implemented (Remaining time)
>
>
>
> Biography
> I initially became interested in Freenet because I am something of a
> cypherpunk, in that I believe the ability to hold pseudonymous discourse to
> be a major cornerstone of free speech and the free flow of information. I've
> skulked around Freenet occasionally, even helping pre-alpha test version
> 0.7. But I'd like to do more. I want to put my time and energy where my
> mouth is and spend my summer making the world, in some small way, safer for
> freedom.
> Starry-eyed idealism aside, I am an 18 year old American high school senior,
> who will be studying Computer Science after I graduate. While C/C++ is my
> 'first language', so to speak, I am also fluent in Java and Python. Last
> year, I personally rewrote my high school's web page in Python and Django.
> This year, I've been working on an editor for Model United Nations
> resolutions, as time permits. This project is licensed under the GPLv3, and
> is available on GitHub, at http://github.com/spencerjackson/resolute. It's
> written in C++, and uses GTKmm for the GUI.
>
>
> ___
> Devl mailing list
> Devl at freenetproject.org
> http://osprey.vm.bytemark.co.uk/cgi-bin/mailman/listinfo/devl
>

IMHO this looks good.

My one concern is that your suggested timeline looks aggressive.  It
looks to me more like a timeline for writing the code, as opposed to a
timeline for writing the code, documenting it, writing unit tests, and
debugging it.  I know that writing copious documentation and unit
tests as we go isn't how Freenet normally does things, but it would be
nice to improve on that standard :)  I think adding 1 day worth of
documentation and unit tests after each of your listed steps would
make a meaningful improvement to the resultant body of work.  Of
course, others might disagree, and it's not a big concern.  Like I
said, this looks good.

Evan Daniel



Re: [freenet-dev] [GSoC 2010] Improving Content Filters

2010-04-08 Thread Evan Daniel
,
 really, depending on how many bitstreams are in the container) codec
 specific filter. It should be possible to use this filter to either filter
 the just beginning of the file, or the whole thing. This will make it
 possible to preview a partially downloaded file, at some point in the
 future. Some things which will need to be taken into consideration are the
 possibility of Ogg pages being concealed inside of other Ogg pages. This
 will be checked for, and a fatal error will be raised if it occurs.

 The Ogg codecs which I will initially add support for are, in order, Vorbis,
 Theora, and FLAC.


 More content filters

 The more filters the better. In the time remaining, I will implement as many
 different possible content filters. While this step is very important, these
 codecs individually are of a lower priority than previous steps. I will
 implement ATOM/RSS, mp3, and the rudiments of pdf.



 Milestones
 Here are clear milestones which may be used to evaluate my performance. The
 following are a list of these goals which should be met to signify
 completion, along with very rough estimates as to how long each step should
 take:

 *Stream based filters (3 days)
 *Filters are moved to the client layer, with support for (disableable)
 support filtering files going to the hard drive, and inserts (9 days)
 *Filters can be tested on data, without inserting it into the network (3
 days)
 *Compressors can be interacted with through streams (4 days)
 *An Ogg content filter is implemented, supporting the following codecs: (3
 days)
  -The Vorbis codec (2 days)
  -The Theora codec (2 days)
  -The FLAC codec (2 days)
 *Content filters for ATOM/RSS are implemented (5 days)
 *A content filter for MP3 is implemented (6 days)
 *A basic content filter for pdf is implemented (Remaining time)



 Biography
 I initially became interested in Freenet because I am something of a
 cypherpunk, in that I believe the ability to hold pseudonymous discourse to
 be a major cornerstone of free speech and the free flow of information. I've
 skulked around Freenet occasionally, even helping pre-alpha test version
 0.7. But I'd like to do more. I want to put my time and energy where my
 mouth is and spend my summer making the world, in some small way, safer for
 freedom.
 Starry-eyed idealism aside, I am an 18 year old American high school senior,
 who will be studying Computer Science after I graduate. While C/C++ is my
 'first language', so to speak, I am also fluent in Java and Python. Last
 year, I personally rewrote my high school's web page in Python and Django.
 This year, I've been working on an editor for Model United Nations
 resolutions, as time permits. This project is licensed under the GPLv3, and
 is available on GitHub, at http://github.com/spencerjackson/resolute. It's
 written in C++, and uses GTKmm for the GUI.


 ___
 Devl mailing list
 Devl@freenetproject.org
 http://osprey.vm.bytemark.co.uk/cgi-bin/mailman/listinfo/devl


IMHO this looks good.

My one concern is that your suggested timeline looks aggressive.  It
looks to me more like a timeline for writing the code, as opposed to a
timeline for writing the code, documenting it, writing unit tests, and
debugging it.  I know that writing copious documentation and unit
tests as we go isn't how Freenet normally does things, but it would be
nice to improve on that standard :)  I think adding 1 day worth of
documentation and unit tests after each of your listed steps would
make a meaningful improvement to the resultant body of work.  Of
course, others might disagree, and it's not a big concern.  Like I
said, this looks good.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://osprey.vm.bytemark.co.uk/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] [freenet-chat] Add frost page to Freenet Default Bookmarks

2010-04-08 Thread Evan Daniel
On Thu, Apr 8, 2010 at 11:27 AM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Tuesday 06 April 2010 18:17:33 artur wrote:
 Hi,

 I have noticed that Frost is no longer in the default bookmark list of
 Freenet.

 I think it should be because:
 - Many people are using Frost to communicate.
 - It is much more easy and faster to setup than FMS. Freenet has never
 been the fastest system, but it takes a long time to setup FMS and get
 announced. While you can just fire up frost and get started.
 - Frost has been a part of Freenet for a very long time now, it is
 widely spread and tested. But new users do not know all the alternatives
 of communication in Freenet. They have a shot look what is there, try
 it, and if it does not work most of them will leave again. If there is
 an alternative, they might have a second try...

 On the other hand, Frost is broken by design, Freetalk will be integrated in 
 the node soon (how soon nobody knows), and if we put it back on the homepage 
 the spammer may come out of the woodwork.

 Anyone else have an opinion?

I think if we link to it, we should support it, at least to a point.
I'd rather we weren't.  But, we seem to be doing that regardless,
so...  OTOH, I think we should have a messaging system of some sort,
and that isn't yet Freetalk.  And I don't know whether it's better to
link to a messaging system that's so spammed it's unusable, or link to
nothing.

I guess I don't actually have a strong opinion on the matter.  Slight
vote for not having it on the list.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://osprey.vm.bytemark.co.uk/cgi-bin/mailman/listinfo/devl


[freenet-dev] License on wiki documentation

2010-04-06 Thread Evan Daniel
On Tue, Apr 6, 2010 at 9:21 AM, Ximin Luo  wrote:
> atm the wiki content is licensed with GFDL. Do we want to relicense it as
> CC-BY-SA (attribution+sharealike) instead? If so, we should do this while the
> wiki is still young.

Yes, we do.

Evan Daniel



Re: [freenet-dev] License on wiki documentation

2010-04-06 Thread Evan Daniel
On Tue, Apr 6, 2010 at 9:21 AM, Ximin Luo xl...@cam.ac.uk wrote:
 atm the wiki content is licensed with GFDL. Do we want to relicense it as
 CC-BY-SA (attribution+sharealike) instead? If so, we should do this while the
 wiki is still young.

Yes, we do.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://osprey.vm.bytemark.co.uk/cgi-bin/mailman/listinfo/devl


[freenet-dev] Priority of fixing bulk filesharing performance/disk issues? was Re: Another way Freenet sucks for filesharing was Re: [freenet-support] major problems - stuck at 100%, nonresponsive

2010-04-03 Thread Evan Daniel
On Sat, Apr 3, 2010 at 6:01 PM, Matthew Toseland
 wrote:
> On Friday 02 April 2010 17:43:25 Evan Daniel wrote:
>> On Fri, Apr 2, 2010 at 12:39 PM, Matthew Toseland
>>  wrote:
>> > On Friday 02 April 2010 17:31:13 Matthew Toseland wrote:
>> >> On Tuesday 09 March 2010 04:27:24 Evan Daniel wrote:
>> >> > You should really send these to the support list; that's what it's for.
>> >> >
>> >> > You can change the physical security level setting independently of
>> >> > the network seclevels -- see configuration -> security levels.
>> >> >
>> >> > I'm not sure what else to suggest at this point. ?You could try
>> >> > increasing the amount of ram for temp buckets (configuration -> core
>> >> > settings), but that's mostly a stab in the dark.
>> >> >
>> >> > I suspect you need to reduce the amount of stuff in your queue.
>> >>
>> >> Thanks Evan for helping Daniel. In theory it ought to be possible to have 
>> >> a nearly unlimited number of downloads in the queue: That is precisely 
>> >> why we decided to use a database to store the progress of downloads. 
>> >> Unfortunately, in practice, disks are slow, and the more stuff is queued, 
>> >> the less of it will be cached in RAM i.e. the more reliant we are on slow 
>> >> disks.
>> >>
>> >> There are many options for optimising the code so that it uses the disk 
>> >> less. But unfortunately they are all a significant amount of work.
>> >>
>> >> See https://bugs.freenetproject.org/view.php?id=4031 and the bugs it is 
>> >> marked as related to.
>> >
>> > So I guess the real question here is, how important is it that we be able 
>> > to queue 60 downloads and still have acceptable performance? How many 
>> > users use Freenet filesharing in that sort of way?
>>
>> All of them, I suspect. ?If a file is mostly downloaded, but not
>> complete, the natural response seems to be to leave it there in hopes
>> it will complete, and add other files in the mean time. ?Combined with
>> unretrievable files due to missing blocks, this will produce very
>> large download queues.
>
> So this bug should be fairly high priority, despite its potentially being 
> quite a lot of work?:
>
> https://bugs.freenetproject.org/view.php?id=4031

I think so.  I believe I've been saying client layer should be high
priority for a while :)

Evan Daniel



Re: [freenet-dev] Priority of fixing bulk filesharing performance/disk issues? was Re: Another way Freenet sucks for filesharing was Re: [freenet-support] major problems - stuck at 100%, nonresponsive

2010-04-03 Thread Evan Daniel
On Sat, Apr 3, 2010 at 6:01 PM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Friday 02 April 2010 17:43:25 Evan Daniel wrote:
 On Fri, Apr 2, 2010 at 12:39 PM, Matthew Toseland
 t...@amphibian.dyndns.org wrote:
  On Friday 02 April 2010 17:31:13 Matthew Toseland wrote:
  On Tuesday 09 March 2010 04:27:24 Evan Daniel wrote:
   You should really send these to the support list; that's what it's for.
  
   You can change the physical security level setting independently of
   the network seclevels -- see configuration - security levels.
  
   I'm not sure what else to suggest at this point.  You could try
   increasing the amount of ram for temp buckets (configuration - core
   settings), but that's mostly a stab in the dark.
  
   I suspect you need to reduce the amount of stuff in your queue.
 
  Thanks Evan for helping Daniel. In theory it ought to be possible to have 
  a nearly unlimited number of downloads in the queue: That is precisely 
  why we decided to use a database to store the progress of downloads. 
  Unfortunately, in practice, disks are slow, and the more stuff is queued, 
  the less of it will be cached in RAM i.e. the more reliant we are on slow 
  disks.
 
  There are many options for optimising the code so that it uses the disk 
  less. But unfortunately they are all a significant amount of work.
 
  See https://bugs.freenetproject.org/view.php?id=4031 and the bugs it is 
  marked as related to.
 
  So I guess the real question here is, how important is it that we be able 
  to queue 60 downloads and still have acceptable performance? How many 
  users use Freenet filesharing in that sort of way?

 All of them, I suspect.  If a file is mostly downloaded, but not
 complete, the natural response seems to be to leave it there in hopes
 it will complete, and add other files in the mean time.  Combined with
 unretrievable files due to missing blocks, this will produce very
 large download queues.

 So this bug should be fairly high priority, despite its potentially being 
 quite a lot of work?:

 https://bugs.freenetproject.org/view.php?id=4031

I think so.  I believe I've been saying client layer should be high
priority for a while :)

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://osprey.vm.bytemark.co.uk/cgi-bin/mailman/listinfo/devl


[freenet-dev] Another way Freenet sucks for filesharing was Re: [freenet-support] major problems - stuck at 100%, nonresponsive

2010-04-02 Thread Evan Daniel
On Fri, Apr 2, 2010 at 12:39 PM, Matthew Toseland
 wrote:
> On Friday 02 April 2010 17:31:13 Matthew Toseland wrote:
>> On Tuesday 09 March 2010 04:27:24 Evan Daniel wrote:
>> > You should really send these to the support list; that's what it's for.
>> >
>> > You can change the physical security level setting independently of
>> > the network seclevels -- see configuration -> security levels.
>> >
>> > I'm not sure what else to suggest at this point. ?You could try
>> > increasing the amount of ram for temp buckets (configuration -> core
>> > settings), but that's mostly a stab in the dark.
>> >
>> > I suspect you need to reduce the amount of stuff in your queue.
>>
>> Thanks Evan for helping Daniel. In theory it ought to be possible to have a 
>> nearly unlimited number of downloads in the queue: That is precisely why we 
>> decided to use a database to store the progress of downloads. Unfortunately, 
>> in practice, disks are slow, and the more stuff is queued, the less of it 
>> will be cached in RAM i.e. the more reliant we are on slow disks.
>>
>> There are many options for optimising the code so that it uses the disk 
>> less. But unfortunately they are all a significant amount of work.
>>
>> See https://bugs.freenetproject.org/view.php?id=4031 and the bugs it is 
>> marked as related to.
>
> So I guess the real question here is, how important is it that we be able to 
> queue 60 downloads and still have acceptable performance? How many users use 
> Freenet filesharing in that sort of way?

All of them, I suspect.  If a file is mostly downloaded, but not
complete, the natural response seems to be to leave it there in hopes
it will complete, and add other files in the mean time.  Combined with
unretrievable files due to missing blocks, this will produce very
large download queues.

Evan Daniel



Re: [freenet-dev] Another way Freenet sucks for filesharing was Re: [freenet-support] major problems - stuck at 100%, nonresponsive

2010-04-02 Thread Evan Daniel
On Fri, Apr 2, 2010 at 12:39 PM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Friday 02 April 2010 17:31:13 Matthew Toseland wrote:
 On Tuesday 09 March 2010 04:27:24 Evan Daniel wrote:
  You should really send these to the support list; that's what it's for.
 
  You can change the physical security level setting independently of
  the network seclevels -- see configuration - security levels.
 
  I'm not sure what else to suggest at this point.  You could try
  increasing the amount of ram for temp buckets (configuration - core
  settings), but that's mostly a stab in the dark.
 
  I suspect you need to reduce the amount of stuff in your queue.

 Thanks Evan for helping Daniel. In theory it ought to be possible to have a 
 nearly unlimited number of downloads in the queue: That is precisely why we 
 decided to use a database to store the progress of downloads. Unfortunately, 
 in practice, disks are slow, and the more stuff is queued, the less of it 
 will be cached in RAM i.e. the more reliant we are on slow disks.

 There are many options for optimising the code so that it uses the disk 
 less. But unfortunately they are all a significant amount of work.

 See https://bugs.freenetproject.org/view.php?id=4031 and the bugs it is 
 marked as related to.

 So I guess the real question here is, how important is it that we be able to 
 queue 60 downloads and still have acceptable performance? How many users use 
 Freenet filesharing in that sort of way?

All of them, I suspect.  If a file is mostly downloaded, but not
complete, the natural response seems to be to leave it there in hopes
it will complete, and add other files in the mean time.  Combined with
unretrievable files due to missing blocks, this will produce very
large download queues.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://osprey.vm.bytemark.co.uk/cgi-bin/mailman/listinfo/devl


[freenet-dev] Opennet connection replacement policies

2010-03-31 Thread Evan Daniel
On Wed, Mar 31, 2010 at 6:22 PM, Matthew Toseland
 wrote:
> On Wednesday 17 March 2010 18:33:01 Evan Daniel wrote:
>> Currently, when we need to drop an opennet connection, we use the LRU
>> (least recently used) policy: we drop the connection that has least
>> recently served a successful request. ?Unfortunately, I can't find a
>> link to the paper that describes why LRU was chosen, though the basic
>> idea is obvious: if the node has too many peers around a certain
>> location, then those peers will get fewer requests, and therefore one
>> of them is more likely to be the peer dropped.
>>
>> I'm curious whether other algorithms were explored, and if so how they
>> performed in comparison. ?The obvious candidates are LFU (least
>> frequently used), LFU-Window (LFU within a finite time window), and
>> LFU-aging (LFU but with periodic reduction in the frequency counts, so
>> that formerly useful nodes that are no longer useful eventually age
>> out).
>>
>> LFU-aging is normally described as having discrete aging steps at
>> (somewhat) large intervals, because it is normally discussed in the
>> context of a large cache where the O(n) time to perform the aging
>> computation is problematic. ?However, we could do continuous
>> exponential aging at every request without any problems, because the
>> number of connections is small. ?That is, every time a request
>> succeeds, we first multiply the score for each connection by some
>> constant alpha (0 < alpha <= 1), and then increment the score for the
>> connection that had the success. ?(For 0 < alpha <= 0.5 this is
>> precisely equivalent to LRU; for alpha = 1, it is precisely equivalent
>> to (non-window) LFU.)
>>
>> LFU-aging with decay on every request seems (intuitively, and we know
>> what I think of that...) like it should be a good match. ?If the
>> difference in usefulness between different nodes is moderate, and the
>> requests are presumed to be an essentially memoryless random process
>> (a very good assumption, imho), the simple LRU is overly likely to
>> drop one of the more useful connections just because it has been
>> unlucky recently. ?By using LFU with aging (or LFU-window) we get a
>> sort of running average behavior: a connection that is useful, but has
>> been unlucky (and not received any requests recently), will score
>> better than one that is mostly useless but happened to get a good
>> request very recently.
>>
>> Has this idea (or one like it) been investigated before?
>>
>> Thoughts? ?Is this worth study?
>>
>> (Submitted to the bug tracker as 3985.)
>
> Oskar's theoretical work on opennet used LRU and got very good results, which 
> were *almost* backed by provable math, and well established in simulation. 
> We'd have to talk to our deep theorists if we wanted to change this.
>
> Obviously when I say LRU, I mean least-recently-succeeded-CHK-request, and 
> subject to various heuristics (such as which peers are in grace periods etc).
>

As I recall the paper (which paper was it?  I can't find it, but I
probably just overlooked it or something), there was a strong
theoretical basis (proof? I don't recall) for believing that LRU
converged on an optimal distribution.

However, I don't believe that addresses rate of convergence, or
join-leave churn.  It's entirely possible that LFU or something would
also converge, and converge faster -- this would not be in conflict
with a statement that LRU converges on optimal.  Rate of convergence
is closely related to steady-state behavior on a network with a
constant (average) churn rate; we expect such a network to always be
close to optimal, but never optimal (because the recently joined nodes
haven't optimized, and other nodes haven't yet accommodated the recent
departures).  How close to optimal will depend on the churn rate, and
it's possible that something other than LRU produces a better
steady-state behavior.

Evan Daniel



[freenet-dev] responses to WoT/search questions

2010-03-31 Thread Evan Daniel
On Wed, Mar 31, 2010 at 5:42 AM, xor  wrote:
> On Wednesday 31 March 2010 06:32:58 am Evan Daniel wrote:
>> On Wed, Mar 31, 2010 at 12:22 AM, Ximin Luo  wrote:
>> > have you joined the freenet-dev mailing list? in future i'd like to have
>> > these discussions there so that other people can see it too.
>> >
>> > (03:53:26) lusha: hi, can i ask a question about WOT?
>> > (03:55:33) evanbd: No need to ask permission :)
>> > (03:56:12) lusha: is there any document for this?
>> > (03:56:32) lusha: i dont quite understand how they evaluate trust
>> >
>> > (i think) WoT uses a flow-based metric similar to advogato
>> > (www.advogato.org) - see the source code (plugin-WoT-staging), or ask p0s
>> > on IRC (xor on the mailing list) for specific details. atm the
>> > implementation requires retrieving trust scores for everyone on the
>> > network, which won't scale in the long run.
>>
>> No. ?The current WoT code is neither flow-based nor particularly
>> related to the Advogato algorithm. ?It's purely alchemical, having
>> neither a proper specification as to the problem being solved nor any
>> sort of theoretical basis to believe it solves that unspecified
>> problem.
>
> That is true. I should finally do this and answer to your mail w.r.t. to your
> prosed alternative algorithm. I'm sorry, ? ? ? ?I'm just trying to make 
> everyone
> happy. People want a release of FT/WoT soon so as long as I didn't have much
> time/day I was trying to spend it on writing code.
>
> BUT we should also state that the algorithm itself fortunately is only a small
> part of WoT. Most of the work which is required for a working WoT was writing
> the class architecture, the captcha stuff, the FCP stuff, adding proper
> synchronization and general glue code. Those are all done and they work. So
> now we have a proper "nest" for embedding any proper trust/score-based
> algorithm in.

Please don't misunderstand: right now I think usability and any
internal changes you need to do to get WoT / FT ready for release are
far higher priority.  I'm greatly appreciative for the work you've
been doing, and think you should keep doing it.  There's time enough
for WoT algorithms after that.

>
>> Retrieving trust lists for large numbers of nodes should scale fairly
>> well, as long as the updates can be slow. ?IMHO the only real problem
>> presented is startup for a new user (downloading a few tens of MiB of
>> scores might take a little while). ?Specifically, I don't think the
>> scaling problem is any different or worse than the scaling problem
>> inherent in trying to retrieve messages from that many users. ?And,
>> whether it's a problem or not, there are *vastly* more important
>> things to worry about than what to do once we have 1M users -- like
>> how to get that many users in the first place. ?That sort of scaling
>> problem gets put in the "nice problems to have" category in my book.
>
> It can also easily be cut down to logarithmic complexity by only downloading
> trust lists from identities which are directly trusted and from their
> trustees... I'll do that as soon as Freetalk is deployed and the core features
> are working.

Hmm?  You mean limiting it to 2 degrees of separation?  Doesn't that
mean a lot of the network isn't visible (especially if you're assuming
lots of new users)?  I already have a lot of known rank 3 identities.

(Also, I don't see how that's O(log(n)) -- it sounds more like O(1) to me.)

Evan Daniel



[freenet-dev] responses to WoT/search questions

2010-03-31 Thread Evan Daniel
On Wed, Mar 31, 2010 at 12:22 AM, Ximin Luo  wrote:
> have you joined the freenet-dev mailing list? in future i'd like to have these
> discussions there so that other people can see it too.
>
> (03:53:26) lusha: hi, can i ask a question about WOT?
> (03:55:33) evanbd: No need to ask permission :)
> (03:56:12) lusha: is there any document for this?
> (03:56:32) lusha: i dont quite understand how they evaluate trust
>
> (i think) WoT uses a flow-based metric similar to advogato (www.advogato.org) 
> -
> see the source code (plugin-WoT-staging), or ask p0s on IRC (xor on the 
> mailing
> list) for specific details. atm the implementation requires retrieving trust
> scores for everyone on the network, which won't scale in the long run.

No.  The current WoT code is neither flow-based nor particularly
related to the Advogato algorithm.  It's purely alchemical, having
neither a proper specification as to the problem being solved nor any
sort of theoretical basis to believe it solves that unspecified
problem.

Retrieving trust lists for large numbers of nodes should scale fairly
well, as long as the updates can be slow.  IMHO the only real problem
presented is startup for a new user (downloading a few tens of MiB of
scores might take a little while).  Specifically, I don't think the
scaling problem is any different or worse than the scaling problem
inherent in trying to retrieve messages from that many users.  And,
whether it's a problem or not, there are *vastly* more important
things to worry about than what to do once we have 1M users -- like
how to get that many users in the first place.  That sort of scaling
problem gets put in the "nice problems to have" category in my book.

Evan Daniel



Re: [freenet-dev] responses to WoT/search questions

2010-03-31 Thread Evan Daniel
On Wed, Mar 31, 2010 at 5:42 AM, xor x...@gmx.li wrote:
 On Wednesday 31 March 2010 06:32:58 am Evan Daniel wrote:
 On Wed, Mar 31, 2010 at 12:22 AM, Ximin Luo xl...@cam.ac.uk wrote:
  have you joined the freenet-dev mailing list? in future i'd like to have
  these discussions there so that other people can see it too.
 
  (03:53:26) lusha: hi, can i ask a question about WOT?
  (03:55:33) evanbd: No need to ask permission :)
  (03:56:12) lusha: is there any document for this?
  (03:56:32) lusha: i dont quite understand how they evaluate trust
 
  (i think) WoT uses a flow-based metric similar to advogato
  (www.advogato.org) - see the source code (plugin-WoT-staging), or ask p0s
  on IRC (xor on the mailing list) for specific details. atm the
  implementation requires retrieving trust scores for everyone on the
  network, which won't scale in the long run.

 No.  The current WoT code is neither flow-based nor particularly
 related to the Advogato algorithm.  It's purely alchemical, having
 neither a proper specification as to the problem being solved nor any
 sort of theoretical basis to believe it solves that unspecified
 problem.

 That is true. I should finally do this and answer to your mail w.r.t. to your
 prosed alternative algorithm. I'm sorry,        I'm just trying to make 
 everyone
 happy. People want a release of FT/WoT soon so as long as I didn't have much
 time/day I was trying to spend it on writing code.

 BUT we should also state that the algorithm itself fortunately is only a small
 part of WoT. Most of the work which is required for a working WoT was writing
 the class architecture, the captcha stuff, the FCP stuff, adding proper
 synchronization and general glue code. Those are all done and they work. So
 now we have a proper nest for embedding any proper trust/score-based
 algorithm in.

Please don't misunderstand: right now I think usability and any
internal changes you need to do to get WoT / FT ready for release are
far higher priority.  I'm greatly appreciative for the work you've
been doing, and think you should keep doing it.  There's time enough
for WoT algorithms after that.


 Retrieving trust lists for large numbers of nodes should scale fairly
 well, as long as the updates can be slow.  IMHO the only real problem
 presented is startup for a new user (downloading a few tens of MiB of
 scores might take a little while).  Specifically, I don't think the
 scaling problem is any different or worse than the scaling problem
 inherent in trying to retrieve messages from that many users.  And,
 whether it's a problem or not, there are *vastly* more important
 things to worry about than what to do once we have 1M users -- like
 how to get that many users in the first place.  That sort of scaling
 problem gets put in the nice problems to have category in my book.

 It can also easily be cut down to logarithmic complexity by only downloading
 trust lists from identities which are directly trusted and from their
 trustees... I'll do that as soon as Freetalk is deployed and the core features
 are working.

Hmm?  You mean limiting it to 2 degrees of separation?  Doesn't that
mean a lot of the network isn't visible (especially if you're assuming
lots of new users)?  I already have a lot of known rank 3 identities.

(Also, I don't see how that's O(log(n)) -- it sounds more like O(1) to me.)

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://osprey.vm.bytemark.co.uk/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Opennet connection replacement policies

2010-03-31 Thread Evan Daniel
On Wed, Mar 31, 2010 at 6:22 PM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 On Wednesday 17 March 2010 18:33:01 Evan Daniel wrote:
 Currently, when we need to drop an opennet connection, we use the LRU
 (least recently used) policy: we drop the connection that has least
 recently served a successful request.  Unfortunately, I can't find a
 link to the paper that describes why LRU was chosen, though the basic
 idea is obvious: if the node has too many peers around a certain
 location, then those peers will get fewer requests, and therefore one
 of them is more likely to be the peer dropped.

 I'm curious whether other algorithms were explored, and if so how they
 performed in comparison.  The obvious candidates are LFU (least
 frequently used), LFU-Window (LFU within a finite time window), and
 LFU-aging (LFU but with periodic reduction in the frequency counts, so
 that formerly useful nodes that are no longer useful eventually age
 out).

 LFU-aging is normally described as having discrete aging steps at
 (somewhat) large intervals, because it is normally discussed in the
 context of a large cache where the O(n) time to perform the aging
 computation is problematic.  However, we could do continuous
 exponential aging at every request without any problems, because the
 number of connections is small.  That is, every time a request
 succeeds, we first multiply the score for each connection by some
 constant alpha (0  alpha = 1), and then increment the score for the
 connection that had the success.  (For 0  alpha = 0.5 this is
 precisely equivalent to LRU; for alpha = 1, it is precisely equivalent
 to (non-window) LFU.)

 LFU-aging with decay on every request seems (intuitively, and we know
 what I think of that...) like it should be a good match.  If the
 difference in usefulness between different nodes is moderate, and the
 requests are presumed to be an essentially memoryless random process
 (a very good assumption, imho), the simple LRU is overly likely to
 drop one of the more useful connections just because it has been
 unlucky recently.  By using LFU with aging (or LFU-window) we get a
 sort of running average behavior: a connection that is useful, but has
 been unlucky (and not received any requests recently), will score
 better than one that is mostly useless but happened to get a good
 request very recently.

 Has this idea (or one like it) been investigated before?

 Thoughts?  Is this worth study?

 (Submitted to the bug tracker as 3985.)

 Oskar's theoretical work on opennet used LRU and got very good results, which 
 were *almost* backed by provable math, and well established in simulation. 
 We'd have to talk to our deep theorists if we wanted to change this.

 Obviously when I say LRU, I mean least-recently-succeeded-CHK-request, and 
 subject to various heuristics (such as which peers are in grace periods etc).


As I recall the paper (which paper was it?  I can't find it, but I
probably just overlooked it or something), there was a strong
theoretical basis (proof? I don't recall) for believing that LRU
converged on an optimal distribution.

However, I don't believe that addresses rate of convergence, or
join-leave churn.  It's entirely possible that LFU or something would
also converge, and converge faster -- this would not be in conflict
with a statement that LRU converges on optimal.  Rate of convergence
is closely related to steady-state behavior on a network with a
constant (average) churn rate; we expect such a network to always be
close to optimal, but never optimal (because the recently joined nodes
haven't optimized, and other nodes haven't yet accommodated the recent
departures).  How close to optimal will depend on the churn rate, and
it's possible that something other than LRU produces a better
steady-state behavior.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://osprey.vm.bytemark.co.uk/cgi-bin/mailman/listinfo/devl


[freenet-dev] A microblogging and/or real-time chat system of GSOC

2010-03-30 Thread Evan Daniel
I suggest you think of it more like Twitter than IRC.  I believe that
Twitter is a more natural match to the sort of protocol you need to
use over Freenet.  So, when you say something, anyone who is following
you can see it.  So can anyone who searches for any hashtags you used.

You seem to have some misconceptions about how Freenet works.  Have
you installed Freenet and started using it yet?  I also highly
recommend you read some of the papers about Freenet and look around
the wiki a bit.  And ask questions if it's not clear!  In short, there
are two operations that the network supports: insert and fetch.
Insert takes some data, and a key that will refer to it, and sends the
data out onto the network.  It is passed from node to node, and routed
toward nodes with a location dependent on the key.  Nodes along the
route keep a copy of it.  Fetching is similar: the fetch request takes
a key, and the request is routed toward nodes with nearby locations,
until a node is found that has a copy of the data (because it was also
part of the insert path, or part of a previous successful fetch path).
 The data is then sent back to the requester, and nodes along the path
keep a copy.  The data isn't stored on the inserter's computer; it's
stored on other computers on the network.

For a microblogging app, each post will be inserted into the network,
and anyone following you will fetch your posts.  If you go offline, it
doesn't matter: your posts are already inserted into the network, and
therefore available for anyone to fetch.  If your followers go
offline, then they can just fetch your posts when they return.

And you're right that p2p makes messaging somewhat awkward.  But, if
there's a central server, then you're dependent on them.  If they
decide to censor you, there's nothing you can do.  So, I don't see an
alternative that would meet Freenet's goals.  However, I don't think
it's as awkward as seem to think: the insert and fetch operations
provided by Freenet, while somewhat awkward to use, are actually very
powerful.  Having the data persist in the cloud (mostly) independent
of which computers are online is quite powerful.

I'm not terribly familiar with UI technologies, so I can't say what's
a good or bad idea for that, really.  You should plan to have your
program be a plugin for fred; I suggest you take a look at other
plugins to get an idea for what that means.

Evan Daniel

On Tue, Mar 30, 2010 at 10:14 AM, ???  wrote:
> Hi,
>
> Thank you for telling me so many things about the microblogging. I do not
> really sure whether I have understand all the things. I think what you mean
> is that the system is similar as chat room. If I input something, everyone
> who is my friend should see the comment and I can also chat with some one I
> want to. If I understand it correct, I think use JAVA RMI or Web Service may
> be a easy way to complete it. But I have?three questions. First, if I need
> to embedd this system into the Freenet, I think it is very different as web
> application. I should use Swing instead of JSP, JSTL and CSS to complete the
> UI. Second, as it is a whole P2P system, it will work well when everyone is
> on line. But if I am off line, others cannot receive my provious comments.
> Third, if my friend off line, do I need to store my comments in my computer
> and send it when they on line? All the question is around the P2P method
> because I have no server to store the messages. I think P2P is not a good
> way to do message publish and menegement. Can you provide some hint about
> how to deal with these problem.
>
> Thank you for helping me.
>
> Sincerely,
>
> Tianyi Chen
>
>
>
>
>
> On Fri, Mar 26, 2010 at 2:28 PM, ???  wrote:
>> Hi,
>>
>> I am interested in A microblogging and/or real-time chat system of GSOC
>> and
>> the description said that a fair amount of work on how to efficiently
>> implement microblogging over Freenet have been done. I am wordering that
>> if
>> I want to join this project, I should build a independent use any
>> technology
>> and framework I want such as Struts + Spring + Hibernate or I should
>> embedded the microblogging in to the Freenet code as the whole system. And
>> I
>> think Freenet have had strong ability in file distribute so that add
>> real-time chat system is really easy for you because you only need to
>> connect others by knowing their IP address. Could give me a short
>> description about what is the objective of this project and what kind of
>> technology you really want? I am now builiding by own SNS website by using
>> Struts + Spring + Hibernate and Ajax and have some experience about file
>> download software with multithreading control. Could you tell me what is
>> the
>> most different between this project and my provious projects?
&g

Re: [freenet-dev] A microblogging and/or real-time chat system of GSOC

2010-03-30 Thread Evan Daniel
I suggest you think of it more like Twitter than IRC.  I believe that
Twitter is a more natural match to the sort of protocol you need to
use over Freenet.  So, when you say something, anyone who is following
you can see it.  So can anyone who searches for any hashtags you used.

You seem to have some misconceptions about how Freenet works.  Have
you installed Freenet and started using it yet?  I also highly
recommend you read some of the papers about Freenet and look around
the wiki a bit.  And ask questions if it's not clear!  In short, there
are two operations that the network supports: insert and fetch.
Insert takes some data, and a key that will refer to it, and sends the
data out onto the network.  It is passed from node to node, and routed
toward nodes with a location dependent on the key.  Nodes along the
route keep a copy of it.  Fetching is similar: the fetch request takes
a key, and the request is routed toward nodes with nearby locations,
until a node is found that has a copy of the data (because it was also
part of the insert path, or part of a previous successful fetch path).
 The data is then sent back to the requester, and nodes along the path
keep a copy.  The data isn't stored on the inserter's computer; it's
stored on other computers on the network.

For a microblogging app, each post will be inserted into the network,
and anyone following you will fetch your posts.  If you go offline, it
doesn't matter: your posts are already inserted into the network, and
therefore available for anyone to fetch.  If your followers go
offline, then they can just fetch your posts when they return.

And you're right that p2p makes messaging somewhat awkward.  But, if
there's a central server, then you're dependent on them.  If they
decide to censor you, there's nothing you can do.  So, I don't see an
alternative that would meet Freenet's goals.  However, I don't think
it's as awkward as seem to think: the insert and fetch operations
provided by Freenet, while somewhat awkward to use, are actually very
powerful.  Having the data persist in the cloud (mostly) independent
of which computers are online is quite powerful.

I'm not terribly familiar with UI technologies, so I can't say what's
a good or bad idea for that, really.  You should plan to have your
program be a plugin for fred; I suggest you take a look at other
plugins to get an idea for what that means.

Evan Daniel

On Tue, Mar 30, 2010 at 10:14 AM, 陈天一 chentiany...@gmail.com wrote:
 Hi,

 Thank you for telling me so many things about the microblogging. I do not
 really sure whether I have understand all the things. I think what you mean
 is that the system is similar as chat room. If I input something, everyone
 who is my friend should see the comment and I can also chat with some one I
 want to. If I understand it correct, I think use JAVA RMI or Web Service may
 be a easy way to complete it. But I have three questions. First, if I need
 to embedd this system into the Freenet, I think it is very different as web
 application. I should use Swing instead of JSP, JSTL and CSS to complete the
 UI. Second, as it is a whole P2P system, it will work well when everyone is
 on line. But if I am off line, others cannot receive my provious comments.
 Third, if my friend off line, do I need to store my comments in my computer
 and send it when they on line? All the question is around the P2P method
 because I have no server to store the messages. I think P2P is not a good
 way to do message publish and menegement. Can you provide some hint about
 how to deal with these problem.

 Thank you for helping me.

 Sincerely,

 Tianyi Chen





 On Fri, Mar 26, 2010 at 2:28 PM, ??? chentiany...@gmail.com wrote:
 Hi,

 I am interested in A microblogging and/or real-time chat system of GSOC
 and
 the description said that a fair amount of work on how to efficiently
 implement microblogging over Freenet have been done. I am wordering that
 if
 I want to join this project, I should build a independent use any
 technology
 and framework I want such as Struts + Spring + Hibernate or I should
 embedded the microblogging in to the Freenet code as the whole system. And
 I
 think Freenet have had strong ability in file distribute so that add
 real-time chat system is really easy for you because you only need to
 connect others by knowing their IP address. Could give me a short
 description about what is the objective of this project and what kind of
 technology you really want? I am now builiding by own SNS website by using
 Struts + Spring + Hibernate and Ajax and have some experience about file
 download software with multithreading control. Could you tell me what is
 the
 most different between this project and my provious projects?

 By the way, do I need to design the entity, relationship and attribute for
 database by meself? What database you use? (I have MySQL and Oracle 10g
 experience)?

 I'll let someone else speak to the gsoc issues; here's a brief summary

Re: [freenet-dev] responses to WoT/search questions

2010-03-30 Thread Evan Daniel
On Wed, Mar 31, 2010 at 12:22 AM, Ximin Luo xl...@cam.ac.uk wrote:
 have you joined the freenet-dev mailing list? in future i'd like to have these
 discussions there so that other people can see it too.

 (03:53:26) lusha: hi, can i ask a question about WOT?
 (03:55:33) evanbd: No need to ask permission :)
 (03:56:12) lusha: is there any document for this?
 (03:56:32) lusha: i dont quite understand how they evaluate trust

 (i think) WoT uses a flow-based metric similar to advogato (www.advogato.org) 
 -
 see the source code (plugin-WoT-staging), or ask p0s on IRC (xor on the 
 mailing
 list) for specific details. atm the implementation requires retrieving trust
 scores for everyone on the network, which won't scale in the long run.

No.  The current WoT code is neither flow-based nor particularly
related to the Advogato algorithm.  It's purely alchemical, having
neither a proper specification as to the problem being solved nor any
sort of theoretical basis to believe it solves that unspecified
problem.

Retrieving trust lists for large numbers of nodes should scale fairly
well, as long as the updates can be slow.  IMHO the only real problem
presented is startup for a new user (downloading a few tens of MiB of
scores might take a little while).  Specifically, I don't think the
scaling problem is any different or worse than the scaling problem
inherent in trying to retrieve messages from that many users.  And,
whether it's a problem or not, there are *vastly* more important
things to worry about than what to do once we have 1M users -- like
how to get that many users in the first place.  That sort of scaling
problem gets put in the nice problems to have category in my book.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
http://osprey.vm.bytemark.co.uk/cgi-bin/mailman/listinfo/devl


  1   2   3   4   5   6   >