(Copied from my latest flog post, since toad asked me to send a summary here. If you want to read more about Freenet stats, I suggest you read my other posts. freenet:u...@gjw6stjzoz4oag-pqoxip5nk11udqzorozd4jld42ac,BYyqgAtc9p0JGbJ~18XU6mtO9ChnBZdf~ttCn48FV7s,AQACAAE/flog/13/ )
After much effort on statistical sampling techniques, it occurred to me there was a far simpler way to estimate the number of users on Freenet, and how many of them use the network regularly vs occasionally. Instead of sending probe requests, and looking at where they end up, I simply take the probe request results and record peer UIDs from the nodes the probe request passed through. With a moderate number of probe requests (I'm continuing to do 600 requests over 1 hour), each probe hits 10+ nodes, each with a moderate number of peers. This gives me a fairly good lower bound on the network size -- assuming that I got a record of every node on the network is unrealistic, but not grossly so. So far I'm taking data manually; I'll upload a script soon. I have full data from 20091113 (two different samples), 20091114, 20091116, and 20091117. Ignoring the second sample from the 13th, that gives four days of data. Those four samples gave network sizes of 3587, 3873, 3401, and 3550 respectively. (Time of collection varied, so there are some time of day effects here.) Among the four samples, there are a total of 6487 unique nodes. Of those, 1509 appear in all four samples, 909 in 3 samples, 1579 in 2 samples, and 2490 in only one sample. That says to me that approximately 38% of users are "occasional" users, who either only run their node some of the time, or install, run briefly, and then uninstall. A further 23% are dedicated users -- they have their nodes on all the time. The remaining 38% (in 1 or 2 samples) I'll call "regular" users -- they frequently have their node running, but not always. Obviously, these classifications are very rough. I'd say a 1:2:2 ratio is probably a reasonable guess, but it could still be rather far off. I need to take data more regularly and for a longer period of time before any serious conclusions can be drawn. However, I am comfortable saying the following: Freenet has at least 4000 semi-regular or regular users (probably meaningfully more). Freenet probably has between 8000 and 12000 total users (the upper bound I'm far less certain of -- if a lot of people only run Freenet for an hour or two per day, it could be far higher). At most, about a third of users run their node 24/7; the actual number is probably well under that. I think this has several practical implications. First, we need to be working on data retention more, with a focus on retention despite low-uptime nodes. (See bugs 3495, 3514 for a start on that. 2933 should also help. 3637/3639 and the like address more general routing issues; that should help as well.) Second, we need to figure out how to get these low-uptime nodes back onto the network, *and connected usefully*, so that the data they have can be found (and to improve the performance for such users). (See 3583 and related bugs for one approach.) And, finally, we have the general problem of getting (and keeping!) more users. Evan Daniel _______________________________________________ Devl mailing list [email protected] http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
