[freenet-dev] Early thoughts on HTL success rate data

Evan Daniel Mon, 21 Sep 2009 15:36:18 -0700

Builds 1234-1236 included several changes to routing that we hope
improved things.  Specifically:
- The "loop fix": when node A is choosing how to route a request it
received from node B, when considering node C's FOAF locations, it
should ignore locations among C's peers that exactly match B's
location.  That is, A shouldn't try to route a request back where it
came from by an alternate route.  Similarly, A should ignore A's own
location, and the locations of any node A has already tried to route
to and failed (rejectedLoop, rejectedOverload, etc).
- FOAF tie breaking: When A is routing a request, if both B and C have
the same minimum FOAF distance from the request location (presumably
because they have a peer in common), A should tie break on which has
the better immediate location.
- A change to ULPR propagation that shouldn't have much direct impact,
but may have had an indirect impact.

This is also the first time we've made network-level changes since I
started collecting the hourly HTL success rate data; hopefully, this
gives me something to analyze and determine in detail whether the
changes made an improvement.

First, the caveats: 1236 is not yet self-mandatory, so there may
still be some upgrade disruption happening. This is only data from my
node; edt has been sending me his hourly stats data as well, but I
don't yet have any post-1236 data from him, so I haven't included any
of it. There are known local changes at my node (fewer persistent
requests queued) that correlate, because my node lost its node.db4o
with the 1236 upgrade (though I do attempt to model local request
count impact on success rates). I don't yet have enough data to know
if there are any weekly effects (though I have attempted to account
for time-of-day effects). And all I've done are regression tests; I
haven't yet done the various non-parametric statistics tests to
produce an actual p-value on whether the change made a difference. I
strongly suspect it did, but there's too much structure to my
residuals and too much non-normality to the data to be able to say
that from what I've done so far. Network load is meaningfully lower
on the new data as well, and I don't know how to explain that.

To summarize: these results are highly preliminary. Do *not* draw
conclusions from them. They are, however, interesting and promising.
I'll have better data soon, that we might be able to begin drawing
tentative conclusions from. But for now, I think the data are
intriguing enough that people would enjoy seeing them. I would be
*very* appreciative of comments or suggestions.

Now, the request: I'd like more people to send me their success rate
data! So far I only have one; that's rather disappointing, given how
many people want a faster Freenet. It's not hard; just turn on
loglevel normal, make sure you have plenty of space allocated to logs
so it doesn't drop old ones, and then run "zgrep HourlyStats
logs/freenet-*.gz > output.txt" and send me the results by email.
There's nothing non-anonymous in there. You don't have to remove
duplicates; I can handle that easily. There's no particular need for
you to worry about use of your node, internet connection, uptime, etc;
I'm trying to model that, and I should be able to basically average it
out. I'd like high-uptime nodes, but it's not required.

So, the preliminary results. All data is based on the total observed
CHK success rate (both local and remote; the % listed in the success
rate by htl box on the stats page). First, the raw success rate vs
HTL graph:
freenet:c...@u3evtva0k6wvfmcji-ww0-ybdmlkmyef~hyxd6hbf-o,mphkKgatjdGtpHjf70t7APtBy82eW-GHwPjWWnNQFW4,AAIC--8/rate_both.png
The success rates are rather noisy, but it looks at first glance like
the new data (magenta) shows higher success rates than the older data
(blue).

Next, I created a linear regression model that attempted to explain
success rates based on other factors. I included HTL, using 4th order
polynomial fit, since it showed a fair bit of curving; an exponential
or power model is probably better, but nonlinear regression is more
complicated. The high order polynomial should match such curves
reasonably well. I included a linear fit on local CHK blocks fetched,
on the assumption that local load probably has an impact. I included
a quadratic fit on total incoming CHK requests, as an indicator of
local / global network load. And lastly, I included a sinusoidal fit
(amplitude and phase, but not frequency) to time of day, and second
and third harmonics. Overall R^2 was 0.84, with high significance
values; all parameters were significant at the 1% level (most
dramatically better than that), with the exception of the third
harmonic of time of day.

(The model was built against the unified set of data; we'll look for
changes resulting from the new build in the residuals, rather than in
model coefficients.)

Having explained away a large fraction of the variation in the data,
we look at the model residuals:
freenet:c...@9unnkexj6-wqkqdc2pwmxqreyjykdnxq-ej46c41qac,Gsi0xqKxA2hC~3w9iwZm9S8IwcBhH4ljXH7jOAmtSpU,AAIC--8/resid_vs_htl.png
What we see here is the portion of the success rate not explained by
the model (the residuals). Positive numbers mean that the observed
success rate was higher than predicted by the model, negative means
lower. The lines shown are average residual by HTL; they present a
slightly clearer picture of what's going on than the clouds of data
points. Roughly, we see improved success rates at HTL 11-17, no
change at 6-10 and 18, and a slight decrease at 1-5. This is not
inconsistent with the hypothesis I had posed before releasing the
builds, which was that improved routing would improve the success
rates at all HTLs, but that there would be a secondary effect of
improved routing meaning that more "easy" requests succeed at high
HTL, and that therefore the remaining low-htl requests would be
"harder", reducing the success rate. (I predicted we would see
improvement at high htl, and a slight gain, no change, or slight drop
at low htl.)

Lastly, we look at a plot of model residuals vs success rate:
freenet:c...@kkiyz3enchahlen8d-bq8v0jfpjc818yol3vkxb-phc,JgzKJzaTsqV2iNiz-5OTl~Gt4AuKR0X-NH6NBHDCzD8,AAIC--8/residuals.png
For a multivariate regression, plotting residuals vs observed result
is a good way to get a visual picture of whether the model is doing a
good job. If the model is explaining all non-noise factors, the
residuals should show no vertical patterning (horizontal patterning
represents patterns in the collected data). We see significant
vertical patterning, of a similar structure on both sets of data. The
structure suggests that there are unexplained factors that influence
the success rate. Candidates include things like whether I was
running messaging apps (and therefore making those popular keys easy
to find), and how long my node had been up (and thus the population of
the recent requests cache, which I have set to a longer than default
lifetime). The similarity in residual structure suggests that this
model deficiency is probably not having a huge impact on whether our
conclusions are valid.

Looking at the model coefficients is mildly interesting. For example,
the time of day factor shows a maximum impact on predicted success
rate of +/- 1.6%. Success rates are good in the 2200-0200 UTC range,
and less good in the 1200-1700 range. The magnitude of this effect is
swamped by the individual data noise, but it is present. Other
effects modeled (local / network load) show similar magnitude impact.
We see a strong impact from the network load parameter (incoming chk
request count); it accounts for a change of +/- 9%. This is something
else that shows strong pre/post 1236 changes: the post-1236 data has
less network load. I don't know why this would be, but we can clearly
see it on this graph of residuals vs remote requests:
freenet:c...@vwlfek98dqwlyji6edcdivcvy7rw6y3dkijrmzratbm,cvdKwGJjuaZjYtI8gGSC~o6s6Ul1PZ1Y9p9y0buL2EE,AAIC--8/resid_vs_remote.png

This network load discrepancy is worth examining in more detail.
Lower global load might well account for the entire observed effect;
there might be no change from the new build. On the other hand, we
expect improved routing to reduce global load (though I would be
shocked if it reduced it as much as is observed). I don't know how to
explain this; clearly, more modeling work is required.

Evan Daniel
_______________________________________________
Devl mailing list
Devl@freenetproject.org
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

[freenet-dev] Early thoughts on HTL success rate data

Reply via email to