Bug#438179: Processed: destruction of round-robin functionality is fucking up our mirrors and making Debian suck for many people, hence fixing this is a release-critical wish

2007-12-18 Thread Josip Rodin
On Mon, Dec 17, 2007 at 01:15:10PM -0500, Noah Meyerhans wrote:
   If it were possible, (temporarily) adding a securty.d.o mirror in the
   0.0.0.0 - 127.255.255.255 range would be helpful [...]
   Obviously finding a host that can deal with 13.53 MB/s of sustained
   traffic with a useful IP address to temporarily test this behaviour
   might be difficult. :)
  
  Quite.
 
 We might actually be able to help with this at MIT, where steffani is
 already hosted.  MIT controls 18.0.0.0/8 and, while the lab where
 steffani is hosted doesn't usually use net 18 address space, we do have
 some available.  I'd have to double check with the network admin, but it
 should be trivial to allocate an address for this purpose.  Would it
 work to simply bring up a separate interface on steffani?

Did you mention this to DSA, they'd have to configure it at this end?
File an RT ticket?

-- 
 2. That which causes joy or happiness.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#438179: Processed: destruction of round-robin functionality is fucking up our mirrors and making Debian suck for many people, hence fixing this is a release-critical wish

2007-12-18 Thread Josip Rodin
On Tue, Dec 18, 2007 at 03:31:13PM +0100, Martin Schulze wrote:
 If it were possible, (temporarily) adding a securty.d.o mirror in the
 0.0.0.0 - 127.255.255.255 range would be helpful [...]
 Obviously finding a host that can deal with 13.53 MB/s of sustained
 traffic with a useful IP address to temporarily test this behaviour
 might be difficult. :)

Quite.
   
   We might actually be able to help with this
  
  Did you mention this to DSA, they'd have to configure it at this end?
 
 I may be dumb, but what would another IP address for an existing
 security mirror that is already the preferred one buy us?  I would
 expect that a second machine in the 128.* zone would be able to
 spread the load better.

See above - just temporary for purposes of demonstration.

Though I think that the case is already clear, but to eliminate any doubt
whatsoever...

-- 
 2. That which causes joy or happiness.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#438179: Processed: destruction of round-robin functionality is fucking up our mirrors and making Debian suck for many people, hence fixing this is a release-critical wish

2007-12-18 Thread Josip Rodin
On Mon, Dec 17, 2007 at 07:51:18PM +0100, Martin Schulze wrote:
  I've asked DSA for server-status already, and mentioned the logs too,
  we'll see (they haven't replied yet).
 
 Server status is configured on localhost.

OK, so I started measuring that too, and the rates for the last half a day
or so are:
* villa: 20.4 rps, 6.18 Mbps
* lobos: 18.9 rps, 6.23 Mbps
* steffani: 40.0 rps, 15.92 Mbps

The ratios for both parameters are matching the general bandwidth ratios,
so the measurements should be correct.

-- 
 2. That which causes joy or happiness.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#438179: Processed: destruction of round-robin functionality is fucking up our mirrors and making Debian suck for many people, hence fixing this is a release-critical wish

2007-12-18 Thread Martin Schulze
Josip Rodin wrote:
 On Mon, Dec 17, 2007 at 01:15:10PM -0500, Noah Meyerhans wrote:
If it were possible, (temporarily) adding a securty.d.o mirror in the
0.0.0.0 - 127.255.255.255 range would be helpful [...]
Obviously finding a host that can deal with 13.53 MB/s of sustained
traffic with a useful IP address to temporarily test this behaviour
might be difficult. :)
   
   Quite.
  
  We might actually be able to help with this at MIT, where steffani is
  already hosted.  MIT controls 18.0.0.0/8 and, while the lab where
  steffani is hosted doesn't usually use net 18 address space, we do have
  some available.  I'd have to double check with the network admin, but it
  should be trivial to allocate an address for this purpose.  Would it
  work to simply bring up a separate interface on steffani?
 
 Did you mention this to DSA, they'd have to configure it at this end?

I may be dumb, but what would another IP address for an existing
security mirror that is already the preferred one buy us?  I would
expect that a second machine in the 128.* zone would be able to
spread the load better.

Regards,

Joey

-- 
We all know Linux is great... it does infinite loops in 5 seconds.
-- Linus Torvalds

Please always Cc to me when replying to me on the lists.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#438179: Processed: destruction of round-robin functionality is fucking up our mirrors and making Debian suck for many people, hence fixing this is a release-critical wish

2007-12-18 Thread Martin Schulze
Josip Rodin wrote:
 On Tue, Dec 18, 2007 at 03:31:13PM +0100, Martin Schulze wrote:
  If it were possible, (temporarily) adding a securty.d.o mirror in 
  the
  0.0.0.0 - 127.255.255.255 range would be helpful [...]
  Obviously finding a host that can deal with 13.53 MB/s of sustained
  traffic with a useful IP address to temporarily test this behaviour
  might be difficult. :)
 
 Quite.

We might actually be able to help with this
   
   Did you mention this to DSA, they'd have to configure it at this end?
  
  I may be dumb, but what would another IP address for an existing
  security mirror that is already the preferred one buy us?  I would
  expect that a second machine in the 128.* zone would be able to
  spread the load better.
 
 See above - just temporary for purposes of demonstration.
 
 Though I think that the case is already clear, but to eliminate any doubt
 whatsoever...

Ok, no objection.  Noah, please allocate such an IP, configure an alias
eth, keep it for a while and remove it again after max. 1 month (hope that
gives Joy enough time).

In the long term, it may be helpful to acquire another US (uni) based
security mirror.  Joy, if you know of a site that had good bw and is
willing to sponsor host + bandwidth, please let me know.

Regards,

Joey

-- 
We all know Linux is great... it does infinite loops in 5 seconds.
-- Linus Torvalds

Please always Cc to me when replying to me on the lists.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#438179: Processed: destruction of round-robin functionality is fucking up our mirrors and making Debian suck for many people, hence fixing this is a release-critical wish

2007-12-18 Thread Noah Meyerhans
On Tue, Dec 18, 2007 at 04:04:00PM +0100, Martin Schulze wrote:
 Ok, no objection.  Noah, please allocate such an IP, configure an alias
 eth, keep it for a while and remove it again after max. 1 month (hope that
 gives Joy enough time).

This is done.  Steffani now has interface eth0.64 with address 18.24.0.11

noah



signature.asc
Description: Digital signature


Bug#438179: Processed: destruction of round-robin functionality is fucking up our mirrors and making Debian suck for many people, hence fixing this is a release-critical wish

2007-12-17 Thread Martin Schulze
Josip Rodin wrote:
 
 (Please Cc: any responses.)
 
 On Mon, Dec 17, 2007 at 03:10:24PM +1000, Anthony Towns wrote:
  Interesting that it got somewhat more balanced.
 
 It looks like an effect of the weekend ending - more machines in the
 respective netblocks waking up? I checked again a few moments ago,
 and last day's statistic shows that steffani is getting some 55% of traffic.
 
  It'd be really helpful if we could get some logs from the above hosts on
  what IPs are accessing each host. Just the first byte of the IP address,
  and a number of connections (or bandwidth usage) would be enough.
 
 I've asked DSA for server-status already, and mentioned the logs too,
 we'll see (they haven't replied yet).

Server status is configured on localhost.

Regards,

Joey

-- 
Ten years and still binary compatible.  -- XFree86



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#438179: Processed: destruction of round-robin functionality is fucking up our mirrors and making Debian suck for many people, hence fixing this is a release-critical wish

2007-12-16 Thread Anthony Towns
On Sun, Dec 16, 2007 at 03:45:37AM +0100, Josip Rodin wrote:
 After around 11 hours, we've had:
 * villa 4.29 MB/s
 * lobos 3.91 MB/s
 * steffani 14.86 MB/s

The rule9 prediction was:

A: 000.000.000.000-127.255.255.255: steffani, villa, lobos
B: 128.000.000.000-191.255.255.255: steffani
C: 192.000.000.000-212.211.131.255: villa, lobos
D: 212.211.132.000-212.211.132.127: villa
E: 212.211.132.128-212.211.132.255: lobos
F: 212.211.133.000-255.255.255.255: villa, lobos

Class A is pure round-robin, so we can ignore rule9 and assign 1/3 of its
traffic to each host.

The difference between villa and lobos is aiui due not only to the difference
between the D and E IP ranges, but also because the round-robin ordering of
the hosts is [lobos, steffani, villa], which means that since rule9 happens
after round-robin, you get orderings:

lobos, steffani, villa - [lobos, villa], steffani
steffani, villa, lobos - [villa, lobos], steffani
villa, lobos, steffani - [villa, lobos], steffani

A/3 + B   = 14.86 MB/s (steffani)
A/3 + 2C/3 + 2F/3 + D = 4.29 MB/s  (villa)
A/3 + C/3 + F/3 + E   = 3.91 MB/s  (lobos)

If you assumethe 212.211.132.0/24 traffic is negligible, and thus D = E = 0,
then subtracting lobos from villa gives:

   C/3 + F/3 = 4.29 MB/s - 3.91 MB/s = 0.38 MB/s

And thus filling in for lobos, we get A/3 = 3.91 MB/s - 0.38 MB/s =
3.53 MB/s.

Going back to steffani, that gives B = 14.86 MB/s - 3.53 MB/s = 11.33 MB/s,
and we thus have:

A = 10.59 MB/s
B = 11.33 MB/s
C + F =  1.14 MB/s
D =  0MB/s (by assumption)
E =  0MB/s (by assumption)

Which gives us 23.06 MB/s which was the total of what we started with, yay.

Note that 192.168.*.* addresses are in class C, so can only possibly make
up just under 5% of our traffic, which seems pretty negligible. 10.*.*.*
addresses are in class A, and 172.16... class addresses are in class B. I
would've thought there wouldn't be significantly more of those than for
the 192.168.*.* private addresses though.

Anyway, hope that's of some use.

Cheers,
aj



signature.asc
Description: Digital signature


Bug#438179: Processed: destruction of round-robin functionality is fucking up our mirrors and making Debian suck for many people, hence fixing this is a release-critical wish

2007-12-16 Thread Josip Rodin
On Sun, Dec 16, 2007 at 06:28:39PM +1000, Anthony Towns wrote:
 On Sun, Dec 16, 2007 at 03:45:37AM +0100, Josip Rodin wrote:
  After around 11 hours, we've had:
  * villa 4.29 MB/s
  * lobos 3.91 MB/s
  * steffani 14.86 MB/s
 
 The rule9 prediction was:
 [...]
 Anyway, hope that's of some use.

Thanks for doing that.

FWIW, the last reading is:
* villa 5.33 MB/s
* lobos 4.92 MB/s
* steffani 14.58 MB/s

Anyway, in light of all this, please comment again on those old conclusions:

 Which leaves as conclusions:
   - there's no available evidence of a problem from Debian server logs

This should be fixed now, for security.d.o at least.

I can go ask people maintaining servers in the other rotations for data
if you think it's necessary, but it'll take some time.

   - the understanding of the issue we've got so far implies that this
 would only cause fairly minor load balancing problems for the current
 Debian hosts

This disparity doesn't classify as a minor load balancing problem when we
see one third of a rotation doing more than twice as much as other two
thirds.

It has been hard enough to get people to volunteer their sites into popular
round-robins when we would promise they'd get a fair share of traffic...

   - ftp.us, http.us and security.d.o all seem to still be functioning
 from a user's perspective

They are functioning now, but the higher the probability that we'll burden
some sites with excess traffic, the higher the probability that the quality
of service will suffer, and higher the probability that those sites will
drop out of the rotation, and then others can start getting unexpectedly
large amounts of traffic (after redistribution), then they might drop out,
and then rinse  repeat...

-- 
 2. That which causes joy or happiness.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#438179: Processed: destruction of round-robin functionality is fucking up our mirrors and making Debian suck for many people, hence fixing this is a release-critical wish

2007-12-15 Thread Josip Rodin
On Sat, Dec 15, 2007 at 03:38:01PM +0100, Josip Rodin wrote:
 Steve pointed me to http://lists.debian.org/debian-ctte/2007/12/msg00033.html

BTW, if anyone reading has some time to do the math again (hi aj :) it
would be good it was recalculated for ftp.us.debian.org again, which
we've changed in the meantime. That round-robin is now composed of:

% host ftp.us.debian.org   
ftp.us.debian.org   A   35.9.37.225
ftp.us.debian.org   A   64.50.236.52
ftp.us.debian.org   A   128.30.2.36
ftp.us.debian.org   A   204.152.191.39

At the same time, I should just get off my ass and go find the resolving
code in the applications (apt-get, rsync, ...), and then just run it to see
how it works...

-- 
 2. That which causes joy or happiness.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#438179: Processed: destruction of round-robin functionality is fucking up our mirrors and making Debian suck for many people, hence fixing this is a release-critical wish

2007-12-15 Thread Josip Rodin
On Thu, Nov 29, 2007 at 10:39:13PM +0100, Josip Rodin wrote:
 I've noticed that security.debian.org, which is composed of three hosts,
 appears to be resolved by apts so that only one of them, steffani, gets
 picked. I can't substantiate this with exact log evidence yet (there's an
 outstanding RT ticket for that), but the system load on that machine is
 consistently high and network speed low, whereas the other two machines
 are practically idling in comparison.

Steve Langasek happened to ask me today to find some hard facts regarding
the round-robin functionality, so I stopped procrastinating :) and finally
set up tracking of the security.d.o machines' /proc/net/dev
(I had previously asked DSA to share their own stats, but they chose to
install rrdtool for me to use instead).

The RRDs are now being generated every minute (in my home dir on those
machines), and I have prepared a set of rrd.cgi instances that graph them
(but not on those machines, because of various intricate prerequisites).

Right now, with a small sample of just some 110 minutes, the machine eth0s
have been averaging:
* villa: 4.69 MB/s
* lobos: 4.08 MB/s
* steffani: 15.14 MB/s

Steve pointed me to http://lists.debian.org/debian-ctte/2007/12/msg00033.html
And this is starting to match, although not as precisely. But, again, this
is too small a sample for a variety of reasons, so let's give it some time.

I'd appreciate it if someone would send a reminder in a day or two so that
I send over the data then.

-- 
 2. That which causes joy or happiness.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#438179: Processed: destruction of round-robin functionality is fucking up our mirrors and making Debian suck for many people, hence fixing this is a release-critical wish

2007-12-15 Thread Josip Rodin
On Sat, Dec 15, 2007 at 03:43:22PM +0100, Josip Rodin wrote:
 At the same time, I should just get off my ass and go find the resolving
 code in the applications (apt-get, rsync, ...), and then just run it to see
 how it works...

I edited apt's methods/connect.cc:Connect() function after the line
saying struct addrinfo *CurHost = LastHostAddr; to do:

   char hbuf[NI_MAXHOST], sbuf[NI_MAXSERV];
   getnameinfo( CurHost-ai_addr, CurHost-ai_addrlen,
hbuf, sizeof(hbuf),   
sbuf, sizeof(sbuf),
NI_NUMERICHOST );
   char *msg;
   sprintf(msg, \nUsing IP address: %s, with service %s\n, hbuf, sbuf);
   return _error-Error( msg );

Then I ran it with a minimal sources.list file including only security.d.o:

% for i in $(seq 1 30); do LD_LIBRARY_PATH=bin bin/apt-get -o 
Dir::Etc::SourceList=$PWD/sources.list -o Dir::State::Lists=$PWD/foo -o 
Dir::Bin::Methods=$PWD/bin/methods update 2/dev/null | grep Using | sort -u; 
done | cut -d: -f2 | cut -d, -f1 | sort | uniq -c | sort -n
  8  212.211.132.32
 10  212.211.132.250
 12  128.31.0.36
% for i in $(seq 1 30); do LD_LIBRARY_PATH=bin bin/apt-get -o 
Dir::Etc::SourceList=$PWD/sources.list -o Dir::State::Lists=$PWD/foo -o 
Dir::Bin::Methods=$PWD/bin/methods update 2/dev/null | grep Using | sort -u; 
done | cut -d: -f2 | cut -d, -f1 | sort | uniq -c | sort -n
  7  128.31.0.36
 11  212.211.132.250
 12  212.211.132.32
% for i in $(seq 1 30); do LD_LIBRARY_PATH=bin bin/apt-get -o 
Dir::Etc::SourceList=$PWD/sources.list -o Dir::State::Lists=$PWD/foo -o 
Dir::Bin::Methods=$PWD/bin/methods update 2/dev/null | grep Using | sort -u; 
done | cut -d: -f2 | cut -d, -f1 | sort | uniq -c | sort -n
  6  212.211.132.32
 10  212.211.132.250
 14  128.31.0.36
% for i in $(seq 1 30); do LD_LIBRARY_PATH=bin bin/apt-get -o 
Dir::Etc::SourceList=$PWD/sources.list -o Dir::State::Lists=$PWD/foo -o 
Dir::Bin::Methods=$PWD/bin/methods update 2/dev/null | grep Using | sort -u; 
done | cut -d: -f2 | cut -d, -f1 | sort | uniq -c | sort -n
  7  128.31.0.36
 11  212.211.132.32
 12  212.211.132.250
% for i in $(seq 1 30); do LD_LIBRARY_PATH=bin bin/apt-get -o 
Dir::Etc::SourceList=$PWD/sources.list -o Dir::State::Lists=$PWD/foo -o 
Dir::Bin::Methods=$PWD/bin/methods update 2/dev/null | grep Using | sort -u; 
done | cut -d: -f2 | cut -d, -f1 | sort | uniq -c | sort -n
  8  212.211.132.250
 11  128.31.0.36
 11  212.211.132.32

Screwy... let's try a larger sample:

% for i in $(seq 1 99); do LD_LIBRARY_PATH=bin bin/apt-get -o 
Dir::Etc::SourceList=$PWD/sources.list -o Dir::State::Lists=$PWD/foo -o 
Dir::Bin::Methods=$PWD/bin/methods update 2/dev/null | grep Using | sort -u; 
done | cut -d: -f2 | cut -d, -f1 | sort | uniq -c | sort -n
 30  212.211.132.250
 33  212.211.132.32
 36  128.31.0.36
% for i in $(seq 1 99); do LD_LIBRARY_PATH=bin bin/apt-get -o 
Dir::Etc::SourceList=$PWD/sources.list -o Dir::State::Lists=$PWD/foo -o 
Dir::Bin::Methods=$PWD/bin/methods update 2/dev/null | grep Using | sort -u; 
done | cut -d: -f2 | cut -d, -f1 | sort | uniq -c | sort -n
 27  212.211.132.250
 34  212.211.132.32
 38  128.31.0.36
% for i in $(seq 1 99); do LD_LIBRARY_PATH=bin bin/apt-get -o 
Dir::Etc::SourceList=$PWD/sources.list -o Dir::State::Lists=$PWD/foo -o 
Dir::Bin::Methods=$PWD/bin/methods update 2/dev/null | grep Using | sort -u; 
done | cut -d: -f2 | cut -d, -f1 | sort | uniq -c | sort -n
 32  212.211.132.250
 33  128.31.0.36
 34  212.211.132.32
% for i in $(seq 1 99); do LD_LIBRARY_PATH=bin bin/apt-get -o 
Dir::Etc::SourceList=$PWD/sources.list -o Dir::State::Lists=$PWD/foo -o 
Dir::Bin::Methods=$PWD/bin/methods update 2/dev/null | grep Using | sort -u; 
done | cut -d: -f2 | cut -d, -f1 | sort | uniq -c | sort -n
 30  212.211.132.250
 32  212.211.132.32
 37  128.31.0.36
% for i in $(seq 1 99); do LD_LIBRARY_PATH=bin bin/apt-get -o 
Dir::Etc::SourceList=$PWD/sources.list -o Dir::State::Lists=$PWD/foo -o 
Dir::Bin::Methods=$PWD/bin/methods update 2/dev/null | grep Using | sort -u; 
done | cut -d: -f2 | cut -d, -f1 | sort | uniq -c | sort -n
 31  212.211.132.32
 32  212.211.132.250
 36  128.31.0.36

That's a bit less screwy, although it's a bit more consistently tilting in
the direction of steffani.

On that note, let me update that other bit:

 Right now, with a small sample of just some 110 minutes, the machine eth0s
 have been averaging:
 * villa: 4.69 MB/s
 * lobos: 4.08 MB/s
 * steffani: 15.14 MB/s

After around 11 hours, we've had:
* villa 4.29 MB/s
* lobos 3.91 MB/s
* steffani 14.86 MB/s

That would be 64% for steffani... :/

-- 
 2. That which causes joy or happiness.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#438179: Processed: destruction of round-robin functionality is fucking up our mirrors and making Debian suck for many people, hence fixing this is a release-critical wish

2007-11-29 Thread Josip Rodin
On Thu, Nov 29, 2007 at 11:37:27PM +1000, Anthony Towns wrote:
 severity 438179 wishlist
 thanks
 
 On Tue, Nov 27, 2007 at 07:09:04PM +, Debian Bug Tracking System wrote:
  Processing commands for [EMAIL PROTECTED]:
   severity 438179 serious
  Bug#438179: Please provide a way to override RFC3484
  Severity set to `serious' from `wishlist'
 
 Josip, there's been absolutely no evidence that our mirrors are
 fucking up. If you've got some, please share it instead of taking your
 frustration out by playing with BTS serverities.

Sorry, I was actually re-reading the bug log without noticing that it was
no longer assigned to a package, rather it's now at the pseudo-package,
where the severity doesn't make all that much sense.

I've noticed that security.debian.org, which is composed of three hosts,
appears to be resolved by apts so that only one of them, steffani, gets
picked. I can't substantiate this with exact log evidence yet (there's an
outstanding RT ticket for that), but the system load on that machine is
consistently high and network speed low, whereas the other two machines
are practically idling in comparison.

I've also previously noticed that ftp.us.debian.org traffic seems to
concentrate too much on one host, too, ike.egr.msu.edu, but I've got even
less evidence there (that and other machines aren't under our control).

-- 
 2. That which causes joy or happiness.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]