Re: Ditching our mirror system for an inferior solution? (was: Re: About Testing the SourceForce Mirror of AOO 3.4)

Eberhard Moenkeberg Sat, 14 Apr 2012 11:49:50 -0700

Hi,

On Fri, 13 Apr 2012, Peter Pöml wrote:

Am 03.04.2012 um 18:17 schrieb Roberto Galoppini:

We at SourceForge have worked the last ten days to line-up dedicated
infrastructure (including CDN services) to support the upcoming AOO
download serving test.


I can hardly believe reading this!


Me too. What an ignorance of proven and waiting mechanisms.

What's going on? We have an existing (and well working) mirror network,that handles any required load just fine. It's proven and time-tested.It has survived all releases with ease. By all calculation, and bypractical experience, the combined upload capacity of the mirrors issufficient to satisfy the peak download demand as well as the sustaineddemand. By the way, the "peak download demand" doesn't really differ alot from the day-to-day download demand, contrary to public belief. Themirrors are numerous and spread around the world, and the chance of aclient being sent to a close and fast mirror is good - better than witha handful of mirrors as is the case with the Sourceforge mirror network.Sourceforge specializes in something different - providing a myriad ofsmall files by a set of specialized mirrors. "Normal", plain simplemirrors can't take part in this network as far as I can tell.


Yes. I had tried to help with ftp5.gwdg.de - impossible "unconditionally".

Even though the network was considerably extended a few years ago, from10 (under 10?) to >20 mirrors, this is still a small number of mirrors.(Even though these are power-mirrors, but those are part of our existingmirror network just as well.)
With our mirror network, mirrors can mirror partial content, so they canprovide what's important in their region, like certain language packsonly. This greatly increases the likelyhood of finding mirrors in remoteareas, that don't have hundreds of gigabytes to spare. It's alsounnecessary that mirrors carry old releases that are infrequentlydownloaded. Mirrors can run whatever HTTP software they prefer, not onlyApache httpd, or even FTP servers. Mirrors can decide to offer mirroringonly in their network/autonomous system/country to limit the share ofrequests they get, and from where they get it. Many mirrors don't havegood international connectivity, but can be used well with usnevertheless. We provide cryptohashes, Metalinks, even P2P links, allfully automatically. That's very important for these unusually largefiles. Downloading without error correction is not fun. We selectmirrors by GeoIP, but also by geographical distance as well as networktopology, whatever gives a close match, and we already support IPv6.
It has taken some years to build all this, and a lot of the featureswere triggered directly by the work on the OpenOffice.org redirector.Built for OpenOffice.org
The software is the one kind of work that went into it, finding andcollecting mirrors the other thing, building trust and lastingrelationship. A mirror network isn't built overnight.
I think there is a danger that the Apache mirror network is equated withthe OOo mirror network. This is a mistake in my view. The large filesthat we have are a totally different challenge. It's a huge differenceto download 6MB tarballs and 200MB files, both from the usersperspective ("why does my file not work, that I waited so long for!?")and from the mirrors perspective ("what are these 200 connections fromChinese IPs on my mirror server!?"). It is important to be able to givemirrors different weight, because they differ vastly in theircapabilities, which can range from 4GBit bandwidth down brittle to50Mbit somewhere else. Even inside an "Internet country" like Germanyyou'll have differences of 100 MBit to multiple Gbit, and you want toutilize the bandwidth well. We have this working well!

I can confirm this, I have watched the growing "intelligence" ofMirrorBrain from the beginning.

OpenOffice.org used a software called "Bouncer" before switching toMirrorBrain, which was one of the simpler solutions. I think everybody(who has been in the project a few years) will agree that we don't wantto go back.

Surely. The OpenOffice step from bouncer to mirrorbrain was all overagreed a performance and quality step.

BTW, dear Apache people, I am the one that helped StarOffice Hamburg topublish their first opensource release - maintainer of ftp.gwdg.de since20 years.

So I see that Sourceforge wants to beef up their network by renting aContent Delivery Network (CDN). Is that needed? yes, because they don'thave enough bandwidth in mirrors. Is that a good idea? I don't think so,but I'm biased, because 1) I don't like advertisements and 2) I'mstrongly rooted in the mirror community with both legs.

Didn't mirrorbrain lately help Novell to save a lot of money theyregularly had spended to Akamai before? I guess it was this way.

In the mirror community, there is a kind of self esteem among the moreambitious mirror admins: they believe that stepping in of commercialCDNs is not needed to handle even peak download demand of the mostpopular Open Source software. And they work hard for it.

Yes, we do. All mirror admins love to see their lines full. That is thetemporary excitement we are struggling for. Mirrorbrain can give us thispicture at the spot moments without frustrating any single user.

Together, we have proven that the help of commercial CDNs is *not*needed, both with OpenOffice.org and with OpenSUSE.org. Mirrors haveserved > 20 GByte per second together. The bandwidth is there! (In thepast, Akamai was used during release peaks with OpenSUSE.org, so I havebeen there, and also got interesting insight and numbers there.)
I tried the currently configured download fromhttp://www.openoffice.org/download today (from a real crappy end userbox ;). It was slow and didn't start downloading immediately, but showeda page full of advertisement that didn't have any relation toOpenOffice.org, wanted to open a popup (MS IE said that and blocked it)

Hey, Peter, you and MS IE - what's going on? Are you letting others todrive you crazy?

and when the download started, it came from the Swiss mirror, but I'm inGermany! What's that? Thrown 3 years back in time? Sub-optimal. (I canguess who pays for the CDN that is rented to help out: advertising.)
Do you really want to ditch what we have built? Ditching the system thatimproved downloading OpenOffice.org in the farthest corners of theworld? Exchanging it against a handful of Sourceforge mirrors, and 250Apache mirrors, many of which lack the capability? Some are big, butmany will be far from having the bandwidth to deliver large files.
Something that Apache's mirror system also can't do is sending me to mylocal mirror (my very ISP in my city runs a mirror, and my home IP is intheir netblock). Apache mirror system sends me to *any* mirror in mycountry, while our current solution recognizes the network topology andlets me download from the local mirror. Especially with large files,that's very nice both for the ISP and for me as user. Sourceforge cantheoretically do this (because they use a part of MirrorBrain for thatpurpose!) but don't have enough mirrors to play this out. This is notonly useful with single ISPs, if they have a mirror; it's also usefulwith autonomous systems (AS) of networks that share a backbone, likemost German universities in AS680 here in Germany.


The german university network (DFN-Verein, some members already are

"producing" 10 gbit) was the base infrastucture for the openofficespreading (and staroffice before, and is now already with libreofficetoo).

Please don't neglect this chance for the Apache Foundation. It clearly isoffered (and - regarding ftp.gwdg.de and many more - since the beginningof Apache practized).

So we will have a *technically inferiour* solution in the future? That'snot the Apache way, is it?
I have been told more than once, on this list, that "it will be theApache mirror system and nothing else". I didn't understand the reasons(except for policy, no special treatment for individual projects), butit won't work that way IMO.
Now it seems to me that the Apache mirror system seeked the help ofSourceforge.net. If that means that some doubts crept up, then I sharethose doubts. But I don't see Sourceforge.net as the solution either, asexplained above. They have their merits, and I like their dedication andthe specialized system they've built (with features that I'm enviousof!), but I think our existing solution is better suited. And not onlythat, IMO it is a very important prerequisite of being successful. Nowell-working downloads, no luck with distributing FOSS that consists oflarge files.


Dear Apache Foundation, please listen to Peter's words and use his work.

It will be a win for you - incredible that you did not realize thatalready by yourself. You are a "community product", and so you should helpto show that "the community" is autonomous.



Viele Gruesse
Eberhard Moenkeberg ([email protected], [email protected])

--
Eberhard Moenkeberg
Arbeitsgruppe IT-Infrastruktur
E-Mail: [email protected]      Tel.: +49 (0)551 201-1551
-------------------------------------------------------------------------
Gesellschaft fuer wissenschaftliche Datenverarbeitung mbH Goettingen (GWDG)
Am Fassberg 11, 37077 Goettingen
URL:    http://www.gwdg.de             E-Mail: [email protected]
Tel.:   +49 (0)551 201-1510            Fax:    +49 (0)551 201-2150
Geschaeftsfuehrer:         Prof. Dr. Ramin Yahyapour
Aufsichtsratsvorsitzender: Prof. Dr. Christian Griesinger
Sitz der Gesellschaft:     Goettingen
Registergericht:           Goettingen  Handelsregister-Nr. B 598
-------------------------------------------------------------------------

Re: Ditching our mirror system for an inferior solution? (was: Re: About Testing the SourceForce Mirror of AOO 3.4)

Reply via email to