On Tue, Jun 7, 2011 at 6:35 PM, Christian Grobmeier <grobme...@gmail.com> wrote:
>> 300000 downloads per day or per month?
>>
>> 52TB per month is still a lot...
>
> per day.
> Look at this chart:
> http://marketing.openoffice.org/marketing_bouncer.html

TL;DR: these bandwidth numbers are not actually that scary or
important -- the OOo infrastructure does not serve up all that
traffic, the many mirrors do, and traffic from the OOo infrastructure
to the mirrors is very efficient since it efficiently pushes out new
releases (with rsync). The big work in mirroring is setting up the
process and shepherding the mirroring community.

OOo has 118 mirrors. Apache has 294 mirrors. There is significant
overlap -- a lot of sites that mirror apache downloads already mirror
OOo downloads and vice versa.

  http://download.services.openoffice.org/mirrors/all.html
  http://www.apache.org/mirrors/

Note that many of the mirrors of OOo and of apache also mirror large
amounts of other open source software.

OOo uses mirrorbrain [1]. Apache uses some CGI scripts to accomplish a
lot but not all of the same functionality.

The set of data ASF pushes to its mirrors is somewhere downwards of
50GB [4]. The set of data OOo pushes is somewhere downwards of 300 GB
[2]. It doesn't seem like a good idea to push an additional 300GB to
all existing apache mirrors: that isn't quite what they signed up for.

As I understand it Oracle wants to transition out of maintaining the
OOo infrastructure eventually. So if OOo gets accepted into the
incubator, it seems like the smart approach would be to duplicate the
existing OOo mirrorbrain installation onto some apache hardware, with
the people that look after the OOo and apache mirrors figuring out the
specific details.

Will be a fair bit of work I imagine so definitely needs volunteers to
step up and all that, but nothing particularly scary I think, assuming
the existing OOo mirror maintainers [3] help out. Without their help
it will be much harder to figure out how to do things! If most of the
people that worked on mirroring at OOo are now at TDF (and looking at
mail archives it seems that might be true [7]), better be extra nice
to them TDF folks :-)

Long term, if there's people to do the work, one could imagine
updating the custom ASF mirror infrastructure to use mirrorbrain which
seems like a cool tool...but that is a _lot_ of work because the
existing CGI scripts integrate into the download pages of every apache
project.


cheers,


Leo

PS: LibreOffice also uses mirrorbrain [5], having about 65 mirrors.
They required only 15GB for a mirror though [6], not the OOo footprint
200GB. Sounds much more reasonable...

[1] http://mirrorbrain.org/
[2] http://wiki.services.openoffice.org/wiki/New_OOo_Mirror_Structure
http://distribution.openoffice.org/files.html
http://wiki.services.openoffice.org/wiki/Mirrors_Project
[3] http://openoffice.org/projects/distribution/lists
[4] http://www.apache.org/info/how-to-mirror.html
[5] http://download.documentfoundation.org/mirrors/all.html
http://wiki.documentfoundation.org/Mirrors
[6] http://download.documentfoundation.org/mirroring.html
[7] http://blog.gmane.org/gmane.comp.documentfoundation.mirrors

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Reply via email to