*We have finally assessed the capacity and capabilities needed to serve the surge of Apache OpenOffice 3.4 release-time traffic. Before we could commit to delivering the full download volume, we wanted to produce a vetted plan, including a clear timeline and backing technical implementation plans.
First let me quickly recap my understanding of the problems we are trying to solve for: - Apache OpenOffice 3.4 will be released in mid April and we want to assure capacity to handle that traffic both in terms of bandwidth and simultaneous connections. - The Apache OpenOffice project would benefit to be able to promote the release heavily without worrying about capacity. Given those needs and the fact the Apache Infrastructure team said they’d welcome our assistance, we at SourceForge think we can help and that there would be mutual benefit. What we are proposing is an elaboration of Joe’s ‘hybrid’ approach: - Both AOO and SF.net mirror networks would be used to provide download capacity for the 3.4 release. - SourceForge.net would be the “recommended default download” on the website. - Apache Mirror network would be an alternate download option. - Apache OpenOffice team and Infrastructure team will maintain control of the the auto-update URL’s and possibly follow Rob’s suggestion to stagger automatic updates. SourceForge.net will manage the full burst capacity for web-based downloads through our global network of OSS mirrors, global CDN network(s) and cloud file server providers. Using these resources, we anticipate our capacity is well above the expected delivery requirements for the upcoming release. In addition to basic download capacity, SourceForge will provide detailed download statistics, which will support future product, infrastructure and marketing plans. We will commit to make stats available on the SourceForge.net website and provide stats delivery APIs. We are able to capture initiated downloads, not just page views, and will provide them split by geography and operating system. We’re also willing to consider additional stats needs. Proposed Timeline: - Immediately: SourceForge sets up Apache Infra team with credentials on an AOO mirror project in sf.net - First week: SourceForge updates contracts with CDN and other providers to handle full AOO peak release traffic - Second Week: AOO Infra team works with sf.net operations team to ramp traffic to sf.net in a controlled way in order to gather statistical data, verify assumptions, and give the Apache infrastrucure team time to verify our capacity. - 1-2 days post test: SF.net analyzes traffic data, assures that our assumptions about geographic mix, and interactive vs automated download mix, are valid and we can do this in a fiscally responsible way. - 1-2 days post test: AOO infrastructure team analyses traffic data, lets sf.net team know any additonal data needs, and validates that the system will work for them Once everything is tested and vetted on both sides, we will need to make a CDN bandwidth commit, and would like the AOO team to commit to notifying us 30 days prior to shutting down the flow of traffic, so that we can update our contracts and avoid penalties. We believe that the combination of SF.net mirrors, and CDN based burst capacity will provide a fast and stable download experience for AOO users, and **will allow the AOO team to publicize the release in an agressive manner.* On Wed, Mar 21, 2012 at 10:55 AM, Mark Ramm <m...@geek.net> wrote: > >And finally: would you have any objection to us using a mix of fixed >> >mirrors, elastic file delivery services (like s3), and commercial CDN >> >service to handle spikes in download gracefully and assure that global >> >users get good download performance when local mirrors are overloaded >> >or not available? >> >> >> No, we may even be willing to budget some amount for this purpose. >> Cost estimates would be appreciated as our budget numbers for FY2012 >> need to be finalized next week. > > > Sorry that it's taken a bit to get back to you. We are working on > getting pricing from a variety of providers, and my personal goal is to > find a way for us to fund the CDN and S3 costs, and to provide this to the > community as a free (as in beer) service. > > Thanks everybody who provided anecdotal information on historical traffic > peaks, and particularly for the steady state run rate information. That > has been invaluable as we talk with vendors about the suplemental capacity > we need to acquire to handle peak loads. > > There's one key input to figuring out if I can pay for all of this out of > ad revenue, which is what percentage of the daily downloads are expected to > come from auto-updater software or other non-browser scripts? Would that > traffic still be pointed primarily at AOO owned domains and mirrors, or > would we be handling some of that from the sf.net service? > > And finally, I'd also be interested in finding out if you know percentage > of traffic is from North America vs the rest of the world because some > providers give very different rates for different locations, for example > Cloudfront publishes $0.02/gb US and $0.12/gb in South America. > > Thanks again for to everybody who helped with data so far! > > --Mark Ramm > -- *Mark Ramm* Director of Engineering, SourceForge Developer Experience phone: 734-707-7266 email: m...@geek.net skype: geekmark