[Blueprint servercloud-p-move-ec2-mirrors-to-s3] Move EC2 mirrors to S3

Ben Howard Mon, 31 Oct 2011 06:30:32 -0700

Blueprint changed by Ben Howard:

Whiteboard changed:
  Rationale:
  S3 is a faster, more scalable technology that allows us to reduce costs and 
increase availability for EC2 users.
  
  Assumption:
    * Running on-EC2 mirrors is expensive and presents availability challenges.
    * on-EC2 mirrors have limited bandwidth and thus speeds can be affected by 
load
    * S3 has very high availability
    * S3 intra-region bandwidth is free
    * on-EC2 mirrors results in bandwidth charges when users use a mirror 
outside the availability zone
    * S3 is extremely fast. Generally speeds to S3 access are near-native
    * Amazon has asked for us to host our mirrors on S3
    * Uploads to S3 are free
  
  A prototyped-S3 mirror will be shown at UDS.
- 
- ---
- 
- Hurdles:
- https://bugs.launchpad.net/ubuntu/+source/apt/+bug/882832 - S3 does not 
accept "+"'s when fetching files.
-   * May need to change APT meta-data to remove "+"'s and replace with "%2B"
-   * May need to maintain a separate meta-data for S3 buckets and re-sign
  
  ----
  
  Q. 'LAN' bandwidth access to the region mirrors is currently free already, 
no? -- Daviey
  A. All upload and intra-zone transit is free.
  
  Q. How do you keep people out of EC2 from access?
  A. Bucket policies enable restricting access to the Amazon CIDR address.
  
  Comments:
  - Can this whiteboard be pre-filed with some examples of how others have 
implemented this?
     *  AFAIK, there is only one implementation of S3 buckets. Most examples 
that I know of us S3 as a storage backend, while having an EC2 instance front 
the S3 storage. Amazon Linux AMI is the only pure S3 backend/frontend solution.
  
  - How would this work?
    * There are a number of ways to push out a repository using existing tools. 
s3fuse (albeit rather buggy) allows you to mount a s3 bucket as file system. 
s3cmd allows for local to remote synchronization. However, in order to build 
something that scales well, you need something that uses multiple connections 
and pushes multiple files at once (i.e. threading and boto python library)
     * Example syncronization code can be found at 
lp:~utlemming/+junk/s3repo_tool (this code will be used to populate the 
prototyped-S3 mirror for UDS.
  
  --
  
  Examples:
  Amazon Linux AMI currently uses S3 for backing the mirrors. The design is 
more or less:
      - One bucket per region.
      - Buckets are named off of the DNS CNAME, i.e. 
"us-east-1.ec2.archive.ubuntu.com"
      - a method of pushing pristine repositories up


-- 
Move EC2 mirrors to S3
https://blueprints.launchpad.net/ubuntu/+spec/servercloud-p-move-ec2-mirrors-to-s3

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs

[Blueprint servercloud-p-move-ec2-mirrors-to-s3] Move EC2 mirrors to S3

Reply via email to