I've copied our current draft plan for the structure of the PyPI infra. I 
stress this is just a draft as it stands today and is not final.

On Sep 27, 2011, at 2:40 AM, Tarek Ziadé wrote:
> 
> == better infra ==
> 


The current plan is to have two primary load balancer VMs running Nginx acting 
as both balancer and SSL termination points. These will share a set of floating 
IPs using Heartbeat. Behind this will be the same Apache configuration 
currently in use (Apache serving static files and PyPI running as an FCGI 
script controlled by Apache) running on two VMs, both talking to the same 
master-slave Postgres 9 replication setup. Package files will initially be 
handled by a shared DRBD drive, however this may be obsoleted by the project to 
move file hosting to Cloudfront or another CDN.

A currently open question is how best to provide reliability and security for 
the SSH-based file upload system currently deployed on ximenez. Most likely we 
can setup the initial SSH endpoint on the load balancers to run a proxy to one 
of the main PyPI application servers, however failover would have to be 
semi-manual (possibly driven by Chef, meaning a chef-client run would have to 
happen before the tunnel would be updated, or anywhere up to 30 minutes). Given 
the relatively minimal public knowledge of this service, I think this is 
acceptable as a first-pass but a future solution involving HAProxy or another 
TCP load balancer to handle the SSH traffic might be appropriate. Similarly, 
automated database failover is not planned at this time due to the extra 
application-side complexity however the process will be well documented and 
able to be executed by all tier one on-call operations staff if the Postgres 
master goes offline for some reason.

This will only partially address the current reliability issues as many of the 
current problems are linked to Apache or mod_fcgid needing to be restarted. In 
that light I would like to see PyPI and the catalog-sig group investigate 
moving the codebase to work against mod_wsgi or gunicorn (no real preference 
between the two) to create a more reliable runtime environment.

--Noah

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
Catalog-SIG mailing list
[email protected]
http://mail.python.org/mailman/listinfo/catalog-sig

Reply via email to