Hi all, First of all, my apologies in advance if I’m inadvertently breaking any posting rules!
I wonder if anyone might have experience deploying OpenStack Swift server(s) for production *small-scale* web apps? If you're pressed for time, I'd appreciate any advice you can provide. The rest of this is context, a few more specific questions, etc, for anyone who enjoys the challenge of reading a 3-part novel: I am involved in a project right now to build a sort of workflow/messaging/document control web application. The application will be deployed to a number of data centres for different organizations. Each organization will host between 5,000 and 12,000 users, although down the road a larger organization (100,000 users) might be deployed as well. Before my time, OpenStack Swift was chosen as the object storage service for users' documents. I have no experience deploying, maintaining or troubleshooting Swift. All I know is what I've read on the OpenStack website and a few blog entries and mailing list anecdotes. Unfortunately I do not have any hard statistics about hit rates, data flow, amount of storage, etc. My suspicion is that the application will be used casually, perhaps a few minutes per day per average user, more or less like a casual/occasional Facebook user, reading a few posts and maybe writing a response or two. I believe the average user would typically retrieve only images from object storage (corporate logos and maybe a photo or two). More rarely, a user would upload or download PDFs and Word files and those sorts of documents. Most of the documents would be small; my guess is that an average user would store nowhere near 1 gigabyte of document/object data. Some of the documents would contain private/confidential data. The recommendations I’ve found seem to suggest that 3 storage nodes are the minimum to get much benefit out of Swift. Does anyone have any feedback on this number? Has anyone deployed a single Swift storage node to production? Or 2? For example, perhaps by replicating to 3 different devices on a single node? >From a project perspective, my superiors are concerned about costs. Assuming 3 object storage nodes, plus 1 PAC node (in order to segregate the application tier from the data tier), that’s 4 nodes to deploy at each organization. >From a performance perspective, I can’t help but question the wisdom of deploying a large, scalable distributed system like Swift in order to manage small-sized document objects for a small number of users. Again, I have no hard statistics. Guessing 1 GB per user with 12,000 users, that’s about 12 terabytes of object storage. Does anyone run production Swift object servers with 12 TB or less in object data? >From a support perspective, the complexity of maintaining and troubleshooting Swift also seems to me to be rather high for such a small deployment. With rsync running in addition to the Swift servers, and the potential sources of disaster or corruption residing on 4+ nodes and in multiple applications / services, I would expect the cost of providing technical support to be much higher than it would be to, say, maintain a RAID array. I also don’t have any concept of if / how this complexity would change the amount of time it takes to recover from a disaster. Does anyone have any experience maintaining and troubleshooting Swift in a small-scale production environment? Any feedback / statistics / analysis / advice / links / etc anyone can provide would be very much appreciated. Thanks all, Johann Tienhaara Jtienhaara AT yahoo DOT com
_______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : [email protected] Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
