Re: [openstack-dev] [swift] Go! Swift!
I think most are missing the point a bit. The question that should really be asked is, what is right for Swift to continue to scale. Since the inception of Openstack, Swift has had to solve for problems of scale that generally are not shared with the rest of Openstack. When we first set out to write Swift, we had set, what we thought at the time were pretty lofty goals for ourselves: * 100 Billion objects * 100 Petabytes of data * 100 K requests/second * 100 Gb/s throughput We started with Python figuring that when we hit major bottlenecks, we would look at other options. We have been surprised at how far we have been able to push Python and have met most if not all of the goals above. As we look toward the future, we realize that we are now looking for how we will support trillions of objects, 100's of petabytes to exabytes of data, etc. We feel that we have finally hit that point that we need more than incremental improvements, and that we are running out of incremental improvements that can be made with Python. What started as a simple experiment by Mike Barton, has turned into quite a significant improvement in performance and builds a base that can be built off of for future improvements. This wasn't built because of it being "shiny" but out of direct need, and is currently being tested with great results on production workloads. I applaud the team that has worked on this at Rackspace, and hope the community can look at the current needs of Swift, and the merits of the work that has been accomplished, rather than the politics of "shiny". Thanks, -- Chuck On Thu, Apr 30, 2015 at 11:45 AM John Dickinson wrote: > Swift is a scalable and durable storage engine for storing unstructured > data. It's been proven time and time again in production in clusters all > over the world. > > We in the Swift developer community are constantly looking for ways to > improve the codebase and deliver a better quality codebase to users > everywhere. During the past year, the Rackspace Cloud Files team has been > exploring the idea of reimplementing parts of Swift in Go. Yesterday, they > released some of this code, called "hummingbird", for the first time. It's > been proposed to a "feature/hummingbird" branch in Swift's source repo. > > https://review.openstack.org/#/c/178851 > > I am very excited about this work being in the greater OpenStack Swift > developer community. If you look at the patch above, you'll see that there > are various parts of Swift reimplemented in Go. During the next six months > (i.e. before Tokyo), I would like us to answer this question: > > What advantages does a compiled-language object server bring, and do they > outweigh the costs of using a different language? > > Of course, there are a ton of things we need to explore on this topic, but > I'm happy that we'll be doing it in the context of the open community > instead of behind closed doors. We will have a fishbowl session in > Vancouver on this topic. I'm looking forward to the discussion. > > > --John > > > > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] Fwd: [Eventletdev] Eventlet 0.15 pre-release testers needed
Just a FYI for those interested in the next eventlet version. It also looks like they have a python 3 branch ready to start testing with. -- Chuck -- Forwarded message -- From: Sergey Shepelev Date: Fri, Jun 13, 2014 at 1:18 PM Subject: [Eventletdev] Eventlet 0.15 pre-release testers needed To: eventletdev , Noah Glusenkamp < n...@empowerengine.com>, Victor Sergeyev , ja...@stasiak.at Hello, everyone. TL;DR: please test these versions in Python2 and Python3: pip install URL should work (master) https://github.com/eventlet/eventlet/archive/6c4823c80575899e98afcb12f84dcf4d54e277cd.zip (py3-greenio branch, on top of master) https://github.com/eventlet/eventlet/archive/9e666c78086a1eb0c05027ec6892143dfa5c32bd.zip I am going to make Eventlet 0.15 release in coming week or two and your feedback would be greatly appreciated because it's the first release since we started work on Python3 compatibility. So please try to run your project in Python3 too, if you can. ___ Click here to unsubscribe or manage your list subscription: https://lists.secondlife.com/cgi-bin/mailman/listinfo/eventletdev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] Concerns about the ballooning size of keystone tokens
There is a review for swift [1] that is requesting to set the max header size to 16k to be able to support v3 keystone tokens. That might be fine if you measure you request rate in requests per minute, but this is continuing to add significant overhead to swift. Even if you *only* have 10,000 requests/sec to your swift cluster, an 8k token is adding almost 80MB/sec of bandwidth. This will seem to be equally bad (if not worse) for services like marconi. When PKI tokens were first introduced, we raised concerns about the unbounded size of of the token in the header, and were told that uuid style tokens would still be usable, but all I heard at the summit, was to not use them and PKI was the future of all things. At what point do we re-evaluate the decision to go with pki tokens, and that they may not be the best idea for apis like swift and marconi? Thanks, -- Chuck [1] https://review.openstack.org/#/c/93356/ ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Objects not getting distributed across the swift cluster...
Hi Shyam, If I am reading your ring output correctly, it looks like only the devices in node .202 have a weight set, and thus why all of your objects are going to that one node. You can update the weight of the other devices, and rebalance, and things should get distributed correctly. -- Chuck On Thu, May 1, 2014 at 5:28 AM, Shyam Prasad N wrote: > Hi, > > I created a swift cluster and configured the rings like this... > > swift-ring-builder object.builder create 10 3 1 > > ubuntu-202:/etc/swift$ swift-ring-builder object.builder > object.builder, build version 12 > 1024 partitions, 3.00 replicas, 1 regions, 4 zones, 12 devices, 300.00 > balance > The minimum number of hours before a partition can be reassigned is 1 > Devices:id region zone ip address port replication ip > replication port name weight partitions balance meta > 0 1 1 10.3.0.202 6010 > 10.3.0.202 6010 xvdb 1.00 1024 300.00 > 1 1 1 10.3.0.202 6020 > 10.3.0.202 6020 xvdc 1.00 1024 300.00 > 2 1 1 10.3.0.202 6030 > 10.3.0.202 6030 xvde 1.00 1024 300.00 > 3 1 2 10.3.0.212 6010 > 10.3.0.212 6010 xvdb 1.00 0 -100.00 > 4 1 2 10.3.0.212 6020 > 10.3.0.212 6020 xvdc 1.00 0 -100.00 > 5 1 2 10.3.0.212 6030 > 10.3.0.212 6030 xvde 1.00 0 -100.00 > 6 1 3 10.3.0.222 6010 > 10.3.0.222 6010 xvdb 1.00 0 -100.00 > 7 1 3 10.3.0.222 6020 > 10.3.0.222 6020 xvdc 1.00 0 -100.00 > 8 1 3 10.3.0.222 6030 > 10.3.0.222 6030 xvde 1.00 0 -100.00 > 9 1 4 10.3.0.232 6010 > 10.3.0.232 6010 xvdb 1.00 0 -100.00 > 10 1 4 10.3.0.232 6020 > 10.3.0.232 6020 xvdc 1.00 0 -100.00 > 11 1 4 10.3.0.232 6030 > 10.3.0.232 6030 xvde 1.00 0 -100.00 > > Container and account rings have a similar configuration. > Once the rings were created and all the disks were added to the rings like > above, I ran rebalance on each ring. (I ran rebalance after adding each of > the node above.) > Then I immediately scp the rings to all other nodes in the cluster. > > I now observe that the objects are all going to 10.3.0.202. I don't see > the objects being replicated to the other nodes. So much so that 202 is > approaching 100% disk usage, while other nodes are almost completely empty. > What am I doing wrong? Am I not supposed to run rebalance operation after > addition of each disk/node? > > Thanks in advance for the help. > > -- > -Shyam > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Issues with Python Requests
I think I have worked out the performance issues with eventlet and Requests with most of it being that swiftclient needs to make use of requests.session to re-use connections, and there are likely other areas there that we can make improvements. Now on to expect: 100-continue support, has anyone else looked into that? -- Chuck On Fri, Apr 4, 2014 at 9:41 AM, Chuck Thier wrote: > Howdy, > > Now that swift has aligned with the other projects to use requests in > python-swiftclient, we have lost a couple of features. > > 1. Requests doesn't support expect: 100-continue. This is very useful > for services like swift or glance where you want to make sure a request can > continue before you start uploading GBs of data (for example find out that > you need to auth). > > 2. Requests doesn't play nicely with eventlet or other async frameworks > [1]. I noticed this when suddenly swift-bench (which uses swiftclient) > wasn't performing as well as before. This also means that, for example, if > you are using keystone with swift, the auth requests to keystone will block > the proxy server until they complete, which is also not desirable. > > Does anyone know if these issues are being addressed, or begun working on > them? > > Thanks, > > -- > Chuck > > [1] > http://docs.python-requests.org/en/latest/user/advanced/#blocking-or-non-blocking > ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Issues with Python Requests
On Fri, Apr 4, 2014 at 11:18 AM, Donald Stufft wrote: > > On Apr 4, 2014, at 10:56 AM, Chuck Thier wrote: > > On Fri, Apr 4, 2014 at 9:44 AM, Donald Stufft wrote: > >> requests should work fine if you used the event let monkey patch the >> socket module prior to import requests. >> > > That's what I had hoped as well (and is what swift-bench did already), but > it performs the same if I monkey patch or not. > > -- > Chuck > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > Is it running inside of a eventlet.spawn thread? > It looks like I missed something the first time, as I tried again and got a little different behavior. monkey patching the socket helps, but is still far slower than it was before. Currently, swift bench running with requests, does about 25 requests/second for PUTs and 50 requests/second for GETs. The same test without requests does 50 requests/second for PUTs and 200 requests/second for GETs. I'll try to keep digging to figure out why there is such a performance difference, but if anyone else has had experience tuning performance with requests, I would appreciate any input. -- Chuck ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Issues with Python Requests
On Fri, Apr 4, 2014 at 9:44 AM, Donald Stufft wrote: > requests should work fine if you used the event let monkey patch the > socket module prior to import requests. > That's what I had hoped as well (and is what swift-bench did already), but it performs the same if I monkey patch or not. -- Chuck ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] Issues with Python Requests
Howdy, Now that swift has aligned with the other projects to use requests in python-swiftclient, we have lost a couple of features. 1. Requests doesn't support expect: 100-continue. This is very useful for services like swift or glance where you want to make sure a request can continue before you start uploading GBs of data (for example find out that you need to auth). 2. Requests doesn't play nicely with eventlet or other async frameworks [1]. I noticed this when suddenly swift-bench (which uses swiftclient) wasn't performing as well as before. This also means that, for example, if you are using keystone with swift, the auth requests to keystone will block the proxy server until they complete, which is also not desirable. Does anyone know if these issues are being addressed, or begun working on them? Thanks, -- Chuck [1] http://docs.python-requests.org/en/latest/user/advanced/#blocking-or-non-blocking ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [legal-discuss] [Marconi] Why is marconi a queue implementation vs a provisioning API?
> > > I agree this is quite an issue but I also think that pretending that > we'll be able to let OpenStack grow with a minimum set of databases, > brokers and web servers is a bit unrealistic. The set of supported > technologies won't be able to fulfill the needs of all the > yet-to-be-discovered *amazing* projects. > Or continue to ostracize current *amazing* projects. ;) There has long been a rift in the Openstack community around the implementation details of swift. I know someone mentioned earlier, but I want to focus on the fact that Swift (like marconi) is a very different service. The API *is* the product. With something like Nova, the API can be down, but users can still use their VM's. For swift, if the API is down, the whole product is down. We have a very different set of constraints that we have to work with, and thus why we often have to take very different approaches. There absolutely can't be a one fit solution for all. If we are going to be so strict about what an Openstack project uses, are we then by the same token going to kick swift out of Openstack because it will *never* use Pecan? And I say that not because I think Pecan is a bad tool, just not the right tool for swift. -- Chuck ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Asynchrounous programming: replace eventlet with asyncio
Concurrency is hard, let's blame the tools! Any lib that we use in python is going to have a set of trade-offs. Looking at a couple of the options on the table: 1. Threads: Great! code doesn't have to change too much, but now that code *will* be preempted at any time, so now we have to worry about locking and we have even more race conditions that are difficult to debug. 2. Asyncio: Explicit FTW! Except now that big list of dependencies has to also support the same form of explicit concurrency. This is a trade-off that twisted makes as well. Any library that might block has to have a separate library made for it. We could dig deeper, but hopefully you see what I mean. Changing tools may solve one problem, but at the same time introduce a different set of problems. I think the biggest issue with using Eventlet is that developers want to treat it like magic, and you can't do that. If you are monkey patching the world, then you are doing it wrong. How about we take a moment to learn how to use the tools we have effectively, rather than just blaming them. Many projects have managed to use Eventlet effectively (including some in Openstack). Eventlet isn't perfect, but it has gotten us quite a ways. If you do choose to use another library, please make sure you are trading for the right set of problems. -- Chuck On Fri, Feb 7, 2014 at 1:07 AM, Joshua Harlow wrote: > +1 > > To give an example as to why eventlet implicit monkey patch the world > isn't especially great (although it's what we are currently using > throughout openstack). > > The way I think about how it works is to think about what libraries that a > single piece of code calls and how it is very hard to predict whether that > code will trigger a implicit switch (conceptually similar to a context > switch). > > Let's take a simple naive piece of code. > > >>> import logging > >>> LOG = logging.getLogger(__name__) > >>> LOG.info("hi") > > This seems rather straightforward, write 'hi' to some log location. With > eventlets implicitness (via ye-old monkey patch everything) it is entirely > possible that somewhere inside the logging code there will be a write to a > socket (say perhaps this person enabled syslog/socket logger or something > like that) and that will block. This causes an implicit switch to another > greenthread (and so-on for the applications life-cycle). Now just magnify > the amount of understanding required to reason about how the logging > library (which is pretty well understood) works with eventlet by the number > of libraries in > https://github.com/openstack/requirements/blob/master/global-requirements.txt. > To understand how all these libraries interact with I/O, threading or other > locations where things can implicitly switch is pretty much near > impossible. It becomes even more 'hairy' when those libraries themselves > acquire some type of locks (did you as an eventlet user remember to monkey > patch the threading module?)... > > IMHO eventlet has 'seduced' many developers into thinking that it > magically makes an application C10K ready even though it easily makes it > possible to 'crash and burn' without to much trouble. Is the benefit worth > it? Maybe, maybe not... > > I'm not saying we should abandon eventlet (likely we can't easily pull > this off even if we wanted to), but I do agree that the randomness it > provides is not easy to follow, debug, analyze... It gets even more > complicated when you start to mix threads (which do exist in python, but > are GIL handicapped, although this has been getting better in 3.2+ with GIL > improvements) with greenthreads (try figuring out which one is causing race > conditions in a gdb session for example). > > Anyways, the future of this whole situation looks bright, it will be an > interesting balance between making it easy to read/understand (eventlet > tries to make it look so easy and no-changes needed, see above seduction) > vs. requiring a "big" mind-set change in how libraries and applications are > currently written. > > Which is the right path to get to the final destination, only time will > tell :-) > > -Josh > > > Sent from my really tiny device... > > On Feb 6, 2014, at 6:55 PM, "Zane Bitter" wrote: > > On 04/02/14 13:53, Kevin Conway wrote: > > On 2/4/14 12:07 PM, "victor stinner" wrote: > > >The purpose of replacing eventlet with asyncio is to get a well defined > > >control flow, no more surprising task switching at random points. > > > I disagree with this. Eventlet and gevent yield the execution context > > anytime an IO call is made or the 'sleep()' function is called explicitly. > > The order in which greenthreads grain execution context is deterministic > > even if not entirely obvious. There is no context switching at random. > > > This is technically correct of course, but in reality there's no way to > know whether a particular piece of code is safe from context switches > unless you have the entire codebase of the program and
Re: [openstack-dev] [Swift] erasure codes, digging deeper
I think you are missing the point. What I'm talking about is who chooses what data is EC and what is not. The point that I am trying to make is that the operators of swift clusters should decide what data is EC, not the clients/users. How the data is stored should be totally transparent to the user. Now if we want to down the road offer user defined classes of storage (like how S3 does reduced redundancy), I'm cool with that, just that it should be orthogonal to the implementation of EC. -- Chuck On Thu, Jul 18, 2013 at 12:57 PM, John Dickinson wrote: > Are you talking about the parameters for EC or the fact that something is > erasure coded vs replicated? > > For the first, that's exactly what we're thinking: a deployer sets up one > (or more) policies and calls them Alice, Bob, or whatever, and then the API > client can set that on a particular container. > > This allows users who know what they are doing (ie those who know the > tradeoffs and their data characteristics) to make good choices. It also > allows deployers who want to have an automatic policy to set one up to > migrate data. > > For example, a deployer may choose to run a migrator process that moved > certain data from replicated to EC containers over time (and drops a > manifest file in the replicated tier to point to the EC data so that the > URL still works). > > Like existing features in Swift (eg large objects), this gives users the > ability to flexibly store their data with a nice interface yet still have > the ability to get at some of the pokey bits underneath. > > --John > > > > On Jul 18, 2013, at 10:31 AM, Chuck Thier wrote: > > > I'm with Chmouel though. It seems to me that EC policy should be chosen > by the provider and not the client. For public storage clouds, I don't > think you can make the assumption that all users/clients will understand > the storage/latency tradeoffs and benefits. > > > > > > On Thu, Jul 18, 2013 at 8:11 AM, John Dickinson wrote: > > Check out the slides I linked. The plan is to enable an EC policy that > is then set on a container. A cluster may have a replication policy and one > or more EC policies. Then the user will be able to choose the policy for a > particular container. > > > > --John > > > > > > > > > > On Jul 18, 2013, at 2:50 AM, Chmouel Boudjnah > wrote: > > > > > On Thu, Jul 18, 2013 at 12:42 AM, John Dickinson wrote: > > >>* Erasure codes (vs replicas) will be set on a per-container basis > > > > > > I was wondering if there was any reasons why it couldn't be as > > > per-account basis as this would allow an operator to have different > > > type of an account and different pricing (i.e: tiered storage). > > > > > > Chmouel. > > > > > > ___ > > OpenStack-dev mailing list > > OpenStack-dev@lists.openstack.org > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > > > ___ > > OpenStack-dev mailing list > > OpenStack-dev@lists.openstack.org > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Swift] erasure codes, digging deeper
I'm with Chmouel though. It seems to me that EC policy should be chosen by the provider and not the client. For public storage clouds, I don't think you can make the assumption that all users/clients will understand the storage/latency tradeoffs and benefits. On Thu, Jul 18, 2013 at 8:11 AM, John Dickinson wrote: > Check out the slides I linked. The plan is to enable an EC policy that is > then set on a container. A cluster may have a replication policy and one or > more EC policies. Then the user will be able to choose the policy for a > particular container. > > --John > > > > > On Jul 18, 2013, at 2:50 AM, Chmouel Boudjnah > wrote: > > > On Thu, Jul 18, 2013 at 12:42 AM, John Dickinson wrote: > >>* Erasure codes (vs replicas) will be set on a per-container basis > > > > I was wondering if there was any reasons why it couldn't be as > > per-account basis as this would allow an operator to have different > > type of an account and different pricing (i.e: tiered storage). > > > > Chmouel. > > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev