Re: [openstack-dev] [swift] Go! Swift!

2015-05-07 Thread Chuck Thier
I think most are missing the point a bit.  The question that should really
be asked is, what is right for Swift to continue to scale.  Since the
inception of Openstack, Swift has had to solve for problems of scale that
generally are not shared with the rest of Openstack.

When we first set out to write Swift, we had set, what we thought at the
time were pretty lofty goals for ourselves:

* 100 Billion objects
* 100 Petabytes of data
* 100 K requests/second
* 100 Gb/s throughput

We started with Python figuring that when we hit major bottlenecks, we
would look at other options.  We have been surprised at how far we have
been able to push Python and have met most if not all of the goals above.

As we look toward the future, we realize that we are now looking for how we
will support trillions of objects, 100's of petabytes to exabytes of data,
etc.  We feel that we have finally hit that point that we need more than
incremental improvements, and that we are running out of incremental
improvements that can be made with Python.

What started as a simple experiment by Mike Barton, has turned into quite a
significant improvement in performance and builds a base that can be built
off of for future improvements.  This wasn't built because of it being
"shiny" but out of direct need, and is currently being tested with great
results on production workloads.

I applaud the team that has worked on this at Rackspace, and hope the
community can look at the current needs of Swift, and the merits of the
work that has been accomplished, rather than the politics of "shiny".

Thanks,

--
Chuck


On Thu, Apr 30, 2015 at 11:45 AM John Dickinson  wrote:

> Swift is a scalable and durable storage engine for storing unstructured
> data. It's been proven time and time again in production in clusters all
> over the world.
>
> We in the Swift developer community are constantly looking for ways to
> improve the codebase and deliver a better quality codebase to users
> everywhere. During the past year, the Rackspace Cloud Files team has been
> exploring the idea of reimplementing parts of Swift in Go. Yesterday, they
> released some of this code, called "hummingbird", for the first time. It's
> been proposed to a "feature/hummingbird" branch in Swift's source repo.
>
> https://review.openstack.org/#/c/178851
>
> I am very excited about this work being in the greater OpenStack Swift
> developer community. If you look at the patch above, you'll see that there
> are various parts of Swift reimplemented in Go. During the next six months
> (i.e. before Tokyo), I would like us to answer this question:
>
> What advantages does a compiled-language object server bring, and do they
> outweigh the costs of using a different language?
>
> Of course, there are a ton of things we need to explore on this topic, but
> I'm happy that we'll be doing it in the context of the open community
> instead of behind closed doors. We will have a fishbowl session in
> Vancouver on this topic. I'm looking forward to the discussion.
>
>
> --John
>
>
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Fwd: [Eventletdev] Eventlet 0.15 pre-release testers needed

2014-06-13 Thread Chuck Thier
Just a FYI for those interested in the next eventlet version.  It also
looks like they have a python 3 branch ready to start testing with.

--
Chuck

-- Forwarded message --
From: Sergey Shepelev 
Date: Fri, Jun 13, 2014 at 1:18 PM
Subject: [Eventletdev] Eventlet 0.15 pre-release testers needed
To: eventletdev , Noah Glusenkamp <
n...@empowerengine.com>, Victor Sergeyev ,
ja...@stasiak.at


Hello, everyone.

TL;DR: please test these versions in Python2 and Python3:
pip install URL should work
(master)
https://github.com/eventlet/eventlet/archive/6c4823c80575899e98afcb12f84dcf4d54e277cd.zip
(py3-greenio branch, on top of master)
https://github.com/eventlet/eventlet/archive/9e666c78086a1eb0c05027ec6892143dfa5c32bd.zip

I am going to make Eventlet 0.15 release in coming week or two and your
feedback would be greatly appreciated because it's the first release since
we started work on Python3 compatibility. So please try to run your project
in Python3 too, if you can.


___
Click here to unsubscribe or manage your list subscription:
https://lists.secondlife.com/cgi-bin/mailman/listinfo/eventletdev
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Concerns about the ballooning size of keystone tokens

2014-05-21 Thread Chuck Thier
There is a review for swift [1] that is requesting to set the max header
size to 16k to be able to support v3 keystone tokens.  That might be fine
if you measure you request rate in requests per minute, but this is
continuing to add significant overhead to swift.  Even if you *only* have
10,000 requests/sec to your swift cluster, an 8k token is adding almost
80MB/sec of bandwidth.  This will seem to be equally bad (if not worse) for
services like marconi.

When PKI tokens were first introduced, we raised concerns about the
unbounded size of of the token in the header, and were told that uuid style
tokens would still be usable, but all I heard at the summit, was to not use
them and PKI was the future of all things.

At what point do we re-evaluate the decision to go with pki tokens, and
that they may not be the best idea for apis like swift and marconi?

Thanks,

--
Chuck

[1] https://review.openstack.org/#/c/93356/
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Objects not getting distributed across the swift cluster...

2014-05-01 Thread Chuck Thier
Hi Shyam,

If I am reading your ring output correctly, it looks like only the devices
in node .202 have a weight set, and thus why all of your objects are going
to that one node.  You can update the weight of the other devices, and
rebalance, and things should get distributed correctly.

--
Chuck


On Thu, May 1, 2014 at 5:28 AM, Shyam Prasad N wrote:

> Hi,
>
> I created a swift cluster and configured the rings like this...
>
> swift-ring-builder object.builder create 10 3 1
>
> ubuntu-202:/etc/swift$ swift-ring-builder object.builder
> object.builder, build version 12
> 1024 partitions, 3.00 replicas, 1 regions, 4 zones, 12 devices, 300.00
> balance
> The minimum number of hours before a partition can be reassigned is 1
> Devices:id  region  zone  ip address  port  replication ip
> replication port  name weight partitions balance meta
>  0   1 1  10.3.0.202  6010
> 10.3.0.202  6010  xvdb   1.00   1024  300.00
>  1   1 1  10.3.0.202  6020
> 10.3.0.202  6020  xvdc   1.00   1024  300.00
>  2   1 1  10.3.0.202  6030
> 10.3.0.202  6030  xvde   1.00   1024  300.00
>  3   1 2  10.3.0.212  6010
> 10.3.0.212  6010  xvdb   1.00  0 -100.00
>  4   1 2  10.3.0.212  6020
> 10.3.0.212  6020  xvdc   1.00  0 -100.00
>  5   1 2  10.3.0.212  6030
> 10.3.0.212  6030  xvde   1.00  0 -100.00
>  6   1 3  10.3.0.222  6010
> 10.3.0.222  6010  xvdb   1.00  0 -100.00
>  7   1 3  10.3.0.222  6020
> 10.3.0.222  6020  xvdc   1.00  0 -100.00
>  8   1 3  10.3.0.222  6030
> 10.3.0.222  6030  xvde   1.00  0 -100.00
>  9   1 4  10.3.0.232  6010
> 10.3.0.232  6010  xvdb   1.00  0 -100.00
> 10   1 4  10.3.0.232  6020
> 10.3.0.232  6020  xvdc   1.00  0 -100.00
> 11   1 4  10.3.0.232  6030
> 10.3.0.232  6030  xvde   1.00  0 -100.00
>
> Container and account rings have a similar configuration.
> Once the rings were created and all the disks were added to the rings like
> above, I ran rebalance on each ring. (I ran rebalance after adding each of
> the node above.)
> Then I immediately scp the rings to all other nodes in the cluster.
>
> I now observe that the objects are all going to 10.3.0.202. I don't see
> the objects being replicated to the other nodes. So much so that 202 is
> approaching 100% disk usage, while other nodes are almost completely empty.
> What am I doing wrong? Am I not supposed to run rebalance operation after
> addition of each disk/node?
>
> Thanks in advance for the help.
>
> --
> -Shyam
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Issues with Python Requests

2014-04-04 Thread Chuck Thier
I think I have worked out the performance issues with eventlet and Requests
with most of it being that swiftclient needs to make use of
requests.session to re-use connections, and there are likely other areas
there that we can make improvements.

Now on to expect: 100-continue support, has anyone else looked into that?

--
Chuck


On Fri, Apr 4, 2014 at 9:41 AM, Chuck Thier  wrote:

> Howdy,
>
> Now that swift has aligned with the other projects to use requests in
> python-swiftclient, we have lost a couple of features.
>
> 1.  Requests doesn't support expect: 100-continue.  This is very useful
> for services like swift or glance where you want to make sure a request can
> continue before you start uploading GBs of data (for example find out that
> you need to auth).
>
> 2.  Requests doesn't play nicely with eventlet or other async frameworks
> [1].  I noticed this when suddenly swift-bench (which uses swiftclient)
> wasn't performing as well as before.  This also means that, for example, if
> you are using keystone with swift, the auth requests to keystone will block
> the proxy server until they complete, which is also not desirable.
>
> Does anyone know if these issues are being addressed, or begun working on
> them?
>
> Thanks,
>
> --
> Chuck
>
> [1]
> http://docs.python-requests.org/en/latest/user/advanced/#blocking-or-non-blocking
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Issues with Python Requests

2014-04-04 Thread Chuck Thier
On Fri, Apr 4, 2014 at 11:18 AM, Donald Stufft  wrote:

>
> On Apr 4, 2014, at 10:56 AM, Chuck Thier  wrote:
>
> On Fri, Apr 4, 2014 at 9:44 AM, Donald Stufft  wrote:
>
>> requests should work fine if you used the event let monkey patch the
>> socket module prior to import requests.
>>
>
> That's what I had hoped as well (and is what swift-bench did already), but
> it performs the same if I monkey patch or not.
>
> --
> Chuck
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> Is it running inside of a eventlet.spawn thread?
>

It looks like I missed something the first time, as I tried again and got a
little different behavior.  monkey patching the socket helps, but is still
far slower than it was before.

Currently, swift bench running with requests, does about 25 requests/second
for PUTs and 50 requests/second for GETs.  The same test without requests
does 50 requests/second for PUTs and 200 requests/second for GETs.

I'll try to keep digging to figure out why there is such a performance
difference, but if anyone else has had experience tuning performance with
requests, I would appreciate any input.

--
Chuck
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Issues with Python Requests

2014-04-04 Thread Chuck Thier
On Fri, Apr 4, 2014 at 9:44 AM, Donald Stufft  wrote:

> requests should work fine if you used the event let monkey patch the
> socket module prior to import requests.
>

That's what I had hoped as well (and is what swift-bench did already), but
it performs the same if I monkey patch or not.

--
Chuck
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Issues with Python Requests

2014-04-04 Thread Chuck Thier
Howdy,

Now that swift has aligned with the other projects to use requests in
python-swiftclient, we have lost a couple of features.

1.  Requests doesn't support expect: 100-continue.  This is very useful for
services like swift or glance where you want to make sure a request can
continue before you start uploading GBs of data (for example find out that
you need to auth).

2.  Requests doesn't play nicely with eventlet or other async frameworks
[1].  I noticed this when suddenly swift-bench (which uses swiftclient)
wasn't performing as well as before.  This also means that, for example, if
you are using keystone with swift, the auth requests to keystone will block
the proxy server until they complete, which is also not desirable.

Does anyone know if these issues are being addressed, or begun working on
them?

Thanks,

--
Chuck

[1]
http://docs.python-requests.org/en/latest/user/advanced/#blocking-or-non-blocking
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [legal-discuss] [Marconi] Why is marconi a queue implementation vs a provisioning API?

2014-03-20 Thread Chuck Thier
>
>
> I agree this is quite an issue but I also think that pretending that
> we'll be able to let OpenStack grow with a minimum set of databases,
> brokers and web servers is a bit unrealistic. The set of supported
> technologies won't be able to fulfill the needs of all the
> yet-to-be-discovered *amazing* projects.
>

Or continue to ostracize current *amazing* projects. ;)

There has long been a rift in the Openstack community around the
implementation details of swift.   I know someone mentioned earlier, but I
want to focus on the fact that Swift (like marconi) is a very different
service.  The API *is* the product.  With something like Nova, the API can
be down, but users can still use their VM's.  For swift, if the API is
down, the whole product is down.  We have a very different set of
constraints that we have to work with, and thus why we often have to take
very different approaches.  There absolutely can't be a one fit solution
for all.

If we are going to be so strict about what an Openstack project uses, are
we then by the same token going to kick swift out of Openstack because it
will *never* use Pecan?  And I say that not because I think Pecan is a bad
tool, just not the right tool for swift.

--
Chuck
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Asynchrounous programming: replace eventlet with asyncio

2014-02-07 Thread Chuck Thier
Concurrency is hard, let's blame the tools!

Any lib that we use in python is going to have a set of trade-offs.
 Looking at a couple of the options on the table:

1.  Threads:  Great! code doesn't have to change too much, but now that
code *will* be preempted at any time, so now we have to worry about locking
and we have even more race conditions that are difficult to debug.

2.  Asyncio:  Explicit FTW!  Except now that big list of dependencies has
to also support the same form of explicit concurrency.  This is a trade-off
that twisted makes as well.  Any library that might block has to have a
separate library made for it.

We could dig deeper, but hopefully you see what I mean.  Changing tools may
solve one problem, but at the same time introduce a different set of
problems.

I think the biggest issue with using Eventlet is that developers want to
treat it like magic, and you can't do that.  If you are monkey patching the
world, then you are doing it wrong.  How about we take a moment to learn
how to use the tools we have effectively, rather than just blaming them.
 Many projects have managed to use Eventlet effectively (including some in
Openstack).

Eventlet isn't perfect, but it has gotten us quite a ways.  If you do
choose to use another library, please make sure you are trading for the
right set of problems.

--
Chuck


On Fri, Feb 7, 2014 at 1:07 AM, Joshua Harlow wrote:

>  +1
>
> To give an example as to why eventlet implicit monkey patch the world
> isn't especially great (although it's what we are currently using
> throughout openstack).
>
> The way I think about how it works is to think about what libraries that a
> single piece of code calls and how it is very hard to predict whether that
> code will trigger a implicit switch (conceptually similar to a context
> switch).
>
> Let's take a simple naive piece of code.
>
> >>> import logging
> >>> LOG = logging.getLogger(__name__)
> >>> LOG.info("hi")
>
> This seems rather straightforward, write 'hi' to some log location. With
> eventlets implicitness (via ye-old monkey patch everything) it is entirely
> possible that somewhere inside the logging code there will be a write to a
> socket (say perhaps this person enabled syslog/socket logger or something
> like that) and that will block. This causes an implicit switch to another
> greenthread (and so-on for the applications life-cycle). Now just magnify
> the amount of understanding required to reason about how the logging
> library (which is pretty well understood) works with eventlet by the number
> of libraries in
> https://github.com/openstack/requirements/blob/master/global-requirements.txt.
> To understand how all these libraries interact with I/O, threading or other
> locations where things can implicitly switch is pretty much near
> impossible. It becomes even more 'hairy' when those libraries themselves
> acquire some type of locks (did you as an eventlet user remember to monkey
> patch the threading module?)...
>
> IMHO eventlet has 'seduced' many developers into thinking that it
> magically makes an application C10K ready even though it easily makes it
> possible to 'crash and burn' without to much trouble. Is the benefit worth
> it? Maybe, maybe not...
>
> I'm not saying we should abandon eventlet (likely we can't easily pull
> this off even if we wanted to), but I do agree that the randomness it
> provides is not easy to follow, debug, analyze... It gets even more
> complicated when you start to mix threads (which do exist in python, but
> are GIL handicapped, although this has been getting better in 3.2+ with GIL
> improvements) with greenthreads (try figuring out which one is causing race
> conditions in a gdb session for example).
>
> Anyways, the future of this whole situation looks bright, it will be an
> interesting balance between making it easy to read/understand (eventlet
> tries to make it look so easy and no-changes needed, see above seduction)
> vs. requiring a "big" mind-set change in how libraries and applications are
> currently written.
>
> Which is the right path to get to the final destination, only time will
> tell :-)
>
>  -Josh
>
>
> Sent from my really tiny device...
>
> On Feb 6, 2014, at 6:55 PM, "Zane Bitter"  wrote:
>
>  On 04/02/14 13:53, Kevin Conway wrote:
>
> On 2/4/14 12:07 PM, "victor stinner"  wrote:
>
>  >The purpose of replacing eventlet with asyncio is to get a well defined
>
>  >control flow, no more surprising task switching at random points.
>
>
>  I disagree with this. Eventlet and gevent yield the execution context
>
> anytime an IO call is made or the 'sleep()' function is called explicitly.
>
> The order in which greenthreads grain execution context is deterministic
>
> even if not entirely obvious. There is no context switching at random.
>
>
> This is technically correct of course, but in reality there's no way to
> know whether a particular piece of code is safe from context switches
> unless you have the entire codebase of the program and 

Re: [openstack-dev] [Swift] erasure codes, digging deeper

2013-07-18 Thread Chuck Thier
I think you are missing the point.  What I'm talking about is who chooses
what data is EC and what is not.  The point that I am trying to make is
that the operators of swift clusters should decide what data is EC, not the
clients/users.  How the data is stored should be totally transparent to the
user.

Now if we want to down the road offer user defined classes of storage (like
how S3 does reduced redundancy), I'm cool with that, just that it should be
orthogonal to the implementation of EC.

--
Chuck


On Thu, Jul 18, 2013 at 12:57 PM, John Dickinson  wrote:

> Are you talking about the parameters for EC or the fact that something is
> erasure coded vs replicated?
>
> For the first, that's exactly what we're thinking: a deployer sets up one
> (or more) policies and calls them Alice, Bob, or whatever, and then the API
> client can set that on a particular container.
>
> This allows users who know what they are doing (ie those who know the
> tradeoffs and their data characteristics) to make good choices. It also
> allows deployers who want to have an automatic policy to set one up to
> migrate data.
>
> For example, a deployer may choose to run a migrator process that moved
> certain data from replicated to EC containers over time (and drops a
> manifest file in the replicated tier to point to the EC data so that the
> URL still works).
>
> Like existing features in Swift (eg large objects), this gives users the
> ability to flexibly store their data with a nice interface yet still have
> the ability to get at some of the pokey bits underneath.
>
> --John
>
>
>
> On Jul 18, 2013, at 10:31 AM, Chuck Thier  wrote:
>
> > I'm with Chmouel though.  It seems to me that EC policy should be chosen
> by the provider and not the client.  For public storage clouds, I don't
> think you can make the assumption that all users/clients will understand
> the storage/latency tradeoffs and benefits.
> >
> >
> > On Thu, Jul 18, 2013 at 8:11 AM, John Dickinson  wrote:
> > Check out the slides I linked. The plan is to enable an EC policy that
> is then set on a container. A cluster may have a replication policy and one
> or more EC policies. Then the user will be able to choose the policy for a
> particular container.
> >
> > --John
> >
> >
> >
> >
> > On Jul 18, 2013, at 2:50 AM, Chmouel Boudjnah 
> wrote:
> >
> > > On Thu, Jul 18, 2013 at 12:42 AM, John Dickinson  wrote:
> > >>* Erasure codes (vs replicas) will be set on a per-container basis
> > >
> > > I was wondering if there was any reasons why it couldn't be as
> > > per-account basis as this would allow an operator to have different
> > > type of an account and different pricing (i.e: tiered storage).
> > >
> > > Chmouel.
> >
> >
> > ___
> > OpenStack-dev mailing list
> > OpenStack-dev@lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
> > ___
> > OpenStack-dev mailing list
> > OpenStack-dev@lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Swift] erasure codes, digging deeper

2013-07-18 Thread Chuck Thier
I'm with Chmouel though.  It seems to me that EC policy should be chosen by
the provider and not the client.  For public storage clouds, I don't think
you can make the assumption that all users/clients will understand the
storage/latency tradeoffs and benefits.


On Thu, Jul 18, 2013 at 8:11 AM, John Dickinson  wrote:

> Check out the slides I linked. The plan is to enable an EC policy that is
> then set on a container. A cluster may have a replication policy and one or
> more EC policies. Then the user will be able to choose the policy for a
> particular container.
>
> --John
>
>
>
>
> On Jul 18, 2013, at 2:50 AM, Chmouel Boudjnah 
> wrote:
>
> > On Thu, Jul 18, 2013 at 12:42 AM, John Dickinson  wrote:
> >>* Erasure codes (vs replicas) will be set on a per-container basis
> >
> > I was wondering if there was any reasons why it couldn't be as
> > per-account basis as this would allow an operator to have different
> > type of an account and different pricing (i.e: tiered storage).
> >
> > Chmouel.
>
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev