Re: [openstack-dev] [oslo.db]A proposal for DB read/write separation

2014-08-13 Thread Mike Wilson
Lee,

No problem about mixing up the Mike's, there's a bunch of us out there :-).
What are you are describing here is very much like a spec I wrote for
Nova[1] a couple months ago and then never got back to. At the time I
considered gearing the feature toward oslo.db and I can't remember exactly
why I didn't. I think it probably had more to do with having folks that are
familiar with the problem reviewing code in Nova than anything else.
Anyway, I'd like to revisit this in Kilo or if you see a nice way to
integrate this into oslo.db I'd love to see your proposal.

-Mike

[1] https://review.openstack.org/#/c/93466/


On Sun, Aug 10, 2014 at 10:30 PM, Li Ma skywalker.n...@gmail.com wrote:

  not sure if I said that :).  I know extremely little about galera.

 Hi Mike Bayer, I'm so sorry I mistake you from Mike Wilson in the last
 post. :-) Also, say sorry to Mike Wilson.

  I’d totally guess that Galera would need to first have SELECTs come from
 a slave node, then the moment it sees any kind of DML / writing, it
 transparently switches the rest of the transaction over to a writer node.

 You are totally right.

 
  @transaction.writer
  def read_and_write_something(arg1, arg2, …):
  # …
 
  @transaction.reader
  def only_read_something(arg1, arg2, …):
  # …

 The first approach that I had in mind is the decorator-based method to
 separates read/write ops like what you said. To some degree, it is almost
 the same app-level approach to the master/slave configuration, due to
 transparency to developers. However, as I stated before, the current
 approach is merely used in OpenStack. Decorator is more friendly than
 use_slave_flag or something like that. If ideally transparency cannot be
 achieved, to say the least, decorator-based app-level switching is a great
 improvement, compared with the current implementation.

  OK so Galera would perhaps have some way to make this happen, and that's
 great.

 If any Galera expert here, please correct me. At least in my experiment,
 transactions work in that way.

  this (the word “integrate”, and what does that mean) is really the only
 thing making me nervous.

 Mike, just feel free. What I'd like to do is to add a django-style routing
 method as a plus in oslo.db, like:

 [database]
 # Original master/slave configuration
 master_connection =
 slave_connection =

 # Only Support Synchronous Replication
 enable_auto_routing = True

 [db_cluster]
 master_connection =
 master_connection =
 ...
 slave_connection =
 slave_connection =
 ...

 HOWEVER, I think it needs more investigation, so this is why I'd like to
 put it in the mailing list in the early stage to raise some discussions in
 depth. I'm not a Galera expert. I really appreciate any challenges here.

 Thanks,
 Li Ma


 - Original Message -
 From: Mike Bayer mba...@redhat.com
 To: OpenStack Development Mailing List (not for usage questions) 
 openstack-dev@lists.openstack.org
 Sent: 星期日, 2014年 8 月 10日 下午 11:57:47
 Subject: Re: [openstack-dev] [oslo.db]A proposal for DB read/write
 separation


 On Aug 10, 2014, at 11:17 AM, Li Ma skywalker.n...@gmail.com wrote:

 
  How about Galera multi-master cluster? As Mike Bayer said, it is
 virtually synchronous by default. It is still possible that outdated rows
 are queried that make results not stable.

 not sure if I said that :).  I know extremely little about galera.


 
 
  Let's move forward to synchronous replication, like Galera with
 causal-reads on. The dominant advantage is that it has consistent
 relational dataset support. The disadvantage are that it uses optimistic
 locking and its performance sucks (also said by Mike Bayer :-). For
 optimistic locking problem, I think it can be dealt with by
 retry-on-deadlock. It's not the topic here.

 I *really* don’t think I said that, because I like optimistic locking, and
 I’ve never used Galera ;).

 Where I am ignorant here is of what exactly occurs if you write some rows
 within a transaction with Galera, then do some reads in that same
 transaction.   I’d totally guess that Galera would need to first have
 SELECTs come from a slave node, then the moment it sees any kind of DML /
 writing, it transparently switches the rest of the transaction over to a
 writer node.   No idea, but it has to be something like that?


 
 
  So, the transparent read/write separation is dependent on such an
 environment. SQLalchemy tutorial provides code sample for it [1]. Besides,
 Mike Bayer also provides a blog post for it [2].

 So this thing with the “django-style routers”, the way that example is, it
 actually would work poorly with a Session that is not in “autocommit” mode,
 assuming you’re working with regular old databases that are doing some
 simple behind-the-scenes replication.   Because again, if you do a flush,
 those rows go to the master, if the transaction is still open, then reading
 from the slaves you won’t see the rows you just inserted.So in reality,
 that example is kind of crappy

Re: [openstack-dev] [oslo.db]A proposal for DB read/write separation

2014-08-08 Thread Mike Wilson
Li Ma,

This is interesting, In general I am in favor of expanding the scope of any
read/write separation capabilities that we have. I'm not clear what exactly
you are proposing, hopefully you can answer some of my questions inline.
The thing I had thought of immediately was detection of whether an
operation is read or write and integrating that into oslo.db or sqlalchemy.
Mike Bayer has some thoughts on that[1] and there are other approaches
around that can be copied/learned from. These sorts of things are clear to
me and while moving towards more transparency for the developer, still
require context. Please, share with us more details on your proposal.

-Mike

[1]
http://www.percona.com/doc/percona-xtradb-cluster/5.5/wsrep-system-index.html
[2]
http://techspot.zzzeek.org/2012/01/11/django-style-database-routers-in-sqlalchemy/


On Thu, Aug 7, 2014 at 10:03 PM, Li Ma skywalker.n...@gmail.com wrote:

 Getting a massive amount of information from data storage to be displayed
 is
 where most of the activity happens in OpenStack. The two activities of
 reading
 data and writing (creating, updating and deleting) data are fundamentally
 different.

 The optimization for these two opposite database activities can be done by
 physically separating the databases that service these two different
 activities. All the writes go to database servers, which then replicates
 the
 written data to the database server(s) dedicated to servicing the reads.


 Currently, AFAIK, many OpenStack deployment in production try to take
 advantage of MySQL (includes Percona or MariaDB) multi-master Galera
 cluster.
 It is possible to design and implement a read/write separation schema
 for such a DB cluster.


I just want to clarify here, are you suggesting that _all_ reads and _all_
writes would hit different databases? It would be interesting to see a
relational schema design that would allow that to work. That seems like
something that you wouldn't try in a relational database at all.



 Actually, OpenStack has a method for read scalability via defining
 master_connection and slave_connection in configuration, but this method
 lacks of flexibility due to deciding master or slave in the logical
 context(code). It's not transparent for application developer.
 As a result, it is not widely used in all the OpenStack projects.

 So, I'd like to propose a transparent read/write separation method
 for oslo.db that every project may happily takes advantage of it
 without any code modification.


The problem with making it transparent to the developer is that, well, you
can't unless your application is tolerant of old data in an asynchronous
replication world. If you are in a fully synchronous world you could fully
separate writes and reads, but what would be the point since your database
performance is now trash anyway. Please note that although Galera is a
considered a synchronous model it's not actually all the way there. You can
break the certification of course, but there are also things that are done
to keep the performance to an acceptable level. Take for example the
wswrep_causal_reads configuration parameter[2]. Without this sucker being
turned on you can't make read/write separation transparent to the
developer. Turning it on causes a significant performance degradation
unfortunately.

I feel like this is a problem fundamental to a consistent relational
dataset. If you are okay with eventual consistency it's okay, you can make
things transparent to the developer. But by it's very nature relational
datasets are well, relational, they need all the other pieces and those
pieces need to be consistent. I guess what I am saying is that your
proposal needs more details. Please respond with specifics and examples to
move the discussion forward.



 Moreover, I'd like to put it in the mailing list in advance to
 make sure it is acceptable for oslo.db.

 I'd appreciate any comments.

 br.
 Li Ma


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)

2014-03-14 Thread Mike Wilson
   that will be in oslo)
   2) Split of work of getting rid of soft deletion in steps (that I
   already mention):
   a) remove soft deletion from places where we are not using it
   b) replace internal code where we are using soft deletion to that
 framework
   c) replace API stuff using ceilometer (for logs) or this framework (for
   restorable stuff)
  
  
   To put in a nutshell: Restoring Delete resources / Delayed Deletion !=
   Soft deletion.
  
  
   Best regards,
   Boris Pavlovic
  
  
  
   On Thu, Mar 13, 2014 at 9:21 PM, Mike Wilson geekinu...@gmail.com
   mailto:geekinu...@gmail.com wrote:
  
   For some guests we use the LVM imagebackend and there are times
 when
   the guest is deleted on accident. Humans, being what they are,
 don't
   back up their files and don't take care of important data, so it is
   not uncommon to use lvrestore and undelete an instance so that
   people can get their data. Of course, this is not always possible
 if
   the data has been subsequently overwritten. But it is common enough
   that I imagine most of our operators are familiar with how to do
 it.
   So I guess my saying that we do it on a regular basis is not quite
   accurate. Probably would be better to say that it is not uncommon
 to
   do this, but definitely not a daily task or something of that ilk.
  
   I have personally undeleted an instance a few times after
   accidental deletion also. I can't remember the specifics, but I do
   remember doing it :-).
  
   -Mike
  
  
   On Tue, Mar 11, 2014 at 12:46 PM, Johannes Erdfelt
   johan...@erdfelt.com mailto:johan...@erdfelt.com wrote:
  
   On Tue, Mar 11, 2014, Mike Wilson geekinu...@gmail.com
   mailto:geekinu...@gmail.com wrote:
Undeleting things is an important use case in my opinion. We
   do this in our
environment on a regular basis. In that light I'm not sure
   that it would be
appropriate just to log the deletion and git rid of the row.
 I
   would like
to see it go to an archival table where it is easily
 restored.
  
   I'm curious, what are you undeleting and why?
  
   JE
  
  
   ___
   OpenStack-dev mailing list
   OpenStack-dev@lists.openstack.org
   mailto:OpenStack-dev@lists.openstack.org
  
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
  
  
   ___
   OpenStack-dev mailing list
   OpenStack-dev@lists.openstack.org
   mailto:OpenStack-dev@lists.openstack.org
   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
  
  
  
   ___
   OpenStack-dev mailing list
   OpenStack-dev@lists.openstack.org
   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
 
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo.messaging] [zeromq] nova-rpc-zmq-receiver bottleneck

2014-03-14 Thread Mike Wilson
Hi Yatin,

I'm glad you are thinking about the drawbacks of the zmq-receiver causes, I
want to give you a reason to keep the zmq-receiver and get your feedback.
The way I think about the zmq-receiver is a tiny little mini-broker that
exists separate from any other OpenStack service. As such, it's
implementation can be augmented to support store-and-forward and possibly
other messaging behaviors that are desirable for ceilometer currently and
possibly other things in the future. Integrating the receiver into each
service is going to remove its independence and black box nature and give
it all the bugs and quirks of any project it gets lumped in with. I would
prefer that we continue to improve zmq-receiver to overcome the tough
parts. Either that or find a good replacement and use that. An example of a
possible replacement might be the qpid dispatch router[1], although this
guy explicitly wants to avoid any store and forward behaviors. Of course,
dispatch router is going to be tied to qpid, I just wanted to give an
example of something with similar functionality.

-Mike


On Thu, Mar 13, 2014 at 11:36 AM, yatin kumbhare yatinkumbh...@gmail.comwrote:

 Hello Folks,

 When zeromq is use as rpc-backend, nova-rpc-zmq-receiver service needs
 to be run on every node.

 zmq-receiver receives messages on tcp://*:9501 with socket type PULL and
 based on topic-name (which is extracted from received data), it forwards
 data to respective local services, over IPC protocol.

 While, openstack services, listen/bind on IPC socket with socket-type
 PULL.

 I see, zmq-receiver as a bottleneck and overhead as per the current
 design.
 1. if this service crashes: communication lost.
 2. overhead of running this extra service on every nodes, which just
 forward messages as is.


 I'm looking forward to, remove zmq-receiver service and enable direct
 communication (nova-* and cinder-*) across and within node.

 I believe, this will create, zmq experience more seamless.

 the communication will change from IPC to zmq TCP socket type for each
 service.

 like: rpc.cast from scheduler -to - compute would be direct rpc message
 passing. no routing through zmq-receiver.

 Now, TCP protocol, all services will bind to unique port (port-range could
 be, 9501-9510)

 from nova.conf, rpc_zmq_matchmaker =
 nova.openstack.common.rpc.matchmaker_ring.MatchMakerRing.

 I have put arbitrary ports numbers after the service name.

 file:///etc/oslo/matchmaker_ring.json

 {
  cert:9507: [
  controller
  ],
  cinder-scheduler:9508: [
  controller
  ],
  cinder-volume:9509: [
  controller
  ],
  compute:9501: [
  controller,computenodex
  ],
  conductor:9502: [
  controller
  ],
  consoleauth:9503: [
  controller
  ],
  network:9504: [
  controller,computenodex
  ],
  scheduler:9506: [
  controller
  ],
  zmq_replies:9510: [
  controller,computenodex
  ]
  }

 Here, the json file would keep track of ports for each services.

 Looking forward to seek community feedback on this idea.


 Regards,
 Yatin


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)

2014-03-13 Thread Mike Wilson
For some guests we use the LVM imagebackend and there are times when the
guest is deleted on accident. Humans, being what they are, don't back up
their files and don't take care of important data, so it is not uncommon to
use lvrestore and undelete an instance so that people can get their data.
Of course, this is not always possible if the data has been subsequently
overwritten. But it is common enough that I imagine most of our operators
are familiar with how to do it. So I guess my saying that we do it on a
regular basis is not quite accurate. Probably would be better to say that
it is not uncommon to do this, but definitely not a daily task or something
of that ilk.

I have personally undeleted an instance a few times after accidental
deletion also. I can't remember the specifics, but I do remember doing it
:-).

-Mike


On Tue, Mar 11, 2014 at 12:46 PM, Johannes Erdfelt johan...@erdfelt.comwrote:

 On Tue, Mar 11, 2014, Mike Wilson geekinu...@gmail.com wrote:
  Undeleting things is an important use case in my opinion. We do this in
 our
  environment on a regular basis. In that light I'm not sure that it would
 be
  appropriate just to log the deletion and git rid of the row. I would like
  to see it go to an archival table where it is easily restored.

 I'm curious, what are you undeleting and why?

 JE


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)

2014-03-13 Thread Mike Wilson
The restore use case is for sure inconsistently implemented and used. I
think I agree with Boris that we treat it as separate and just move on with
cleaning up soft delete. I imagine most deployments don't like having most
of the rows in their table be useless and make db access slow? That being
said, I am a little sad my hacky restore method will need to be reworked
:-).

-Mike


On Thu, Mar 13, 2014 at 1:30 PM, Clint Byrum cl...@fewbar.com wrote:

 Excerpts from Tim Bell's message of 2014-03-12 11:02:25 -0700:
 
  
   If you want to archive images per-say, on deletion just export it to a
 'backup tape' (for example) and store enough of the metadata
   on that 'tape' to re-insert it if this is really desired and then
 delete it from the database (or do the export... asynchronously). The
   same could be said with VMs, although likely not all resources, aka
 networks/.../ make sense to do this.
  
   So instead of deleted = 1, wait for cleaner, just save the resource (if
   possible) + enough metadata on some other system ('backup tape',
 alternate storage location, hdfs, ceph...) and leave it there unless
   it's really needed. Making the database more complex (and all
 associated code) to achieve this same goal seems like a hack that just
   needs to be addressed with a better way to do archiving.
  
   In a cloudy world of course people would be able to recreate
 everything they need on-demand so who needs undelete anyway ;-)
  
 
  I have no problem if there was an existing process integrated into all
 of the OpenStack components which would produce me an archive trail with
 meta data and a command to recover the object from that data.
 
  Currently, my understanding is that there is no such function and thus
 the proposal to remove the deleted column is premature.
 

 That seems like an unreasonable request of low level tools like Nova. End
 user applications and infrastructure management should be responsible
 for these things and will do a much better job of it, as you can work
 your own business needs for reliability and recovery speed into an
 application aware solution. If Nova does it, your cloud just has to
 provide everybody with the same un-delete, which is probably overkill
 for _many_ applications.

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [db][all] (Proposal) Restorable Delayed deletion of OS Resources

2014-03-13 Thread Mike Wilson
After a read through seems pretty good.

+1


On Thu, Mar 13, 2014 at 1:42 PM, Boris Pavlovic bpavlo...@mirantis.comwrote:

 Hi stackers,

 As a result of discussion:
 [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion
 (step by step)
 http://osdir.com/ml/openstack-dev/2014-03/msg00947.html

 I understood that there should be another proposal. About how we should
 implement Restorable  Delayed Deletion of OpenStack Resource in common way
  without these hacks with soft deletion in DB.  It is actually very
 simple, take a look at this document:


 https://docs.google.com/document/d/1WGrIgMtWJqPDyT6PkPeZhNpej2Q9Mwimula8S8lYGV4/edit?usp=sharing


 Best regards,
 Boris Pavlovic

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron][LBaaS] Mini-summit Interest?

2014-03-11 Thread Mike Wilson
Hangouts  worked well at the nova mid-cycle meetup. Just make sure you have
your network situation sorted out before hand. Bandwidth and firewalls are
what comes to mind immediately.

-Mike


On Tue, Mar 11, 2014 at 9:34 AM, Tom Creighton
tom.creigh...@rackspace.comwrote:

 When the Designate team had their mini-summit, they had an open Google
 Hangout for remote participants.  We could even have an open conference
 bridge if you are not partial to video conferencing.  With the issue of
 inclusion solved, let's focus on a date that is good for the team!

 Cheers,

 Tom Creighton


 On Mar 10, 2014, at 4:10 PM, Edgar Magana emag...@plumgrid.com wrote:

  Eugene,
 
  A have a few arguments why I believe this is not 100% inclusive
* Is the foundation involved on this process? How? What is the
 budget? Who is the responsible from the foundation  side?
* If somebody made already travel arraignments, it won't be
 possible to make changes at not cost.
* Staying extra days in a different city could impact anyone's
 budget
* As a OpenStack developer. I want to understand why the summit is
 not enough for deciding the next steps for each project. If that is the
 case, I would prefer to make changes on the organization of the summit
 instead of creating mini-summits all around!
  I could continue but I think these are good enough.
 
  I could agree with your point about previous summits being distractive
 for developers, this is why this time the OpenStack foundation is trying
 very hard to allocate specific days for the conference and specific days
 for the summit.
  The point that I am totally agree with you is that we SHOULD NOT have
 session about work that will be done no matter what!  Those are just a
 waste of good time that could be invested in very interesting discussions
 about topics that are still not clear.
  I would recommend that you express this opinion to Mark. He is the right
 guy to decide which sessions will bring interesting discussions and which
 ones will be just a declaration of intents.
 
  Thanks,
 
  Edgar
 
  From: Eugene Nikanorov enikano...@mirantis.com
  Reply-To: OpenStack List openstack-dev@lists.openstack.org
  Date: Monday, March 10, 2014 10:32 AM
  To: OpenStack List openstack-dev@lists.openstack.org
  Subject: Re: [openstack-dev] [Neutron][LBaaS] Mini-summit Interest?
 
  Hi Edgar,
 
  I'm neutral to the suggestion of mini summit at this point.
  Why do you think it will exclude developers?
  If we keep it 1-3 days prior to OS Summit in Atlanta (e.g. in the same
 city) that would allow anyone who joins OS Summit to save on extra
 travelling.
  OS Summit itself is too distractive to have really productive
 discussions, unless your missing the sessions and spend time discussing.
  For instance design sessions basically only good for declaration of
 intents, but not for real discussion of a complex topic at meaningful
 detail level.
 
  What would be your suggestions to make this more inclusive?
  I think the time and place is the key here - hence Atlanta and few days
 prior OS summit.
 
  Thanks,
  Eugene.
 
 
 
  On Mon, Mar 10, 2014 at 10:59 PM, Edgar Magana emag...@plumgrid.com
 wrote:
  Team,
 
  I found that having a mini-summit with a very short notice means
 excluding
  a lot of developers of such an interesting topic for Neutron.
  The OpenStack summit is the opportunity for all developers to come
  together and discuss the next steps, there are many developers that CAN
  NOT afford another trip for a special summit. I am personally against
  that and I do support Mark's proposal of having all the conversation
 over
  IRC and mailing list.
 
  Please, do not start excluding people that won't be able to attend
 another
  face-to-face meeting besides the summit. I believe that these are the
  little things that make an open source community weak if we do not
 control
  it.
 
  Thanks,
 
  Edgar
 
 
  On 3/6/14 9:51 PM, Mark McClain mmccl...@yahoo-inc.com wrote:
 
  
  On Mar 6, 2014, at 4:31 PM, Jay Pipes jaypi...@gmail.com wrote:
  
   On Thu, 2014-03-06 at 21:14 +, Youcef Laribi wrote:
   +1
  
   I think if we can have it before the Juno summit, we can take
   concrete, well thought-out proposals to the community at the summit.
  
   Unless something has changed starting at the Hong Kong design summit
   (which unfortunately I was not able to attend), the design summits
 have
   always been a place to gather to *discuss* and *debate* proposed
   blueprints and design specs. It has never been about a gathering to
   rubber-stamp proposals that have already been hashed out in private
   somewhere else.
  
  You are correct that is the goal of the design summit.  While I do
 think
  it is wise to discuss the next steps with LBaaS at this point in time,
 I
  am not a proponent of in person mini-design summits.  Many contributors
  to LBaaS are distributed all over the global, and scheduling a mini
  summit with short notice will exclude 

Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)

2014-03-11 Thread Mike Wilson
Undeleting things is an important use case in my opinion. We do this in our
environment on a regular basis. In that light I'm not sure that it would be
appropriate just to log the deletion and git rid of the row. I would like
to see it go to an archival table where it is easily restored.

-Mike


On Mon, Mar 10, 2014 at 3:44 PM, Joshua Harlow harlo...@yahoo-inc.comwrote:

  Sounds like a good idea to me.

  I've never understood why we treat the DB as a LOG (keeping deleted == 0
 records around) when we should just use a LOG (or similar system) to begin
 with instead.

  Does anyone use the feature of switching deleted == 1 back to deleted =
 0? Has this worked out for u?

  Seems like some of the feedback on
 https://etherpad.openstack.org/p/operators-feedback-mar14 also suggests
 that this has been a operational pain-point for folks (Tool to delete
 things properly suggestions and such...).

   From: Boris Pavlovic bpavlo...@mirantis.com
 Reply-To: OpenStack Development Mailing List (not for usage questions) 
 openstack-dev@lists.openstack.org
 Date: Monday, March 10, 2014 at 1:29 PM
 To: OpenStack Development Mailing List openstack-dev@lists.openstack.org,
 Victor Sergeyev vserge...@mirantis.com
 Subject: [openstack-dev] [all][db][performance] Proposal: Get rid of soft
 deletion (step by step)

   Hi stackers,

  (It's proposal for Juno.)

  Intro:

  Soft deletion means that records from DB are not actually deleted, they
 are just marked as a deleted. To mark record as a deleted we put in
 special table's column deleted record's ID value.

  Issue 1: Indexes  Queries
 We have to add in every query AND deleted == 0 to get non-deleted
 records.
 It produce performance issue, cause we should add it in any index one
 extra column.
 As well it produce extra complexity in db migrations and building queries.

  Issue 2: Unique constraints
 Why we store ID in deleted and not True/False?
  The reason is that we would like to be able to create real DB unique
 constraints and avoid race conditions on insert operation.

  Sample: we Have table (id, name, password, deleted) we would like to put
 in column name only unique value.

  Approach without UC: if count(`select  where name = name`) == 0:
 insert(...)
 (race cause we are able to add new record between )

  Approach with UC: try: insert(...) except Duplicate: ...

  So to add UC we have to add them on (name, deleted). (to be able to make
 insert/delete/insert with same name)

  As well it produce performance issues, because we have to use Complex
 unique constraints on 2  or more columns. + extra code  complexity in db
 migrations.

  Issue 3: Garbage collector

  It is really hard to make garbage collector that will have good
 performance and be enough common to work in any case for any project.
 Without garbage collector DevOps have to cleanup records by hand, (risk to
 break something). If they don't cleanup DB they will get very soon
 performance issue.

  To put in a nutshell most important issues:
 1) Extra complexity to each select query  extra column in each index
 2) Extra column in each Unique Constraint (worse performance)
 3) 2 Extra column in each table: (deleted, deleted_at)
 4) Common garbage collector is required


  To resolve all these issues we should just remove soft deletion.

  One of approaches that I see is in step by step removing deleted
 column from every table with probably code refactoring.  Actually we have 3
 different cases:

  1) We don't use soft deleted records:
 1.1) Do .delete() instead of .soft_delete()
 1.2) Change query to avoid adding extra deleted == 0 to each query
 1.3) Drop deleted and deleted_at columns

  2) We use soft deleted records for internal stuff e.g. periodic tasks
 2.1) Refactor code somehow: E.g. store all required data by periodic task
 in some special table that has: (id, type, json_data) columns
 2.2) On delete add record to this table
 2.3-5) similar to 1.1, 1.2, 13

  3) We use soft deleted records in API
 3.1) Deprecated API call if it is possible
 3.2) Make proxy call to ceilometer from API
 3.3) On .delete() store info about records in (ceilometer, or somewhere
 else)
 3.4-6) similar to 1.1, 1.2, 1.3

 This is not ready RoadMap, just base thoughts to start the constructive
 discussion in the mailing list, so %stacker% your opinion is very
 important!


  Best regards,
 Boris Pavlovic


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Simulating many fake nova compute nodes for scheduler testing

2014-03-04 Thread Mike Wilson
On Mon, Mar 3, 2014 at 3:10 PM, Sergey Skripnick sskripn...@mirantis.comwrote:




  I can run multiple compute service in same hosts without containers.
 Containers give you a nice isolation and another way to try a more
 realistic scenario, but my initial goal now is to be able to simulate many
 fake compute node scenario with as little resources as possible.


 I believe it is impossible to use threads without changes in the code.


Having gone the threads route once myself, I can say from experience that
it requires changes to the code. I was able to get threads up and running
with a few modifications, but there were other issues that I never fully
resolved that make me lean more towards the container model that has been
discussed earlier in the thread. Btw, I would suggest having a look at
Rally, the Openstack Benchmarking Service. They have deployment frameworks
that use LXC that you might be able to write a thread model for.

-Mike




 --
 Regards,
 Sergey Skripnick

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Neutron] Running multiple neutron-servers

2013-12-11 Thread Mike Wilson
Hi Neutron team,

I haven't been involved in neutron meetings for quite some time so I'm not
sure where we are on this at this point. It is often recommended in
OpenStack guides and other operational materials to run multiple
neutron-servers to deal with the API load from Nova. Things like the
_heal_instance_info_caches periodic task as well as just normal create
requests are pretty heavy. Those issues aside I think we can all agree that
it would good for the neutron-server to be horizontally scalable. I don't
have a handle on the all the issues surrounding this. However, I did report
a bug a few months ago about concurrency and updates to the
IpAvailabilityRanges[1]. There was a fix proposed by Zhang Hua [2] that
seems like it needs more discussion.

Essentially, Salvatore has concerns about patching up a design flaw from
what I gather. At the same time, we still have had this issue since the
initial release of neutron(quantum) and it is still a really big deal for
deployers. I would like to propose that we pick up the conversation where
it left off on the proposed fix and _also_ consider any possible redesign
going forward.

Could I get some feedback from Salvatore specifically and other members of
the team on this? I would also be happy to pitch in towards whatever
solution is decided on provided we can rescue the poor deployers :-).

-Mike Wilson


[1] https://bugs.launchpad.net/neutron/+bug/1214115
[2] https://review.openstack.org/43275
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Oslo] First steps towards amqp 1.0

2013-12-09 Thread Mike Wilson
This is the first time I've heard of the dispatch router, I'm really
excited now that I've looked at it a bit. Thx Gordon and Russell for
bringing this up. I'm very familiar with the scaling issues associated with
any kind of brokered messaging solution. We grew an Openstack installation
to about 7,000 nodes and started having significant scaling issues with the
qpid broker. We've talked about our problems at a couple summits in a fair
amount of detail[1][2]. I won't bother repeating the information in this
thread.

I really like the idea of separating the logic of routing away from the the
message emitter. Russell mentioned the 0mq matchmaker, we essentially
ditched the qpid broker for direct communication via 0mq and it's
matchmaker. It still has a lot of problems which dispatch seems to address.
For example, in ceilometer we have store-and-forward behavior as a
requirement. This kind of communication requires a broker but 0mq doesn't
really officially support one, which means we would probably end up with
some broker as part of OpenStack. Matchmaker is also a fairly basic
implementation of what is essentially a directory. For any sort of serious
production use case you end up sprinkling JSON files all over the place or
maintaining a Redis backend. I feel like the matchmaker needs a bunch more
work to make modifying the directory simpler for operations. I would rather
put that work into a separate project like dispatch than have to maintain
essentially a one off in Openstack's codebase.

I wonder how this fits into messaging from a driver perspective in
Openstack or even how this fits into oslo.messaging? Right now we have
topics for binaries(compute, network, consoleauth, etc),
hostname.service_topic for nodes, fanout queue per node (not sure if kombu
also has this) and different exchanges per project. If we can abstract the
routing from the emission of the message all we really care about is
emitter, endpoint, messaging pattern (fanout, store and forward, etc). Also
not sure if there's a dispatch analogue in the rabbit world, if not we need
to have some mapping of concepts etc between impls.

So many questions, but in general I'm really excited about this and eager
to contribute. For sure I will start playing with this in Bluehost's
environments that haven't been completely 0mqized. I also have some
lingering concerns about qpid in general. Beyond scaling issues I've run
into some other terrible bugs that motivated our move away from it. Again,
these are mentioned in our presentations at summits and I'd be happy to
talk more about them in a separate discussion. I've also been able to talk
to some other qpid+openstack users who have seen the same bugs. Another
large installation that comes to mind is Qihoo 360 in China. They run a few
thousand nodes with qpid for messaging and are familiar with the snags we
run into.

Gordon,

I would really appreciate if you could watch those two talks and comment.
The bugs are probably separate from the dispatch router discussion, but it
does dampen my enthusiasm a bit not knowing how to fix issues beyond scale
:-(.

-Mike Wilson

[1]
http://www.openstack.org/summit/portland-2013/session-videos/presentation/using-openstack-in-a-traditional-hosting-environment
[2]
http://www.openstack.org/summit/openstack-summit-hong-kong-2013/session-videos/presentation/going-brokerless-the-transition-from-qpid-to-0mq




On Mon, Dec 9, 2013 at 4:29 PM, Mark McLoughlin mar...@redhat.com wrote:

 On Mon, 2013-12-09 at 16:05 +0100, Flavio Percoco wrote:
  Greetings,
 
  As $subject mentions, I'd like to start discussing the support for
  AMQP 1.0[0] in oslo.messaging. We already have rabbit and qpid drivers
  for earlier (and different!) versions of AMQP, the proposal would be
  to add an additional driver for a _protocol_ not a particular broker.
  (Both RabbitMQ and Qpid support AMQP 1.0 now).
 
  By targeting a clear mapping on to a protocol, rather than a specific
  implementation, we would simplify the task in the future for anyone
  wishing to move to any other system that spoke AMQP 1.0. That would no
  longer require a new driver, merely different configuration and
  deployment. That would then allow openstack to more easily take
  advantage of any emerging innovations in this space.

 Sounds sane to me.

 To put it another way, assuming all AMQP 1.0 client libraries are equal,
 all the operator cares about is that we have a driver that connect into
 whatever AMQP 1.0 messaging topology they want to use.

 Of course, not all client libraries will be equal, so if we don't offer
 the choice of library/driver to the operator, then the onus is on us to
 pick the best client library for this driver.

 (Enjoying the rest of this thread too, thanks to Gordon for his
 insights)

 Mark.


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Neutron Distributed Virtual Router

2013-12-09 Thread Mike Wilson
I guess the question that immediately comes to mind is, is there anyone
that doesn't want a distributed router? I guess there could be someone out
there that hates the idea of traffic flowing in a balanced fashion, but
can't they just run a single router then? Does there really need to be some
flag to disable/enable this behavior? Maybe I am oversimplifying things...
you tell me.

-Mike Wilson


On Mon, Dec 9, 2013 at 3:01 PM, Vasudevan, Swaminathan (PNB Roseville) 
swaminathan.vasude...@hp.com wrote:

  Hi Folks,

 We are in the process of defining the API for the Neutron Distributed
 Virtual Router, and we have a question.



 Just wanted to get the feedback from the community before we implement and
 post for review.



 We are planning to use the “distributed” flag for the routers that are
 supposed to be routing traffic locally (both East West and North South).

 This “distributed” flag is already there in the “neutronclient” API, but
 currently only utilized by the “Nicira Plugin”.

 We would like to go ahead and use the same “distributed” flag and add an
 extension to the router table to accommodate the “distributed flag”.



 Please let us know your feedback.



 Thanks.



 Swaminathan Vasudevan

 Systems Software Engineer (TC)





 HP Networking

 Hewlett-Packard

 8000 Foothills Blvd

 M/S 5541

 Roseville, CA - 95747

 tel: 916.785.0937

 fax: 916.785.1815

 email: swaminathan.vasude...@hp.com





 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Reg : Security groups implementation using openflows in quantum ovs plugin

2013-11-25 Thread Mike Wilson
Adding Jun to this thread since gmail is failing him.


On Tue, Nov 19, 2013 at 10:44 AM, Amir Sadoughi amir.sadou...@rackspace.com
 wrote:

  Yes, my work has been on ML2 with neutron-openvswitch-agent.  I’m
 interested to see what Jun Park has. I might have something ready before he
 is available again, but would like to collaborate regardless.

  Amir



  On Nov 19, 2013, at 3:31 AM, Kanthi P pavuluri.kan...@gmail.com wrote:

  Hi All,

  Thanks for the response!
 Amir,Mike: Is your implementation being done according to ML2 plugin

  Regards,
 Kanthi


 On Tue, Nov 19, 2013 at 1:43 AM, Mike Wilson geekinu...@gmail.com wrote:

 Hi Kanthi,

  Just to reiterate what Kyle said, we do have an internal implementation
 using flows that looks very similar to security groups. Jun Park was the
 guy that wrote this and is looking to get it upstreamed. I think he'll be
 back in the office late next week. I'll point him to this thread when he's
 back.

  -Mike


 On Mon, Nov 18, 2013 at 3:39 PM, Kyle Mestery (kmestery) 
 kmest...@cisco.com wrote:

 On Nov 18, 2013, at 4:26 PM, Kanthi P pavuluri.kan...@gmail.com wrote:
   Hi All,
 
  We are planning to implement quantum security groups using openflows
 for ovs plugin instead of iptables which is the case now.
 
  Doing so we can avoid the extra linux bridge which is connected
 between the vnet device and the ovs bridge, which is given as a work around
 since ovs bridge is not compatible with iptables.
 
  We are planning to create a blueprint and work on it. Could you please
 share your views on this
 
  Hi Kanthi:

 Overall, this idea is interesting and removing those extra bridges would
 certainly be nice. Some people at Bluehost gave a talk at the Summit [1] in
 which they explained they have done something similar, you may want to
 reach out to them since they have code for this internally already.

 The OVS plugin is in feature freeze during Icehouse, and will be
 deprecated in favor of ML2 [2] at the end of Icehouse. I would advise you
 to retarget your work at ML2 when running with the OVS agent instead. The
 Neutron team will not accept new features into the OVS plugin anymore.

 Thanks,
 Kyle

 [1]
 http://www.openstack.org/summit/openstack-summit-hong-kong-2013/session-videos/presentation/towards-truly-open-and-commoditized-software-defined-networks-in-openstack
 [2] https://wiki.openstack.org/wiki/Neutron/ML2

  Thanks,
  Kanthi
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


  ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Icehouse mid-cycle meetup

2013-11-25 Thread Mike Wilson
Hotel information has been posted. Look forward to seeing you all in
February :-).

-Mike


On Mon, Nov 25, 2013 at 8:14 AM, Russell Bryant rbry...@redhat.com wrote:

 Greetings,

 Other groups have started doing mid-cycle meetups with success.  I've
 received significant interest in having one for Nova.  I'm now excited
 to announce some details.

 We will be holding a mid-cycle meetup for the compute program from
 February 10-12, 2014, in Orem, UT.  Huge thanks to Bluehost for hosting us!

 Details are being posted to the event wiki page [1].  If you plan to
 attend, please register.  Hotel recommendations with booking links will
 be posted soon.

 Please let me know if you have any questions.

 Thanks,

 [1] https://wiki.openstack.org/wiki/Nova/IcehouseCycleMeetup
 --
 Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Does Nova really need an SQL database?

2013-11-20 Thread Mike Wilson
I agree heartily with the availability and resiliency aspect.  For me, that
is the biggest reason to consider a NOSQL backend. The other potential
performance benefits are attractive to me also.

-Mike


On Wed, Nov 20, 2013 at 9:06 AM, Soren Hansen so...@linux2go.dk wrote:

 2013/11/18 Mike Spreitzer mspre...@us.ibm.com:
  There were some concerns expressed at the summit about scheduler
  scalability in Nova, and a little recollection of Boris' proposal to
  keep the needed state in memory.


  I also heard one guy say that he thinks Nova does not really need a
  general SQL database, that a NOSQL database with a bit of
  denormalization and/or client-maintained secondary indices could
  suffice.

 I may have said something along those lines. Just to clarify -- since
 you started this post by talking about scheduler scalability -- the main
 motivation for using a non-SQL backend isn't scheduler scalability, it's
 availability and resilience. I just don't accept the failure modes that
 MySQL (and derivatives such as Galera) impose.

  Has that sort of thing been considered before?

 It's been talked about on and off since... well, probably since we
 started this project.

  What is the community's level of interest in exploring that?

 The session on adding a backend using a non-SQL datastore was pretty
 well attended.


 --
 Soren Hansen | http://linux2go.dk/
 Ubuntu Developer | http://www.ubuntu.com/
 OpenStack Developer  | http://www.openstack.org/

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Reg : Security groups implementation using openflows in quantum ovs plugin

2013-11-19 Thread Mike Wilson
The current implementation is fairly generic, the plan is to get it into
the ML2 plugin.

-Mike


On Tue, Nov 19, 2013 at 2:31 AM, Kanthi P pavuluri.kan...@gmail.com wrote:

 Hi All,

 Thanks for the response!
 Amir,Mike: Is your implementation being done according to ML2 plugin

 Regards,
 Kanthi


 On Tue, Nov 19, 2013 at 1:43 AM, Mike Wilson geekinu...@gmail.com wrote:

 Hi Kanthi,

 Just to reiterate what Kyle said, we do have an internal implementation
 using flows that looks very similar to security groups. Jun Park was the
 guy that wrote this and is looking to get it upstreamed. I think he'll be
 back in the office late next week. I'll point him to this thread when he's
 back.

 -Mike


 On Mon, Nov 18, 2013 at 3:39 PM, Kyle Mestery (kmestery) 
 kmest...@cisco.com wrote:

 On Nov 18, 2013, at 4:26 PM, Kanthi P pavuluri.kan...@gmail.com wrote:
  Hi All,
 
  We are planning to implement quantum security groups using openflows
 for ovs plugin instead of iptables which is the case now.
 
  Doing so we can avoid the extra linux bridge which is connected
 between the vnet device and the ovs bridge, which is given as a work around
 since ovs bridge is not compatible with iptables.
 
  We are planning to create a blueprint and work on it. Could you please
 share your views on this
 
 Hi Kanthi:

 Overall, this idea is interesting and removing those extra bridges would
 certainly be nice. Some people at Bluehost gave a talk at the Summit [1] in
 which they explained they have done something similar, you may want to
 reach out to them since they have code for this internally already.

 The OVS plugin is in feature freeze during Icehouse, and will be
 deprecated in favor of ML2 [2] at the end of Icehouse. I would advise you
 to retarget your work at ML2 when running with the OVS agent instead. The
 Neutron team will not accept new features into the OVS plugin anymore.

 Thanks,
 Kyle

 [1]
 http://www.openstack.org/summit/openstack-summit-hong-kong-2013/session-videos/presentation/towards-truly-open-and-commoditized-software-defined-networks-in-openstack
 [2] https://wiki.openstack.org/wiki/Neutron/ML2

  Thanks,
  Kanthi
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Does Nova really need an SQL database?

2013-11-19 Thread Mike Wilson
I've been thinking about this use case for a DHT-like design, I think I
want to do what other people have alluded to here and try and intercept
problematic requests like this one in some sort of pre sending to
ring-segment stage. In this case the pre-stage could decide to send this
off to a scheduler that has a more complete view of the world.
Alternatively, don't make a single request for 50 instances, just send 50
requests for one? Is that a viable thing to do for this use case?

-Mike


On Tue, Nov 19, 2013 at 7:03 PM, Joshua Harlow harlo...@yahoo-inc.comwrote:

 At yahoo at least 50+ simultaneous will be the common case (maybe we are
 special).

 Think of what happens on www.yahoo.com say on the olympics, news.yahoo.com
 could need 50+ very very quickly (especially if say a gold medal is won by
 some famous person). So I wouldn't discount those being the common case
 (may not be common for some, but is common for others). In fact any
 website with spurious/spikey traffic will have the same desire; so it
 might be a target use-case for website like companies (or ones that can't
 upfront predict spikes).

 Overall though I think what u said about 'don't fill it up' is good
 general knowledge. Filling up stuff beyond a certain threshold is
 dangerous just in general (one should only push the limits so far before
 madness).

 On 11/19/13 4:08 PM, Clint Byrum cl...@fewbar.com wrote:

 Excerpts from Chris Friesen's message of 2013-11-19 12:18:16 -0800:
  On 11/19/2013 01:51 PM, Clint Byrum wrote:
   Excerpts from Chris Friesen's message of 2013-11-19 11:37:02 -0800:
   On 11/19/2013 12:35 PM, Clint Byrum wrote:
  
   Each scheduler process can own a different set of resources. If they
   each grab instance requests in a round-robin fashion, then they will
   fill their resources up in a relatively well balanced way until one
   scheduler's resources are exhausted. At that time it should bow out
 of
   taking new instances. If it can't fit a request in, it should kick
 the
   request out for retry on another scheduler.
  
   In this way, they only need to be in sync in that they need a way to
   agree on who owns which resources. A distributed hash table that
 gets
   refreshed whenever schedulers come and go would be fine for that.
  
   That has some potential, but at high occupancy you could end up
 refusing
   to schedule something because no one scheduler has sufficient
 resources
   even if the cluster as a whole does.
  
  
   I'm not sure what you mean here. What resource spans multiple compute
   hosts?
 
  Imagine the cluster is running close to full occupancy, each scheduler
  has room for 40 more instances.  Now I come along and issue a single
  request to boot 50 instances.  The cluster has room for that, but none
  of the schedulers do.
 
 
 You're assuming that all 50 come in at once. That is only one use case
 and not at all the most common.
 
   This gets worse once you start factoring in things like heat and
   instance groups that will want to schedule whole sets of resources
   (instances, IP addresses, network links, cinder volumes, etc.) at
 once
   with constraints on where they can be placed relative to each other.
 
   Actually that is rather simple. Such requests have to be serialized
   into a work-flow. So if you say give me 2 instances in 2 different
   locations then you allocate 1 instance, and then another one with
   'not_in_location(1)' as a condition.
 
  Actually, you don't want to serialize it, you want to hand the whole
 set
  of resource requests and constraints to the scheduler all at once.
 
  If you do them one at a time, then early decisions made with
  less-than-complete knowledge can result in later scheduling requests
  failing due to being unable to meet constraints, even if there are
  actually sufficient resources in the cluster.
 
  The VM ensembles document at
 
 
 https://docs.google.com/document/d/1bAMtkaIFn4ZSMqqsXjs_riXofuRvApa--qo4U
 Twsmhw/edit?pli=1
  has a good example of how one-at-a-time scheduling can cause spurious
  failures.
 
  And if you're handing the whole set of requests to a scheduler all at
  once, then you want the scheduler to have access to as many resources
 as
  possible so that it has the highest likelihood of being able to satisfy
  the request given the constraints.
 
 This use case is real and valid, which is why I think there is room for
 multiple approaches. For instance the situation you describe can also be
 dealt with by just having the cloud stay under-utilized and accepting
 that when you get over a certain percentage utilized spurious failures
 will happen. We have a similar solution in the ext3 filesystem on Linux.
 Don't fill it up, or suffer a huge performance penalty.
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


 ___
 OpenStack-dev 

Re: [openstack-dev] [Nova] Does Nova really need an SQL database?

2013-11-18 Thread Mike Wilson
I'm not sure the problem is that we use a general SQL database. The
problems as I see it are:

-Multi-master in MySQL sucks. Complicated, problematic and not performant.
Also, no great way to do multi-master over higher latency networks.
-MySQL and Postgres require tuning to scale.
-We tend to write queries badly when using SQLA. Ie. lots of code-level
joins and filtering.
-SQLA mapping is pretty slow. See Boris and Alexei's patch to
compute_node_get_all for an example of how this can be worked around[1].
Also comstud's work on the mysql backend[2].
-Thread serialization problem in eventlet, also somewhat addressed by the
mysql backend

Some of these problems are addressed very well by some NOSQL DBs,
specifically the multi-master problems just go away for the most part.
However our general SQL databases provide some nice things like
transactions that would require some more work on our end to do properly.

All that being said, I am very interested in what NOSQL DBs can do for us.

-Mike Wilson

[1] https://review.openstack.org/#/c/43151/
[2] https://blueprints.launchpad.net/nova/+spec/db-mysqldb-impl


On Mon, Nov 18, 2013 at 12:35 PM, Mike Spreitzer mspre...@us.ibm.comwrote:

 There were some concerns expressed at the summit about scheduler
 scalability in Nova, and a little recollection of Boris' proposal to keep
 the needed state in memory.  I also heard one guy say that he thinks Nova
 does not really need a general SQL database, that a NOSQL database with a
 bit of denormalization and/or client-maintained secondary indices could
 suffice.  Has that sort of thing been considered before?  What is the
 community's level of interest in exploring that?

 Thanks,
 Mike
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Reg : Security groups implementation using openflows in quantum ovs plugin

2013-11-18 Thread Mike Wilson
Hi Kanthi,

Just to reiterate what Kyle said, we do have an internal implementation
using flows that looks very similar to security groups. Jun Park was the
guy that wrote this and is looking to get it upstreamed. I think he'll be
back in the office late next week. I'll point him to this thread when he's
back.

-Mike


On Mon, Nov 18, 2013 at 3:39 PM, Kyle Mestery (kmestery) kmest...@cisco.com
 wrote:

 On Nov 18, 2013, at 4:26 PM, Kanthi P pavuluri.kan...@gmail.com wrote:
  Hi All,
 
  We are planning to implement quantum security groups using openflows for
 ovs plugin instead of iptables which is the case now.
 
  Doing so we can avoid the extra linux bridge which is connected between
 the vnet device and the ovs bridge, which is given as a work around since
 ovs bridge is not compatible with iptables.
 
  We are planning to create a blueprint and work on it. Could you please
 share your views on this
 
 Hi Kanthi:

 Overall, this idea is interesting and removing those extra bridges would
 certainly be nice. Some people at Bluehost gave a talk at the Summit [1] in
 which they explained they have done something similar, you may want to
 reach out to them since they have code for this internally already.

 The OVS plugin is in feature freeze during Icehouse, and will be
 deprecated in favor of ML2 [2] at the end of Icehouse. I would advise you
 to retarget your work at ML2 when running with the OVS agent instead. The
 Neutron team will not accept new features into the OVS plugin anymore.

 Thanks,
 Kyle

 [1]
 http://www.openstack.org/summit/openstack-summit-hong-kong-2013/session-videos/presentation/towards-truly-open-and-commoditized-software-defined-networks-in-openstack
 [2] https://wiki.openstack.org/wiki/Neutron/ML2

  Thanks,
  Kanthi
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron] Shared network between specific tenants, but not all tenants?

2013-10-29 Thread Mike Wilson
+1

I also have tenants asking for this :-). I'm interested to see a blueprint.

-Mike


On Tue, Oct 29, 2013 at 1:24 PM, Jay Pipes jaypi...@gmail.com wrote:

 On 10/29/2013 02:25 PM, Justin Hammond wrote:

 We have been considering this and have some notes on our concept, but we
 haven't made a blueprint for it. I will speak amongst my group and find
 out what they think of making it more public.


 OK, cool, glad to know I'm not the only one with tenants asking for this :)

 Looking forward to a possible blueprint on this.

 Best,
 -jay


  On 10/29/13 12:26 PM, Jay Pipes jaypi...@gmail.com wrote:

  Hi Neutron devs,

 Are there any plans to support networks that are shared/routed only
 between certain tenants (not all tenants)?

 Thanks,
 -jay

 __**_
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.**org OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/**cgi-bin/mailman/listinfo/**openstack-devhttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 __**_
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.**org OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/**cgi-bin/mailman/listinfo/**openstack-devhttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 __**_
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.**org OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/**cgi-bin/mailman/listinfo/**openstack-devhttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Does DB schema hygiene warrant long migrations?

2013-10-24 Thread Mike Wilson
So, I observe a consensus here of long migrations suckm +1 to that.
I also observe a consensus that we need to get no-downtime schema changes
working. It seems super important. Also +1 to that.

Getting back to the original review, it got -2'd because Michael would like
to make sure that the benefit outweighs the cost of the downtime. I
completely agree with that, so far we've heard arguments from both Jay and
Boris as to why this is faster/slower but I think some sort of evidence
other than hearsay is needed. Can we get some sort of benchmark result that
clearly illustrates the performance consequences of the migration in the
long run?

-Mike



On Thu, Oct 24, 2013 at 4:53 PM, Boris Pavlovic bo...@pavlovic.me wrote:

 Michael,


  - pruning isn't done by the system automatically, so we have to assume
 it never happens


 We are working around it
 https://blueprints.launchpad.net/nova/+spec/db-purge-engine



   - we need to have a clearer consensus about what we think the maximum
  size of a nova deployment is. Are we really saying we don't support
  nova installs with a million instances? If so what is the maximum
  number of instances we're targeting? Having a top level size in mind
  isn't a bad thing, but I don't think we have one at the moment that we
  all agree on. Until that happens I'm going to continue targeting the
  largest databases people have told me about (plus a fudge factor).


 Rally https://wiki.openstack.org/wiki/Rally should help us to determine
 this.

 At this moment I can just use theoretical knowledges.
 (and they said even 1mln instances in current nova implementation won't
 work)



 Best regards,
 Boris Pavlovic



 On Fri, Oct 25, 2013 at 2:35 AM, Michael Still mi...@stillhq.com wrote:

 On Fri, Oct 25, 2013 at 9:07 AM, Boris Pavlovic bo...@pavlovic.me
 wrote:
  Johannes,
 
  +1, purging should help here a lot.

 Sure, but my point is more:

  - pruning isn't done by the system automatically, so we have to
 assume it never happens

  - we need to have a clearer consensus about what we think the maximum
 size of a nova deployment is. Are we really saying we don't support
 nova installs with a million instances? If so what is the maximum
 number of instances we're targeting? Having a top level size in mind
 isn't a bad thing, but I don't think we have one at the moment that we
 all agree on. Until that happens I'm going to continue targeting the
 largest databases people have told me about (plus a fudge factor).

 Michael

 --
 Rackspace Australia

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Scheduler meeting and Icehouse Summit

2013-10-16 Thread Mike Wilson
I need to understand better what holistic scheduling means, but I agree
with you that this is not exactly what Boris has raised as an issue. I
don't have a rock solid design for what I want to do, but at least the
objectives I want to achieve are that spinning up more schedulers increases
your response time and ability to schedule perhaps at the cost of the
accuracy of the answer (just good enough) and the need to retry your
request against several scheduler threads. I will try to look for more
resources to understand holistic scheduling a quick google search takes
me to a bunch of EE and manufacturing engineering type papers. I'll do more
research on this.

However, this does fit under performance for sure, it is not unrelated at
all. If there is a chance to incorporate this into a performance session I
think this is where it belongs.

-Mike Wilson


On Mon, Oct 14, 2013 at 9:53 PM, Mike Spreitzer mspre...@us.ibm.com wrote:

 Yes, Rethinking Scheduler Design
 http://summit.openstack.org/cfp/details/34 is not the same as the
 performance issue that Boris raised.  I think the former would be a natural
 consequence of moving to an optimization-based joint decision-making
 framework, because such a thing necessarily takes a good enough attitude.
  The issue Boris raised is more efficient tracking of the true state of
 resources, and I am interested in that issue too.  A holistic scheduler
 needs such tracking, in addition to the needs of the individual services.
  Having multiple consumers makes the issue more interesting :-)

 Regards,
 Mike
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] BUG? nova-compute should delete unused instance files on boot

2013-10-08 Thread Mike Wilson
+1 to what Chris suggested. Zombie state that doesn't affect quota, but
doesn't create more problems by trying to reuse resources that aren't
available. That way we can tell the customer that things are deleted, but
we don't need to break our cloud by screwing up future schedule requests.

-Mike


On Tue, Oct 8, 2013 at 11:58 AM, Joshua Harlow harlo...@yahoo-inc.comwrote:

 Sure, basically a way around this is to do migration of the VM's on the
 host u are doing maintenance on.

 That¹s one way y! has its ops folks work around this.

 Another solution is just don't do local_deletes :-P

 It sounds like your 'zombie' state would be useful as a way to solve this
 also.

 To me though any solution that creates 2 sets of the same resources in
 your cloud though isn't a good way (which afaik the current local_delete
 aims for) as it causes maintenance and operator pain (and needless
 problems that a person has to go in and figure out  resolve). I'd rather
 have the delete fail, leave the quota of the user alone, and tell the user
 the hypervisor where the VM is on is currently under maintenance (ideally
 the `host-update` resolves this, as long as its supported on all
 hypervisor types). At least that gives a sane operational experience and
 doesn't cause support bugs that are hard to resolve.

 But maybe this type of action should be more configurable. Allow or
 disallow local deletes.

 On 10/7/13 11:50 PM, Chris Friesen chris.frie...@windriver.com wrote:

 On 10/07/2013 05:30 PM, Joshua Harlow wrote:
  A scenario that I've seen:
 
  Take 'nova-compute' down for software upgrade, API still accessible
 since
  you want to provide API uptime (aka not taking the whole cluster
 offline).
 
  User Y deletes VM on that hypervisor where nova-compute is currently
 down,
  DB locally deletes, at this point VM 'A' is still active but nova thinks
  its not.
 
 Isn't this sort of thing exactly what nova host-update --maintenance
 enable hostname was intended for?  I.e., push all the VMs off that
 compute node so you can take down the services without causing problems.
 
 Its kind of a pain that the host-update stuff is implemented at the
 hypervisor level though (and isn't available for libvirt), it seems like
 it could be implemented at a more generic level.  (And on that note, why
 isn't there a host table in the database since we can have multiple
 services running on one host and we might want to take them all down?)
 
 Alternately, maybe we need to have a 2-stage delete, where the VM gets
 put into a zombie state in the database and the resources can't be
 reused until the compute service confirms that the VM has been killed.
 
 Chris
 
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Experiences of using Neutron in large scale

2013-10-02 Thread Mike Wilson
Kumar,

How large of a deployment are you considering it for? We've run Neutron in
a fairly large environment (10k+ nodes) for a year now and have learned
some interesting lessons. We use a modified Openvswitch plugin and as such
have no experience with the Nicira plugin. I think the largest single
problem that we have as it pertains to scalability are the race conditions
in neutron-server. Allocating IPs, network, ports etc tend to have some
racey behaviors. I feel like many of these issues are being addressed by
neutron developers, but also Neutron is very viable for large-scale
production today. For instance most of the race conditions that I mention
can be averted if you aren't writing to the database concurrently. You
could designate ONE neutron-server as the write server and the rest as
read, it's a little tricky to do because you have to have a router in
front of them all or reroute requests, but the API set is not very large so
a very doable task. That being said, in our environment we use a single
neutron-server with another standing by as backup. It's not as performant
as we'd like it to be, but it hasn't stopped us from growing so far.

-Mike Wilson

P.S. There is a presentation from the Portland summit that myself and Jun
Park did. In it we talk about some of the issues around scale although
neutron (quantum at the time) is a smaller part of the talk. :
http://www.openstack.org/summit/portland-2013/session-videos/presentation/using-openstack-in-a-traditional-hosting-environment
.


On Wed, Oct 2, 2013 at 11:04 AM, Kumar chvs...@gmail.com wrote:

 Hi,
   We are considering to run openstack Neutron in a large scale deployment.
 I would like to know community experience and suggestions.

 To get to know the quality I am going through neutron bugs( I assume that
 is the best way to know the quality)
 Some of them are real concerning like below bugs
 https://bugs.launchpad.net/neutron/+bug/1211915
 https://bugs.launchpad.net/neutron/+bug/1230407
 https://bugs.launchpad.net/neutron/+bug/121

 The bug 1211915 is raised for simple tempest tests,whats about huge
 deployments?
 I am told even vendor neutron plugins too have similar issues when we
 create tens of instances in single click on horizon. And people see too
 many connection timeouts in quantum service logs with vendor plugins as
 well.

 I was told that some were struck with nova-network as  there is no support
 yet to migrate  Neutron and they could not take advantage of new network
 services.

 I would like to know community thinking on the same. Please note that I am
 not concerned on fix availability.

 Thanks,
 -Kumar


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Request for feedback on bp-db-slave-handle

2013-07-26 Thread Mike Wilson
Connection and session code in oslo-incubator:
https://review.openstack.org/#/c/29464/
Change to Context: https://review.openstack.org/#/c/30363/
Decorator for sqlalchemy api: https://review.openstack.org/#/c/30370/

So back at the Portland summit myself and Jun Park presented about some of
our difficulties scaling Openstack with the Folsom release:
http://www.openstack.org/summit/portland-2013/session-videos/presentation/using-openstack-in-a-traditional-hosting-environment
.

One of the main obstacles we ran into was the amount of chattiness to
MySQL. As we were deploying literally hundreds of nodes per day we weren't
able to dig in and weed out unnecessary traffic or delve into any type of
optimization approach. Instead we utilized a well known database scaling
paradigm: shoving off reads to replication slaves and only sending reads
which are sensitive to replication latency to the write master. I feel like
replication, be it in MySQL or Postgres, is a fairly well understood
concept and has lots of tools and documentation around it. The only hard
part IMO about scaling this way is that you need to audit your queries to
understand which could be split out, but you also need to understand the
intricacies of your application to understand when it is inappropriate to
send a heavy query to a read slave. In other words, some queries hurt a
lot, but we can't _always_ just send them to read slaves.

So rather than talk about it, here's some example code. Please look at the
reviews above when you see me doing unfamiliar things with context,
slave_connection, etc.

https://review.openstack.org/#/c/38872

In my example my DBA is upset because he's getting this query from every
node that we have every periodic_interval. However, it wouldn't be good for
me to simply send every call to
nova.db.sqlalchemy.api.instance_get_all_by_host to a read slave. Some parts
of the codebase are absolutely not tolerant of data that is possibly a few
hundred milliseconds out of sync with the master. So we need a way to
indicate you hit the slave this time, but not other times. That's where the
lag_tolerant context comes in. Since context is passed all the way through
the stack to the DB layer we can indicate that we are tolerant of laggy
data and that's not going to be changed even if the call goes over RPC.

I'd appreciate any feedback on this approach, I have really only discussed
it with Devananda van der Veen briefly but he was extremely helpful. This
hopefully get some more eyes on it, so yeah, fire away!


-Mike Wilson
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Need feedback for approach on bp-db-slave-handle

2013-07-26 Thread Mike Wilson
So back at the Portland summit myself and Jun Park presented about some of
our difficulties scaling Openstack with the Folsom release:
http://www.openstack.org/summit/portland-2013/session-videos/presentation/using-openstack-in-a-traditional-hosting-environment
.

One of the main obstacles we ran into was the amount of chattiness to
MySQL. As we were deploying literally hundreds of nodes per day we weren't
able to dig in and weed out unnecessary traffic or delve into any type of
optimization approach. Instead we utilized a well known database scaling
paradigm: shoving off reads to replication slaves and only sending reads
which are sensitive to replication latency to the write master. I feel like
replication, be it in MySQL or Postgres, is a fairly well understood
concept and has lots of tools and documentation around it. The only hard
part IMO about scaling this way is that you need to audit your queries to
understand which could be split out, but you also need to understand the
intricacies of your application to understand when it is inappropriate to
send a heavy query to a read slave. In other words, some queries hurt a
lot, but we can't _always_ just send them to read slaves.

So rather than talk about it, here's some example code. Please look at the
reviews below when you see me doing unfamiliar things with context,
slave_connection, etc.

Example slaveififed _sync_power_states:
https://review.openstack.org/#/c/38872
Connection and session code in oslo-incubator:
https://review.openstack.org/#/c/29464/
Change to Context: https://review.openstack.org/#/c/30363/
Decorator for sqlalchemy api: https://review.openstack.org/#/c/30370/

In my example my DBA is upset because he's getting this query from every
node that we have every periodic_interval. However, it wouldn't be good for
me to simply send every call to
nova.db.sqlalchemy.api.instance_get_all_by_host to a read slave. Some parts
of the codebase are absolutely not tolerant of data that is possibly a few
hundred milliseconds out of sync with the master. So we need a way to
indicate you hit the slave this time, but not other times. That's where the
lag_tolerant context comes in. Since context is passed all the way through
the stack to the DB layer we can indicate that we are tolerant of laggy
data and that's not going to be changed even if the call goes over RPC.

I'd appreciate any feedback on this approach, I have really only discussed
it with Devananda van der Veen and Russell Bryant briefly but they have
been extremely helpful. This hopefully get some more eyes on it, so yeah,
fire away!

-Mike
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Python overhead for rootwrap

2013-07-25 Thread Mike Wilson
In my opinion:

1. Stop using rootwrap completely and get strong argument checking support
into sudo (regex).
2. Some sort of long lived rootwrap process, either forked by the service
that want's to shell out or a general purpose rootwrapd type thing.

I prefer #1 because it's surprising that sudo doesn't do this type of thing
already. It _must_ be something that everyone wants. But #2 may be quicker
and easier to implement, my $.02.

-Mike Wilson


On Thu, Jul 25, 2013 at 2:21 PM, Joe Gordon joe.gord...@gmail.com wrote:

 Hi All,

 We have recently hit some performance issues with nova-network.  It turns
 out the root cause of this was we do roughly 20 rootwrapped shell commands,
 many inside of global locks. (https://bugs.launchpad.net/oslo/+bug/1199433
 )

 It turns out starting python itself, has a fairly significant overhead
 when compared to the run time of many of the binary commands we execute.

 For example:

  $ time python -c print 'test'
 test

 real 0m0.023s
 user 0m0.016s
 sys 0m0.004s


 $ time ip a
 ...

 real 0m0.003s
 user 0m0.000s
 sys 0m0.000s


 While we have removed the extra overhead of using entry points, we are now
 hitting the overhead of just shelling out to python.


 While there are many possible ways to reduce this issue, such as reducing
 the number of rootwrapped calls and making locks finer grain, I think its
 worth exploring alternates to the current rootwrap model.

 Any ideas?  I am sending this email out to get the discussion started.


 best,
 Joe Gordon

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] A simple way to improve nova scheduler

2013-07-23 Thread Mike Wilson
Just some added info for that talk, we are using qpid as our messaging
backend. I have no data for RabbitMQ, but our schedulers are _always_
behind on processing updates. It may be different with rabbit.

-Mike


On Tue, Jul 23, 2013 at 1:56 PM, Joe Gordon joe.gord...@gmail.com wrote:


 On Jul 23, 2013 3:44 PM, Ian Wells ijw.ubu...@cack.org.uk wrote:
 
   * periodic updates can overwhelm things.  Solution: remove unneeded
 updates,
   most scheduling data only changes when an instance does some state
 change.
 
  It's not clear that periodic updates do overwhelm things, though.
  Boris ran the tests.  Apparently 10k nodes updating once a minute
  extend the read query by ~10% (the main problem being the read query
  is abysmal in the first place).  I don't know how much of the rest of
  the infrastructure was involved in his test, though (RabbitMQ,
  Conductor).

 A great openstack at scale talk, that covers the scheduler
 http://www.bluehost.com/blog/bluehost/bluehost-presents-operational-case-study-at-openstack-summit-2111

 
  There are reasonably solid reasons why we would want an alternative to
  the DB backend, but I'm not sure the update rate is one of them.   If
  we were going for an alternative the obvious candidate to my mind
  would be something like ZooKeeper (particularly since in some setups
  it's already a channel between the compute hosts and the control
  server).
  --
  Ian.
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] A simple way to improve nova scheduler

2013-07-23 Thread Mike Wilson
Again I can only speak for qpid, but it's not really a big load on the
qpidd server itself. I think the issue is that the updates come in serially
into each scheduler that you have running. We don't process those quickly
enough for it to do any good, which is why the lookup from db. You can see
this for yourself using the fake hypervisor, launch yourself a bunch of
simulated nova-compute, launch a nova-scheduler on the same host and even
with 1k or so you will notice the latency between the update being sent and
the update actually meaning anything for the scheduler.

I think a few points that have been brought up could mitigate this quite a
bit. My personal view is the following:

-Only update when you have to (ie. 10k nodes all sending update every
periodic interval is heavy, only send when you have to)
-Don't fanout to schedulers, update a single scheduler which in turn
updates a shared store that is fast such as memcache

I guess that effectively is what you are proposing with the added twist of
the shared store.

-Mike


On Tue, Jul 23, 2013 at 2:25 PM, Boris Pavlovic bo...@pavlovic.me wrote:

 Joe,
 Sure we will.

 Mike,
 Thanks for sharing information about scalability problems, presentation
 was great.
 Also could you say what do you think is 150 req/sec is it big load for
 qpid or rabbit? I think it is just nothing..


 Best regards,
 Boris Pavlovic
 ---
 Mirantis Inc.



 On Wed, Jul 24, 2013 at 12:17 AM, Joe Gordon joe.gord...@gmail.comwrote:




 On Tue, Jul 23, 2013 at 1:09 PM, Boris Pavlovic bo...@pavlovic.mewrote:

 Ian,

 There are serious scalability and performance problems with DB usage in
 current scheduler.
 Rapid Updates + Joins makes current solution absolutely not scalable.

 Bleuhost example just shows personally for me just a trivial thing. (It
 just won't work)

 We will add tomorrow antother graphic:
 Avg user req / sec in current and our approaches.


 Will you be releasing your code to generate the results? Without that the
 graphic isn't very useful


 I hope it will help you to better understand situation.


 Joshua,

 Our current discussion is about could we remove information about
 compute nodes from Nova saftly.
 Both our and your approach will remove data from nova DB.

 Also your approach had much more:
 1) network load
 2) latency
 3) one more service (memcached)

 So I am not sure that it is better then just send directly to scheduler
 information.


 Best regards,
 Boris Pavlovic
 ---
 Mirantis Inc.






 On Tue, Jul 23, 2013 at 11:56 PM, Joe Gordon joe.gord...@gmail.comwrote:


 On Jul 23, 2013 3:44 PM, Ian Wells ijw.ubu...@cack.org.uk wrote:
 
   * periodic updates can overwhelm things.  Solution: remove unneeded
 updates,
   most scheduling data only changes when an instance does some state
 change.
 
  It's not clear that periodic updates do overwhelm things, though.
  Boris ran the tests.  Apparently 10k nodes updating once a minute
  extend the read query by ~10% (the main problem being the read query
  is abysmal in the first place).  I don't know how much of the rest of
  the infrastructure was involved in his test, though (RabbitMQ,
  Conductor).

 A great openstack at scale talk, that covers the scheduler
 http://www.bluehost.com/blog/bluehost/bluehost-presents-operational-case-study-at-openstack-summit-2111

 
  There are reasonably solid reasons why we would want an alternative to
  the DB backend, but I'm not sure the update rate is one of them.   If
  we were going for an alternative the obvious candidate to my mind
  would be something like ZooKeeper (particularly since in some setups
  it's already a channel between the compute hosts and the control
  server).
  --
  Ian.
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev