Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-10 Thread Sean Dague
On 11/10/2015 05:12 AM, Thierry Carrez wrote:
> Kevin Carter wrote:
>>> I believe Clint already linked to
>>> https://aphyr.com/posts/309-knossos-redis-and-linearizability or
>>> similar - but 'known for general ease of use and reliability' is uhm,
>>> a bold claim. Its worth comparing that (and the other redis writeups)
>>> to this one: https://aphyr.com/posts/291-call-me-maybe-zookeeper. "Use
>>> zookeeper, its mature".
>>
>> Those write ups are from 2013 and with general improvements in Redis over 
>> the last two years I'd find it hard to believe that they're still relevant, 
>> however its worth testing to confirm if Redis is deemed as a viable option.
>>
>>> The openjdk is present on the same linux distributions, and has been
>>> used in both open source and proprietary programs for decades. *what*
>>> license implications are you speaking of?
>>
>> The license issues would be related to deployers using Oracle java
which may or may not be needed by certain deployers for scale and
performance requirements. While I do not have specific performance
numbers at my fingertips to illustrate general performance issues using
zookeeper at scale with OpenJDK, I have, in the past, compared OpenJDK
to Oracle Java and found that Oracle java was quite a bit more stable
and packed far more performance capabilities.   I did find [
http://blog.cloud-benchmarks.org/2015/07/17/cassandra-write-performance-on-gce-and-aws.html
] which claims a 32% performance improvement with casandra using Oracle
Java 8 over OpenJDK on top of the fact that it was less prone to
crashes but that may not be entirely relevant to this case. Also
there's no denying that Oracle has a questionable history dealing with
Opensource projects regarding Java in the past and because performance
/ stability concerns may require the use of Oracle Java which will
undoubtedly
>   come wit
> h questionable license requirements.
> 
> I can't be suspected of JVM sympathies, and I find that a bit unfair.
> I'll try to summarize:
> 
> 1- ZooKeeper is a very good DLM
> 
> 2- ZooKeeper is totally supported under OpenJDK and is run under this
> configuration by large users (see Josh's other post for data)
> 
> 3- /Some/ large Java stacks run faster / better under Oracle's non-free
> JVM, but (1) there is no evidence that ZooKeeper is one of them and (2)
> this is less and less true with modern JDKs (which are all built on top
> of OpenJDK)
> 
> 4- Still, some shops will prefer to stay out of Java stacks for various
> reasons and need an alternative
> 
> I don't think anything in this discussion invalidates the compromise
> (using tooz) we came up during the session. ZooKeeper can totally be the
> tooz default in devstack (something has to be). If other tooz drivers
> reach the same level of maturity one day, they could run in specific
> tests and/or become the new default ?

And another good datapoint is Nova's had optional Zookeeper support for
service groups (landed in Folsom/Grizzly), which provides instantaneous
reporting of compute workers coming / going. That's been used to build a
lot of HA orchestration bits outside of Nova by a bunch of folks in the
NFV space.

So it's also not just theory that Zookeeper is keeping up here, many
OpenStack deploys already are using it quite heavily.

-Sean

-- 
Sean Dague
http://dague.net

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-10 Thread Joshua Harlow

Sean Dague wrote:

On 11/10/2015 05:12 AM, Thierry Carrez wrote:

Kevin Carter wrote:

I believe Clint already linked to
https://aphyr.com/posts/309-knossos-redis-and-linearizability or
similar - but 'known for general ease of use and reliability' is uhm,
a bold claim. Its worth comparing that (and the other redis writeups)
to this one: https://aphyr.com/posts/291-call-me-maybe-zookeeper. "Use
zookeeper, its mature".

Those write ups are from 2013 and with general improvements in Redis over the 
last two years I'd find it hard to believe that they're still relevant, however 
its worth testing to confirm if Redis is deemed as a viable option.


The openjdk is present on the same linux distributions, and has been
used in both open source and proprietary programs for decades. *what*
license implications are you speaking of?

The license issues would be related to deployers using Oracle java

which may or may not be needed by certain deployers for scale and
performance requirements. While I do not have specific performance
numbers at my fingertips to illustrate general performance issues using
zookeeper at scale with OpenJDK, I have, in the past, compared OpenJDK
to Oracle Java and found that Oracle java was quite a bit more stable
and packed far more performance capabilities.   I did find [
http://blog.cloud-benchmarks.org/2015/07/17/cassandra-write-performance-on-gce-and-aws.html
] which claims a 32% performance improvement with casandra using Oracle
Java 8 over OpenJDK on top of the fact that it was less prone to
crashes but that may not be entirely relevant to this case. Also
there's no denying that Oracle has a questionable history dealing with
Opensource projects regarding Java in the past and because performance
/ stability concerns may require the use of Oracle Java which will
undoubtedly

   come wit
h questionable license requirements.

I can't be suspected of JVM sympathies, and I find that a bit unfair.
I'll try to summarize:

1- ZooKeeper is a very good DLM

2- ZooKeeper is totally supported under OpenJDK and is run under this
configuration by large users (see Josh's other post for data)

3- /Some/ large Java stacks run faster / better under Oracle's non-free
JVM, but (1) there is no evidence that ZooKeeper is one of them and (2)
this is less and less true with modern JDKs (which are all built on top
of OpenJDK)

4- Still, some shops will prefer to stay out of Java stacks for various
reasons and need an alternative

I don't think anything in this discussion invalidates the compromise
(using tooz) we came up during the session. ZooKeeper can totally be the
tooz default in devstack (something has to be). If other tooz drivers
reach the same level of maturity one day, they could run in specific
tests and/or become the new default ?


And another good datapoint is Nova's had optional Zookeeper support for
service groups (landed in Folsom/Grizzly), which provides instantaneous
reporting of compute workers coming / going. That's been used to build a
lot of HA orchestration bits outside of Nova by a bunch of folks in the
NFV space.


Are they still using the nova zookeeper support (for service groups), 
from the analysis done by a coworker I'm not sure they can even use it 
anymore.


The following outline the reasons...

https://review.openstack.org/#/c/190322/ (the uberspec)

https://review.openstack.org/#/c/138607 (tooz for service groups) that 
one uncovered the following:


'For example, the Zookeeper driver uses evzookeeper which is no longer 
actively maintained and doesn't work with eventlet >= 0.17.1.'


The following has more details of this investigation:

http://lists.openstack.org/pipermail/openstack-dev/2015-May/063602.html



So it's also not just theory that Zookeeper is keeping up here, many
OpenStack deploys already are using it quite heavily.


Agreed that people do use it, just they might not be using it for 
service groups (unless they use an older release of nova) due to the 
above issues (which is why those specs were created, to resolve that and 
to fix the service group layer as a whole).




-Sean



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-10 Thread Kevin Carter
>I believe Clint already linked to
>https://aphyr.com/posts/309-knossos-redis-and-linearizability or
>similar - but 'known for general ease of use and reliability' is uhm,
>a bold claim. Its worth comparing that (and the other redis writeups)
>to this one: https://aphyr.com/posts/291-call-me-maybe-zookeeper. "Use
>zookeeper, its mature".

Those write ups are from 2013 and with general improvements in Redis over the 
last two years I'd find it hard to believe that they're still relevant, however 
its worth testing to confirm if Redis is deemed as a viable option.

>The openjdk is present on the same linux distributions, and has been
>used in both open source and proprietary programs for decades. *what*
>license implications are you speaking of?

The license issues would be related to deployers using Oracle java which may or 
may not be needed by certain deployers for scale and performance requirements. 
While I do not have specific performance numbers at my fingertips to illustrate 
general performance issues using zookeeper at scale with OpenJDK, I have, in 
the past, compared OpenJDK to Oracle Java and found that Oracle java was quite 
a bit more stable and packed far more performance capabilities.   I did find [ 
http://blog.cloud-benchmarks.org/2015/07/17/cassandra-write-performance-on-gce-and-aws.html
 ] which claims a 32% performance improvement with casandra using Oracle Java 8 
over OpenJDK on top of the fact that it was less prone to crashes but that may 
not be entirely relevant to this case. Also there's no denying that Oracle has 
a questionable history dealing with Opensource projects regarding Java in the 
past and because performance / stability concerns may require the use of Oracle 
Java which will undoubtedly come with
  questionable license requirements.

>I believe you can do this with zookeeper - both single process, or
>three processes on one machine to emulate a cluster - very easily.
>Quoting http://qnalist.com/questions/29943/java-heap-size-for-zookeeper
>- "It's more dependent on your workload than anything. If you're
>storing on order of hundreds of small znodes then 1gb is going to [be]
>more then fine." Obviously we should test this and confirm it, but
>developer efficiency is a key part of any decision, and AFAIK there is
>absolutely nothing in the way as far as zookeeper goes. 

This would be worth while testing to ensure that a developer really can use 
typical work machines to test without major performance impacts or changes to 
their workflow.

>Just like rabbitmq and openvswitch, its a mature thing, written in a language
>other than Python, which needs its own care and feeding (and that
>feeding is something like 90% zk specific, not 'java headaches').

Agreed that with different solutions care is needed to facilitate optimal 
language environments however I'm not sure OpenVSwitch/RabbitMQ may be the best 
comparison to the point of maturity. As I'm sure you're well aware, both of 
these pieces of software have been "mature" for some time however up until real 
recently stable OvS has been known to reach havoc in large scale production 
environments and RabbitMQ clustering is still really something to be desired. 
The point being is just because something is "mature" doesn't mean that its the 
most stable and or the right solution.

> The default should be suitable for use in the majority of clouds

IDK if we can quantify that Zookeeper fits the "majority" of clouds as I 
believe there'll be scale issues while using OpenJDK forcing the use of Oracla 
Java and the potential for inadvertent license issues. That said, I have no 
real means to say Redis will be any better, however prior operational 
experience tells me that managing zookeeper at real scale is a PITA.

I too wouldn't mind seeing a solution using consul put forth as the default. 
It's really an interesting approach, provides multiple backends (http/dns/etc), 
should scale to multiple DCs without much  if any hacking, written in go (using 
1.5.1+ in master and 1.4.1+ in stable) which will come with some impressive 
performance capabilities, is under active development, licensed "Mozilla Public 
License, version 2.0", should be fairly minimal in terms of resource 
requirements on development machines, and hits many of the other shinneyness 
factors developers / deployers may be interested in. 


All said I'm very interested in a DLM solution and if there's anything I can do 
to help make it go please let me know.

--

Kevin Carter
IRC: cloudnull



From: Robert Collins <robe...@robertcollins.net>
Sent: Tuesday, November 10, 2015 12:55 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [all] Outcome of distributed lock manager 
discussion @ the summit

On 10 November 2015 at 19:24, Kevin Ca

Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-10 Thread Kevin Carter
Clint,

> While I'm sure it works a lot of the time, when it breaks it will break
> in very mysterious, and possibly undetectable way.

> For some things, this would be no big deal. But in some it may result in
> total disaster, like two conductors trying to both own a single Ironic
> node and one accidentally erasing what the other just wrote there.

This is fair, it may break in unpredictable ways due to the master slave 
replication, the asynchronous nature of the cluster implementation and how 
redis promotes a slave when a master go down (all of which could result in 
catastrophic failures due to the noted race conditions). While a test case 
would be interesting I acknowledge that it may be impossible to confirm such a 
situation in a controlled environment. 

>For some things, this would be no big deal. But in some it may result in
>total disaster, like two conductors trying to both own a single Ironic
>node and one accidentally erasing what the other just wrote there.

So that may be a mark against Redis being the preferred back-end for DLM 
however a quick look into the issue tracker for zookeeper reveals a similar 
sets of race conditions that are currently open and could result in the same 
kinds of situations [0]. While not ideal, it may really be a case and weighing 
the technology choices (like you've said) and picking the best fit for now. 

[0] - http://bit.ly/1NGQrAd  # Search string for Zookeeper Jira was too long so 
i shortened it.

--

Kevin Carter
IRC: cloudnull



From: Clint Byrum <cl...@fewbar.com>
Sent: Tuesday, November 10, 2015 2:21 AM
To: openstack-dev
Subject: Re: [openstack-dev] [all] Outcome of distributed lock manager  
discussion @ the summit

Excerpts from Kevin Carter's message of 2015-11-09 22:24:16 -0800:
> Hello all,
>
> The rational behind using a solution like zookeeper makes sense however in 
> reviewing the thread I found myself asking if there was a better way to 
> address the problem without the addition of a Java based solution as the 
> default. While it has been covered that the current implementation would be a 
> reference and that "other" driver support in Tooz would allow for any backend 
> a deployer may want, the work being proposed within devstack [0] would become 
> the default development case thus making it the de-facto standard and I think 
> we could do better in terms of supporting developers and delivering 
> capability.
>
> My thoughts on using Redis+Redislock instead of Java+Zookeeper as the default 
> option:
> * Tooz already support redislock
> * Redis has an established cluster system known for general ease of use and 
> reliability on distributed systems.
> * Several OpenStack projects already support Redis as a backend option or 
> have extended capabilities using a Redis.
> * Redis can be implemented in RHEL, SUSE, and DEB based systems with ease.
> * Redis is Opensource software licensed under the "three clause BSD license" 
> and would not have any of the same questionable license implications as found 
> when dealing with anything Java.
> * The inclusion of Redis would work on a single node allowing developers to 
> continue work using VMs running on Laptops with 4GB or ram but would also 
> scale to support the multi-controller use case with ease. This would also 
> give developers the ability to work on a systems that will actually resemble 
> production.
> * Redislock will bring with it no additional developer facing language 
> dependencies (Redis is written in ANSI C and works ... without external 
> dependencies [1]) while also providing a plethora of language bindings [2].
>
>
> I apologize for questioning the proposed solution so late into the 
> development of this thread and for not making the summit conversations to 
> talk more with everyone whom worked on the proposal. While the ship may have 
> sailed on this point for now I figured I'd ask why we might go down the path 
> of Zookeeper+Java when a solution with likely little to no development effort 
> already exists, can support just about any production/development 
> environment, has lots of bindings, and (IMHO) would integrate with the larger 
> community easier; many OpenStack developers and deployers already know Redis. 
> With the inclusion of ZK+Java in DevStack and the act of making it the 
> default it essentially creates new hard dependencies one of which is Java and 
> I'd like to avoid that if at all possible; basically I think we can do better.
>

Kevin, thanks so much for your thoughts on this. I really do appreciate
that we've had a high diversity of opinions and facts brought to bear on
this subject.

The Aphyr/Jepsen tests that were linked before [1] show, IMO, that Redis
satisfies availability and partition tolerance in the 

Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-10 Thread Thierry Carrez
Kevin Carter wrote:
>> I believe Clint already linked to
>> https://aphyr.com/posts/309-knossos-redis-and-linearizability or
>> similar - but 'known for general ease of use and reliability' is uhm,
>> a bold claim. Its worth comparing that (and the other redis writeups)
>> to this one: https://aphyr.com/posts/291-call-me-maybe-zookeeper. "Use
>> zookeeper, its mature".
> 
> Those write ups are from 2013 and with general improvements in Redis over the 
> last two years I'd find it hard to believe that they're still relevant, 
> however its worth testing to confirm if Redis is deemed as a viable option.
> 
>> The openjdk is present on the same linux distributions, and has been
>> used in both open source and proprietary programs for decades. *what*
>> license implications are you speaking of?
> 
> The license issues would be related to deployers using Oracle java which may 
> or may not be needed by certain deployers for scale and performance 
> requirements. While I do not have specific performance numbers at my 
> fingertips to illustrate general performance issues using zookeeper at scale 
> with OpenJDK, I have, in the past, compared OpenJDK to Oracle Java and found 
> that Oracle java was quite a bit more stable and packed far more performance 
> capabilities.   I did find [ 
> http://blog.cloud-benchmarks.org/2015/07/17/cassandra-write-performance-on-gce-and-aws.html
>  ] which claims a 32% performance improvement with casandra using Oracle Java 
> 8 over OpenJDK on top of the fact that it was less prone to crashes but that 
> may not be entirely relevant to this case. Also there's no denying that 
> Oracle has a questionable history dealing with Opensource projects regarding 
> Java in the past and because performance / stability concerns may require the 
> use of Oracle Java which will undoubtedly
  come wit
h questionable license requirements.

I can't be suspected of JVM sympathies, and I find that a bit unfair.
I'll try to summarize:

1- ZooKeeper is a very good DLM

2- ZooKeeper is totally supported under OpenJDK and is run under this
configuration by large users (see Josh's other post for data)

3- /Some/ large Java stacks run faster / better under Oracle's non-free
JVM, but (1) there is no evidence that ZooKeeper is one of them and (2)
this is less and less true with modern JDKs (which are all built on top
of OpenJDK)

4- Still, some shops will prefer to stay out of Java stacks for various
reasons and need an alternative

I don't think anything in this discussion invalidates the compromise
(using tooz) we came up during the session. ZooKeeper can totally be the
tooz default in devstack (something has to be). If other tooz drivers
reach the same level of maturity one day, they could run in specific
tests and/or become the new default ?

-- 
Thierry Carrez (ttx)

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-10 Thread Clint Byrum
Excerpts from Kevin Carter's message of 2015-11-09 22:24:16 -0800:
> Hello all, 
> 
> The rational behind using a solution like zookeeper makes sense however in 
> reviewing the thread I found myself asking if there was a better way to 
> address the problem without the addition of a Java based solution as the 
> default. While it has been covered that the current implementation would be a 
> reference and that "other" driver support in Tooz would allow for any backend 
> a deployer may want, the work being proposed within devstack [0] would become 
> the default development case thus making it the de-facto standard and I think 
> we could do better in terms of supporting developers and delivering 
> capability.
> 
> My thoughts on using Redis+Redislock instead of Java+Zookeeper as the default 
> option:
> * Tooz already support redislock
> * Redis has an established cluster system known for general ease of use and 
> reliability on distributed systems. 
> * Several OpenStack projects already support Redis as a backend option or 
> have extended capabilities using a Redis.
> * Redis can be implemented in RHEL, SUSE, and DEB based systems with ease. 
> * Redis is Opensource software licensed under the "three clause BSD license" 
> and would not have any of the same questionable license implications as found 
> when dealing with anything Java.
> * The inclusion of Redis would work on a single node allowing developers to 
> continue work using VMs running on Laptops with 4GB or ram but would also 
> scale to support the multi-controller use case with ease. This would also 
> give developers the ability to work on a systems that will actually resemble 
> production.
> * Redislock will bring with it no additional developer facing language 
> dependencies (Redis is written in ANSI C and works ... without external 
> dependencies [1]) while also providing a plethora of language bindings [2].
> 
> 
> I apologize for questioning the proposed solution so late into the 
> development of this thread and for not making the summit conversations to 
> talk more with everyone whom worked on the proposal. While the ship may have 
> sailed on this point for now I figured I'd ask why we might go down the path 
> of Zookeeper+Java when a solution with likely little to no development effort 
> already exists, can support just about any production/development 
> environment, has lots of bindings, and (IMHO) would integrate with the larger 
> community easier; many OpenStack developers and deployers already know Redis. 
> With the inclusion of ZK+Java in DevStack and the act of making it the 
> default it essentially creates new hard dependencies one of which is Java and 
> I'd like to avoid that if at all possible; basically I think we can do better.
> 

Kevin, thanks so much for your thoughts on this. I really do appreciate
that we've had a high diversity of opinions and facts brought to bear on
this subject.

The Aphyr/Jepsen tests that were linked before [1] show, IMO, that Redis
satisfies availability and partition tolerance in the CAP theorem [2].
Consistency is entirely compromised by a partition, and having multiple
redis nodes means using a form of replication with no consistency
guarantees. I find it somewhat confusing that Redis actually claims _ALL
THREE_ things in the description of RedLock [3].

While I'm sure it works a lot of the time, when it breaks it will break
in very mysterious, and possibly undetectable way.

For some things, this would be no big deal. But in some it may result in
total disaster, like two conductors trying to both own a single Ironic
node and one accidentally erasing what the other just wrote there.

So, I think we need to think hard about how Redis's weaknesses would
affect the desired goals before we adopt Redis for DLM.

[1] https://aphyr.com/posts/307-call-me-maybe-redis-redux
[2] https://en.wikipedia.org/wiki/CAP_theorem
[3] http://redis.io/topics/distlock

> 
> [0] - https://review.openstack.org/#/c/241040/
> [1] - http://redis.io/topics/introduction
> [2] - http://redis.io/topics/distlock
> 
> --
> 
> Kevin Carter
> IRC: cloudnull
> 
> 
> 
> From: Fox, Kevin M <kevin@pnnl.gov>
> Sent: Monday, November 9, 2015 1:54 PM
> To: maishsk+openst...@maishsk.com; OpenStack Development Mailing List (not 
> for usage questions)
> Subject: Re: [openstack-dev] [all] Outcome of distributed lock manager 
> discussion @ the summit
> 
> Dedicating 3 controller nodes in a small cloud is not the best allocation of 
> resources sometimes.  Your thinking of medium to large clouds. Small 
> production clouds are a thing too. and at that scale, a little downtime if 
> you actually hit the rare case of a node failure on the controller ma

Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-09 Thread Fox, Kevin M
Dedicating 3 controller nodes in a small cloud is not the best allocation of 
resources sometimes.  Your thinking of medium to large clouds. Small production 
clouds are a thing too. and at that scale, a little downtime if you actually 
hit the rare case of a node failure on the controller may be acceptable. Its up 
for an OP to decide.

We've also experienced that sometimes HA software causes more, or longer 
downtimes then it solves sometimes. Due to its complexity, knowledge required, 
proper testing, etc. Again, the risk gets higher the smaller the cloud is in 
some ways.

Being able to keep it simple and small for that case, then scale with switching 
out pieces as needed does have some tangible benefits.

Thanks,
Kevin

From: Maish Saidel-Keesing [mais...@maishsk.com]
Sent: Monday, November 09, 2015 11:35 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [all] Outcome of distributed lock manager 
discussion @ the summit

On 11/05/15 23:18, Fox, Kevin M wrote:
> Your assuming there are only 2 choices,
>   zk or db+rabbit. I'm claiming both hare suboptimal at present. a 3rd might 
> be needed. Though even with its flaws, the db+rabbit choice has a few 
> benefits too.
>
> You also seem to assert that to support large clouds, the default must be 
> something that can scale that large. While that would be nice, I don't think 
> its a requirement if its overly burdensome on deployers of non huge clouds.
>
> I don't have metrics, but I would be surprised if most deployments today 
> (production + other) used 3 controllers with a full ha setup. I would guess 
> that the majority are single controller setups. With those, the
I think it would be safe to assume - that any kind of production cloud -
or any operator that considers their OpenStack environment something
that is close to production ready - would not be daft enough to deploy
their whole environment based on a single controller - which is a
whopper of a single point of failure.

Most Fuel (mirantis) deployments are multiple controllers.
RHOS also recommends doing multiple controllers.

I don't think that we as a community can afford to assume that 1
controller will suffice.
This does not say that maintaining zk will be any easier though.
> overhead of maintaining a whole dlm like zk seems like overkill. If db+rabbit 
> would work for that one case, that would be one less thing to have to setup 
> for an op. They already have to setup db+rabbit. Or even a clm plugin of some 
> sort, that won't scale, but would be very easy to deploy, and change out 
> later when needed would be very useful.
>
> etcd is starting to show up in a lot of other projects, and so it may be at 
> sites already. being able to support it may be less of a burden to operators 
> then zk in some cases.
>
> If your cloud grows to the point where the dlm choice really matters for 
> scalability/correctness, then you probably have enough staff members to deal 
> with adding in zk, and that's probably the right choice.
>
> You can have multiple suggested things in addition to one default. Default to 
> the thing that makes the most sense in the common most deployments, and make 
> specific recommendations for certain scenarios. like, "if greater then 100 
> nodes, we strongly recommend using zk" or something to that effect.
>
> Thanks,
> Kevin
>
>
> 
> From: Clint Byrum [cl...@fewbar.com]
> Sent: Thursday, November 05, 2015 11:44 AM
> To: openstack-dev
> Subject: Re: [openstack-dev] [all] Outcome of distributed lock manager  
> discussion @ the summit
>
> Excerpts from Fox, Kevin M's message of 2015-11-04 14:32:42 -0800:
>> To clarify that statement a little more,
>>
>> Speaking only for myself as an op, I don't want to support yet one more 
>> snowflake in a sea of snowflakes, that works differently then all the rest, 
>> without a very good reason.
>>
>> Java has its own set of issues associated with the JVM. Care, and feeding 
>> sorts of things. If we are to invest time/money/people in learning how to 
>> properly maintain it, its easier to justify if its not just a one off for 
>> just DLM,
>>
>> So I wouldn't go so far as to say we're vehemently opposed to java, just 
>> that DLM on its own is probably not a strong enough feature all on its own 
>> to justify requiring pulling in java. Its been only a very recent thing that 
>> you could convince folks that DLM was needed at all. So either make java 
>> optional, or find some other use cases that needs java badly enough that you 
>> can make java a required component. I suspect some day searchlight might be 
>> compelling enough for that, but not

Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-09 Thread Maish Saidel-Keesing

On 11/05/15 23:18, Fox, Kevin M wrote:

Your assuming there are only 2 choices,
  zk or db+rabbit. I'm claiming both hare suboptimal at present. a 3rd might be 
needed. Though even with its flaws, the db+rabbit choice has a few benefits too.

You also seem to assert that to support large clouds, the default must be 
something that can scale that large. While that would be nice, I don't think 
its a requirement if its overly burdensome on deployers of non huge clouds.

I don't have metrics, but I would be surprised if most deployments today 
(production + other) used 3 controllers with a full ha setup. I would guess 
that the majority are single controller setups. With those, the
I think it would be safe to assume - that any kind of production cloud - 
or any operator that considers their OpenStack environment something 
that is close to production ready - would not be daft enough to deploy 
their whole environment based on a single controller - which is a 
whopper of a single point of failure.


Most Fuel (mirantis) deployments are multiple controllers.
RHOS also recommends doing multiple controllers.

I don't think that we as a community can afford to assume that 1 
controller will suffice.

This does not say that maintaining zk will be any easier though.

overhead of maintaining a whole dlm like zk seems like overkill. If db+rabbit 
would work for that one case, that would be one less thing to have to setup for 
an op. They already have to setup db+rabbit. Or even a clm plugin of some sort, 
that won't scale, but would be very easy to deploy, and change out later when 
needed would be very useful.

etcd is starting to show up in a lot of other projects, and so it may be at 
sites already. being able to support it may be less of a burden to operators 
then zk in some cases.

If your cloud grows to the point where the dlm choice really matters for 
scalability/correctness, then you probably have enough staff members to deal 
with adding in zk, and that's probably the right choice.

You can have multiple suggested things in addition to one default. Default to the thing 
that makes the most sense in the common most deployments, and make specific 
recommendations for certain scenarios. like, "if greater then 100 nodes, we strongly 
recommend using zk" or something to that effect.

Thanks,
Kevin



From: Clint Byrum [cl...@fewbar.com]
Sent: Thursday, November 05, 2015 11:44 AM
To: openstack-dev
Subject: Re: [openstack-dev] [all] Outcome of distributed lock manager  
discussion @ the summit

Excerpts from Fox, Kevin M's message of 2015-11-04 14:32:42 -0800:

To clarify that statement a little more,

Speaking only for myself as an op, I don't want to support yet one more 
snowflake in a sea of snowflakes, that works differently then all the rest, 
without a very good reason.

Java has its own set of issues associated with the JVM. Care, and feeding sorts 
of things. If we are to invest time/money/people in learning how to properly 
maintain it, its easier to justify if its not just a one off for just DLM,

So I wouldn't go so far as to say we're vehemently opposed to java, just that 
DLM on its own is probably not a strong enough feature all on its own to 
justify requiring pulling in java. Its been only a very recent thing that you 
could convince folks that DLM was needed at all. So either make java optional, 
or find some other use cases that needs java badly enough that you can make 
java a required component. I suspect some day searchlight might be compelling 
enough for that, but not today.

As for the default, the default should be good reference. if most sites would 
run with etc or something else since java isn't needed, then don't default 
zookeeper on.


There are a number of reasons, but the most important are:

* Resilience in the face of failures - The current database+MQ based
   solutions are all custom made and have unknown characteristics when
   there are network partitions and node failures.
* Scalability - The current database+MQ solutions rely on polling the
   database and/or sending lots of heartbeat messages or even using the
   database to store heartbeat transactions. This scales fine for tiny
   clusters, but when every new node adds more churn to the MQ and
   database, this will (and has been observed to) be intractable.
* Tech debt - OpenStack is inventing lock solutions and then maintaining
   them. And service discovery solutions, and then maintaining them.
   Wouldn't you rather have better upgrade stories, more stability, more
   scale, and more featuers?

If those aren't compelling enough reasons to deploy a mature java service
like Zookeeper, I don't know what would be. But I do think using the
abstraction layer of tooz will at least allow us to move forward without
having to convince everybody everywhere that this is actually just the
path of least resistance.




--
Best Regards,
Maish Said

Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-09 Thread Joshua Harlow
On Mon, Nov 9, 2015, at 10:24 PM, Kevin Carter wrote:
> Hello all, 
> 
> The rational behind using a solution like zookeeper makes sense however
> in reviewing the thread I found myself asking if there was a better way
> to address the problem without the addition of a Java based solution as
> the default. While it has been covered that the current implementation
> would be a reference and that "other" driver support in Tooz would allow
> for any backend a deployer may want, the work being proposed within
> devstack [0] would become the default development case thus making it the
> de-facto standard and I think we could do better in terms of supporting
> developers and delivering capability.
> 
> My thoughts on using Redis+Redislock instead of Java+Zookeeper as the
> default option:
> * Tooz already support redislock
> * Redis has an established cluster system known for general ease of use
> and reliability on distributed systems. 
> * Several OpenStack projects already support Redis as a backend option or
> have extended capabilities using a Redis.
> * Redis can be implemented in RHEL, SUSE, and DEB based systems with
> ease. 
> * Redis is Opensource software licensed under the "three clause BSD
> license" and would not have any of the same questionable license
> implications as found when dealing with anything Java.
> * The inclusion of Redis would work on a single node allowing developers
> to continue work using VMs running on Laptops with 4GB or ram but would
> also scale to support the multi-controller use case with ease. This would
> also give developers the ability to work on a systems that will actually
> resemble production.
> * Redislock will bring with it no additional developer facing language
> dependencies (Redis is written in ANSI C and works ... without external
> dependencies [1]) while also providing a plethora of language bindings
> [2].

So FYI, I wrote the tooz redis backend (and its being used I think
by ceilometer and others) ;)

https://github.com/openstack/tooz/blob/master/tooz/drivers/redis.py

http://docs.openstack.org/developer/tooz/developers.html#redis

It internally uses redis-py and wraps up that libraries lua lock which
last time I checked uses the red-lock algorithm to operate:

https://github.com/openstack/tooz/blob/master/tooz/drivers/redis.py#L57

So it exists, although IMHO it is on the potential list
of drivers to chop/deprecate, redis really doesn't *still* have a
preference to consistency (even with the newly developed/added
clustering) support:

>From http://redis.io/topics/cluster-spec

'''Acceptable degree of write safety: the system
tries (in a best-effort way) to retain all the writes originating
from clients connected with the majority of the master nodes.'''

To me the above may be ok, but it doesn't seem like the most ideal
goal to have a system used for locking, service discovery and other
critical
openstack functionality to operate in a 'best-effort' way. The spec
at https://review.openstack.org/#/c/240645/ can hopefully flush this out
soon enough (and that will determine the fate of the above driver).

Btw other drivers that have been implemented (not all with full API
support
see [1] for compat. listings):

https://github.com/openstack/tooz/tree/master/tooz/drivers

TLDR; let's flush the desired capabilities out on
https://review.openstack.org/#/c/240645/ and go from there (redis may
be ok, but it may not be as well...)

> 
> 
> I apologize for questioning the proposed solution so late into the
> development of this thread and for not making the summit conversations to
> talk more with everyone whom worked on the proposal. While the ship may
> have sailed on this point for now I figured I'd ask why we might go down
> the path of Zookeeper+Java when a solution with likely little to no
> development effort already exists, can support just about any
> production/development environment, has lots of bindings, and (IMHO)
> would integrate with the larger community easier; many OpenStack
> developers and deployers already know Redis. With the inclusion of
> ZK+Java in DevStack and the act of making it the default it essentially
> creates new hard dependencies one of which is Java and I'd like to avoid
> that if at all possible; basically I think we can do better.
> 
> 
> [0] - https://review.openstack.org/#/c/241040/
> [1] - http://redis.io/topics/introduction
> [2] - http://redis.io/topics/distlock
> 
> --
> 
> Kevin Carter
> IRC: cloudnull
> 
> 
> ____
> From: Fox, Kevin M <kevin@pnnl.gov>
> Sent: Monday, November 9, 2015 1:54 PM
> To: maishsk+openst...@maishsk.com; OpenStack Development Mailing List
> (not for usage questions)
> Subject: Re: [openstack-dev] [all] Outcome of distributed loc

Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-09 Thread Joshua Harlow
On Mon, Nov 9, 2015, at 10:24 PM, Kevin Carter wrote:
> Hello all, 
> 
> The rational behind using a solution like zookeeper makes sense however
> in reviewing the thread I found myself asking if there was a better way
> to address the problem without the addition of a Java based solution as
> the default. While it has been covered that the current implementation
> would be a reference and that "other" driver support in Tooz would allow
> for any backend a deployer may want, the work being proposed within
> devstack [0] would become the default development case thus making it the
> de-facto standard and I think we could do better in terms of supporting
> developers and delivering capability.
> 
> My thoughts on using Redis+Redislock instead of Java+Zookeeper as the
> default option:
> * Tooz already support redislock
> * Redis has an established cluster system known for general ease of use
> and reliability on distributed systems. 

This one I somewhat suspect, the clustering support was released about
six months ago:

https://github.com/antirez/redis/blob/3.0/00-RELEASENOTES#L130

So I'm not exactly sure how established (or even well deployed
and tested it is); does anyone have experience with it, configuring it,
handling
its failure modes?? It'd be nice to know how it works (and I'm generally
curious).

> * Several OpenStack projects already support Redis as a backend option or
> have extended capabilities using a Redis.
> * Redis can be implemented in RHEL, SUSE, and DEB based systems with
> ease. 
> * Redis is Opensource software licensed under the "three clause BSD
> license" and would not have any of the same questionable license
> implications as found when dealing with anything Java.
> * The inclusion of Redis would work on a single node allowing developers
> to continue work using VMs running on Laptops with 4GB or ram but would
> also scale to support the multi-controller use case with ease. This would
> also give developers the ability to work on a systems that will actually
> resemble production.
> * Redislock will bring with it no additional developer facing language
> dependencies (Redis is written in ANSI C and works ... without external
> dependencies [1]) while also providing a plethora of language bindings
> [2].
> 
> 
> I apologize for questioning the proposed solution so late into the
> development of this thread and for not making the summit conversations to
> talk more with everyone whom worked on the proposal. While the ship may
> have sailed on this point for now I figured I'd ask why we might go down
> the path of Zookeeper+Java when a solution with likely little to no
> development effort already exists, can support just about any
> production/development environment, has lots of bindings, and (IMHO)
> would integrate with the larger community easier; many OpenStack
> developers and deployers already know Redis. With the inclusion of
> ZK+Java in DevStack and the act of making it the default it essentially
> creates new hard dependencies one of which is Java and I'd like to avoid
> that if at all possible; basically I think we can do better.
> 
> 
> [0] - https://review.openstack.org/#/c/241040/
> [1] - http://redis.io/topics/introduction
> [2] - http://redis.io/topics/distlock
> 
> --
> 
> Kevin Carter
> IRC: cloudnull
> 
> 
> 
> From: Fox, Kevin M <kevin@pnnl.gov>
> Sent: Monday, November 9, 2015 1:54 PM
> To: maishsk+openst...@maishsk.com; OpenStack Development Mailing List
> (not for usage questions)
> Subject: Re: [openstack-dev] [all] Outcome of distributed lock manager
> discussion @ the summit
> 
> Dedicating 3 controller nodes in a small cloud is not the best allocation
> of resources sometimes.  Your thinking of medium to large clouds. Small
> production clouds are a thing too. and at that scale, a little downtime
> if you actually hit the rare case of a node failure on the controller may
> be acceptable. Its up for an OP to decide.
> 
> We've also experienced that sometimes HA software causes more, or longer
> downtimes then it solves sometimes. Due to its complexity, knowledge
> required, proper testing, etc. Again, the risk gets higher the smaller
> the cloud is in some ways.
> 
> Being able to keep it simple and small for that case, then scale with
> switching out pieces as needed does have some tangible benefits.
> 
> Thanks,
> Kevin
> ________
> From: Maish Saidel-Keesing [mais...@maishsk.com]
> Sent: Monday, November 09, 2015 11:35 AM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [all] Outcome of distributed lock manager
> discussion @ the summit
&

Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-09 Thread Robert Collins
On 10 November 2015 at 19:24, Kevin Carter  wrote:
> Hello all,
>
> The rational behind using a solution like zookeeper makes sense however in 
> reviewing the thread I found myself asking if there was a better way to 
> address the problem without the addition of a Java based solution as the 
> default. While it has been covered that the current implementation would be a 
> reference and that "other" driver support in Tooz would allow for any backend 
> a deployer may want, the work being proposed within devstack [0] would become 
> the default development case thus making it the de-facto standard and I think 
> we could do better in terms of supporting developers and delivering 
> capability.
>
> My thoughts on using Redis+Redislock instead of Java+Zookeeper as the default 
> option:
> * Tooz already support redislock
> * Redis has an established cluster system known for general ease of use and 
> reliability on distributed systems.

I believe Clint already linked to
https://aphyr.com/posts/309-knossos-redis-and-linearizability or
similar - but 'known for general ease of use and reliability' is uhm,
a bold claim. Its worth comparing that (and the other redis writeups)
to this one: https://aphyr.com/posts/291-call-me-maybe-zookeeper. "Use
zookeeper, its mature".

> * Several OpenStack projects already support Redis as a backend option or 
> have extended capabilities using a Redis.
> * Redis can be implemented in RHEL, SUSE, and DEB based systems with ease.
> * Redis is Opensource software licensed under the "three clause BSD license" 
> and would not have any of the same questionable license implications as found 
> when dealing with anything Java.

The openjdk is present on the same linux distributions, and has been
used in both open source and proprietary programs for decades. *what*
license implications are you speaking of?

> * The inclusion of Redis would work on a single node allowing developers to 
> continue work using VMs running on Laptops with 4GB or ram but would also 
> scale to support the multi-controller use case with ease. This would also 
> give developers the ability to work on a systems that will actually resemble 
> production.

I believe you can do this with zookeeper - both single process, or
three processes on one machine to emulate a cluster - very easily.
Quoting http://qnalist.com/questions/29943/java-heap-size-for-zookeeper
- "It's more dependent on your workload than anything. If you're
storing on order of hundreds of small znodes then 1gb is going to [be]
more then fine." Obviously we should test this and confirm it, but
developer efficiency is a key part of any decision, and AFAIK there is
absolutely nothing in the way as far as zookeeper goes. Just like
rabbitmq and openvswitch, its a mature thing, written in a language
other than Python, which needs its own care and feeding (and that
feeding is something like 90% zk specific, not 'java headaches').

> * Redislock will bring with it no additional developer facing language 
> dependencies (Redis is written in ANSI C and works ... without external 
> dependencies [1]) while also providing a plethora of language bindings [2].
>
>
> I apologize for questioning the proposed solution so late into the 
> development of this thread and for not making the summit conversations to 
> talk more with everyone whom worked on the proposal. While the ship may have 
> sailed on this point for now I figured I'd ask why we might go down the path 
> of Zookeeper+Java when a solution with likely little to no development effort 
> already exists, can support just about any production/development 
> environment, has lots of bindings, and (IMHO) would integrate with the larger 
> community easier; many OpenStack developers and deployers already know Redis. 
> With the inclusion of ZK+Java in DevStack and the act of making it the 
> default it essentially creates new hard dependencies one of which is Java and 
> I'd like to avoid that if at all possible; basically I think we can do better.


I think its fine to raise the question, but lets perhaps set some priorities.

1) The default should be mature
2) The default should be suitable for developer use on a modest laptop
3) The default should be suitable for use in the majority of clouds

I believe zk meets all those priorities, and redis does not. Its
possible that etcd and or consul do: though they are much newer and so
perhaps fail on the maturity scale. I'm certain redis does not - at
least, not unless the previously reported defects have been fixed in
the last 2 years.

-Rob



-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-09 Thread Kevin Carter
Hello all, 

The rational behind using a solution like zookeeper makes sense however in 
reviewing the thread I found myself asking if there was a better way to address 
the problem without the addition of a Java based solution as the default. While 
it has been covered that the current implementation would be a reference and 
that "other" driver support in Tooz would allow for any backend a deployer may 
want, the work being proposed within devstack [0] would become the default 
development case thus making it the de-facto standard and I think we could do 
better in terms of supporting developers and delivering capability.

My thoughts on using Redis+Redislock instead of Java+Zookeeper as the default 
option:
* Tooz already support redislock
* Redis has an established cluster system known for general ease of use and 
reliability on distributed systems. 
* Several OpenStack projects already support Redis as a backend option or have 
extended capabilities using a Redis.
* Redis can be implemented in RHEL, SUSE, and DEB based systems with ease. 
* Redis is Opensource software licensed under the "three clause BSD license" 
and would not have any of the same questionable license implications as found 
when dealing with anything Java.
* The inclusion of Redis would work on a single node allowing developers to 
continue work using VMs running on Laptops with 4GB or ram but would also scale 
to support the multi-controller use case with ease. This would also give 
developers the ability to work on a systems that will actually resemble 
production.
* Redislock will bring with it no additional developer facing language 
dependencies (Redis is written in ANSI C and works ... without external 
dependencies [1]) while also providing a plethora of language bindings [2].


I apologize for questioning the proposed solution so late into the development 
of this thread and for not making the summit conversations to talk more with 
everyone whom worked on the proposal. While the ship may have sailed on this 
point for now I figured I'd ask why we might go down the path of Zookeeper+Java 
when a solution with likely little to no development effort already exists, can 
support just about any production/development environment, has lots of 
bindings, and (IMHO) would integrate with the larger community easier; many 
OpenStack developers and deployers already know Redis. With the inclusion of 
ZK+Java in DevStack and the act of making it the default it essentially creates 
new hard dependencies one of which is Java and I'd like to avoid that if at all 
possible; basically I think we can do better.


[0] - https://review.openstack.org/#/c/241040/
[1] - http://redis.io/topics/introduction
[2] - http://redis.io/topics/distlock

--

Kevin Carter
IRC: cloudnull



From: Fox, Kevin M <kevin@pnnl.gov>
Sent: Monday, November 9, 2015 1:54 PM
To: maishsk+openst...@maishsk.com; OpenStack Development Mailing List (not for 
usage questions)
Subject: Re: [openstack-dev] [all] Outcome of distributed lock manager 
discussion @ the summit

Dedicating 3 controller nodes in a small cloud is not the best allocation of 
resources sometimes.  Your thinking of medium to large clouds. Small production 
clouds are a thing too. and at that scale, a little downtime if you actually 
hit the rare case of a node failure on the controller may be acceptable. Its up 
for an OP to decide.

We've also experienced that sometimes HA software causes more, or longer 
downtimes then it solves sometimes. Due to its complexity, knowledge required, 
proper testing, etc. Again, the risk gets higher the smaller the cloud is in 
some ways.

Being able to keep it simple and small for that case, then scale with switching 
out pieces as needed does have some tangible benefits.

Thanks,
Kevin

From: Maish Saidel-Keesing [mais...@maishsk.com]
Sent: Monday, November 09, 2015 11:35 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [all] Outcome of distributed lock manager 
discussion @ the summit

On 11/05/15 23:18, Fox, Kevin M wrote:
> Your assuming there are only 2 choices,
>   zk or db+rabbit. I'm claiming both hare suboptimal at present. a 3rd might 
> be needed. Though even with its flaws, the db+rabbit choice has a few 
> benefits too.
>
> You also seem to assert that to support large clouds, the default must be 
> something that can scale that large. While that would be nice, I don't think 
> its a requirement if its overly burdensome on deployers of non huge clouds.
>
> I don't have metrics, but I would be surprised if most deployments today 
> (production + other) used 3 controllers with a full ha setup. I would guess 
> that the majority are single controller setups. With those, the
I think it would be safe to assume - that any kind of production cloud -
or any operator that

Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-06 Thread Fox, Kevin M


> -Original Message-
> From: Clint Byrum [mailto:cl...@fewbar.com]
> Sent: Thursday, November 05, 2015 3:19 PM
> To: openstack-dev
> Subject: Re: [openstack-dev] [all] Outcome of distributed lock manager
> discussion @ the summit
> 
> Excerpts from Fox, Kevin M's message of 2015-11-05 13:18:13 -0800:
> > Your assuming there are only 2 choices,  zk or db+rabbit. I'm claiming
> > both hare suboptimal at present. a 3rd might be needed. Though even
> with its flaws, the db+rabbit choice has a few benefits too.
> >
> 
> Well, I'm assuming it is zk/etcd/consul, because while the java argument is
> rather religious, the reality is all three are significantly different from
> databases and message queues and thus will be "snowflakes". But yes, I
> _am_ assuming that Zookeeper is a natural, logical, simple choice, and that
> fact that it runs in a jvm is a poor reason to avoid it.

Yes. Having a snowflake there is probably unavoidable, but how much of one is.

I've had to tune jvm stuff like the java stack size when things spontaneously 
break, and then they tell you, oh, yeah, what that happens, go tweak such and 
such in the jvm... Unix sysadmins usually know the  common things for c  apps 
without much effort. And tend to know to look in advance. In my, somewhat 
limited experience with go, the runtime seems closer to regular unix programs 
then jvm ones.

The term 'java' is often conflated to mean both the java language, and the jvm 
runtime. When people talk about java, often they are talking about the jvm. I 
think this is one of those cases. Its easier to debug c/go for unix admins not 
trained specifically in jvm behaviors/tunables.

> 
> > You also seem to assert that to support large clouds, the default must be
> something that can scale that large. While that would be nice, I don't think
> its a requirement if its overly burdensome on deployers of non huge clouds.
> >
> 
> I think the current solution even scales poorly for medium sized clouds.
> Only the tiniest of clouds with the fewest nodes can really sustain all of 
> that
> polling without incurring cost for that overhead that would be better spent
> on serviceing users.

While not ideal, I've run clouds with around 100 nodes on a single controller. 
If its doable today, it should be doable with the new system. Its not ideal, 
but if it's a zero effort deploy, and easy to debug, that has something going 
for it.

> 
> > I don't have metrics, but I would be surprised if most deployments today
> (production + other) used 3 controllers with a full ha setup. I would guess
> that the majority are single controller setups. With those, the overhead of
> maintaining a whole dlm like zk seems like overkill. If db+rabbit would work
> for that one case, that would be one less thing to have to setup for an op.
> They already have to setup db+rabbit. Or even a clm plugin of some sort,
> that won't scale, but would be very easy to deploy, and change out later
> when needed would be very useful.
> >
> 
> We do have metrics:
> 
> http://www.openstack.org/assets/survey/Public-User-Survey-Report.pdf
> 
> Page 35, "How many physical compute nodes do OpenStack clouds have?"
> 

Not what I was asking. It was asking how many controllers, not how many compute 
nodes. Like I said above, 1 controller can handle quite a bit of compute nodes.

> 
> 10-99:42%
> 1-9:  36%
> 100-999:  15%
> 1000-: 7%
> 
> So for respondents to that survey, yes, "most" are running less than 100
> nodes. However, by compute node count, if we extrapolate a bit:
> 
> There were 154 respondents so:
> 
> 10-99 * 42% =640 - 6403 nodes
> 1-9 * 36% =  55 - 498 nodes
> 100-999 * 15% =  2300 - 23076 nodes
> 1000- * 7% = 1 - 107789 nodes
>

This is good, but I believe this is biased towards the top end.

Respondents are much more likely to respond if they have a larger cloud to brag 
about. Folks doing it for development, testing, and other reasons may not 
respond because its not worth the effort. 

> So in terms of the number of actual computers running OpenStack compute,
> as an example, from the survey respondents, there are more computes
> running in *one* of the clouds with more than 1000 nodes than there are in
> *all* of the clouds with less than 10 nodes, and certainly more in all of the
> clouds over 1000 nodes, than in all of the clouds with less than 100 nodes.

For the reason listed above, I don't think we have enough evidence draw too 
strong a conclusion from this.

> 
> What this means, to me, is that the investment in OpenStack should focus
> on those with > 1000, since those orgs are definitely investing a lot more
> today. We shouldn't make it _hard_ to do a tiny cloud, but I think it's 

Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-05 Thread Chris Dent

On Thu, 5 Nov 2015, Robert Collins wrote:


In the session we were told that zookeeper is already used in CI jobs
for ceilometer (was this wrong?) and thats why we figured it made a
sane default for devstack.


For clarity: What ceilometer (actually gnocchi) is doing is using tooz
in CI (gate-ceilometer-dsvm-integration). And for now it is using
redis as that was "simple".

Outside of CI it is possible to deploy ceilo, aodh and gnocchi to use
tooz for coordinating group partitioning in active-active HA setups
and shared locks. Again the standard deploy for that has been to use
redis because of availability. It's fairly understood that zookeeper
would be more correct but there are packaging concerns.

--
Chris Dent   (�s°□°)�s�喋擤ォ�http://anticdent.org/
freenode: cdent tw: @anticdent__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-05 Thread Sean Dague
On 11/05/2015 06:00 AM, Thierry Carrez wrote:
> Hayes, Graham wrote:
>> On 04/11/15 20:04, Ed Leafe wrote:
>>> On Nov 3, 2015, at 6:45 AM, Davanum Srinivas  wrote:

 Here's a Devstack review for zookeeper in support of this initiative:

 https://review.openstack.org/241040

 Thanks,
 Dims
>>>
>>> I thought that the operators at that session made it very clear that they 
>>> would *not* run any Java applications, and that if OpenStack required a 
>>> Java app to run, they would no longer use it.
>>>
>>> I like the idea of using Zookeeper as the DLM, but I don't think it should 
>>> be set up as a default, even for devstack, given the vehement opposition 
>>> expressed.
>>>
>>>
>>> -- Ed Leafe
>>>
>>
>> I got the impression that there was *some* operators that wouldn't run
>> java.

I feel like I'd like to see that with data. Because every Ops session
I've been in around logging and debugging has had nearly everyone raise
their hand that they are running the ELK stack for log analysis. So they
are all running Java already.

I would absolutely hate to have some design point get made based on
rumors from ops and "java is icky" sentiment from the dev space.

Defaults matter, because it means you get a critical mass of operators
running similar configs, and they can build and share knowledge. For all
of the issues with Rabbit, it has demonstrably been good to have
collaboration in the field between operators that have shared patterns
and fed back the issues. So we should really say Zookeeper is the
default choice, even if there are others people could choose that have
extra mustachy / monocle goodness.

-Sean

-- 
Sean Dague
http://dague.net

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-05 Thread Sean Dague
On 11/05/2015 03:08 AM, Chris Dent wrote:
> On Thu, 5 Nov 2015, Robert Collins wrote:
> 
>> In the session we were told that zookeeper is already used in CI jobs
>> for ceilometer (was this wrong?) and thats why we figured it made a
>> sane default for devstack.
> 
> For clarity: What ceilometer (actually gnocchi) is doing is using tooz
> in CI (gate-ceilometer-dsvm-integration). And for now it is using
> redis as that was "simple".
> 
> Outside of CI it is possible to deploy ceilo, aodh and gnocchi to use
> tooz for coordinating group partitioning in active-active HA setups
> and shared locks. Again the standard deploy for that has been to use
> redis because of availability. It's fairly understood that zookeeper
> would be more correct but there are packaging concerns.

What are the packaging concerns for zookeeper?

-Sean

-- 
Sean Dague
http://dague.net

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-05 Thread Thierry Carrez
Hayes, Graham wrote:
> On 04/11/15 20:04, Ed Leafe wrote:
>> On Nov 3, 2015, at 6:45 AM, Davanum Srinivas  wrote:
>>>
>>> Here's a Devstack review for zookeeper in support of this initiative:
>>>
>>> https://review.openstack.org/241040
>>>
>>> Thanks,
>>> Dims
>>
>> I thought that the operators at that session made it very clear that they 
>> would *not* run any Java applications, and that if OpenStack required a Java 
>> app to run, they would no longer use it.
>>
>> I like the idea of using Zookeeper as the DLM, but I don't think it should 
>> be set up as a default, even for devstack, given the vehement opposition 
>> expressed.
>>
>>
>> -- Ed Leafe
>>
> 
> I got the impression that there was *some* operators that wouldn't run
> java.
> 
> I do not see an issue with having ZooKeeper as the default, as long as
> there is an alternate solution that also works for the operators that do
> not want to use it.

Yes, that is my recollection. We can't make Java mandatory, so we need
to have the *option* to not run any Java (for those people who don't
want to start touching it, for various reasons).

IMHO that doesn't mean ZK cannot be the early default in devstack, or
that we should hold all DLM work until a Consul/etcd driver is
production-ready. It just means we need to have people signed up to
build and maintain a Consul and/or etcd driver :)

NB: I wouldn't mind helping on an etcd driver, that sounds like a fun
side project. I'm just totally unsure I'll have time to do it.

-- 
Thierry Carrez (ttx)

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-05 Thread Joshua Harlow

Sean Dague wrote:

On 11/05/2015 06:00 AM, Thierry Carrez wrote:

Hayes, Graham wrote:

On 04/11/15 20:04, Ed Leafe wrote:

On Nov 3, 2015, at 6:45 AM, Davanum Srinivas  wrote:

Here's a Devstack review for zookeeper in support of this initiative:

https://review.openstack.org/241040

Thanks,
Dims

I thought that the operators at that session made it very clear that they would 
*not* run any Java applications, and that if OpenStack required a Java app to 
run, they would no longer use it.

I like the idea of using Zookeeper as the DLM, but I don't think it should be 
set up as a default, even for devstack, given the vehement opposition expressed.


-- Ed Leafe


I got the impression that there was *some* operators that wouldn't run
java.


I feel like I'd like to see that with data. Because every Ops session
I've been in around logging and debugging has had nearly everyone raise
their hand that they are running the ELK stack for log analysis. So they
are all running Java already.

I would absolutely hate to have some design point get made based on
rumors from ops and "java is icky" sentiment from the dev space.

Defaults matter, because it means you get a critical mass of operators
running similar configs, and they can build and share knowledge. For all
of the issues with Rabbit, it has demonstrably been good to have
collaboration in the field between operators that have shared patterns
and fed back the issues. So we should really say Zookeeper is the
default choice, even if there are others people could choose that have
extra mustachy / monocle goodness.



+1 from me

I mean I get that there will be some person out there that will say 'no 
icky thats java' but said type of people will *always* exist, no matter 
what the situation and if we are basing sound technical decisions on 
that one (and/or small set of people) person it makes me wonder what the 
heck we are doing...


Because that's totally crazy (IMHO). After a while we need to listen to 
the 99% and make a solution targeted at them, and accept that we will 
not make 100% of people happy all the time. This is why I personally 
like being opinionated and I think/thought that openstack as a group had 
matured enough to do this (but I see that it still isn't ready to do this).


My 2 cents,

-Josh


-Sean



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-05 Thread Robert Collins
On 5 November 2015 at 11:32, Fox, Kevin M  wrote:
> To clarify that statement a little more,
>
> Speaking only for myself as an op, I don't want to support yet one more 
> snowflake in a sea of snowflakes, that works differently then all the rest, 
> without a very good reason.
>
> Java has its own set of issues associated with the JVM. Care, and feeding 
> sorts of things. If we are to invest time/money/people in learning how to 
> properly maintain it, its easier to justify if its not just a one off for 
> just DLM,
>
> So I wouldn't go so far as to say we're vehemently opposed to java, just that 
> DLM on its own is probably not a strong enough feature all on its own to 
> justify requiring pulling in java. Its been only a very recent thing that you 
> could convince folks that DLM was needed at all. So either make java 
> optional, or find some other use cases that needs java badly enough that you 
> can make java a required component. I suspect some day searchlight might be 
> compelling enough for that, but not today.
>
> As for the default, the default should be good reference. if most sites would 
> run with etc or something else since java isn't needed, then don't default 
> zookeeper on.

So lets be clear about the discussion at the summit.

There were three, non-conflicting and distinct concerns raised about Java.

One is the 'its a new platform for us operators to understand
operations around' - which is fair, and indeed, Java has different
(not better, different) behaviours to the CPython VM.

Secondly, 'us operators do not want to be a special snowflake, we
*want* to run the majority configuration' - which makes sense, and is
one reason to aim for a convergent stack where possible.

Thirdly, 'many of our customers *will not* run Oracle's JVM and the
stability and performance of Zookeeper on openjdk is an unknown'. The
argument was that we can't pick zk because the herd run it on Oracle's
JVM not openjdk - now there are some unquantified bits here, but it is
known that openjdk has had sufficient differences to Oracle JVM to
cause subtle bugs, so if most large zk shops are running Oracle JVM
then indeed this becomes a special-snowflake risk.

I don't recall *anyone* saying they thought zk was bad, or that they
would refuse to run it if we had chosen zk rather than tooz. We got
stuck on that third issue - there was no way to answer it in the
session, and its obviously a terrifying risk to take.

And because for every option some operators were going to be unhappy,
we fell back to the choice of not making a choice.

There are a bunch of parameters around DLM usage that we haven't
quantified yet - we can talk capabilities sensibly, but we don't yet
know how much load we will put on the DLM, nor how it will scale
relative to cloud size. My naive expectation is that we'll need a
-very- large cloud to stress the cluster size of any decent DLM, but
that request rate / latency could be a potential issue as clouds scale
(e.g. need care and feeding).

-Rob


-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-05 Thread Clint Byrum
Excerpts from Fox, Kevin M's message of 2015-11-04 14:32:42 -0800:
> To clarify that statement a little more,
> 
> Speaking only for myself as an op, I don't want to support yet one more 
> snowflake in a sea of snowflakes, that works differently then all the rest, 
> without a very good reason.
> 
> Java has its own set of issues associated with the JVM. Care, and feeding 
> sorts of things. If we are to invest time/money/people in learning how to 
> properly maintain it, its easier to justify if its not just a one off for 
> just DLM,
> 
> So I wouldn't go so far as to say we're vehemently opposed to java, just that 
> DLM on its own is probably not a strong enough feature all on its own to 
> justify requiring pulling in java. Its been only a very recent thing that you 
> could convince folks that DLM was needed at all. So either make java 
> optional, or find some other use cases that needs java badly enough that you 
> can make java a required component. I suspect some day searchlight might be 
> compelling enough for that, but not today.
> 
> As for the default, the default should be good reference. if most sites would 
> run with etc or something else since java isn't needed, then don't default 
> zookeeper on.
> 

There are a number of reasons, but the most important are:

* Resilience in the face of failures - The current database+MQ based
  solutions are all custom made and have unknown characteristics when
  there are network partitions and node failures.
* Scalability - The current database+MQ solutions rely on polling the
  database and/or sending lots of heartbeat messages or even using the
  database to store heartbeat transactions. This scales fine for tiny
  clusters, but when every new node adds more churn to the MQ and
  database, this will (and has been observed to) be intractable.
* Tech debt - OpenStack is inventing lock solutions and then maintaining
  them. And service discovery solutions, and then maintaining them.
  Wouldn't you rather have better upgrade stories, more stability, more
  scale, and more featuers?

If those aren't compelling enough reasons to deploy a mature java service
like Zookeeper, I don't know what would be. But I do think using the
abstraction layer of tooz will at least allow us to move forward without
having to convince everybody everywhere that this is actually just the
path of least resistance.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-05 Thread Clint Byrum
Excerpts from Fox, Kevin M's message of 2015-11-05 13:18:13 -0800:
> Your assuming there are only 2 choices,
>  zk or db+rabbit. I'm claiming both hare suboptimal at present. a 3rd might 
> be needed. Though even with its flaws, the db+rabbit choice has a few 
> benefits too.
> 

Well, I'm assuming it is zk/etcd/consul, because while the java argument
is rather religious, the reality is all three are significantly different
from databases and message queues and thus will be "snowflakes". But yes,
I _am_ assuming that Zookeeper is a natural, logical, simple choice,
and that fact that it runs in a jvm is a poor reason to avoid it.

> You also seem to assert that to support large clouds, the default must be 
> something that can scale that large. While that would be nice, I don't think 
> its a requirement if its overly burdensome on deployers of non huge clouds.
> 

I think the current solution even scales poorly for medium sized
clouds. Only the tiniest of clouds with the fewest nodes can really
sustain all of that polling without incurring cost for that overhead
that would be better spent on serviceing users.

> I don't have metrics, but I would be surprised if most deployments today 
> (production + other) used 3 controllers with a full ha setup. I would guess 
> that the majority are single controller setups. With those, the overhead of 
> maintaining a whole dlm like zk seems like overkill. If db+rabbit would work 
> for that one case, that would be one less thing to have to setup for an op. 
> They already have to setup db+rabbit. Or even a clm plugin of some sort, that 
> won't scale, but would be very easy to deploy, and change out later when 
> needed would be very useful.
> 

We do have metrics:

http://www.openstack.org/assets/survey/Public-User-Survey-Report.pdf

Page 35, "How many physical compute nodes do OpenStack clouds have?"


10-99:42%
1-9:  36%
100-999:  15%
1000-: 7%

So for respondents to that survey, yes, "most" are running less than 100
nodes. However, by compute node count, if we extrapolate a bit:

There were 154 respondents so:

10-99 * 42% =640 - 6403 nodes
1-9 * 36% =  55 - 498 nodes
100-999 * 15% =  2300 - 23076 nodes
1000- * 7% = 1 - 107789 nodes

So in terms of the number of actual computers running OpenStack compute,
as an example, from the survey respondents, there are more computes
running in *one* of the clouds with more than 1000 nodes than there are
in *all* of the clouds with less than 10 nodes, and certainly more in
all of the clouds over 1000 nodes, than in all of the clouds with less
than 100 nodes.

What this means, to me, is that the investment in OpenStack should focus
on those with > 1000, since those orgs are definitely investing a lot
more today. We shouldn't make it _hard_ to do a tiny cloud, but I think
it's ok to make the tiny cloud less efficient if it means we can grow
it into a monster cloud at any point and we continue to garner support
from orgs who need to build large scale clouds.

(I realize I'm biased because I want to build a cloud with more than
1000 nodes ;)

> etcd is starting to show up in a lot of other projects, and so it may be at 
> sites already. being able to support it may be less of a burden to operators 
> then zk in some cases.
> 

Sure, just like some shops already have postgres and in theory you can
still run OpenStack on postgres. But the testing level for postgres
support is so abyssmal that I'd be surprised if anybody was actually
_choosing_ to do this. I can see this going the same way, where we give
everyone a choice, but then end up with almost nobody using any
alternative choices because the community has only rallied around the
one dominat choice.

> If your cloud grows to the point where the dlm choice really matters for 
> scalability/correctness, then you probably have enough staff members to deal 
> with adding in zk, and that's probably the right choice.
> 

If your cloud is 40 compute nodes, and three nines (which, lets face
it, thats the availability profile of a cloud with one controller), we
can just throw Zookeeper up untuned and satisfy the needs. Why would we
want to put up a custom homegrown db+mq solution and then force a change
later on if the cloud grows? A single code path seems a lot better than
multiple code paths, some of which are not really well tested.

> You can have multiple suggested things in addition to one default. Default to 
> the thing that makes the most sense in the common most deployments, and make 
> specific recommendations for certain scenarios. like, "if greater then 100 
> nodes, we strongly recommend using zk" or something to that effect.
> 

Choices are not free either. Just edit that statement there: "We
strongly recommend using zk." Nothing about ZK, etcd, or consul,
invalidates running on a small cloud. In many ways it makes things
simpler, since the user doesn't have to decide on a DLM, but instead
just installs the thing we recommend.


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-05 Thread Fox, Kevin M
Your assuming there are only 2 choices,
 zk or db+rabbit. I'm claiming both hare suboptimal at present. a 3rd might be 
needed. Though even with its flaws, the db+rabbit choice has a few benefits too.

You also seem to assert that to support large clouds, the default must be 
something that can scale that large. While that would be nice, I don't think 
its a requirement if its overly burdensome on deployers of non huge clouds.

I don't have metrics, but I would be surprised if most deployments today 
(production + other) used 3 controllers with a full ha setup. I would guess 
that the majority are single controller setups. With those, the overhead of 
maintaining a whole dlm like zk seems like overkill. If db+rabbit would work 
for that one case, that would be one less thing to have to setup for an op. 
They already have to setup db+rabbit. Or even a clm plugin of some sort, that 
won't scale, but would be very easy to deploy, and change out later when needed 
would be very useful.

etcd is starting to show up in a lot of other projects, and so it may be at 
sites already. being able to support it may be less of a burden to operators 
then zk in some cases.

If your cloud grows to the point where the dlm choice really matters for 
scalability/correctness, then you probably have enough staff members to deal 
with adding in zk, and that's probably the right choice.

You can have multiple suggested things in addition to one default. Default to 
the thing that makes the most sense in the common most deployments, and make 
specific recommendations for certain scenarios. like, "if greater then 100 
nodes, we strongly recommend using zk" or something to that effect.

Thanks,
Kevin



From: Clint Byrum [cl...@fewbar.com]
Sent: Thursday, November 05, 2015 11:44 AM
To: openstack-dev
Subject: Re: [openstack-dev] [all] Outcome of distributed lock manager  
discussion @ the summit

Excerpts from Fox, Kevin M's message of 2015-11-04 14:32:42 -0800:
> To clarify that statement a little more,
>
> Speaking only for myself as an op, I don't want to support yet one more 
> snowflake in a sea of snowflakes, that works differently then all the rest, 
> without a very good reason.
>
> Java has its own set of issues associated with the JVM. Care, and feeding 
> sorts of things. If we are to invest time/money/people in learning how to 
> properly maintain it, its easier to justify if its not just a one off for 
> just DLM,
>
> So I wouldn't go so far as to say we're vehemently opposed to java, just that 
> DLM on its own is probably not a strong enough feature all on its own to 
> justify requiring pulling in java. Its been only a very recent thing that you 
> could convince folks that DLM was needed at all. So either make java 
> optional, or find some other use cases that needs java badly enough that you 
> can make java a required component. I suspect some day searchlight might be 
> compelling enough for that, but not today.
>
> As for the default, the default should be good reference. if most sites would 
> run with etc or something else since java isn't needed, then don't default 
> zookeeper on.
>

There are a number of reasons, but the most important are:

* Resilience in the face of failures - The current database+MQ based
  solutions are all custom made and have unknown characteristics when
  there are network partitions and node failures.
* Scalability - The current database+MQ solutions rely on polling the
  database and/or sending lots of heartbeat messages or even using the
  database to store heartbeat transactions. This scales fine for tiny
  clusters, but when every new node adds more churn to the MQ and
  database, this will (and has been observed to) be intractable.
* Tech debt - OpenStack is inventing lock solutions and then maintaining
  them. And service discovery solutions, and then maintaining them.
  Wouldn't you rather have better upgrade stories, more stability, more
  scale, and more featuers?

If those aren't compelling enough reasons to deploy a mature java service
like Zookeeper, I don't know what would be. But I do think using the
abstraction layer of tooz will at least allow us to move forward without
having to convince everybody everywhere that this is actually just the
path of least resistance.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-05 Thread Chris Dent

On Thu, 5 Nov 2015, Sean Dague wrote:

On 11/05/2015 03:08 AM, Chris Dent wrote:

Outside of CI it is possible to deploy ceilo, aodh and gnocchi to use
tooz for coordinating group partitioning in active-active HA setups
and shared locks. Again the standard deploy for that has been to use
redis because of availability. It's fairly understood that zookeeper
would be more correct but there are packaging concerns.


What are the packaging concerns for zookeeper?


I had thought there were generic issues with RPMs of Java-based packages
but I'm able to find RPMs of zookeeper for recent Fedoras[1] so I guess
the concerns are either moot or nearly so. What this means for RHEL or
CentOS I've never been too sure about.

[1] http://rpmfind.net/linux/rpm2html/search.php?query=zookeeper

--
Chris Dent   (�s°□°)�s�喋擤ォ�http://anticdent.org/
freenode: cdent tw: @anticdent__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-04 Thread Vilobh Meshram
I will be working on adding the Consul driver to Tooz [1].

-Vilobh
[1] https://blueprints.launchpad.net/python-tooz/+spec/add-consul-driver

On Wed, Nov 4, 2015 at 2:05 PM, Mark Voelker  wrote:

> On Nov 4, 2015, at 4:41 PM, Gregory Haynes  wrote:
> >
> > Excerpts from Clint Byrum's message of 2015-11-04 21:17:15 +:
> >> Excerpts from Joshua Harlow's message of 2015-11-04 12:57:53 -0800:
> >>> Ed Leafe wrote:
>  On Nov 3, 2015, at 6:45 AM, Davanum Srinivas
> wrote:
> > Here's a Devstack review for zookeeper in support of this initiative:
> >
> > https://review.openstack.org/241040
> >
> > Thanks,
> > Dims
> 
>  I thought that the operators at that session made it very clear that
> they would *not* run any Java applications, and that if OpenStack required
> a Java app to run, they would no longer use it.
> 
>  I like the idea of using Zookeeper as the DLM, but I don't think it
> should be set up as a default, even for devstack, given the vehement
> opposition expressed.
> 
> >>>
> >>> What should be the default then?
> >>>
> >>> As for 'vehement opposition' I didn't see that as being there, I saw a
> >>> small set of people say 'I don't want to run java or I can't run java',
> >>> some comments about requiring using oracles JVM (which isn't correct,
> >>> OpenJDK works for folks that I have asked in the zookeeper community
> and
> >>> else where) and the rest of the folks were ok with it...
> >>>
> >>> If people want a alternate driver, propose it IMHO...
> >>>
> >>
> >> The few operators who stated this position are very much appreciated
> >> for standing up and making it clear. It has helped us not step into a
> >> minefield with a native ZK driver!
> >>
> >> Consul is the most popular second choice, and should work fine for the
> >> use cases we identified. It will not be sufficient if we ever have
> >> a use case where many agents must lock many resources, since Consul
> >> does not offer a way to grant lock access in a fair manner (ZK does,
> >> and we're not aware of any others that do actually). Using Consul or
> >> etcd for this case would result in situations where lock waiters may
> >> wait _forever_, and will likely wait longer than they should at times.
> >> Hopefully we can simply avoid the need for this in OpenStack all
> together.
> >>
> >> I do _not_ think we should wait for constrained operators to scream
> >> at us about ZK to write a Consul driver. It's important enough that we
> >> should start documenting all of the issues we expect to see with Consul
> >> (it's not widely packaged, for instance) and writing a driver with its
> >> own devstack plugin.
> >>
> >> If there are Consul experts who did not make it to those sessions,
> >> it would be greatly appreciated if you can spend some time on this.
> >>
> >> What I don't want to see happen is we get into a deadlock where there's
> >> a large portion of users who can't upgrade and no driver to support
> them.
> >> So lets stay ahead of the problem, and get a set of drivers that works
> >> for everybody!
> >>
> >
> > One additional note - out of the three possible options I see for tooz
> > drivers in production (zk, consul, etcd) we currently only have drivers
> > for ZK. This means that unless new drivers are created, when we depend
> > on tooz we will be requiring folks deploy zk.
> >
> > It would be *awesome* if some folks stepped up to create and support at
> > least one of the aternate backends.
> >
> > Although I am a fan of the ZK solution, I have an old WIP patch for
> > creating an etcd driver. I would like to revive and maintain it, but I
> > would also need one more maintainer per the new rules for in tree
> > drivers…
>
> For those following along at home, said WIP etcd driver patch is here:
>
> https://review.openstack.org/#/c/151463/
>
> And said rules are at:
>
> https://review.openstack.org/#/c/240645/
>
> And FWIW, I too am personally fine with ZK as a default for devstack.
>
> At Your Service,
>
> Mark T. Voelker
>
> >
> > Cheers,
> > Greg
> >
> >
> __
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe:
> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-04 Thread Davanum Srinivas
Graham,

Agree. Hence the Tooz as the abstraction layer. Folks are welcome to
write new drivers or fix existing drivers for Tooz where needed.

-- Dims

On Wed, Nov 4, 2015 at 3:04 PM, Hayes, Graham  wrote:
> On 04/11/15 20:04, Ed Leafe wrote:
>> On Nov 3, 2015, at 6:45 AM, Davanum Srinivas  wrote:
>>>
>>> Here's a Devstack review for zookeeper in support of this initiative:
>>>
>>> https://review.openstack.org/241040
>>>
>>> Thanks,
>>> Dims
>>
>> I thought that the operators at that session made it very clear that they 
>> would *not* run any Java applications, and that if OpenStack required a Java 
>> app to run, they would no longer use it.
>>
>> I like the idea of using Zookeeper as the DLM, but I don't think it should 
>> be set up as a default, even for devstack, given the vehement opposition 
>> expressed.
>>
>>
>> -- Ed Leafe
>>
>
> I got the impression that there was *some* operators that wouldn't run
> java.
>
> I do not see an issue with having ZooKeeper as the default, as long as
> there is an alternate solution that also works for the operators that do
> not want to use it.
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



-- 
Davanum Srinivas :: https://twitter.com/dims

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-04 Thread Sean Dague
On 11/04/2015 03:57 PM, Joshua Harlow wrote:
> Ed Leafe wrote:
>> On Nov 3, 2015, at 6:45 AM, Davanum Srinivas  wrote:
>>> Here's a Devstack review for zookeeper in support of this initiative:
>>>
>>> https://review.openstack.org/241040
>>>
>>> Thanks,
>>> Dims
>>
>> I thought that the operators at that session made it very clear that
>> they would *not* run any Java applications, and that if OpenStack
>> required a Java app to run, they would no longer use it.
>>
>> I like the idea of using Zookeeper as the DLM, but I don't think it
>> should be set up as a default, even for devstack, given the vehement
>> opposition expressed.
>>
> 
> What should be the default then?
> 
> As for 'vehement opposition' I didn't see that as being there, I saw a
> small set of people say 'I don't want to run java or I can't run java',
> some comments about requiring using oracles JVM (which isn't correct,
> OpenJDK works for folks that I have asked in the zookeeper community and
> else where) and the rest of the folks were ok with it...
> 
> If people want a alternate driver, propose it IMHO...

Zookeeper has previously been used by a number of projects, I think it
makes a sensible default to start. We even had it in the gate on the
unit test jobs for a while. We can make a plug point in devstack later
once we see some kinds of jobs running on the zookeeper base for what
the semantics would make sense to plug more stuff in.

Kind of like the MQ path in devstack right now. One default, and a plug
point for people trying other stuff.

-Sean

-- 
Sean Dague
http://dague.net

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-04 Thread Robert Collins
On 5 November 2015 at 09:02, Ed Leafe  wrote:
> On Nov 3, 2015, at 6:45 AM, Davanum Srinivas  wrote:
>>
>> Here's a Devstack review for zookeeper in support of this initiative:
>>
>> https://review.openstack.org/241040
>>
>> Thanks,
>> Dims
>
> I thought that the operators at that session made it very clear that they 
> would *not* run any Java applications, and that if OpenStack required a Java 
> app to run, they would no longer use it.
>
> I like the idea of using Zookeeper as the DLM, but I don't think it should be 
> set up as a default, even for devstack, given the vehement opposition 
> expressed.

There was no option suggested that all the operators would run happily.

Thus it doesn't matter what the 'default' is - we know only some
operators will run it.

In the session we were told that zookeeper is already used in CI jobs
for ceilometer (was this wrong?) and thats why we figured it made a
sane default for devstack.

We can always change the default later.

What is important is that folk step up and write the consul and etcd
drivers for the non-Java-happy operators to consume.

-Rob

-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-04 Thread Ed Leafe
On Nov 4, 2015, at 3:17 PM, Clint Byrum  wrote:

> What I don't want to see happen is we get into a deadlock where there's
> a large portion of users who can't upgrade and no driver to support them.
> So lets stay ahead of the problem, and get a set of drivers that works
> for everybody!

I think that this is a great idea, but we also need some people familiar with 
Consul to do this work. Otherwise, ZK (and hence Java) is a defacto dependency.


-- Ed Leafe







signature.asc
Description: Message signed with OpenPGP using GPGMail
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-04 Thread Joshua Harlow

Ed Leafe wrote:

On Nov 3, 2015, at 6:45 AM, Davanum Srinivas  wrote:

Here's a Devstack review for zookeeper in support of this initiative:

https://review.openstack.org/241040

Thanks,
Dims


I thought that the operators at that session made it very clear that they would 
*not* run any Java applications, and that if OpenStack required a Java app to 
run, they would no longer use it.

I like the idea of using Zookeeper as the DLM, but I don't think it should be 
set up as a default, even for devstack, given the vehement opposition expressed.



What should be the default then?

As for 'vehement opposition' I didn't see that as being there, I saw a 
small set of people say 'I don't want to run java or I can't run java', 
some comments about requiring using oracles JVM (which isn't correct, 
OpenJDK works for folks that I have asked in the zookeeper community and 
else where) and the rest of the folks were ok with it...


If people want a alternate driver, propose it IMHO...



-- Ed Leafe






__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-04 Thread Hayes, Graham
On 04/11/15 20:04, Ed Leafe wrote:
> On Nov 3, 2015, at 6:45 AM, Davanum Srinivas  wrote:
>>
>> Here's a Devstack review for zookeeper in support of this initiative:
>>
>> https://review.openstack.org/241040
>>
>> Thanks,
>> Dims
> 
> I thought that the operators at that session made it very clear that they 
> would *not* run any Java applications, and that if OpenStack required a Java 
> app to run, they would no longer use it.
> 
> I like the idea of using Zookeeper as the DLM, but I don't think it should be 
> set up as a default, even for devstack, given the vehement opposition 
> expressed.
> 
> 
> -- Ed Leafe
> 

I got the impression that there was *some* operators that wouldn't run
java.

I do not see an issue with having ZooKeeper as the default, as long as
there is an alternate solution that also works for the operators that do
not want to use it.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-04 Thread Monty Taylor

On 11/04/2015 04:09 PM, Davanum Srinivas wrote:

Graham,

Agree. Hence the Tooz as the abstraction layer. Folks are welcome to
write new drivers or fix existing drivers for Tooz where needed.


Yes. This is correct. We cannot grow a hard depend on a Java thing, but 
optional depends are ok - and it turns out the semantics needed from 
DLMs and DKVSs are sufficiently abstractable for it to make sense.


That said - the only usable tooz backend at the moment is zookeeper - so 
someone who cares about the not-Java use case will have to step up and 
write a consul backend. The main thing is that we allow that to happen 
and don't do things that would prevent such a thing from being written.


Reasons for making ZK the default are:

- It exists in tooz today
- It's easily installable in all the distros
- It has devstack support already

None of those three are true of consul, although none are terribly hard 
to achieve.



On Wed, Nov 4, 2015 at 3:04 PM, Hayes, Graham  wrote:

On 04/11/15 20:04, Ed Leafe wrote:

On Nov 3, 2015, at 6:45 AM, Davanum Srinivas  wrote:


Here's a Devstack review for zookeeper in support of this initiative:

https://review.openstack.org/241040

Thanks,
Dims


I thought that the operators at that session made it very clear that they would 
*not* run any Java applications, and that if OpenStack required a Java app to 
run, they would no longer use it.

I like the idea of using Zookeeper as the DLM, but I don't think it should be 
set up as a default, even for devstack, given the vehement opposition expressed.


-- Ed Leafe



I got the impression that there was *some* operators that wouldn't run
java.

I do not see an issue with having ZooKeeper as the default, as long as
there is an alternate solution that also works for the operators that do
not want to use it.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev







__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-04 Thread Fox, Kevin M
To clarify that statement a little more,

Speaking only for myself as an op, I don't want to support yet one more 
snowflake in a sea of snowflakes, that works differently then all the rest, 
without a very good reason.

Java has its own set of issues associated with the JVM. Care, and feeding sorts 
of things. If we are to invest time/money/people in learning how to properly 
maintain it, its easier to justify if its not just a one off for just DLM,

So I wouldn't go so far as to say we're vehemently opposed to java, just that 
DLM on its own is probably not a strong enough feature all on its own to 
justify requiring pulling in java. Its been only a very recent thing that you 
could convince folks that DLM was needed at all. So either make java optional, 
or find some other use cases that needs java badly enough that you can make 
java a required component. I suspect some day searchlight might be compelling 
enough for that, but not today.

As for the default, the default should be good reference. if most sites would 
run with etc or something else since java isn't needed, then don't default 
zookeeper on.

Thanks,
Kevin 


From: Ed Leafe [e...@leafe.com]
Sent: Wednesday, November 04, 2015 12:02 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [all] Outcome of distributed lock manager  
discussion @ the summit

On Nov 3, 2015, at 6:45 AM, Davanum Srinivas <dava...@gmail.com> wrote:
>
> Here's a Devstack review for zookeeper in support of this initiative:
>
> https://review.openstack.org/241040
>
> Thanks,
> Dims

I thought that the operators at that session made it very clear that they would 
*not* run any Java applications, and that if OpenStack required a Java app to 
run, they would no longer use it.

I like the idea of using Zookeeper as the DLM, but I don't think it should be 
set up as a default, even for devstack, given the vehement opposition expressed.


-- Ed Leafe






__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-04 Thread Clint Byrum
Excerpts from Joshua Harlow's message of 2015-11-04 12:57:53 -0800:
> Ed Leafe wrote:
> > On Nov 3, 2015, at 6:45 AM, Davanum Srinivas  wrote:
> >> Here's a Devstack review for zookeeper in support of this initiative:
> >>
> >> https://review.openstack.org/241040
> >>
> >> Thanks,
> >> Dims
> >
> > I thought that the operators at that session made it very clear that they 
> > would *not* run any Java applications, and that if OpenStack required a 
> > Java app to run, they would no longer use it.
> >
> > I like the idea of using Zookeeper as the DLM, but I don't think it should 
> > be set up as a default, even for devstack, given the vehement opposition 
> > expressed.
> >
> 
> What should be the default then?
> 
> As for 'vehement opposition' I didn't see that as being there, I saw a 
> small set of people say 'I don't want to run java or I can't run java', 
> some comments about requiring using oracles JVM (which isn't correct, 
> OpenJDK works for folks that I have asked in the zookeeper community and 
> else where) and the rest of the folks were ok with it...
> 
> If people want a alternate driver, propose it IMHO...
> 

The few operators who stated this position are very much appreciated
for standing up and making it clear. It has helped us not step into a
minefield with a native ZK driver!

Consul is the most popular second choice, and should work fine for the
use cases we identified. It will not be sufficient if we ever have
a use case where many agents must lock many resources, since Consul
does not offer a way to grant lock access in a fair manner (ZK does,
and we're not aware of any others that do actually). Using Consul or
etcd for this case would result in situations where lock waiters may
wait _forever_, and will likely wait longer than they should at times.
Hopefully we can simply avoid the need for this in OpenStack all together.

I do _not_ think we should wait for constrained operators to scream
at us about ZK to write a Consul driver. It's important enough that we
should start documenting all of the issues we expect to see with Consul
(it's not widely packaged, for instance) and writing a driver with its
own devstack plugin.

If there are Consul experts who did not make it to those sessions,
it would be greatly appreciated if you can spend some time on this.

What I don't want to see happen is we get into a deadlock where there's
a large portion of users who can't upgrade and no driver to support them.
So lets stay ahead of the problem, and get a set of drivers that works
for everybody!

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-04 Thread Gregory Haynes
Excerpts from Clint Byrum's message of 2015-11-04 21:17:15 +:
> Excerpts from Joshua Harlow's message of 2015-11-04 12:57:53 -0800:
> > Ed Leafe wrote:
> > > On Nov 3, 2015, at 6:45 AM, Davanum Srinivas  wrote:
> > >> Here's a Devstack review for zookeeper in support of this initiative:
> > >>
> > >> https://review.openstack.org/241040
> > >>
> > >> Thanks,
> > >> Dims
> > >
> > > I thought that the operators at that session made it very clear that they 
> > > would *not* run any Java applications, and that if OpenStack required a 
> > > Java app to run, they would no longer use it.
> > >
> > > I like the idea of using Zookeeper as the DLM, but I don't think it 
> > > should be set up as a default, even for devstack, given the vehement 
> > > opposition expressed.
> > >
> > 
> > What should be the default then?
> > 
> > As for 'vehement opposition' I didn't see that as being there, I saw a 
> > small set of people say 'I don't want to run java or I can't run java', 
> > some comments about requiring using oracles JVM (which isn't correct, 
> > OpenJDK works for folks that I have asked in the zookeeper community and 
> > else where) and the rest of the folks were ok with it...
> > 
> > If people want a alternate driver, propose it IMHO...
> > 
> 
> The few operators who stated this position are very much appreciated
> for standing up and making it clear. It has helped us not step into a
> minefield with a native ZK driver!
> 
> Consul is the most popular second choice, and should work fine for the
> use cases we identified. It will not be sufficient if we ever have
> a use case where many agents must lock many resources, since Consul
> does not offer a way to grant lock access in a fair manner (ZK does,
> and we're not aware of any others that do actually). Using Consul or
> etcd for this case would result in situations where lock waiters may
> wait _forever_, and will likely wait longer than they should at times.
> Hopefully we can simply avoid the need for this in OpenStack all together.
> 
> I do _not_ think we should wait for constrained operators to scream
> at us about ZK to write a Consul driver. It's important enough that we
> should start documenting all of the issues we expect to see with Consul
> (it's not widely packaged, for instance) and writing a driver with its
> own devstack plugin.
> 
> If there are Consul experts who did not make it to those sessions,
> it would be greatly appreciated if you can spend some time on this.
> 
> What I don't want to see happen is we get into a deadlock where there's
> a large portion of users who can't upgrade and no driver to support them.
> So lets stay ahead of the problem, and get a set of drivers that works
> for everybody!
> 

One additional note - out of the three possible options I see for tooz
drivers in production (zk, consul, etcd) we currently only have drivers
for ZK. This means that unless new drivers are created, when we depend
on tooz we will be requiring folks deploy zk.

It would be *awesome* if some folks stepped up to create and support at
least one of the aternate backends.

Although I am a fan of the ZK solution, I have an old WIP patch for
creating an etcd driver. I would like to revive and maintain it, but I
would also need one more maintainer per the new rules for in tree
drivers...

Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-04 Thread Mark Voelker
On Nov 4, 2015, at 4:41 PM, Gregory Haynes  wrote:
> 
> Excerpts from Clint Byrum's message of 2015-11-04 21:17:15 +:
>> Excerpts from Joshua Harlow's message of 2015-11-04 12:57:53 -0800:
>>> Ed Leafe wrote:
 On Nov 3, 2015, at 6:45 AM, Davanum Srinivas  wrote:
> Here's a Devstack review for zookeeper in support of this initiative:
> 
> https://review.openstack.org/241040
> 
> Thanks,
> Dims
 
 I thought that the operators at that session made it very clear that they 
 would *not* run any Java applications, and that if OpenStack required a 
 Java app to run, they would no longer use it.
 
 I like the idea of using Zookeeper as the DLM, but I don't think it should 
 be set up as a default, even for devstack, given the vehement opposition 
 expressed.
 
>>> 
>>> What should be the default then?
>>> 
>>> As for 'vehement opposition' I didn't see that as being there, I saw a 
>>> small set of people say 'I don't want to run java or I can't run java', 
>>> some comments about requiring using oracles JVM (which isn't correct, 
>>> OpenJDK works for folks that I have asked in the zookeeper community and 
>>> else where) and the rest of the folks were ok with it...
>>> 
>>> If people want a alternate driver, propose it IMHO...
>>> 
>> 
>> The few operators who stated this position are very much appreciated
>> for standing up and making it clear. It has helped us not step into a
>> minefield with a native ZK driver!
>> 
>> Consul is the most popular second choice, and should work fine for the
>> use cases we identified. It will not be sufficient if we ever have
>> a use case where many agents must lock many resources, since Consul
>> does not offer a way to grant lock access in a fair manner (ZK does,
>> and we're not aware of any others that do actually). Using Consul or
>> etcd for this case would result in situations where lock waiters may
>> wait _forever_, and will likely wait longer than they should at times.
>> Hopefully we can simply avoid the need for this in OpenStack all together.
>> 
>> I do _not_ think we should wait for constrained operators to scream
>> at us about ZK to write a Consul driver. It's important enough that we
>> should start documenting all of the issues we expect to see with Consul
>> (it's not widely packaged, for instance) and writing a driver with its
>> own devstack plugin.
>> 
>> If there are Consul experts who did not make it to those sessions,
>> it would be greatly appreciated if you can spend some time on this.
>> 
>> What I don't want to see happen is we get into a deadlock where there's
>> a large portion of users who can't upgrade and no driver to support them.
>> So lets stay ahead of the problem, and get a set of drivers that works
>> for everybody!
>> 
> 
> One additional note - out of the three possible options I see for tooz
> drivers in production (zk, consul, etcd) we currently only have drivers
> for ZK. This means that unless new drivers are created, when we depend
> on tooz we will be requiring folks deploy zk.
> 
> It would be *awesome* if some folks stepped up to create and support at
> least one of the aternate backends.
> 
> Although I am a fan of the ZK solution, I have an old WIP patch for
> creating an etcd driver. I would like to revive and maintain it, but I
> would also need one more maintainer per the new rules for in tree
> drivers…

For those following along at home, said WIP etcd driver patch is here:

https://review.openstack.org/#/c/151463/

And said rules are at:

https://review.openstack.org/#/c/240645/

And FWIW, I too am personally fine with ZK as a default for devstack.

At Your Service,

Mark T. Voelker

> 
> Cheers,
> Greg
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-04 Thread Ed Leafe
On Nov 3, 2015, at 6:45 AM, Davanum Srinivas  wrote:
> 
> Here's a Devstack review for zookeeper in support of this initiative:
> 
> https://review.openstack.org/241040
> 
> Thanks,
> Dims

I thought that the operators at that session made it very clear that they would 
*not* run any Java applications, and that if OpenStack required a Java app to 
run, they would no longer use it.

I like the idea of using Zookeeper as the DLM, but I don't think it should be 
set up as a default, even for devstack, given the vehement opposition expressed.


-- Ed Leafe







signature.asc
Description: Message signed with OpenPGP using GPGMail
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-04 Thread Sean Dague
Thanks Dims,

+2

On 11/03/2015 07:45 AM, Davanum Srinivas wrote:
> Here's a Devstack review for zookeeper in support of this initiative:
> 
> https://review.openstack.org/241040
> 
> Thanks,
> Dims
> 
> 
> On Mon, Nov 2, 2015 at 11:05 PM, Joshua Harlow  wrote:
>> Thanks robert,
>>
>> I've started to tweak https://review.openstack.org/#/c/209661/ with regard
>> to the outcome of that (at least to cover the basics)... Should be finished
>> up soon (I hope).
>>
>>
>> Robert Collins wrote:
>>>
>>> Hi, at the summit we had a big session on distributed lock managers
>>> (DLMs).
>>>
>>> I'd just like to highlight the conclusions we came to in the session (
>>>  https://etherpad.openstack.org/p/mitaka-cross-project-dlm
>>>  )
>>>
>>> Firstly OpenStack projects that want to use a DLM can make it a hard
>>> dependency. Previously we've had a unwritten policy that DLMs should
>>> be optional, which has led to us writing poor DLM-like things backed
>>> by databases :(. So this is a huge and important step forward in our
>>> architecture.
>>>
>>> As in our existing pattern of usage for database and message-queues,
>>> we'll use an oslo abstraction layer: tooz. This doesn't preclude a
>>> different answer in special cases - but they should be considered
>>> special and exception, not the general case.
>>>
>>> Based on the project requirements surfaced in the discussion, it seems
>>> likely that all of konsul, etc and zookeeper will be able to have
>>> suitable production ready drivers written for tooz. Specifically no
>>> project required a fair locking implementation in the DLM.
>>>
>>> After our experience with oslo.messaging however, we wanted to avoid
>>> the situation of having unmaintained drivers and no signalling to
>>> users about them.
>>>
>>> So, we resolved to adopt roughly the oslo.messaging requirements for
>>> drivers, with a couple of tweaks...
>>>
>>> Production drivers in-tree will need:
>>>   - two nominated developers responsible for it
>>>   - gating functional tests that use dsvm
>>> Test drivers in-tree will need:
>>>   - clear identification that the driver is a test driver - in the
>>> module name at minimum
>>>
>>> All hail our new abstraction overlords.
>>>
>>> -Rob
>>>
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 


-- 
Sean Dague
http://dague.net

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-03 Thread Davanum Srinivas
Here's a Devstack review for zookeeper in support of this initiative:

https://review.openstack.org/241040

Thanks,
Dims


On Mon, Nov 2, 2015 at 11:05 PM, Joshua Harlow  wrote:
> Thanks robert,
>
> I've started to tweak https://review.openstack.org/#/c/209661/ with regard
> to the outcome of that (at least to cover the basics)... Should be finished
> up soon (I hope).
>
>
> Robert Collins wrote:
>>
>> Hi, at the summit we had a big session on distributed lock managers
>> (DLMs).
>>
>> I'd just like to highlight the conclusions we came to in the session (
>>  https://etherpad.openstack.org/p/mitaka-cross-project-dlm
>>  )
>>
>> Firstly OpenStack projects that want to use a DLM can make it a hard
>> dependency. Previously we've had a unwritten policy that DLMs should
>> be optional, which has led to us writing poor DLM-like things backed
>> by databases :(. So this is a huge and important step forward in our
>> architecture.
>>
>> As in our existing pattern of usage for database and message-queues,
>> we'll use an oslo abstraction layer: tooz. This doesn't preclude a
>> different answer in special cases - but they should be considered
>> special and exception, not the general case.
>>
>> Based on the project requirements surfaced in the discussion, it seems
>> likely that all of konsul, etc and zookeeper will be able to have
>> suitable production ready drivers written for tooz. Specifically no
>> project required a fair locking implementation in the DLM.
>>
>> After our experience with oslo.messaging however, we wanted to avoid
>> the situation of having unmaintained drivers and no signalling to
>> users about them.
>>
>> So, we resolved to adopt roughly the oslo.messaging requirements for
>> drivers, with a couple of tweaks...
>>
>> Production drivers in-tree will need:
>>   - two nominated developers responsible for it
>>   - gating functional tests that use dsvm
>> Test drivers in-tree will need:
>>   - clear identification that the driver is a test driver - in the
>> module name at minimum
>>
>> All hail our new abstraction overlords.
>>
>> -Rob
>>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



-- 
Davanum Srinivas :: https://twitter.com/dims

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-03 Thread Mike Perez
On 15:26 Nov 03, Robert Collins wrote:
> Hi, at the summit we had a big session on distributed lock managers (DLMs).
> 
> I'd just like to highlight the conclusions we came to in the session (
> https://etherpad.openstack.org/p/mitaka-cross-project-dlm
> )

Also Cinder will be spearheading some Tooz integration work:

https://review.openstack.org/#/c/185646/

-- 
Mike Perez

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-02 Thread Joshua Harlow

Thanks robert,

I've started to tweak https://review.openstack.org/#/c/209661/ with 
regard to the outcome of that (at least to cover the basics)... Should 
be finished up soon (I hope).


Robert Collins wrote:

Hi, at the summit we had a big session on distributed lock managers (DLMs).

I'd just like to highlight the conclusions we came to in the session (
 https://etherpad.openstack.org/p/mitaka-cross-project-dlm
 )

Firstly OpenStack projects that want to use a DLM can make it a hard
dependency. Previously we've had a unwritten policy that DLMs should
be optional, which has led to us writing poor DLM-like things backed
by databases :(. So this is a huge and important step forward in our
architecture.

As in our existing pattern of usage for database and message-queues,
we'll use an oslo abstraction layer: tooz. This doesn't preclude a
different answer in special cases - but they should be considered
special and exception, not the general case.

Based on the project requirements surfaced in the discussion, it seems
likely that all of konsul, etc and zookeeper will be able to have
suitable production ready drivers written for tooz. Specifically no
project required a fair locking implementation in the DLM.

After our experience with oslo.messaging however, we wanted to avoid
the situation of having unmaintained drivers and no signalling to
users about them.

So, we resolved to adopt roughly the oslo.messaging requirements for
drivers, with a couple of tweaks...

Production drivers in-tree will need:
  - two nominated developers responsible for it
  - gating functional tests that use dsvm
Test drivers in-tree will need:
  - clear identification that the driver is a test driver - in the
module name at minimum

All hail our new abstraction overlords.

-Rob



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-02 Thread Robert Collins
Hi, at the summit we had a big session on distributed lock managers (DLMs).

I'd just like to highlight the conclusions we came to in the session (
https://etherpad.openstack.org/p/mitaka-cross-project-dlm
)

Firstly OpenStack projects that want to use a DLM can make it a hard
dependency. Previously we've had a unwritten policy that DLMs should
be optional, which has led to us writing poor DLM-like things backed
by databases :(. So this is a huge and important step forward in our
architecture.

As in our existing pattern of usage for database and message-queues,
we'll use an oslo abstraction layer: tooz. This doesn't preclude a
different answer in special cases - but they should be considered
special and exception, not the general case.

Based on the project requirements surfaced in the discussion, it seems
likely that all of konsul, etc and zookeeper will be able to have
suitable production ready drivers written for tooz. Specifically no
project required a fair locking implementation in the DLM.

After our experience with oslo.messaging however, we wanted to avoid
the situation of having unmaintained drivers and no signalling to
users about them.

So, we resolved to adopt roughly the oslo.messaging requirements for
drivers, with a couple of tweaks...

Production drivers in-tree will need:
 - two nominated developers responsible for it
 - gating functional tests that use dsvm
Test drivers in-tree will need:
 - clear identification that the driver is a test driver - in the
module name at minimum

All hail our new abstraction overlords.

-Rob

-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-02 Thread Geoff O'Callaghan
On 03/11/2015 1:28 PM, "Robert Collins"  wrote:
>
> Hi, at the summit we had a big session on distributed lock managers
(DLMs).

Awesome.

>
> I'd just like to highlight the conclusions we came to in the session (
> https://etherpad.openstack.org/p/mitaka-cross-project-dlm
> )
>
> Firstly OpenStack projects that want to use a DLM can make it a hard
> dependency. Previously we've had a unwritten policy that DLMs should
> be optional, which has led to us writing poor DLM-like things backed
> by databases :(. So this is a huge and important step forward in our
> architecture.

Agreed and it's also a positive step.

>
> As in our existing pattern of usage for database and message-queues,
> we'll use an oslo abstraction layer: tooz. This doesn't preclude a
> different answer in special cases - but they should be considered
> special and exception, not the general case.
>
> Based on the project requirements surfaced in the discussion, it seems
> likely that all of konsul, etc and zookeeper will be able to have
> suitable production ready drivers written for tooz. Specifically no
> project required a fair locking implementation in the DLM.
>
> After our experience with oslo.messaging however, we wanted to avoid
> the situation of having unmaintained drivers and no signalling to
> users about them.
>
> So, we resolved to adopt roughly the oslo.messaging requirements for
> drivers, with a couple of tweaks...
>
> Production drivers in-tree will need:
>  - two nominated developers responsible for it
>  - gating functional tests that use dsvm
> Test drivers in-tree will need:
>  - clear identification that the driver is a test driver - in the
> module name at minimum
>
> All hail our new abstraction overlords.

This really is fantastic news.   Thanks for the 'heads up'.

Geoff

>
> -Rob
>
> --
> Robert Collins 
> Distinguished Technologist
> HP Converged Cloud
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev