[openstack-dev] ??magnum??About clean none use container imag

2015-04-12 Thread 449171342
From now on magnum had container create and delete api .The  container create 
api will pull docker image from docker-registry.But the container delete api 
didn't delete image.It will let the image remain even though didn't had 
container use it.Is it much better we can clear the image in someway?__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron] Regarding neutron bug # 1432582

2015-04-12 Thread Kevin Benton
I would like to see some form of this merged at least as an error message.
If a server has a bad CMOS battery and suffers a power outage, it's clock
could easily be several years behind. In that scenario, the NTP daemon
could refuse to sync due to a sanity check.

On Wed, Apr 8, 2015 at 10:46 AM, Sudipto Biswas  wrote:

> Hi Guys, I'd really appreciate your feedback on this.
>
> Thanks,
> Sudipto
>
>
> On Monday 30 March 2015 12:11 PM, Sudipto Biswas wrote:
>
>> Someone from my team had installed the OS on baremetal with a wrong 'date'
>> When this node was added to the Openstack controller, the logs from the
>> neutron-agent on the compute node showed - "AMQP connected". But the
>> neutron
>> agent-list command would not list this agent at all.
>>
>> I could figure out the problem when the neutron-server debug logs were
>> enabled
>> and it vaguely pointed at the rejection of AMQP connections due to a
>> timestamp
>> miss match. The neutron-server was treating these requests as stale due
>> to the
>> timestamp of the node being behind the neutron-server. However, there's no
>> good way to detect this if the agent runs on a node which is ahead of
>> time.
>>
>> I recently raised a bug here: https://bugs.launchpad.net/
>> neutron/+bug/1432582
>>
>> And tried to resolve this with the review:
>> https://review.openstack.org/#/c/165539/
>>
>> It went through quite a few +2s after 15 odd patch sets but we still are
>> not
>> in common ground w.r.t addressing this situation.
>>
>> My fix tries to log better and throw up an exception to the neutron agent
>> on
>> FIRST time boot of the agent for better detection of the problem.
>>
>> I would like to get your thoughts on this fix. Whether this seems legit
>> to have
>> the fix per the patch OR could you suggest a approach to tackle this OR
>> suggest
>> just abandoning the change.
>>
>>
>>
>> __
>>
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:
>> unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>>
>>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Kevin Benton
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-12 Thread Attila Fazekas




- Original Message -
> From: "Kevin Benton" 
> To: "OpenStack Development Mailing List (not for usage questions)" 
> 
> Sent: Sunday, April 12, 2015 4:17:29 AM
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
> 
> 
> 
> So IIUC tooz would be handling the liveness detection for the agents. That
> would be nice to get ride of that logic in Neutron and just register
> callbacks for rescheduling the dead.
> 
> Where does it store that state, does it persist timestamps to the DB like
> Neutron does? If so, how would that scale better? If not, who does a given
> node ask to know if an agent is online or offline when making a scheduling
> decision?
> 
You might find interesting the proposed solution in this bug:
https://bugs.launchpad.net/nova/+bug/1437199

> However, before (what I assume is) the large code change to implement tooz, I
> would like to quantify that the heartbeats are actually a bottleneck. When I
> was doing some profiling of them on the master branch a few months ago,
> processing a heartbeat took an order of magnitude less time (<50ms) than the
> 'sync routers' task of the l3 agent (~300ms). A few query optimizations
> might buy us a lot more headroom before we have to fall back to large
> refactors.
> Kevin Benton wrote:
> 
> 
> 
> One of the most common is the heartbeat from each agent. However, I
> don't think we can't eliminate them because they are used to determine
> if the agents are still alive for scheduling purposes. Did you have
> something else in mind to determine if an agent is alive?
> 
> Put each agent in a tooz[1] group; have each agent periodically heartbeat[2],
> have whoever needs to schedule read the active members of that group (or use
> [3] to get notified via a callback), profit...
> 
> Pick from your favorite (supporting) driver at:
> 
> http://docs.openstack.org/ developer/tooz/compatibility. html
> 
> [1] http://docs.openstack.org/ developer/tooz/compatibility. html#grouping
> [2] https://github.com/openstack/ tooz/blob/0.13.1/tooz/ coordination.py#L315
> [3] http://docs.openstack.org/ developer/tooz/tutorial/group_
> membership.html#watching- group-changes
> 
> 
> __ __ __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request@lists. openstack.org?subject: unsubscribe
> http://lists.openstack.org/ cgi-bin/mailman/listinfo/ openstack-dev
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] In loving memory of Chris Yeoh

2015-04-12 Thread Gary Kotton
Hi,
I am very saddened to read this. Not only will Chris be missed on a
professional level but on a personal level. He was a real mensh
(http://www.thefreedictionary.com/mensh). He was always helpful and
supportive. Wishing his family a long life.
Thanks
Gary

On 4/13/15, 4:33 AM, "Michael Still"  wrote:

>Hi, as promised I now have details of a charity for people to donate
>to in Chris' memory:
>
>
>https://urldefense.proofpoint.com/v2/url?u=http-3A__participate.freetobrea
>the.org_site_TR-3Fpx-3D1582460-26fr-5Fid-3D2710-26pg-3Dpersonal-23.VSscH5S
>Ud90&d=AwIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=VlZxHpZBmzzk
>WT5jqz9JYBk8YTeq9N3-diTlNj4GyNc&m=IFwED7YYaddl7JbqZ5OLChF6gtEGxYkxfFHwjWRm
>sD8&s=B3EgunFqBdY8twmv-iJ7G7xvKZ4Th48oB4HKSv2uGKg&e=
>
>In the words of the family:
>
>"We would prefer that people donate to lung cancer research in lieu of
>flowers. Lung cancer has the highest mortality rate out of all the
>cancers, and the lowest funding out of all the cancers. There is a
>stigma attached that lung cancer is a smoker's disease, and that
>sufferers deserve their fate. They bring it on through lifestyle
>choice. Except that Chris has never smoked in his life, like a
>surprisingly large percentage of lung cancer sufferers. These people
>suffer for the incorrect beliefs of the masses, and those that are
>left behind are equally innocent. We shouldn't be doing this now. He
>shouldn't be gone. We need to do more to fix this. There will be
>charity envelopes available at the funeral, or you can choose your
>preferred research to fund, should you wish to do so. You have our
>thanks."
>
>Michael
>
>On Wed, Apr 8, 2015 at 2:49 PM, Michael Still  wrote:
>> It is my sad duty to inform the community that Chris Yeoh passed away
>>this
>> morning. Chris leaves behind a daughter Alyssa, aged 6, who I hope will
>> remember Chris as the clever and caring person that I will remember him
>>as.
>> I haven¹t had a chance to confirm with the family if they want flowers
>>or a
>> donation to a charity. As soon as I know those details I will reply to
>>this
>> email.
>>
>> Chris worked on open source for a very long time, with OpenStack being
>>just
>> the most recent in a long chain of contributions. He worked tirelessly
>>on
>> his contributions to Nova, including mentoring other developers. He was
>> dedicated to the cause, with a strong vision of what OpenStack could
>>become.
>> He even named his cat after the project.
>>
>> Chris might be the only person to have ever sent an email to his
>>coworkers
>> explaining what his code review strategy would be after brain surgery.
>>It
>> takes phenomenal strength to carry on in the face of that kind of
>>adversity,
>> but somehow he did. Frankly, I think I would have just sat on the beach.
>>
>> Chris was also a contributor to the Linux Standards Base (LSB), where he
>> helped improve the consistency and interoperability between Linux
>> distributions. He ran the ŒHackfest¹ programming contests for a number
>>of
>> years at Australia¹s open source conference -- linux.conf.au. He
>>supported
>> local Linux user groups in South Australia and Canberra, including
>> involvement at installfests and speaking at local meetups. He competed
>>in a
>> programming challenge called Loki Hack, and beat out the world to win
>>the
>> event[1].
>>
>> Alyssa¹s memories of her dad need to last her a long time, so we¹ve
>>decided
>> to try and collect some fond memories of Chris to help her along the
>>way. If
>> you feel comfortable doing so, please contribute a memory or two at
>> 
>>https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_form
>>s_d_1kX-2DePqAO7Cuudppwqz1cqgBXAsJx27GkdM-2DeCZ0c1V8_viewform&d=AwIGaQ&c=
>>Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=VlZxHpZBmzzkWT5jqz9JYBk8YTe
>>q9N3-diTlNj4GyNc&m=IFwED7YYaddl7JbqZ5OLChF6gtEGxYkxfFHwjWRmsD8&s=iihsaOMe
>>lNeIR3VZapWKjr5KLgMQArZ3nifKDo1yy8o&e=
>>
>> Chris was humble, helpful and honest. The OpenStack and broader Open
>>Source
>> communities are poorer for his passing.
>>
>> Michael
>>
>> [1] 
>>https://urldefense.proofpoint.com/v2/url?u=http-3A__www.lokigames.com_hac
>>k_&d=AwIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=VlZxHpZBmzzkW
>>T5jqz9JYBk8YTeq9N3-diTlNj4GyNc&m=IFwED7YYaddl7JbqZ5OLChF6gtEGxYkxfFHwjWRm
>>sD8&s=9SJI7QK-jzCsVUN2hTXSthqiXNEbq2Fvl9JqQiX9tfo&e=
>
>
>
>-- 
>Rackspace Australia
>
>__
>OpenStack Development Mailing List (not for usage questions)
>Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [puppet] Puppet PTL

2015-04-12 Thread Emilien Macchi


On 04/10/2015 06:58 PM, Colleen Murphy wrote:
> Just to make it official: since we only had one nominee for PTL, we will
> go ahead and name Emilien Macchi as our new PTL without proceeding with
> an election process. Thanks, Emilien, for all your hard work and for
> taking on this responsibility!

Well, I think this is a great opportunity to me to say something here.
First of all, thank you for your trust and I'll do my best to succeed in
this new position. You know me enough I'm always open to any feedback,
so please do.

Also, I would like to thank our whole community (core & non-core) for
the hard work we all did these years.
I truly think we will succeed under the big tent only if we continue to
work *together* as a team. But I'm confident, this is part of our DNA
and this is why we are at this stage today.

I'm really looking forward to seeing you at the next Summit.
Thanks,

> 
> Colleen (crinkle)
> 
> -- 
> 
> To unsubscribe from this group and stop receiving emails from it, send
> an email to puppet-openstack+unsubscr...@puppetlabs.com
> .

-- 
Emilien Macchi



signature.asc
Description: OpenPGP digital signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-12 Thread Joshua Harlow

Joshua Harlow wrote:

Kevin Benton wrote:

>Timestamps are just one way (and likely the most primitive), using
redis (or memcache) key/value and expiry are another (and letting
memcache or redis expire using its own internal algorithms), using
zookeeper ephemeral nodes[1] are another... The point being that its
backend specific and tooz supports varying backends.

Very cool. Is the backend completely transparent so a deployer could
choose a service they are comfortable maintaining, or will that change
the properties WRT to resiliency of state on node restarts,
partitions, etc?


Of course... we tried to make it 'completely' transparent, but in
reality certain backends (zookeeper which uses a paxos-like algorithm
and redis with sentinel support...) are better (more resilient, more
consistent, handle partitions/restarts better...) than others (memcached
is after all just a distributed cache). This is just the nature of the
game...



And for some more reading fun:

https://aphyr.com/posts/315-call-me-maybe-rabbitmq

https://aphyr.com/posts/291-call-me-maybe-zookeeper

https://aphyr.com/posts/283-call-me-maybe-redis

https://aphyr.com/posts/316-call-me-maybe-etcd-and-consul

... (aphyr.com has alot of these neat posts)...



The Nova implementation of Tooz seemed pretty straight-forward, although
it looked like it had pluggable drivers for service management already.
Before I dig into it much further I'll file a spec on the Neutron side
to see if I can get some other cores onboard to do the review work if I
push a change to tooz.


Sounds good to me.




On Sun, Apr 12, 2015 at 9:38 AM, Joshua Harlow mailto:harlo...@outlook.com>> wrote:

Kevin Benton wrote:

So IIUC tooz would be handling the liveness detection for the
agents.
That would be nice to get ride of that logic in Neutron and just
register callbacks for rescheduling the dead.

Where does it store that state, does it persist timestamps to the DB
like Neutron does? If so, how would that scale better? If not,
who does
a given node ask to know if an agent is online or offline when
making a
scheduling decision?


Timestamps are just one way (and likely the most primitive), using
redis (or memcache) key/value and expiry are another (and letting
memcache or redis expire using its own internal algorithms), using
zookeeper ephemeral nodes[1] are another... The point being that its
backend specific and tooz supports varying backends.


However, before (what I assume is) the large code change to
implement
tooz, I would like to quantify that the heartbeats are actually a
bottleneck. When I was doing some profiling of them on the
master branch
a few months ago, processing a heartbeat took an order of
magnitude less
time (<50ms) than the 'sync routers' task of the l3 agent
(~300ms). A
few query optimizations might buy us a lot more headroom before
we have
to fall back to large refactors.


Sure, always good to avoid prematurely optimizing things...

Although this is relevant for u I think anyway:

https://review.openstack.org/#__/c/138607/
 (same thing/nearly same
in nova)...

https://review.openstack.org/#__/c/172502/
 (a WIP implementation of
the latter).

[1]
https://zookeeper.apache.org/__doc/trunk/__zookeeperProgrammers.html#__Ephemeral+Nodes





Kevin Benton wrote:


One of the most common is the heartbeat from each agent.
However, I
don't think we can't eliminate them because they are used
to determine
if the agents are still alive for scheduling purposes. Did
you have
something else in mind to determine if an agent is alive?


Put each agent in a tooz[1] group; have each agent periodically
heartbeat[2], have whoever needs to schedule read the active
members of
that group (or use [3] to get notified via a callback), profit...

Pick from your favorite (supporting) driver at:

http://docs.openstack.org/developer/tooz/compatibility.html

>

[1]
http://docs.openstack.org/developer/tooz/compatibility.html#grouping



>
[2]
https://github.com/openstack/tooz/blob/0.13.1/tooz/coordination.py#L315



>

[3]
http://docs.openstack.org/developer/tooz/tutorial/group_membership.html#watching-group-changes



Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-12 Thread Joshua Harlow

joehuang wrote:

Hi, Kevin and Joshua,

As my understanding, Tooz only addresses the issue of agent status
management, but how to solve the concurrent dynamic load impact on large
scale ( for example 100k managed nodes with the dynamic load like
security goup rule update, routers_updated, etc )


Yes, that is correct, let's not confuse status/liveness management with 
updates... since IMHO they are to very different things (the latter can 
be eventually consistent IMHO will the liveness 'question' probably 
should not be...).




And one more question is, if we have 100k managed nodes, how to do the
partition? Or all nodes will be managed by one Tooz service, like
Zookeeper? Can Zookeeper manage 100k nodes status?


I can get u some data/numbers from some studies I've seen, but what u 
are talking about is highly specific as to what u are doing with 
zookeeper... There is no one solution for all the things IMHO; choose 
what's best from your tool-belt for each problem...




Best Regards

Chaoyi Huang ( Joe Huang )

*From:*Kevin Benton [mailto:blak...@gmail.com]
*Sent:* Monday, April 13, 2015 3:52 AM
*To:* OpenStack Development Mailing List (not for usage questions)
*Subject:* Re: [openstack-dev] [neutron] Neutron scaling datapoints?


Timestamps are just one way (and likely the most primitive), using redis

(or memcache) key/value and expiry are another (and letting memcache or
redis expire using its own internal algorithms), using zookeeper
ephemeral nodes[1] are another... The point being that its backend
specific and tooz supports varying backends.

Very cool. Is the backend completely transparent so a deployer could
choose a service they are comfortable maintaining, or will that change
the properties WRT to resiliency of state on node restarts, partitions, etc?

The Nova implementation of Tooz seemed pretty straight-forward, although
it looked like it had pluggable drivers for service management already.
Before I dig into it much further I'll file a spec on the Neutron side
to see if I can get some other cores onboard to do the review work if I
push a change to tooz.

On Sun, Apr 12, 2015 at 9:38 AM, Joshua Harlow mailto:harlo...@outlook.com>> wrote:

Kevin Benton wrote:

So IIUC tooz would be handling the liveness detection for the agents.
That would be nice to get ride of that logic in Neutron and just
register callbacks for rescheduling the dead.

Where does it store that state, does it persist timestamps to the DB
like Neutron does? If so, how would that scale better? If not, who does
a given node ask to know if an agent is online or offline when making a
scheduling decision?


Timestamps are just one way (and likely the most primitive), using redis
(or memcache) key/value and expiry are another (and letting memcache or
redis expire using its own internal algorithms), using zookeeper
ephemeral nodes[1] are another... The point being that its backend
specific and tooz supports varying backends.


However, before (what I assume is) the large code change to implement
tooz, I would like to quantify that the heartbeats are actually a
bottleneck. When I was doing some profiling of them on the master branch
a few months ago, processing a heartbeat took an order of magnitude less
time (<50ms) than the 'sync routers' task of the l3 agent (~300ms). A
few query optimizations might buy us a lot more headroom before we have
to fall back to large refactors.


Sure, always good to avoid prematurely optimizing things...

Although this is relevant for u I think anyway:

https://review.openstack.org/#/c/138607/ (same thing/nearly same in nova)...

https://review.openstack.org/#/c/172502/ (a WIP implementation of the
latter).

[1]
https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#Ephemeral+Nodes



Kevin Benton wrote:


One of the most common is the heartbeat from each agent. However, I
don't think we can't eliminate them because they are used to determine
if the agents are still alive for scheduling purposes. Did you have
something else in mind to determine if an agent is alive?


Put each agent in a tooz[1] group; have each agent periodically
heartbeat[2], have whoever needs to schedule read the active members of
that group (or use [3] to get notified via a callback), profit...

Pick from your favorite (supporting) driver at:

http://docs.openstack.org/__developer/tooz/compatibility.__html


[1]
http://docs.openstack.org/__developer/tooz/compatibility.__html#grouping

[2]
https://github.com/openstack/__tooz/blob/0.13.1/tooz/__coordination.py#L315

[3]
http://docs.openstack.org/__developer/tooz/tutorial/group___membership.html#watching-__group-changes


Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-12 Thread Joshua Harlow

Kevin Benton wrote:

 >Timestamps are just one way (and likely the most primitive), using
redis (or memcache) key/value and expiry are another (and letting
memcache or redis expire using its own internal algorithms), using
zookeeper ephemeral nodes[1] are another... The point being that its
backend specific and tooz supports varying backends.

Very cool. Is the backend completely transparent so a deployer could
choose a service they are comfortable maintaining, or will that change
the properties WRT to resiliency of state on node restarts, partitions, etc?


Of course... we tried to make it 'completely' transparent, but in 
reality certain backends (zookeeper which uses a paxos-like algorithm 
and redis with sentinel support...) are better (more resilient, more 
consistent, handle partitions/restarts better...) than others (memcached 
is after all just a distributed cache). This is just the nature of the 
game...




The Nova implementation of Tooz seemed pretty straight-forward, although
it looked like it had pluggable drivers for service management already.
Before I dig into it much further I'll file a spec on the Neutron side
to see if I can get some other cores onboard to do the review work if I
push a change to tooz.


Sounds good to me.




On Sun, Apr 12, 2015 at 9:38 AM, Joshua Harlow mailto:harlo...@outlook.com>> wrote:

Kevin Benton wrote:

So IIUC tooz would be handling the liveness detection for the
agents.
That would be nice to get ride of that logic in Neutron and just
register callbacks for rescheduling the dead.

Where does it store that state, does it persist timestamps to the DB
like Neutron does? If so, how would that scale better? If not,
who does
a given node ask to know if an agent is online or offline when
making a
scheduling decision?


Timestamps are just one way (and likely the most primitive), using
redis (or memcache) key/value and expiry are another (and letting
memcache or redis expire using its own internal algorithms), using
zookeeper ephemeral nodes[1] are another... The point being that its
backend specific and tooz supports varying backends.


However, before (what I assume is) the large code change to
implement
tooz, I would like to quantify that the heartbeats are actually a
bottleneck. When I was doing some profiling of them on the
master branch
a few months ago, processing a heartbeat took an order of
magnitude less
time (<50ms) than the 'sync routers' task of the l3 agent
(~300ms). A
few query optimizations might buy us a lot more headroom before
we have
to fall back to large refactors.


Sure, always good to avoid prematurely optimizing things...

Although this is relevant for u I think anyway:

https://review.openstack.org/#__/c/138607/
 (same thing/nearly same
in nova)...

https://review.openstack.org/#__/c/172502/
 (a WIP implementation of
the latter).

[1]

https://zookeeper.apache.org/__doc/trunk/__zookeeperProgrammers.html#__Ephemeral+Nodes




Kevin Benton wrote:


 One of the most common is the heartbeat from each agent.
However, I
 don't think we can't eliminate them because they are used
to determine
 if the agents are still alive for scheduling purposes. Did
you have
 something else in mind to determine if an agent is alive?


Put each agent in a tooz[1] group; have each agent periodically
heartbeat[2], have whoever needs to schedule read the active
members of
that group (or use [3] to get notified via a callback), profit...

Pick from your favorite (supporting) driver at:

http://docs.openstack.org/developer/tooz/compatibility.html

>

[1]

http://docs.openstack.org/developer/tooz/compatibility.html#grouping



>
[2]

https://github.com/openstack/tooz/blob/0.13.1/tooz/coordination.py#L315





Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-12 Thread joehuang
Hi, Kevin and Joshua,

As my understanding, Tooz only addresses the issue of agent status management, 
but how to solve the concurrent dynamic load impact on large scale ( for 
example 100k managed nodes with the dynamic load like security goup rule 
update, routers_updated, etc )

And one more question is, if we have 100k managed nodes, how to do the 
partition? Or all nodes will be managed by one Tooz service, like Zookeeper? 
Can Zookeeper manage 100k nodes status?

Best Regards
Chaoyi Huang ( Joe Huang )

From: Kevin Benton [mailto:blak...@gmail.com]
Sent: Monday, April 13, 2015 3:52 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?

>Timestamps are just one way (and likely the most primitive), using redis (or 
>memcache) key/value and expiry are another (and letting memcache or redis 
>expire using its own internal algorithms), using zookeeper ephemeral nodes[1] 
>are another... The point being that its backend specific and tooz supports 
>varying backends.

Very cool. Is the backend completely transparent so a deployer could choose a 
service they are comfortable maintaining, or will that change the properties 
WRT to resiliency of state on node restarts, partitions, etc?

The Nova implementation of Tooz seemed pretty straight-forward, although it 
looked like it had pluggable drivers for service management already. Before I 
dig into it much further I'll file a spec on the Neutron side to see if I can 
get some other cores onboard to do the review work if I push a change to tooz.


On Sun, Apr 12, 2015 at 9:38 AM, Joshua Harlow 
mailto:harlo...@outlook.com>> wrote:
Kevin Benton wrote:
So IIUC tooz would be handling the liveness detection for the agents.
That would be nice to get ride of that logic in Neutron and just
register callbacks for rescheduling the dead.

Where does it store that state, does it persist timestamps to the DB
like Neutron does? If so, how would that scale better? If not, who does
a given node ask to know if an agent is online or offline when making a
scheduling decision?

Timestamps are just one way (and likely the most primitive), using redis (or 
memcache) key/value and expiry are another (and letting memcache or redis 
expire using its own internal algorithms), using zookeeper ephemeral nodes[1] 
are another... The point being that its backend specific and tooz supports 
varying backends.

However, before (what I assume is) the large code change to implement
tooz, I would like to quantify that the heartbeats are actually a
bottleneck. When I was doing some profiling of them on the master branch
a few months ago, processing a heartbeat took an order of magnitude less
time (<50ms) than the 'sync routers' task of the l3 agent (~300ms). A
few query optimizations might buy us a lot more headroom before we have
to fall back to large refactors.

Sure, always good to avoid prematurely optimizing things...

Although this is relevant for u I think anyway:

https://review.openstack.org/#/c/138607/ (same thing/nearly same in nova)...

https://review.openstack.org/#/c/172502/ (a WIP implementation of the latter).

[1] 
https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#Ephemeral+Nodes

Kevin Benton wrote:


One of the most common is the heartbeat from each agent. However, I
don't think we can't eliminate them because they are used to determine
if the agents are still alive for scheduling purposes. Did you have
something else in mind to determine if an agent is alive?


Put each agent in a tooz[1] group; have each agent periodically
heartbeat[2], have whoever needs to schedule read the active members of
that group (or use [3] to get notified via a callback), profit...

Pick from your favorite (supporting) driver at:

http://docs.openstack.org/__developer/tooz/compatibility.__html


[1]
http://docs.openstack.org/__developer/tooz/compatibility.__html#grouping

[2]
https://github.com/openstack/__tooz/blob/0.13.1/tooz/__coordination.py#L315

[3]
http://docs.openstack.org/__developer/tooz/tutorial/group___membership.html#watching-__group-changes



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
OpenStack-dev-request@lists.__openstack.org?subject:__unsubscribe

http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev


__

Re: [openstack-dev] [all][pbr] splitting our deployment vs install dependencies

2015-04-12 Thread Robert Collins
On 13 April 2015 at 13:09, Robert Collins  wrote:
> On 13 April 2015 at 12:53, Monty Taylor  wrote:
>
>> What we have in the gate is the thing that produces the artifacts that
>> someone installing using the pip tool would get. Shipping anything with
>> those artifacts other that a direct communication of what we tested is
>> just mean to our end users.
>
> Actually its not.
>
> What we test is point in time. At 2:45 UTC on Monday installing this
> git ref of nova worked.
>
> Noone can reconstruct that today.
>
> I entirely agree with the sentiment you're expressing, but we're not
> delivering that sentiment today.

This observation led to yet more IRC discussion and eventually
https://etherpad.openstack.org/p/stable-omg-deps

In short, the proposal is that we:
 - stop trying to use install_requires to reproduce exactly what
works, and instead use it to communicate known constraints (> X, Y is
broken etc).
 - use a requirements.txt file we create *during* CI to capture
exactly what worked, and also capture the dpkg and rpm versions of
packages that were present when it worked, and so on. So we'll build a
git tree where its history is an audit trail of exactly what worked
for everything that passed CI, formatted to make it really really easy
for other people to consume.

-Rob

-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] In loving memory of Chris Yeoh

2015-04-12 Thread Michael Still
Hi, as promised I now have details of a charity for people to donate
to in Chris' memory:


http://participate.freetobreathe.org/site/TR?px=1582460&fr_id=2710&pg=personal#.VSscH5SUd90

In the words of the family:

"We would prefer that people donate to lung cancer research in lieu of
flowers. Lung cancer has the highest mortality rate out of all the
cancers, and the lowest funding out of all the cancers. There is a
stigma attached that lung cancer is a smoker's disease, and that
sufferers deserve their fate. They bring it on through lifestyle
choice. Except that Chris has never smoked in his life, like a
surprisingly large percentage of lung cancer sufferers. These people
suffer for the incorrect beliefs of the masses, and those that are
left behind are equally innocent. We shouldn't be doing this now. He
shouldn't be gone. We need to do more to fix this. There will be
charity envelopes available at the funeral, or you can choose your
preferred research to fund, should you wish to do so. You have our
thanks."

Michael

On Wed, Apr 8, 2015 at 2:49 PM, Michael Still  wrote:
> It is my sad duty to inform the community that Chris Yeoh passed away this
> morning. Chris leaves behind a daughter Alyssa, aged 6, who I hope will
> remember Chris as the clever and caring person that I will remember him as.
> I haven’t had a chance to confirm with the family if they want flowers or a
> donation to a charity. As soon as I know those details I will reply to this
> email.
>
> Chris worked on open source for a very long time, with OpenStack being just
> the most recent in a long chain of contributions. He worked tirelessly on
> his contributions to Nova, including mentoring other developers. He was
> dedicated to the cause, with a strong vision of what OpenStack could become.
> He even named his cat after the project.
>
> Chris might be the only person to have ever sent an email to his coworkers
> explaining what his code review strategy would be after brain surgery. It
> takes phenomenal strength to carry on in the face of that kind of adversity,
> but somehow he did. Frankly, I think I would have just sat on the beach.
>
> Chris was also a contributor to the Linux Standards Base (LSB), where he
> helped improve the consistency and interoperability between Linux
> distributions. He ran the ‘Hackfest’ programming contests for a number of
> years at Australia’s open source conference -- linux.conf.au. He supported
> local Linux user groups in South Australia and Canberra, including
> involvement at installfests and speaking at local meetups. He competed in a
> programming challenge called Loki Hack, and beat out the world to win the
> event[1].
>
> Alyssa’s memories of her dad need to last her a long time, so we’ve decided
> to try and collect some fond memories of Chris to help her along the way. If
> you feel comfortable doing so, please contribute a memory or two at
> https://docs.google.com/forms/d/1kX-ePqAO7Cuudppwqz1cqgBXAsJx27GkdM-eCZ0c1V8/viewform
>
> Chris was humble, helpful and honest. The OpenStack and broader Open Source
> communities are poorer for his passing.
>
> Michael
>
> [1] http://www.lokigames.com/hack/



-- 
Rackspace Australia

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [all] Problems with keystoneclient stable branch (and maybe yours too)

2015-04-12 Thread Brant Knudson
There were several problems with the keystoneclient stable/juno branch that
have been or are in the process of being fixed since its creation.
Hopefully this note will be useful to other projects that create stable
branches for their libraries.


1) Unit tests didn't pass with earlier packages

The supported versions of several of the packages in requirements.txt in
the stable branch are in the process of being capped[0], so that the tests
are now running with older versions of the packages. Since we don't
normally test with the older packages we didn't know that the
keystoneclient unit tests don't actually pass with the old version of the
package. This is fixed by correcting the tests to work with the older
versions of the packages.[1][2]

[0] https://review.openstack.org/#/c/172220/
[1] https://review.openstack.org/#/c/172655/
[2] https://review.openstack.org/#/c/172256/

It would be great if we were testing with the minimum versions of the
packages that we say we support somehow since that would have caught this.


2) Incorrect cap in requirements.txt

python-keystoneclient in stable/juno was capped at <=1.1.0, and 1.1.0 is
the version tagged for the stable branch. When you create a review in
stable/juno it installs python-keystoneclient and now the system has got a
version like 1.1.0.post1, which is >1.1.0, so now python-keystoneclient
doesn't match the requirements and swift-proxy fails to start (swift-proxy
is very good at catching this problem for whatever reason). The cap should
have been <1.2.0 so that we can propose patches and also make fix releases
(1.1.1, 1.1.2, etc.).[3]

[3] https://review.openstack.org/#/c/172718/

I tried to recap all of the clients but that didn't pass Jenkins, probably
because one or more clients didn't use semver correctly and have
requirements updates in a micro release.[4]

[4] https://review.openstack.org/#/c/172719/


3) Unsupported functional tests

We added support for functional tests (tox -e functional) in K, but Jenkins
was configured to run the functional job on all branches and it fails when
the tox target doesn't exist. The fix was to exclude the current stable/
branches for keystoneclient.[5]

[5] https://review.openstack.org/#/c/172658/


4) Tempest -juno job?

For some reason keystoneclient has 2 tempest-neutron jobs:
gate-tempest-dsvm-neutron-src-python-keystoneclient and
...-keystoneclient-juno , and the -juno job is failing in stable/juno. It
didn't make sense to me that we needed to run both in python-keystoneclient
stable/juno. I was told that we didn't need the -juno one anymore on any
branch since we have a stable/juno branch, so that job is removed.[6]

[6] https://review.openstack.org/#/c/172662/


Hopefully with these changes the python-keystoneclient stable/juno branch
will be working.

- Brant
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all][pbr] splitting our deployment vs install dependencies

2015-04-12 Thread Robert Collins
On 13 April 2015 at 12:53, Monty Taylor  wrote:

> What we have in the gate is the thing that produces the artifacts that
> someone installing using the pip tool would get. Shipping anything with
> those artifacts other that a direct communication of what we tested is
> just mean to our end users.

Actually its not.

What we test is point in time. At 2:45 UTC on Monday installing this
git ref of nova worked.

Noone can reconstruct that today.

I entirely agree with the sentiment you're expressing, but we're not
delivering that sentiment today.

We need to balance the inability to atomically update things - which
forces a degree of freedom on install_requires - with being able to
give someone the same install that we tested.

That is the fundamental tension that we're not handling well, nor have
I seen a proposal to tackle it so far.

I'll have to spend some time noodling on this, but one of the clear
constraints is that install_requires cannot both be:
 - flexible enough to permit us to upgrade requirements across many
git based packages [because we could do coordinated releases of sdists
to approximate atomic bulk changes]
 - tight enough enough to give the next person trying to run that ref
of the package the same things we installed in CI.

-> I think we need something other than install_requires

...
> I disagree that anything is broken for us that is not caused by our
> inability to remember that distro packaging concerns are not the same as
> our concerns, and that the mechanism already exists for distro pacakgers
> to do what they want. Furthermore, it is caused by an insistence that we
> need to keep versions "open" for some ephemeral reason such as "upstream
> might release a bug fix" Since we all know that "if it's not tested,
> it's broken" - any changes to upstream software should be considered
> broken until proven otherwise. History over the last 5 years has shown
> this to be accurate more than the other thing.

This seems like a strong argument for really being able to reconstruct
what was in CI.

> If we pin the stable branches with hard pins of direct and indirect
> dependencies, we can have our stable branch artifacts be installable.
> Thats awesome. IF there is a bugfix release or a security update to a
> dependent library - someone can propose it. Otherwise, the stable
> release should not be moving.

Can we do that in stable branches? We've still got the problem of
bumping dependencies across multiple packages.

-Rob

-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all][pbr] splitting our deployment vs install dependencies

2015-04-12 Thread Robert Collins
On 13 April 2015 at 12:01, James Polley  wrote:
>
>

> That sounds, to me, very similar to a discussion we had a few weeks ago in
> the context of our stable branches.
>
> In that context, we have two competing requirements. One requirement is that
> our CI system wants a very tightly pinned requirements, as do downstream CI
> systems and deployers that want to test and deploy exact known-tested
> versions of things. On the other hand, downstream distributors (including OS
> packagers) need to balance OpenStack's version requirements with version
> requirements from all the other packages in their distribution; the tighter
> the requirements we list are, the harder it is for the requirements to work
> with the requirements of other packages in the distribution.

They are analogous yes.
...
>> rust gets it wright. There is a Cargo.toml and a Cargo.lock, which are
>> understood by the tooling in a manner similar to what you have
>> described, and it is not just understood but DOCUMENTED that an
>> _application_ should ship with a Cargo.lock and a _library_ should not.
>
> This sounds similar to a solution that was proposed for the stable branches:
> a requirements.in with mandatory version constraints while being as loose as
> otherwise possible, which is used to generate a requirements.txt which has
> the "local to deployment" exact versions that are used in our CI. The
> details of the proposal are in https://review.openstack.org/#/c/161047/

That proposal is still under discussion, and seems stuck between the
distro and -infra folk. *if* it ends up doing the transitive thing, I
think that that would make a sensible requirements.txt, yes. However
see the spec for that thread of discussion.
..
>> I'm also concerned that dstufft is actively wanting to move towards a
>> world where the build tooling is not needed or used as part of the
>> install pipeline (metadata 2.0) -- so I'm concerned that we're aligning
>> with a pattern that isn't very robust and isn't supported by tooling
>> directly and that we're going to need to change understood usage
>> patterns across a large developer based to chase something that _still_
>> isn't going to be "how people do it"

Monty:
So wheels are already in that space. metadata-2.0 is about bringing
that declarative stuff forward in the pipeline, from binary packages
to source packages. I'm currently using frustration based development
to bring it in at the start - for developers, in the lead-in to source
packages.

So you're concerned - but about what specifically? What goes wrong
with wheels (not wheels with C code). Whats not robust about the
pattern? The cargo package manager you referred to is entirely
declarative

James: I don't think the binary distribution stuff is relevant to my
discussion, since I'm talking entirely about 'using-pip' use cases,
whereas dpkg and rpm packages don't use that at all.

-Rob

-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all][pbr] splitting our deployment vs install dependencies

2015-04-12 Thread Monty Taylor
On 04/12/2015 08:01 PM, James Polley wrote:
> On Mon, Apr 13, 2015 at 9:12 AM, Monty Taylor  wrote:
> 
>> On 04/12/2015 06:43 PM, Robert Collins wrote:
>>> Right now we do something that upstream pip considers wrong: we make
>>> our requirements.txt be our install_requires.
>>>
>>> Upstream there are two separate concepts.
>>>
>>> install_requirements, which are meant to document what *must* be
>>> installed to import the package, and should encode any mandatory
>>> version constraints while being as loose as otherwise possible. E.g.
>>> if package A depends on package B version 1.5 or above, it should say
>>> B>=1.5 in A's install_requires. They should not specify maximum
>>> versions except when that is known to be a problem: they shouldn't
>>> borrow trouble.
>>>
>>> deploy requirements - requirements.txt - which are meant to be *local
>>> to a deployment*, and are commonly expected to specify very narrow (or
>>> even exact fit) versions.
>>
> 
> That sounds, to me, very similar to a discussion we had a few weeks ago in
> the context of our stable branches.
> 
> In that context, we have two competing requirements. One requirement is
> that our CI system wants a very tightly pinned requirements, as do
> downstream CI systems and deployers that want to test and deploy exact
> known-tested versions of things. On the other hand, downstream distributors
> (including OS packagers) need to balance OpenStack's version requirements
> with version requirements from all the other packages in their
> distribution; the tighter the requirements we list are, the harder it is
> for the requirements to work with the requirements of other packages in the
> distribution.

This is not accurate. During distro packaging activities, pbr does not
process dependencies at all. So no matter how we pin things in
OpenStack, it does not make it harder for the distros.

>> tl;dr - I'm mostly in support of what you're saying - but I'm going to
>> bludgeon it some.
>>
>> To be either fair or unfair, depending on how you look at it - some
>> people upstream consider those two to be a pattern, but it is not
>> encoded anywhere except in hidden lore that is shared between secret
>> people. Upstream's tools have bumpkiss for support for this, and if we
>> hadn't drawn a line in the sand encoding SOME behavior there would still
>> be nothing.
>>
>> Nor, btw, is it the right split. It optimizes for the wrong thing.
>>
>> rust gets it wright. There is a Cargo.toml and a Cargo.lock, which are
>> understood by the tooling in a manner similar to what you have
>> described, and it is not just understood but DOCUMENTED that an
>> _application_ should ship with a Cargo.lock and a _library_ should not.
>>
> 
> This sounds similar to a solution that was proposed for the stable
> branches: a requirements.in with mandatory version constraints while being
> as loose as otherwise possible, which is used to generate a
> requirements.txt which has the "local to deployment" exact versions that
> are used in our CI. The details of the proposal are in
> https://review.openstack.org/#/c/161047/

I disagree with this proposal. It's also not helping any users. Because
of what I said above, there is no flexibility that we lose downstream by
being strict and pedantic with our versions. So, having the "lose" and
the "strict" file just gets us two files and doubles the confusion.
Having a list of what we know the state to be is great. We should give
that to users. If they want to use something other than pip to install,
awesome - the person in charge of curating that content can test the
version interactions in their environment.

What we have in the gate is the thing that produces the artifacts that
someone installing using the pip tool would get. Shipping anything with
those artifacts other that a direct communication of what we tested is
just mean to our end users.

> 
>> Without the library/application distinction, the effort in
>> differentiating is misplaced, I believe - because it's libraries that
>> need flexible depends - and applications where the specific set of
>> depends that were tested in CI become important to consumers.
>>
>>> What pbr, which nearly if not all OpenStack projects use, does, is to
>>> map the contents of requirements.txt into install_requires. And then
>>> we use the same requirements.txt in our CI to control whats deployed
>>> in our test environment[*]. and there we often have tight constraints
>>> like seen here -
>>>
>> http://git.openstack.org/cgit/openstack/requirements/tree/global-requirements.txt#n63
>>
>> That is, btw, because that's what the overwhelming majority of consumers
>> assume those files mean. I take "overwhelming majority" from the days
>> when we had files but did not process them automatically and everyone
>> was confused.
>>
>>> I'd like to align our patterns with those of upstream, so that we're
>>> not fighting our tooling so much.
>>
>> Ok. I mean, they don't have a better answer, but if it makes "python"
>>

Re: [openstack-dev] [all][pbr] splitting our deployment vs install dependencies

2015-04-12 Thread James Polley
On Mon, Apr 13, 2015 at 9:12 AM, Monty Taylor  wrote:

> On 04/12/2015 06:43 PM, Robert Collins wrote:
> > Right now we do something that upstream pip considers wrong: we make
> > our requirements.txt be our install_requires.
> >
> > Upstream there are two separate concepts.
> >
> > install_requirements, which are meant to document what *must* be
> > installed to import the package, and should encode any mandatory
> > version constraints while being as loose as otherwise possible. E.g.
> > if package A depends on package B version 1.5 or above, it should say
> > B>=1.5 in A's install_requires. They should not specify maximum
> > versions except when that is known to be a problem: they shouldn't
> > borrow trouble.
> >
> > deploy requirements - requirements.txt - which are meant to be *local
> > to a deployment*, and are commonly expected to specify very narrow (or
> > even exact fit) versions.
>

That sounds, to me, very similar to a discussion we had a few weeks ago in
the context of our stable branches.

In that context, we have two competing requirements. One requirement is
that our CI system wants a very tightly pinned requirements, as do
downstream CI systems and deployers that want to test and deploy exact
known-tested versions of things. On the other hand, downstream distributors
(including OS packagers) need to balance OpenStack's version requirements
with version requirements from all the other packages in their
distribution; the tighter the requirements we list are, the harder it is
for the requirements to work with the requirements of other packages in the
distribution.


> tl;dr - I'm mostly in support of what you're saying - but I'm going to
> bludgeon it some.
>
> To be either fair or unfair, depending on how you look at it - some
> people upstream consider those two to be a pattern, but it is not
> encoded anywhere except in hidden lore that is shared between secret
> people. Upstream's tools have bumpkiss for support for this, and if we
> hadn't drawn a line in the sand encoding SOME behavior there would still
> be nothing.
>
> Nor, btw, is it the right split. It optimizes for the wrong thing.
>
> rust gets it wright. There is a Cargo.toml and a Cargo.lock, which are
> understood by the tooling in a manner similar to what you have
> described, and it is not just understood but DOCUMENTED that an
> _application_ should ship with a Cargo.lock and a _library_ should not.
>

This sounds similar to a solution that was proposed for the stable
branches: a requirements.in with mandatory version constraints while being
as loose as otherwise possible, which is used to generate a
requirements.txt which has the "local to deployment" exact versions that
are used in our CI. The details of the proposal are in
https://review.openstack.org/#/c/161047/


> Without the library/application distinction, the effort in
> differentiating is misplaced, I believe - because it's libraries that
> need flexible depends - and applications where the specific set of
> depends that were tested in CI become important to consumers.
>
> > What pbr, which nearly if not all OpenStack projects use, does, is to
> > map the contents of requirements.txt into install_requires. And then
> > we use the same requirements.txt in our CI to control whats deployed
> > in our test environment[*]. and there we often have tight constraints
> > like seen here -
> >
> http://git.openstack.org/cgit/openstack/requirements/tree/global-requirements.txt#n63
>
> That is, btw, because that's what the overwhelming majority of consumers
> assume those files mean. I take "overwhelming majority" from the days
> when we had files but did not process them automatically and everyone
> was confused.
>
> > I'd like to align our patterns with those of upstream, so that we're
> > not fighting our tooling so much.
>
> Ok. I mean, they don't have a better answer, but if it makes "python"
> hate us less, sweet.
>
> > Concretely, I think we need to:
> >  - teach pbr to read in install_requires from setup.cfg, not
> requirements.txt
> >  - when there are requirements in setup.cfg, stop reading
> requirements.txt
> >  - separate out the global intall_requirements from the global CI
> > requirements, and update our syncing code to be aware of this
> >
> > Then, setup.cfg contains more open requirements suitable for being on
> > PyPI, requirements.txt is the local CI set we know works - and can be
> > much more restrictive as needed.
> >
> > Thoughts? If there's broad apathy-or-agreement I can turn this into a
> > spec for fine coverage of ramifications and corner cases.
>
> I'm concerned that it adds a layer of difference that is confusing to
> people for the sole benefit of pleasing someone else's pedantic worldview.
>
> I'm also concerned that dstufft is actively wanting to move towards a
> world where the build tooling is not needed or used as part of the
> install pipeline (metadata 2.0) -- so I'm concerned that we're aligning
> with a pattern that isn't very robus

Re: [openstack-dev] [all][pbr] splitting our deployment vs install dependencies

2015-04-12 Thread Monty Taylor
On 04/12/2015 06:43 PM, Robert Collins wrote:
> Right now we do something that upstream pip considers wrong: we make
> our requirements.txt be our install_requires.
> 
> Upstream there are two separate concepts.
> 
> install_requirements, which are meant to document what *must* be
> installed to import the package, and should encode any mandatory
> version constraints while being as loose as otherwise possible. E.g.
> if package A depends on package B version 1.5 or above, it should say
> B>=1.5 in A's install_requires. They should not specify maximum
> versions except when that is known to be a problem: they shouldn't
> borrow trouble.
> 
> deploy requirements - requirements.txt - which are meant to be *local
> to a deployment*, and are commonly expected to specify very narrow (or
> even exact fit) versions.

tl;dr - I'm mostly in support of what you're saying - but I'm going to
bludgeon it some.

To be either fair or unfair, depending on how you look at it - some
people upstream consider those two to be a pattern, but it is not
encoded anywhere except in hidden lore that is shared between secret
people. Upstream's tools have bumpkiss for support for this, and if we
hadn't drawn a line in the sand encoding SOME behavior there would still
be nothing.

Nor, btw, is it the right split. It optimizes for the wrong thing.

rust gets it wright. There is a Cargo.toml and a Cargo.lock, which are
understood by the tooling in a manner similar to what you have
described, and it is not just understood but DOCUMENTED that an
_application_ should ship with a Cargo.lock and a _library_ should not.

Without the library/application distinction, the effort in
differentiating is misplaced, I believe - because it's libraries that
need flexible depends - and applications where the specific set of
depends that were tested in CI become important to consumers.

> What pbr, which nearly if not all OpenStack projects use, does, is to
> map the contents of requirements.txt into install_requires. And then
> we use the same requirements.txt in our CI to control whats deployed
> in our test environment[*]. and there we often have tight constraints
> like seen here -
> http://git.openstack.org/cgit/openstack/requirements/tree/global-requirements.txt#n63

That is, btw, because that's what the overwhelming majority of consumers
assume those files mean. I take "overwhelming majority" from the days
when we had files but did not process them automatically and everyone
was confused.

> I'd like to align our patterns with those of upstream, so that we're
> not fighting our tooling so much.

Ok. I mean, they don't have a better answer, but if it makes "python"
hate us less, sweet.

> Concretely, I think we need to:
>  - teach pbr to read in install_requires from setup.cfg, not requirements.txt
>  - when there are requirements in setup.cfg, stop reading requirements.txt
>  - separate out the global intall_requirements from the global CI
> requirements, and update our syncing code to be aware of this
> 
> Then, setup.cfg contains more open requirements suitable for being on
> PyPI, requirements.txt is the local CI set we know works - and can be
> much more restrictive as needed.
> 
> Thoughts? If there's broad apathy-or-agreement I can turn this into a
> spec for fine coverage of ramifications and corner cases.

I'm concerned that it adds a layer of difference that is confusing to
people for the sole benefit of pleasing someone else's pedantic worldview.

I'm also concerned that dstufft is actively wanting to move towards a
world where the build tooling is not needed or used as part of the
install pipeline (metadata 2.0) -- so I'm concerned that we're aligning
with a pattern that isn't very robust and isn't supported by tooling
directly and that we're going to need to change understood usage
patterns across a large developer based to chase something that _still_
isn't going to be "how people do it"

I'm concerned that "how people do it" is a myth not worth chasing.

I'm not _opposed_ to making this richer and more useful for people. I
just don't know what's broken currently for us.

Monty

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [all][pbr] splitting our deployment vs install dependencies

2015-04-12 Thread Robert Collins
Right now we do something that upstream pip considers wrong: we make
our requirements.txt be our install_requires.

Upstream there are two separate concepts.

install_requirements, which are meant to document what *must* be
installed to import the package, and should encode any mandatory
version constraints while being as loose as otherwise possible. E.g.
if package A depends on package B version 1.5 or above, it should say
B>=1.5 in A's install_requires. They should not specify maximum
versions except when that is known to be a problem: they shouldn't
borrow trouble.

deploy requirements - requirements.txt - which are meant to be *local
to a deployment*, and are commonly expected to specify very narrow (or
even exact fit) versions.

What pbr, which nearly if not all OpenStack projects use, does, is to
map the contents of requirements.txt into install_requires. And then
we use the same requirements.txt in our CI to control whats deployed
in our test environment[*]. and there we often have tight constraints
like seen here -
http://git.openstack.org/cgit/openstack/requirements/tree/global-requirements.txt#n63

I'd like to align our patterns with those of upstream, so that we're
not fighting our tooling so much.

Concretely, I think we need to:
 - teach pbr to read in install_requires from setup.cfg, not requirements.txt
 - when there are requirements in setup.cfg, stop reading requirements.txt
 - separate out the global intall_requirements from the global CI
requirements, and update our syncing code to be aware of this

Then, setup.cfg contains more open requirements suitable for being on
PyPI, requirements.txt is the local CI set we know works - and can be
much more restrictive as needed.

Thoughts? If there's broad apathy-or-agreement I can turn this into a
spec for fine coverage of ramifications and corner cases.

-Rob

-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-12 Thread Kevin Benton
>Timestamps are just one way (and likely the most primitive), using redis
(or memcache) key/value and expiry are another (and letting memcache or
redis expire using its own internal algorithms), using zookeeper ephemeral
nodes[1] are another... The point being that its backend specific and tooz
supports varying backends.

Very cool. Is the backend completely transparent so a deployer could choose
a service they are comfortable maintaining, or will that change the
properties WRT to resiliency of state on node restarts, partitions, etc?

The Nova implementation of Tooz seemed pretty straight-forward, although it
looked like it had pluggable drivers for service management already. Before
I dig into it much further I'll file a spec on the Neutron side to see if I
can get some other cores onboard to do the review work if I push a change
to tooz.


On Sun, Apr 12, 2015 at 9:38 AM, Joshua Harlow  wrote:

> Kevin Benton wrote:
>
>> So IIUC tooz would be handling the liveness detection for the agents.
>> That would be nice to get ride of that logic in Neutron and just
>> register callbacks for rescheduling the dead.
>>
>> Where does it store that state, does it persist timestamps to the DB
>> like Neutron does? If so, how would that scale better? If not, who does
>> a given node ask to know if an agent is online or offline when making a
>> scheduling decision?
>>
>
> Timestamps are just one way (and likely the most primitive), using redis
> (or memcache) key/value and expiry are another (and letting memcache or
> redis expire using its own internal algorithms), using zookeeper ephemeral
> nodes[1] are another... The point being that its backend specific and tooz
> supports varying backends.
>
>
>> However, before (what I assume is) the large code change to implement
>> tooz, I would like to quantify that the heartbeats are actually a
>> bottleneck. When I was doing some profiling of them on the master branch
>> a few months ago, processing a heartbeat took an order of magnitude less
>> time (<50ms) than the 'sync routers' task of the l3 agent (~300ms). A
>> few query optimizations might buy us a lot more headroom before we have
>> to fall back to large refactors.
>>
>
> Sure, always good to avoid prematurely optimizing things...
>
> Although this is relevant for u I think anyway:
>
> https://review.openstack.org/#/c/138607/ (same thing/nearly same in
> nova)...
>
> https://review.openstack.org/#/c/172502/ (a WIP implementation of the
> latter).
>
> [1] https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#
> Ephemeral+Nodes
>
>
>> Kevin Benton wrote:
>>
>>
>> One of the most common is the heartbeat from each agent. However, I
>> don't think we can't eliminate them because they are used to determine
>> if the agents are still alive for scheduling purposes. Did you have
>> something else in mind to determine if an agent is alive?
>>
>>
>> Put each agent in a tooz[1] group; have each agent periodically
>> heartbeat[2], have whoever needs to schedule read the active members of
>> that group (or use [3] to get notified via a callback), profit...
>>
>> Pick from your favorite (supporting) driver at:
>>
>> http://docs.openstack.org/__developer/tooz/compatibility.__html
>> 
>>
>> [1]
>> http://docs.openstack.org/__developer/tooz/compatibility.__html#grouping
>> 
>> [2]
>> https://github.com/openstack/__tooz/blob/0.13.1/tooz/__
>> coordination.py#L315
>> 
>> [3]
>> http://docs.openstack.org/__developer/tooz/tutorial/group_
>> __membership.html#watching-__group-changes
>> > membership.html#watching-group-changes>
>>
>>
>> 
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request@lists.__openstack.org?subject:__unsubscribe
>> 
>> http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
>> 
>>
>> 
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:
>> unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Kevin Benton
__
Ope

Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-12 Thread Kevin Benton
>I assumed that all agents are connected to same IP address of RabbitMQ,
then the connection will exceed the port ranges limitation.

Only if the clients are all using the same IP address. If connections
weren't scoped by source IP, busy servers would be completely unreliable
because clients would keep having source port collisions.

For example, the following is a netstat output from a server with two
connections to a service running on port 4000 with both clients using
source port 5: http://paste.openstack.org/show/203211/

>the client should be aware of the cluster member failure, and reconnect to
other survive member. No such mechnism has been implemented yet.

If I understand what you are suggesting, it already has been implemented
that way. The neutron agents and servers can be configured with multiple
rabbitmq servers and they will cycle through the list whenever there is a
failure.

The only downside to that approach is that every neutron agent and server
has to be configured with every rabbitmq server address. This gets tedious
to manage if you want to add cluster members dynamically so using a load
balancer can help relieve that.

Hi, Kevin,



I assumed that all agents are connected to same IP address of RabbitMQ,
then the connection will exceed the port ranges limitation.



For a RabbitMQ cluster, for sure the client can connect to any one of
member in the cluster, but in this case, the client has to be designed in
fail-safe manner: the client should be aware of the cluster member failure,
and reconnect to other survive member. No such mechnism has
been implemented yet.



Other way is to use LVS or DNS based like load balancer, or something else.
If you put one load balancer ahead of a cluster, then we have to take care
of the port number limitation, there are so many agents  will require
connection concurrently, 100k level, and the requests can not be rejected.



Best Regards



Chaoyi Huang ( joehuang )


 --
*From:* Kevin Benton [blak...@gmail.com]
*Sent:* 12 April 2015 9:59
*To:* OpenStack Development Mailing List (not for usage questions)
*Subject:* Re: [openstack-dev] [neutron] Neutron scaling datapoints?

  The TCP/IP stack keeps track of connections as a combination of IP + TCP
port. The two byte port limit doesn't matter unless all of the agents are
connecting from the same IP address, which shouldn't be the case unless
compute nodes connect to the rabbitmq server via one IP address running
port address translation.

 Either way, the agents don't connect directly to the Neutron server, they
connect to the rabbit MQ cluster. Since as many Neutron server processes
can be launched as necessary, the bottlenecks will likely show up at the
messaging or DB layer.

On Sat, Apr 11, 2015 at 6:46 PM, joehuang  wrote:

>  As Kevin talking about agents, I want to remind that in TCP/IP stack,
> port ( not Neutron Port ) is a two bytes field, i.e. port ranges from 0 ~
> 65535, supports maximum 64k port number.
>
>
>
> " above 100k managed node " means more than 100k L2 agents/L3 agents...
> will be alive under Neutron.
>
>
>
> Want to know the detail design how to support 99.9% possibility for
> scaling Neutron in this way, and PoC and test would be a good support for
> this idea.
>
>
>
> "I'm 99.9% sure, for scaling above 100k managed node,
> we do not really need to split the openstack to multiple smaller openstack,
> or use significant number of extra controller machine."
>
>
>
> Best Regards
>
>
>
> Chaoyi Huang ( joehuang )
>
>
>  --
> *From:* Kevin Benton [blak...@gmail.com]
> *Sent:* 11 April 2015 12:34
> *To:* OpenStack Development Mailing List (not for usage questions)
>  *Subject:* Re: [openstack-dev] [neutron] Neutron scaling datapoints?
>
>Which periodic updates did you have in mind to eliminate? One of the
> few remaining ones I can think of is sync_routers but it would be great if
> you can enumerate the ones you observed because eliminating overhead in
> agents is something I've been working on as well.
>
>  One of the most common is the heartbeat from each agent. However, I
> don't think we can't eliminate them because they are used to determine if
> the agents are still alive for scheduling purposes. Did you have something
> else in mind to determine if an agent is alive?
>
> On Fri, Apr 10, 2015 at 2:18 AM, Attila Fazekas 
> wrote:
>
>> I'm 99.9% sure, for scaling above 100k managed node,
>> we do not really need to split the openstack to multiple smaller
>> openstack,
>> or use significant number of extra controller machine.
>>
>> The problem is openstack using the right tools SQL/AMQP/(zk),
>> but in a wrong way.
>>
>> For example.:
>> Periodic updates can be avoided almost in all cases
>>
>> The new data can be pushed to the agent just when it needed.
>> The agent can know when the AMQP connection become unreliable (queue or
>> connection loose),
>> and needs to do full sync.
>> https://bugs.launchpad.net/neutron/

Re: [openstack-dev] [OpenStack-docs] What's Up Doc? Apr 10 2015

2015-04-12 Thread Monty Taylor
On 04/12/2015 04:16 AM, Bernd Bausch wrote:
> There is nothing like a good rage on a Sunday (yes Sunday) afternoon. Many
> thanks, Monty. You helped me make glance work for my particular case; I will
> limit any further messages to the docs mailing list.

Rage on a Sunday followed up by rage coding:

https://review.openstack.org/172728

I figured I should stop flapping my mouth and write some code.

> For now I will use API v1 (export OS_IMAGE_API_VERSION=1), pending further
> discussions in the install guide team. To me, install guides are more a way
> to enter the OpenStack world than an official installation guide; no need to
> expose newbies including myself to the complexity of v2.
> 
> Bernd
> 
> -Original Message-
> From: Monty Taylor [mailto:mord...@inaugust.com] 
> Sent: Sunday, April 12, 2015 6:22 AM
> To: OpenStack Development Mailing List (not for usage questions);
> openstack-d...@lists.openstack.org; openstack-i...@lists.openstack.org
> Cc: Jesse Noller
> Subject: Re: [OpenStack-docs] [openstack-dev] What's Up Doc? Apr 10 2015
> 
> Sorry for top posting - I wasn't subscribed to the doc list before clarkb
> told me about this thread. Warning ... rage coming ... if you don't want to
> read rage on a Saturday, I recommend skipping this email.
> 
> a) There may be a doc bug here, but I'm not 100% convinced it's a doc bug -
> I'll try to characterize it in this way:
> 
> "As a user, I do not know what version of glance I am or should be
> interacting with"
> 
> That part of this is about the default version that python-glanceclient may
> or may not use and what version you may or may not need to provide on the
> command line is a badness I'll get to in a second - but a clear "so you want
> to upload an image, here's what you need to know" is, I think, what Bernd
> was looking for
> 
> b) Glance is categorically broken in all regards related to this topic.
> This thing is the most painful and most broken of everything that exists in
> OpenStack. It is the source of MONTHS of development to deal with it in
> Infra, and even the workarounds are terrible.
> 
> Let me expand:
> 
> glance image-upload MAY OR MAY NOT work on your cloud, and there is
> absolutely no way you as a user can tell. You just have to try and find out.
> 
> IF glance image-upload does not work for you, it may be because of two
> things, neither of which are possible for you as a user to find out:
> 
> Either:
> 
> - Your cloud has decided to not enable image upload permissions in their
> policy.json file, which is a completely opaque choice that you as a user
> have no way of finding out. If this is the case you have no recourse, sorry.
> - Your cloud has deployed a recent glance and has configured it for glance
> v2 and has configured it in the policy.json file to ONLY allow v2 and to
> disallow image-upload
> 
> If the second is true, which you have no way to discover except for trying,
> what you need to do is:
> 
> - upload the image to swift
> - glance task-create --type=import --input='{"import_from":
> "$PATH_TO_IMAGE_IN_SWIFT", "image_properties" : {"name": "Human Readable
> Image Name"}}'
> 
> Yes, you do have to pass JSON on the command line, because BONGHITS (/me
> glares at the now absent Brian Waldon with withering disdain for having
> inflicted such an absolutely craptastic API on the world.)
> 
> Then, you need to poll glance task-status for the status of the import_from
> task until your image has imported.
> 
> c) The python-glanceclient command line client should encapsulate that
> ridiculous logic for you, but it does not
> 
> d) It should be possible to discover from the cloud which of the approaches
> you should take, but it isn't
> 
> Now - I'm honestly not sure how far the docs team should take working around
> this - because fully describing how to successfully upload an image without
> resorting to calling people names is impossible - but is it really the Docs
> team job to make an impossible API seem user friendly? Or, should we not
> treat this as a docs bug and instead treat it as a Glance bug and demand a
> v3 API that rolls back the task interface?
> 
> I vote for the latter.
> 
> BTW - the shade library encodes as much of the logic above as it can.
> That it exists makes me sad.
> 
> Monty
> 
> On Sat, Apr 11, 2015 at 10:50 AM, Matt Kassawara 
> wrote:
> 
>> Sounds like a problem with one or more packages (perhaps
>> python-glanceclient?) because that command using the source version 
>> (not
>> packages) returns the normal list of help items. Maybe try the source 
>> version using "pip install python-glanceclient"?
>>
>> On Sat, Apr 11, 2015 at 5:55 AM, Bernd Bausch > gmail.com>
>> wrote:
>>
>>> glance help image-create. Sorry for being vague.
>>>
>>> When running glance with the parameters from the install guide (the 
>>> trunk version), I am told that I am not doing it correctly; I don't 
>>> have the precise message handy.
>>>
>>>
>>>
>>> My fear is that I will hit similar problems later

Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-12 Thread Joshua Harlow

Kevin Benton wrote:

So IIUC tooz would be handling the liveness detection for the agents.
That would be nice to get ride of that logic in Neutron and just
register callbacks for rescheduling the dead.

Where does it store that state, does it persist timestamps to the DB
like Neutron does? If so, how would that scale better? If not, who does
a given node ask to know if an agent is online or offline when making a
scheduling decision?


Timestamps are just one way (and likely the most primitive), using redis 
(or memcache) key/value and expiry are another (and letting memcache or 
redis expire using its own internal algorithms), using zookeeper 
ephemeral nodes[1] are another... The point being that its backend 
specific and tooz supports varying backends.




However, before (what I assume is) the large code change to implement
tooz, I would like to quantify that the heartbeats are actually a
bottleneck. When I was doing some profiling of them on the master branch
a few months ago, processing a heartbeat took an order of magnitude less
time (<50ms) than the 'sync routers' task of the l3 agent (~300ms). A
few query optimizations might buy us a lot more headroom before we have
to fall back to large refactors.


Sure, always good to avoid prematurely optimizing things...

Although this is relevant for u I think anyway:

https://review.openstack.org/#/c/138607/ (same thing/nearly same in nova)...

https://review.openstack.org/#/c/172502/ (a WIP implementation of the 
latter).


[1] 
https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#Ephemeral+Nodes




Kevin Benton wrote:


One of the most common is the heartbeat from each agent. However, I
don't think we can't eliminate them because they are used to determine
if the agents are still alive for scheduling purposes. Did you have
something else in mind to determine if an agent is alive?


Put each agent in a tooz[1] group; have each agent periodically
heartbeat[2], have whoever needs to schedule read the active members of
that group (or use [3] to get notified via a callback), profit...

Pick from your favorite (supporting) driver at:

http://docs.openstack.org/__developer/tooz/compatibility.__html


[1]
http://docs.openstack.org/__developer/tooz/compatibility.__html#grouping

[2]
https://github.com/openstack/__tooz/blob/0.13.1/tooz/__coordination.py#L315

[3]
http://docs.openstack.org/__developer/tooz/tutorial/group___membership.html#watching-__group-changes



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
OpenStack-dev-request@lists.__openstack.org?subject:__unsubscribe

http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Consistent variable documentation for diskimage-builder elements

2015-04-12 Thread Gregory Haynes
Excerpts from Clint Byrum's message of 2015-04-08 23:11:29 +:
> 
> I discussed a format for something similar here:
> 
> https://review.openstack.org/#/c/162267/
> 
> Perhaps we could merge the effort.
> 
> The design and implementation in that might take some time, but if we
> can document the variables at the same time we prepare the inputs for
> isolation, that seems like a winning path forward.
> 

The solution presented there would be awesome for not having to document
the variables manually at all - we can do some sphinx plugin magic to
autogen the doc sections and even get some annoying to write out
features like static links for each var (Im sure you knew this, just
spelling it out).

I agree that itd be better to not put a lot of effort into switching all
the README's over right now and instead work on the argument isolation.
My hope is that in the meanwhile new elements we create and possibly
README's we end up editing get moved over to this new format. Then, we
can try and autogen something that is pretty similar when the time
comes.

Now, lets get that arg isolation donw already. ;)

Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] VMware CI

2015-04-12 Thread Gary Kotton
Thanks!

On 4/12/15, 3:04 PM, "Davanum Srinivas"  wrote:

>Gary, John,
>
>Just to speed things up, i filed a backport:
>https://review.openstack.org/#/c/172710/
>
>thanks,
>dims
>
>On Sun, Apr 12, 2015 at 4:23 AM, John Garbutt 
>wrote:
>> I have updated the bug so it's high priority and tagged with
>> kilo-rc-potential, and added your note from below as a comment on the
>>bug.
>>
>> It looks like it might be worth a backport so it gets into RC2? Can
>>anyone
>> take that bit on please?
>>
>> Thanks,
>> John
>>
>>
>> On Sunday, April 12, 2015, Gary Kotton  wrote:
>>>
>>> Hi,
>>> Can a core please take a look at
>>>https://review.openstack.org/#/c/171037.
>>> The CI is broken due to commit
>>>e7ae5bb7fbdd5b79bde8937958dd0a645554a5f0.
>>> Thanks
>>> Gary
>>
>>
>> 
>>_
>>_
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: 
>>openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
>
>-- 
>Davanum Srinivas ::
>https://urldefense.proofpoint.com/v2/url?u=https-3A__twitter.com_dims&d=Aw
>ICAg&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=VlZxHpZBmzzkWT5jqz9JY
>Bk8YTeq9N3-diTlNj4GyNc&m=O_vdKYuE0xFSaX6xbIHw3qdu0asR94NVcbUKhC9t2vs&s=zv9
>uaaxIRcEZR4SDIuS8EJW7YjE4-2QEZKZtxKcl4ow&e=
>
>__
>OpenStack Development Mailing List (not for usage questions)
>Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] VMware CI

2015-04-12 Thread Davanum Srinivas
Gary, John,

Just to speed things up, i filed a backport:
https://review.openstack.org/#/c/172710/

thanks,
dims

On Sun, Apr 12, 2015 at 4:23 AM, John Garbutt  wrote:
> I have updated the bug so it's high priority and tagged with
> kilo-rc-potential, and added your note from below as a comment on the bug.
>
> It looks like it might be worth a backport so it gets into RC2? Can anyone
> take that bit on please?
>
> Thanks,
> John
>
>
> On Sunday, April 12, 2015, Gary Kotton  wrote:
>>
>> Hi,
>> Can a core please take a look at https://review.openstack.org/#/c/171037.
>> The CI is broken due to commit e7ae5bb7fbdd5b79bde8937958dd0a645554a5f0.
>> Thanks
>> Gary
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Davanum Srinivas :: https://twitter.com/dims

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-12 Thread joehuang
Hi, Kevin,



I assumed that all agents are connected to same IP address of RabbitMQ, then 
the connection will exceed the port ranges limitation.



For a RabbitMQ cluster, for sure the client can connect to any one of member in 
the cluster, but in this case, the client has to be designed in fail-safe 
manner: the client should be aware of the cluster member failure, and reconnect 
to other survive member. No such mechnism has been implemented yet.



Other way is to use LVS or DNS based like load balancer, or something else. If 
you put one load balancer ahead of a cluster, then we have to take care of the 
port number limitation, there are so many agents  will require connection 
concurrently, 100k level, and the requests can not be rejected.



Best Regards



Chaoyi Huang ( joehuang )




From: Kevin Benton [blak...@gmail.com]
Sent: 12 April 2015 9:59
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?

The TCP/IP stack keeps track of connections as a combination of IP + TCP port. 
The two byte port limit doesn't matter unless all of the agents are connecting 
from the same IP address, which shouldn't be the case unless compute nodes 
connect to the rabbitmq server via one IP address running port address 
translation.

Either way, the agents don't connect directly to the Neutron server, they 
connect to the rabbit MQ cluster. Since as many Neutron server processes can be 
launched as necessary, the bottlenecks will likely show up at the messaging or 
DB layer.

On Sat, Apr 11, 2015 at 6:46 PM, joehuang 
mailto:joehu...@huawei.com>> wrote:

As Kevin talking about agents, I want to remind that in TCP/IP stack, port ( 
not Neutron Port ) is a two bytes field, i.e. port ranges from 0 ~ 65535, 
supports maximum 64k port number.



" above 100k managed node " means more than 100k L2 agents/L3 agents... will be 
alive under Neutron.



Want to know the detail design how to support 99.9% possibility for scaling 
Neutron in this way, and PoC and test would be a good support for this idea.



"I'm 99.9% sure, for scaling above 100k managed node,
we do not really need to split the openstack to multiple smaller openstack,
or use significant number of extra controller machine."



Best Regards



Chaoyi Huang ( joehuang )




From: Kevin Benton [blak...@gmail.com]
Sent: 11 April 2015 12:34
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?

Which periodic updates did you have in mind to eliminate? One of the few 
remaining ones I can think of is sync_routers but it would be great if you can 
enumerate the ones you observed because eliminating overhead in agents is 
something I've been working on as well.

One of the most common is the heartbeat from each agent. However, I don't think 
we can't eliminate them because they are used to determine if the agents are 
still alive for scheduling purposes. Did you have something else in mind to 
determine if an agent is alive?

On Fri, Apr 10, 2015 at 2:18 AM, Attila Fazekas 
mailto:afaze...@redhat.com>> wrote:
I'm 99.9% sure, for scaling above 100k managed node,
we do not really need to split the openstack to multiple smaller openstack,
or use significant number of extra controller machine.

The problem is openstack using the right tools SQL/AMQP/(zk),
but in a wrong way.

For example.:
Periodic updates can be avoided almost in all cases

The new data can be pushed to the agent just when it needed.
The agent can know when the AMQP connection become unreliable (queue or 
connection loose),
and needs to do full sync.
https://bugs.launchpad.net/neutron/+bug/1438159

Also the agents when gets some notification, they start asking for details via 
the
AMQP -> SQL. Why they do not know it already or get it with the notification ?


- Original Message -
> From: "Neil Jerram" 
> mailto:neil.jer...@metaswitch.com>>
> To: "OpenStack Development Mailing List (not for usage questions)" 
> mailto:openstack-dev@lists.openstack.org>>
> Sent: Thursday, April 9, 2015 5:01:45 PM
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
>
> Hi Joe,
>
> Many thanks for your reply!
>
> On 09/04/15 03:34, joehuang wrote:
> > Hi, Neil,
> >
> >  From theoretic, Neutron is like a "broadcast" domain, for example,
> >  enforcement of DVR and security group has to touch each regarding host
> >  where there is VM of this project resides. Even using SDN controller, the
> >  "touch" to regarding host is inevitable. If there are plenty of physical
> >  hosts, for example, 10k, inside one Neutron, it's very hard to overcome
> >  the "broadcast storm" issue under concurrent operation, that's the
> >  bottleneck for scalability of Neutron.
>
> I think I understand that in general terms - but can you be more
> specific about the broadcast sto

Re: [openstack-dev] [Nova] VMware CI

2015-04-12 Thread John Garbutt
I have updated the bug so it's high priority and tagged with
kilo-rc-potential, and added your note from below as a comment on the bug.

It looks like it might be worth a backport so it gets into RC2? Can anyone
take that bit on please?

Thanks,
John

On Sunday, April 12, 2015, Gary Kotton  wrote:

>  Hi,
> Can a core please take a look at https://review.openstack.org/#/c/171037.
> The CI is broken due to commit e7ae5bb7fbdd5b79bde8937958dd0a645554a5f0.
> Thanks
> Gary
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev