Re: [Openstack-operators] [nova] VM HA support in trunk

2016-04-12 Thread Toshikazu Ichikawa
Hi Affan,

 

As you said, pacemaker 1.1.10 doesn’t include pacemaker-remote feature.

 

We installed pacemaker 1.1.12-rc4 to test remote feature on Ubuntu 14.04 from 
the following link[1].

[1] https://launchpad.net/~david-gabriel/+archive/ubuntu/ppa

 

Side note: Ubuntu 16.04, coming soon, contains higher version.

 

Thanks,

Kazu

 

 

From: Affan Syed [mailto:affan.syed@gmail.com] 
Sent: Tuesday, April 12, 2016 2:53 PM
To: Toshikazu Ichikawa <ichikawa.toshik...@lab.ntt.co.jp>; Matt Fischer 
<m...@mattfischer.com>
Cc: openstack-operators@lists.openstack.org
Subject: Re: [Openstack-operators] [nova] VM HA support in trunk

 

Hi Kazu,

 

thanks for this update. Sorry I am a bit late in replying to this thread, but 
one of my students just ran into an issue running pacemaker-based evacuation of 
hosts. It seems that pacemaker 1.1.10 is not supposed to work with remote, and 
the 14.04 distro comes with that version. 

 

Did you get remote to work, if so how? The pull request [1] indicates that 
remote support was added, but its unclear how the above version difference was 
handled. Did you people resort to compiling the latest pcm from source or 
something else?

 

 

Affan

 

 

 

[1] https://github.com/ntt-sic/masakari/pull/11

On Fri, 19 Feb 2016 at 09:19 Toshikazu Ichikawa 
<ichikawa.toshik...@lab.ntt.co.jp <mailto:ichikawa.toshik...@lab.ntt.co.jp> > 
wrote:

Hi Affan,

 

Pacemaker works fine on either a canonical distribution or RDO.

I use our tool [1] using Pacemaker on Ubuntu without any specific issue.

 

[1] https://github.com/ntt-sic/masakari

 

Thanks,

Kazu

 

From: Affan Syed [mailto:affan.syed@gmail.com 
<mailto:affan.syed@gmail.com> ] 
Sent: Tuesday, February 16, 2016 2:02 PM
To: Matt Fischer <m...@mattfischer.com <mailto:m...@mattfischer.com> >; 
Toshikazu Ichikawa <ichikawa.toshik...@lab.ntt.co.jp 
<mailto:ichikawa.toshik...@lab.ntt.co.jp> >
Cc: openstack-operators@lists.openstack.org 
<mailto:openstack-operators@lists.openstack.org> 
Subject: Re: [Openstack-operators] [nova] VM HA support in trunk

 

Hi Kazu and Matt,

Thanks for the pointers. I think the discussion around pacemaker and pacemaker 
remote seems most promising, esp with Russel's blog post I found after I 
emailed earlier [1].

 

Not sure how tooling would be different, but pacemaker, given its use in the 
controller cluster anyways, seems a more logical choice. Any issues you people 
think with a canonical distribution instead of RDO?

 

Affan

 

 

[1] 
http://blog.russellbryant.net/2015/03/10/the-different-facets-of-openstack-ha/

  

On Mon, 15 Feb 2016 at 20:59 Matt Fischer <m...@mattfischer.com 
<mailto:m...@mattfischer.com> > wrote:

I believe that either have your customers design their apps to handle failures 
or have tools that are reactive to failures.

 

Unfortunately like many other private cloud operators we deal a lot with legacy 
applications that aren't scaled horizontally or fault tolerant and so we've 
built tooling to handle customer notifications (reactive). When we lose a 
compute host we generate a notice to customers and then work on evacuating 
their instances. For the evac portion nova host-evacuate or host-evacuate-live 
work fairly well, although we rarely get a functioning floating-IP after 
host-evacuate without other work.

 

Getting adoption of heat or other automation tooling to educate customers is a 
long process, especially when they're used to VMware where I think they get the 
VM HA stuff for "free".

 

 

On Mon, Feb 15, 2016 at 8:25 AM, Toshikazu Ichikawa 
<ichikawa.toshik...@lab.ntt.co.jp <mailto:ichikawa.toshik...@lab.ntt.co.jp> > 
wrote:

Hi Affan,

 

 

I don’t think any components in Liberty provide HA VM support directly.

 

However, many works are published and open-sourced, here.

https://etherpad.openstack.org/p/automatic-evacuation

You may find ideas and solutions.

 

And, the discussion on this topic is on-going at HA meeting.

https://wiki.openstack.org/wiki/Meetings/HATeamMeeting

 

thanks,

Kazu

 

From: Affan Syed [mailto:affan.syed@gmail.com 
<mailto:affan.syed@gmail.com> ] 
Sent: Monday, February 15, 2016 12:51 PM
To: openstack-operators@lists.openstack.org 
<mailto:openstack-operators@lists.openstack.org> 
Subject: [Openstack-operators] [nova] VM HA support in trunk

 

reposting with the correct tag, hopefully. Would really appreciate some 
pointers. 

-- Forwarded message -
From: Affan Syed <affan.syed@gmail.com <mailto:affan.syed@gmail.com> >
Date: Sat, 13 Feb 2016 at 15:13
Subject: [nova] VM HA support in trunk
To: <openstack-operators@lists.openstack.org 
<mailto:openstack-operators@lists.openstack.org> >

 

Hi all,

I have been trying to understand if we currently have some VM HA support as 
part of Liberty?

 

To be precise, how are host being down 

Re: [Openstack-operators] [nova] VM HA support in trunk

2016-02-18 Thread Toshikazu Ichikawa
Hi Affan,

 

Pacemaker works fine on either a canonical distribution or RDO.

I use our tool [1] using Pacemaker on Ubuntu without any specific issue.

 

[1] https://github.com/ntt-sic/masakari

 

Thanks,

Kazu

 

From: Affan Syed [mailto:affan.syed@gmail.com] 
Sent: Tuesday, February 16, 2016 2:02 PM
To: Matt Fischer <m...@mattfischer.com>; Toshikazu Ichikawa 
<ichikawa.toshik...@lab.ntt.co.jp>
Cc: openstack-operators@lists.openstack.org
Subject: Re: [Openstack-operators] [nova] VM HA support in trunk

 

Hi Kazu and Matt,

Thanks for the pointers. I think the discussion around pacemaker and pacemaker 
remote seems most promising, esp with Russel's blog post I found after I 
emailed earlier [1].

 

Not sure how tooling would be different, but pacemaker, given its use in the 
controller cluster anyways, seems a more logical choice. Any issues you people 
think with a canonical distribution instead of RDO?

 

Affan

 

 

[1] 
http://blog.russellbryant.net/2015/03/10/the-different-facets-of-openstack-ha/

  

On Mon, 15 Feb 2016 at 20:59 Matt Fischer <m...@mattfischer.com 
<mailto:m...@mattfischer.com> > wrote:

I believe that either have your customers design their apps to handle failures 
or have tools that are reactive to failures.

 

Unfortunately like many other private cloud operators we deal a lot with legacy 
applications that aren't scaled horizontally or fault tolerant and so we've 
built tooling to handle customer notifications (reactive). When we lose a 
compute host we generate a notice to customers and then work on evacuating 
their instances. For the evac portion nova host-evacuate or host-evacuate-live 
work fairly well, although we rarely get a functioning floating-IP after 
host-evacuate without other work.

 

Getting adoption of heat or other automation tooling to educate customers is a 
long process, especially when they're used to VMware where I think they get the 
VM HA stuff for "free".

 

 

On Mon, Feb 15, 2016 at 8:25 AM, Toshikazu Ichikawa 
<ichikawa.toshik...@lab.ntt.co.jp <mailto:ichikawa.toshik...@lab.ntt.co.jp> > 
wrote:

Hi Affan,

 

 

I don’t think any components in Liberty provide HA VM support directly.

 

However, many works are published and open-sourced, here.

https://etherpad.openstack.org/p/automatic-evacuation

You may find ideas and solutions.

 

And, the discussion on this topic is on-going at HA meeting.

https://wiki.openstack.org/wiki/Meetings/HATeamMeeting

 

thanks,

Kazu

 

From: Affan Syed [mailto:affan.syed@gmail.com 
<mailto:affan.syed@gmail.com> ] 
Sent: Monday, February 15, 2016 12:51 PM
To: openstack-operators@lists.openstack.org 
<mailto:openstack-operators@lists.openstack.org> 
Subject: [Openstack-operators] [nova] VM HA support in trunk

 

reposting with the correct tag, hopefully. Would really appreciate some 
pointers. 

-- Forwarded message -
From: Affan Syed <affan.syed@gmail.com <mailto:affan.syed@gmail.com> >
Date: Sat, 13 Feb 2016 at 15:13
Subject: [nova] VM HA support in trunk
To: <openstack-operators@lists.openstack.org 
<mailto:openstack-operators@lists.openstack.org> >

 

Hi all,

I have been trying to understand if we currently have some VM HA support as 
part of Liberty?

 

To be precise, how are host being down due to power failure handled, 
specifically in terms of migrating the VMs but possibly even their networking 
configs (tunnels etc). 

 

The VM migration like XEN-HA or KVM cluster seem to require 1+1 HA, I have read 
a few places about celiometer+heat templates to launch VMs for an N+1 backup 
scenario, but these all seem like one-off setups. 

 

 

This issue seems to be very much important for legacy enterprises to move their 
"pets" --- not sure if we can simply wish away that mindset!

 

Affan

 

 

 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org 
<mailto:OpenStack-operators@lists.openstack.org> 
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Draft Agenda for MAN Ops Meetup (Feb 15, 16)

2016-02-12 Thread Toshikazu Ichikawa
Hi Tom,

> * HA at the Hypervisor level

I and some members from NTT can share what we are doing (masakari) and
recent activities in HA team meeting, if the time slot is still available.

Thanks,
Kazu

-Original Message-
From: Tom Fifield [mailto:t...@openstack.org] 
Sent: Thursday, February 04, 2016 8:28 PM
To: OpenStack Operators 
Subject: Re: [Openstack-operators] Draft Agenda for MAN Ops Meetup (Feb 15,
16)

Hi all,

We still need moderators for the following:

* Upgrade challenges, LTS releases, patches and packaging
* Keystone Federation - discussion session
* Post-Puppet deployment patterns - discussion
* HA at the Hypervisor level
* OSOps - what is it, where is it going, what you can do
* OSOps working session


Have a look at the moderator's guide @
https://wiki.openstack.org/wiki/Operations/Meetups#Moderators_Guide

and let us know if you're interested!


Regards,


Tom

On 01/02/16 17:29, Matt Jarvis wrote:
> That's a very good point !
>
> OK, so the ones we definitely don't seem to have anyone moderating or 
> presenting against currently are :
>
> Tokyo Highlights - going to assume this was a talk Keystone Federation 
> - discussion session Post-Puppet deployment patterns - discussion HA 
> at the Hypervisor level - assume this was a talk too Ceph integration 
> - discussion Writing User Stories - working group OSOps - what is it, 
> where is it going, what you can do OSOps working session Monitoring 
> and Tools WG
>
> These were almost all taken from the original etherpad ( 
> https://etherpad.openstack.org/p/MAN-ops-meetup ), so if you suggested 
> them or would like to present/moderate then let us know.
>
> If you would like to help with moderating any of the other sessions 
> apart from those above, let us know - for most of the sessions we can 
> always use two moderators.
>
>
>
>
>
>
>
> On 1 February 2016 at 09:20, Shamail Tahir  > wrote:
>
> Hi Matt,
>
>
> On Mon, Feb 1, 2016 at 3:47 AM, Matt Jarvis
>  > wrote:
>
> Hello All
>
> The event in Manchester is rapidly approaching, and we're still
> looking for moderators and presenters for some of these
> sessions. If you proposed one of the sessions currently in the
> schedule, please let us know so we can assign you in the
> schedule. If you'd be willing to help out and moderate one or
> more sessions, we'd really appreciate your help. Thanks to
> everyone who's volunteered so far !
>
> How can we identify which sessions are missing moderators currently?
>
>
> On 20 January 2016 at 17:54, Tom Fifield  > wrote:
>
> Hi all,
>
> Matt, Nick and myself took some time to take our suggestions
> from the etherpad and attempted to wrangle them into
> something that would fit in the space we have over 2 days.
>
> As a reminder, we have two different kind of sessions -
> General Sessions, which are discussions for the operator
> community aimed to produce actions (eg best practices,
> feedback on badness), and**Working groups**focus on specific
> topics aiming to make concrete progress on tasks in that area.
>
> As always, some stuff has been munged and mangled in an
> attempt to fit it in so please take a look at the below and
> reply with your comments! Is anything missing? Something
> look like a terrible idea? Want to completely change the
> room layout? There's still a little bit of flexibility at
> this stage.
>
>
> Day 1 Room 1
>   Room 2
>   Room 3
> 9:00 - 10:00  Registration
>   
> 10:00 - 10:30 Introduction + History of Ops Meetups +
Intro
> to working groups 
>   
> 10:30 - 11:15 How to engage with the community
>   
> 11:15 - 11:20 Breakout explanation
>   
> 11:20 - 12:00 Tokyo highlightsKeystone and
Federation  
> 12:00 - 13:30 Lunch   
>   
> 13:30 - 14:10 Upgrade challenges, LTS releases, patches
and
> packaging Experience with Puppet Deployments  HPC
/
> Scientific WG
> 14:10 - 14:50 Upgrade challenges, LTS releases, patches
and
> packaging Post-Puppet deployment patterns HPC
/ Scientific WG
> 14:50 - 15:20 Coffee  
>   
> 15:20 - 16:00 Neutron Operational best practices  HA
at the
> Hypervisor level  
> 16:00 - 16:40 OSOps - what 

Re: [Openstack-operators] Adding network node (Neutron agents) and test before deploying customer resource

2015-03-08 Thread Toshikazu Ichikawa
Hi Akihiro,

Thanks for your comment.
I agree your suggestion of the name.

Thanks,
Kazu

-Original Message-
From: Akihiro Motoki [mailto:amot...@gmail.com] 
Sent: Friday, March 06, 2015 7:07 PM
To: Toshikazu Ichikawa
Cc: openstack-operators@lists.openstack.org
Subject: Re: [Openstack-operators] Adding network node (Neutron agents) and
test before deploying customer resource

Hi,

It is a good idea that we have such option.
Regarding the option name, I prefer enable_new_agents over start_new_agents
because new agent itself starts and is working even if it is disabled.

Akihiro

2015-03-06 17:35 GMT+09:00 Toshikazu Ichikawa
ichikawa.toshik...@lab.ntt.co.jp:
 Hi,



 Here is my additional thought.



 Nova provides /v2/{tenant_id}/os-services API[3] which is to block 
 scheduling for a service. Cinder also has /v1/{tenant_id}/os-services
 API[4]. When a service (nova-compute or cinder-volume) is disabled, 
 the service is not selected to accomodate a new VM or volume by 
 scheduler. This provides same concept as admin_state_up=False of 
 Neutron. (Nova and Cinder calls a process service, where Neutron 
 calls it agent.)



 The enable_new_services of Nova or Cinder is a config option 
 included in nova.conf or cinder.conf. The user can't change it through 
 API. (I don't think the user have to change it through API.) The default
value is true
 which means it enables services by default. I believe it's good choice 
 as only production environment requires to change it false with 
 additional setup cost of each service.



 Therefore, I believe adding a config option such as start_new_agents
 (default: true) to neutron's configuration provides consistent 
 experience to operators to maintain nodes. The true value of 
 start_new_agents makes the agent status of admin_state_up true 
 at the addition of agent (this is current behavior). The false value 
 makes it false for production environment(new).



 [3]
 http://developer.openstack.org/api-ref-compute-v2-ext.html#ext-os-serv
 ices

 [4] you can call this one by CLI: cinder service-enable/service-disable



 Thanks,

 Kazu



 From: Toshikazu Ichikawa [mailto:ichikawa.toshik...@lab.ntt.co.jp]
 Sent: Thursday, March 05, 2015 10:05 AM
 To: openstack-operators@lists.openstack.org
 Subject: [Openstack-operators] Adding network node (Neutron agents) 
 and test before deploying customer resource



 Hi,



 I'm looking for the way to test a newly added network node by 
 deploying test resource before any customer resource on the node is 
 deployed. I've learned in this ML that Nova and Cinder has the setting 
 of enable_new_services in each conf to disable the initial service
status to archive this.



 My question is Is there any function/configuration to do same thing 
 for Neutron?

 I know there is on-going bug fix to implement the function to block 
 scheduling for Neutron agent[1].

 As mentioned here [2], this fix may enable only administrators deploy 
 routers/dhcp-servers for test rather than having customer's one.

 However, the initial admin_state_up status of agent still remains true
 right after the agent or node is added.

 That means it still happens the customer routers/dhcp-servers are 
 deployed the node before changing the status by manual.

 To resolve this, I believe a feature similar to enable_new_services 
 of Nova/Cinder should be implemented to Neutron to change initial 
 admin_state_up value.

 Do you know any existing function, blueprint or other approach to 
 achieve same goal?

 Or, Is this the feature what you agree to want and should be proposed 
 as a new blueprint?

 I'd like to have neutron operators comments and suggestions.



 [1] https://bugs.launchpad.net/neutron/+bug/1408488

 [2]
 http://lists.openstack.org/pipermail/openstack-dev/2015-January/054007
 .html



 Thanks,

 Kazu




 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operator
 s




--
Akihiro Motoki amot...@gmail.com



___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [nova] Deprecation of ComputeFilter

2015-03-08 Thread Toshikazu Ichikawa
Hi Kris,

Yes. It should work with even disabled host.

Thanks,
Kazu

-Original Message-
From: Kris G. Lindgren [mailto:klindg...@godaddy.com] 
Sent: Saturday, March 07, 2015 3:54 AM
To: Jesse Keating; Jay Pipes; openstack-operators@lists.openstack.org
Subject: Re: [Openstack-operators] [nova] Deprecation of ComputeFilter

I was wondering if doing the target host deployment is effected by the
status of the compute host?

IE can you do: nova boot --availability-zone zone:server on a server
that is marked as disabled in nova?

 
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy, LLC.




On 3/6/15, 11:43 AM, Jesse Keating j...@bluebox.net wrote:

On 3/6/15 10:27 AM, Jay Pipes wrote:

 As for adding another CONF option, I'm -1 on that. I see no valid 
 reason to schedule workloads to disabled hosts.

There may be a better way to skin this cat, but one scenario is we have 
a host that has alerted, we want to evacuate it and prevent any future 
builds from going there so we disable it. But after repairing, we want 
to introduce OUR load to it first to test it before putting CUSTOMER 
load on it, so we'd like to be able to schedule something to the 
disabled host.

--
-jlk

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators



___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Adding network node (Neutron agents) and test before deploying customer resource

2015-03-06 Thread Toshikazu Ichikawa
Hi,

 

Here is my additional thought.

 

Nova provides /v2/{tenant_id}/os-services API[3] which is to block
scheduling for a service. Cinder also has /v1/{tenant_id}/os-services
API[4]. When a service (nova-compute or cinder-volume) is disabled, the
service is not selected to accomodate a new VM or volume by scheduler. This
provides same concept as admin_state_up=False of Neutron. (Nova and Cinder
calls a process service, where Neutron calls it agent.)

 

The enable_new_services of Nova or Cinder is a config option included in
nova.conf or cinder.conf. The user can't change it through API. (I don't
think the user have to change it through API.) The default value is true
which means it enables services by default. I believe it's good choice as
only production environment requires to change it false with additional
setup cost of each service.

 

Therefore, I believe adding a config option such as start_new_agents
(default: true) to neutron's configuration provides consistent experience
to operators to maintain nodes. The true value of start_new_agents makes
the agent status of admin_state_up true at the addition of agent (this
is current behavior). The false value makes it false for production
environment(new).

 

[3]
http://developer.openstack.org/api-ref-compute-v2-ext.html#ext-os-services

[4] you can call this one by CLI: cinder service-enable/service-disable

 

Thanks,

Kazu

 

From: Toshikazu Ichikawa [mailto:ichikawa.toshik...@lab.ntt.co.jp] 
Sent: Thursday, March 05, 2015 10:05 AM
To: openstack-operators@lists.openstack.org
Subject: [Openstack-operators] Adding network node (Neutron agents) and test
before deploying customer resource

 

Hi,

 

I'm looking for the way to test a newly added network node by deploying test
resource before any customer resource on the node is deployed. I've learned
in this ML that Nova and Cinder has the setting of enable_new_services in
each conf to disable the initial service status to archive this.

 

My question is Is there any function/configuration to do same thing for
Neutron?

I know there is on-going bug fix to implement the function to block
scheduling for Neutron agent[1].

As mentioned here [2], this fix may enable only administrators deploy
routers/dhcp-servers for test rather than having customer's one.

However, the initial admin_state_up status of agent still remains true
right after the agent or node is added.

That means it still happens the customer routers/dhcp-servers are deployed
the node before changing the status by manual.

To resolve this, I believe a feature similar to enable_new_services of
Nova/Cinder should be implemented to Neutron to change initial
admin_state_up value.

Do you know any existing function, blueprint or other approach to achieve
same goal?

Or, Is this the feature what you agree to want and should be proposed as a
new blueprint? 

I'd like to have neutron operators comments and suggestions.

 

[1]  https://bugs.launchpad.net/neutron/+bug/1408488
https://bugs.launchpad.net/neutron/+bug/1408488

[2]
http://lists.openstack.org/pipermail/openstack-dev/2015-January/054007.html

http://lists.openstack.org/pipermail/openstack-dev/2015-January/054007.html

 

Thanks,

Kazu

 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] Adding network node (Neutron agents) and test before deploying customer resource

2015-03-04 Thread Toshikazu Ichikawa
Hi,

 

I'm looking for the way to test a newly added network node by deploying test
resource before any customer resource on the node is deployed. I've learned
in this ML that Nova and Cinder has the setting of enable_new_services in
each conf to disable the initial service status to archive this.

 

My question is Is there any function/configuration to do same thing for
Neutron?

I know there is on-going bug fix to implement the function to block
scheduling for Neutron agent[1].

As mentioned here [2], this fix may enable only administrators deploy
routers/dhcp-servers for test rather than having customer's one.

However, the initial admin_state_up status of agent still remains true
right after the agent or node is added.

That means it still happens the customer routers/dhcp-servers are deployed
the node before changing the status by manual.

To resolve this, I believe a feature similar to enable_new_services of
Nova/Cinder should be implemented to Neutron to change initial
admin_state_up value.

Do you know any existing function, blueprint or other approach to achieve
same goal?

Or, Is this the feature what you agree to want and should be proposed as a
new blueprint? 

I'd like to have neutron operators comments and suggestions.

 

[1] https://bugs.launchpad.net/neutron/+bug/1408488

[2]
http://lists.openstack.org/pipermail/openstack-dev/2015-January/054007.html

 

Thanks,

Kazu

 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators