Re: [openstack-dev] [cinder][nova] Are disk-intensive operations managed ... or not?
On 10/19/2014 09:33 AM, Avishay Traeger wrote: Hi Preston, Replies to some of your cinder-related questions: 1. Creating a snapshot isn't usually an I/O intensive operation. Are you seeing I/O spike or CPU? If you're seeing CPU load, I've seen the CPU usage of cinder-api spike sometimes - not sure why. 2. The 'dd' processes that you see are Cinder wiping the volumes during deletion. You can either disable this in cinder.conf, or you can use a relatively new option to manage the bandwidth used for this. IMHO, deployments should be optimized to not do very long/intensive management operations - for example, use backends with efficient snapshots, use CoW operations wherever possible rather than copying full volumes/images, disabling wipe on delete, etc. In a public-cloud environment I don't think it's reasonable to disable wipe-on-delete. Arguably it would be better to use encryption instead of wipe-on-delete. When done with the backing store, just throw away the key and it'll be secure enough for most purposes. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [All] Maintenance mode in OpenStack during patching/upgrades
-Original Message- From: Christopher Aedo [mailto:d...@aedo.net] Sent: 21 October 2014 04:45 To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [All] Maintenance mode in OpenStack during patching/upgrades ... Also, I would like to see maintenance mode for Nova be limited just to stopping any further VMs being sent there, and the node reporting that it's in maintenance mode. I think proactive workload migration should be handled independently, as I can imaging scenarios where maintenance mode might be desired without coupling migration to it. A typical scenario we have is a non-fatal hardware repair. If a node is reporting ECC memory errors, you want to schedule a repair which Will be disruptive for any VMs running on that host. The users get annoyed when you give them their new VM and then immediately tell them the hardware is going to be repaired. Setting into maintenance for me should mean no new work. I assume that stopping the service has a negative impact on other functions like Telemetry. Tim I would love to keep discussing this further - a small session in Paris would be great. But it seems like there's never enough time at the summits, so I don't have high hopes for making much progress on this specific topic there. Just the same, if anything gets pulled together, I'll be keeping an eye out for it. -Christopher On Fri, Oct 17, 2014 at 9:21 PM, Joe Cropper cropper@gmail.com wrote: I’m glad to see this topic getting some focus once again. :-) From several of the administrators I talk with, when they think of putting a host into maintenance mode, the common requests I hear are: 1. Don’t schedule more VMs to the host 2. Provide an optional way to automatically migrate all (usually active) VMs off the host so that users’ workloads remain “unaffected” by the maintenance operation #1 can easily be achieved, as has been mentioned several times, by simply disabling the compute service. However, #2 involves a little more work, although certainly possible using all the operations provided by nova today (e.g., live migration, etc.). I believe these types of discussions have come up several times over the past several OpenStack releases—certainly since Grizzly (i.e., when I started watching this space). It seems that the general direction is to have the type of workflow needed for #2 outside of nova (which is certainly a valid stance). To that end, it would be fairly straightforward to build some code that logically sits on top of nova, that when entering maintenance: 1. Prevents VMs from being scheduled to the host; 2. Maintains state about the maintenance operation (e.g., not in maintenance, migrations in progress, in maintenance, or error); 3. Provides mechanisms to, upon entering maintenance, dictates which VMs (active, all, none) to migrate and provides some throttling capabilities to prevent hundreds of parallel migrations on densely packed hosts (all done via a REST API). If anyone has additional questions, comments, or would like to discuss some options, please let me know. If interested, upon request, I could even share a video of how such cases might work. :-) My colleagues and I have given these use cases a lot of thought and consideration and I’d love to talk more about them (perhaps a small session in Paris would be possible). - Joe On Oct 17, 2014, at 4:18 AM, John Garbutt j...@johngarbutt.com wrote: On 17 October 2014 02:28, Matt Riedemann mrie...@linux.vnet.ibm.com wrote: On 10/16/2014 7:26 PM, Christopher Aedo wrote: On Tue, Sep 9, 2014 at 2:19 PM, Mike Scherbakov mscherba...@mirantis.com wrote: On Tue, Sep 9, 2014 at 6:02 PM, Clint Byrum cl...@fewbar.com wrote: The idea is not simply deny or hang requests from clients, but provide them we are in maintenance mode, retry in X seconds You probably would want 'nova host-servers-migrate host' yeah for migrations - but as far as I understand, it doesn't help with disabling this host in scheduler - there is can be a chance that some workloads will be scheduled to the host. Regarding putting a compute host in maintenance mode using nova host-update --maintenance enable, it looks like the blueprint and associated commits were abandoned a year and a half ago: https://blueprints.launchpad.net/nova/+spec/host-maintenance It seems that nova service-disable host nova-compute effectively prevents the scheduler from trying to send new work there. Is this the best approach to use right now if you want to pull a compute host out of an environment before migrating VMs off? I agree with Tim and Mike that having something respond down for maintenance rather than ignore or hang would be really valuable. But it also looks like that hasn't gotten much traction in the past - anyone feel like they'd be in support of
[openstack-dev] [horizon] integration_tests : httplib timed out when running from eclipse
Running integration_test from command line is ok : (.venv)whg@whg-HP:/opt/stack/horizon$ nosetests openstack_dashboard.test.integration_tests.tests.test_user_settings openstack_dashboard.test.integration_tests.tests.test_user_settings.TestUserSettings.test_user_settings_change ... ok -- Ran 1 test in 86.292s OK But it always reports timeout error immediately after I launch the test in eclipse. nosetests openstack_dashboard.test.integration_tests.tests.test_login.py:TestLogin.test_login openstack_dashboard.test.integration_tests.tests.test_login.TestLogin.test_login ... ERROR Destroying test database for alias 'default' (':memory:')... == ERROR: openstack_dashboard.test.integration_tests.tests.test_login.TestLogin.test_login -- _StringException: Traceback (most recent call last): File /opt/stack/horizon/openstack_dashboard/test/integration_tests/helpers.py, line 34, in setUp self.driver = webdriver.Chrome() File /opt/stack/horizon/.venv/local/lib/python2.7/site-packages/selenium/webdriver/chrome/webdriver.py, line 67, in __init__ self.quit() File /opt/stack/horizon/.venv/local/lib/python2.7/site-packages/selenium/webdriver/chrome/webdriver.py, line 82, in quit self.service.stop() File /opt/stack/horizon/.venv/local/lib/python2.7/site-packages/selenium/webdriver/chrome/service.py, line 97, in stop url_request.urlopen(http://127.0.0.1:%d/shutdown; % self.port) File /usr/lib/python2.7/urllib2.py, line 127, in urlopen return _opener.open(url, data, timeout) File /usr/lib/python2.7/urllib2.py, line 404, in open response = self._open(req, data) File /usr/lib/python2.7/urllib2.py, line 422, in _open '_open', req) File /usr/lib/python2.7/urllib2.py, line 382, in _call_chain result = func(*args) File /usr/lib/python2.7/urllib2.py, line 1214, in http_open return self.do_open(httplib.HTTPConnection, req) File /usr/lib/python2.7/urllib2.py, line 1187, in do_open r = h.getresponse(buffering=True) File /usr/lib/python2.7/httplib.py, line 1045, in getresponse response.begin() File /usr/lib/python2.7/httplib.py, line 409, in begin version, status, reason = self._read_status() File /usr/lib/python2.7/httplib.py, line 365, in _read_status line = self.fp.readline(_MAXLINE + 1) File /usr/lib/python2.7/socket.py, line 476, in readline data = self._sock.recv(self._rbufsize) timeout: timed out -- Ran 1 test in 3.332s FAILED (errors=1) Thanks Hong-Guang ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] HA of dhcp agents?
Thanks for the pointer! I like how the first google hit for this is: Add details on dhcp_agents_per_network option for DHCP agent HA https://bugs.launchpad.net/openstack-manuals/+bug/1370934 :) Seems reasonable to set dhcp_agents_per_network 1. What happens when a DHCP agent dies? Does the scheduler automatically bind another agent to that network? Cheers, -- Noel On Mon, Oct 20, 2014 at 9:03 PM, Jian Wen wenjia...@gmail.com wrote: See dhcp_agents_per_network in neutron.conf. https://bugs.launchpad.net/neutron/+bug/1174132 2014-10-21 6:47 GMT+08:00 Noel Burton-Krahn n...@pistoncloud.com: I've been working on failover for dhcp and L3 agents. I see that in [1], multiple dhcp agents can host the same network. However, it looks like I have to manually assign networks to multiple dhcp agents, which won't work. Shouldn't multiple dhcp agents automatically fail over? [1] http://docs.openstack.org/trunk/config-reference/content/multi_agent_demo_configuration.html ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Best, Jian ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] Why doesn't ml2-ovs work when it's host != the dhcp agent's host?
Hi Kevin, The current method outlined in [1] is to manually assign networks to dhcp agents. I need to be able to kill the node running the dhcp agent and start it up on another node without manual intervention. Someone else pointed me to the dhcp_agents_per_network option which I'm looking into now. [1] http://docs.openstack.org/trunk/config-reference/content/multi_agent_demo_configuration.html On Mon, Oct 20, 2014 at 8:17 PM, Kevin Benton blak...@gmail.com wrote: The current suggested way for DHCP agent fault tolerance is multiple agents per network. Is there a reason you don't want to use that option? On Oct 20, 2014 5:13 PM, Noel Burton-Krahn n...@pistoncloud.com wrote: Thanks, Robert. So, ML2 needs the host attribute to match to bind the port. My other requirement is that the dhcp agent must be able to migrate to a new host on failover. The issue there is that if the dhcp service starts on a new host with a new host name, then it will not take over the networks that were served by the old host name. I'm looking for a way to start the dhcp agent on a new host using the old host's config. -- Noel On Mon, Oct 20, 2014 at 11:10 AM, Robert Kukura kuk...@noironetworks.com wrote: Hi Noel, The ML2 plugin uses the binding:host_id attribute of port to control port binding. For compute ports, nova sets binding:host_id when creating/updating the neutron port, and ML2's openvswitch mechanism driver will look in agents_db to make sure the openvswitch L2 agent is running on that host, and that it has a bridge mapping for any needed physical network or has the appropriate tunnel type enabled. The binding:host_id attribute also gets set on DHCP, L3, and other agents' ports, and must match the host of the openvswitch-agent on that node or ML2 will not be able to bind the port. I suspect your configuration may be resulting in these not matching, and the DHCP port's binding:vif_type attribute being 'binding_failed'. I'd suggest running neutron port-show as admin on the DHCP port to see what the values of binding_vif_type and binding:host_id are, and running neutron agent-list as admin to make sure there is an L2 agent on that node and maybe neutron agent-show as admin to get that agents config details. -Bob On 10/20/14 1:28 PM, Noel Burton-Krahn wrote: I'm running OpenStack Icehouse with Neutron ML2/OVS. I've configured the ml2-ovs-plugin on all nodes with host = the IP of the host itself. However, my dhcp-agent may float from host to host for failover, so I configured it with host=floating. That doesn't work. In this case, the ml2-ovs-plugin creates a namespace and a tap interface for the dhcp agent, but OVS doesn't route any traffic to the dhcp agent. It *does* work if the dhcp agent's host is the same as the ovs plugin's host, but if my dhcp agent migrates to another host, it loses its configuration since it now has a different host name. So my question is, what does host mean for the ML2 dhcp agent and host can I get it to work if the dhcp agent's host != host for the ovs plugin? Case 1: fails: running with dhcp agent's host = floating, ovs plugin's host = IP-of-server dhcp agent is running in netns created by ovs-plugin dhcp agent never receives network traffic Case 2: ok: running with dhcp agent's host = ovs plugin's host = IP-of-server dhcp agent is running in netns created by ovs-plugin (different tap name than case 1) dhcp agent works -- Noel ___ OpenStack-dev mailing listOpenStack-dev@lists.openstack.orghttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] FreeBSD host support
On Mon, Oct 20, 2014 at 10:19 PM, Joe Gordon joe.gord...@gmail.com wrote: On Sat, Oct 18, 2014 at 10:04 AM, Roman Bogorodskiy rbogorods...@mirantis.com wrote: Hi, In discussion of this spec proposal: https://review.openstack.org/#/c/127827/ it was suggested by Joe Gordon to start a discussion on the mailing list. So I'll share my thoughts and a long term plan on adding FreeBSD host support for OpenStack. An ultimate goal is to allow using libvirt/bhyve as a compute driver. However, I think it would be reasonable to start with libvirt/qemu support first as it will allow to prepare the ground. Before diving into the technical details below, I have one question. Why, What is the benefit of this, besides the obvious 'we not support FreeBSD'? Adding support for a new kernel introduces yet another column in our support matrix, and will require a long term commitment to testing and maintaining OpenStack on FreeBSD. There a number of FreeBSD users that are interested in virtualization and an effective management of the virtualized resources. Using OpenStack would be much more convenient than using some custom scripts / home grown solutions people usually use now. High level overview of what needs to be done: - Nova * linux_net needs to be re-factored to allow to plug in FreeBSD support (that's what the spec linked above is about) * nova.virt.disk.mount needs to be extended to support FreeBSD's mdconfig(8) in a similar way to Linux's losetup - Glance and Keystone These components are fairly free of system specifics. Most likely they will require some small fixes like e.g. I made for Glance https://review.openstack.org/#/c/94100/ - Cinder I didn't look close at Cinder from a porting perspective, tbh. Obviously, it'll need some backend driver that would work on FreeBSD, e.g. ZFS. I've seen some patches floating around for ZFS though. Also, I think it'll need an implementation of iSCSI stack on FreeBSD, because it has its own stack, not stgt. On the other hand, Cinder is not required for a minimal installation and that could be done after adding support of the other components. What about neutron? We are in the process of trying to deprecate nova-network, so any new thing needs to support neutron. AFAIK, there's no defined migration plan yet, unless I missed that. Anyway, I don't see any blockers regarding an implementation of a driver similar to linuxbridge that'd work on FreeBSD. Also, Semihalf guys are working on OpenContail/FreeBSD and Neutron/OpenContrial support, so that's an option as well. Also, it's worth to mention that a discussion on this topic already happened on this maillist: http://lists.openstack.org/pipermail/openstack-dev/2014-March/031431.html Some of the limitations were resolved since then, specifically, libvirt/bhyve has no limitation on count of disk and ethernet devices anymore. Roman Bogorodskiy ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] Combination of Heat ResourceGroup(index) with Fn::Select doesnt work ?
I am trying to find a way of creating a dynamic List of Resources(Loadbalancer PoolMembers to be exact) using Heat. The idea is that the number of PoolMembers and the required Addresses would be received as Heat parameters. However, I am unable to get %index% working inside a Fn:Select block. Is this a bug with Heat or am I doing something wrong ? If this is a bug/limitation in heat is there some other way to get what I am trying to do working with heat ? IMO this is a very important usecase for the %index%. Parameters: { NumberOfMembers: { Description: Number of Pool Members to be created, Type: Number, Default: 1 }, MembersList: { Description: Pool Member IP Address, Type: Json, Default: {key0:11.0.0.43} } }, MemberList: { Type: OS::Heat::ResourceGroup, Properties: { count: {Ref:NumberOfMembers}, resource_def: { type: OS::Neutron::PoolMember, properties: { address: { Fn::Select : [ key%index%, {Ref:MembersList}] }, admin_state_up: true, pool_id: {Ref:HaproxyPool}, protocol_port: 80, weight: 1 } } } } Regards, Magesh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] HA of dhcp agents?
No, unfortunately when the DHCP agent dies there isn't automatic rescheduling at the moment. On Mon, Oct 20, 2014 at 11:56 PM, Noel Burton-Krahn n...@pistoncloud.com wrote: Thanks for the pointer! I like how the first google hit for this is: Add details on dhcp_agents_per_network option for DHCP agent HA https://bugs.launchpad.net/openstack-manuals/+bug/1370934 :) Seems reasonable to set dhcp_agents_per_network 1. What happens when a DHCP agent dies? Does the scheduler automatically bind another agent to that network? Cheers, -- Noel On Mon, Oct 20, 2014 at 9:03 PM, Jian Wen wenjia...@gmail.com wrote: See dhcp_agents_per_network in neutron.conf. https://bugs.launchpad.net/neutron/+bug/1174132 2014-10-21 6:47 GMT+08:00 Noel Burton-Krahn n...@pistoncloud.com: I've been working on failover for dhcp and L3 agents. I see that in [1], multiple dhcp agents can host the same network. However, it looks like I have to manually assign networks to multiple dhcp agents, which won't work. Shouldn't multiple dhcp agents automatically fail over? [1] http://docs.openstack.org/trunk/config-reference/content/multi_agent_demo_configuration.html ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Best, Jian ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Kevin Benton ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [cinder][nova] Are disk-intensive operations managed ... or not?
I would say that wipe-on-delete is not necessary in most deployments. Most storage backends exhibit the following behavior: 1. Delete volume A that has data on physical sectors 1-10 2. Create new volume B 3. Read from volume B before writing, which happens to map to physical sector 5 - backend should return zeroes here, and not data from volume A In case the backend doesn't provide this rather standard behavior, data must be wiped immediately. Otherwise, the only risk is physical security, and if that's not adequate, customers shouldn't be storing all their data there regardless. You could also run a periodic job to wipe deleted volumes to reduce the window of vulnerability, without making delete_volume take a ridiculously long time. Encryption is a good option as well, and of course it protects the data before deletion as well (as long as your keys are protected...) Bottom line - I too think the default in devstack should be to disable this option, and think we should consider making the default False in Cinder itself. This isn't the first time someone has asked why volume deletion takes 20 minutes... As for queuing backup operations and managing bandwidth for various operations, ideally this would be done with a holistic view, so that for example Cinder operations won't interfere with Nova, or different Nova operations won't interfere with each other, but that is probably far down the road. Thanks, Avishay On Tue, Oct 21, 2014 at 9:16 AM, Chris Friesen chris.frie...@windriver.com wrote: On 10/19/2014 09:33 AM, Avishay Traeger wrote: Hi Preston, Replies to some of your cinder-related questions: 1. Creating a snapshot isn't usually an I/O intensive operation. Are you seeing I/O spike or CPU? If you're seeing CPU load, I've seen the CPU usage of cinder-api spike sometimes - not sure why. 2. The 'dd' processes that you see are Cinder wiping the volumes during deletion. You can either disable this in cinder.conf, or you can use a relatively new option to manage the bandwidth used for this. IMHO, deployments should be optimized to not do very long/intensive management operations - for example, use backends with efficient snapshots, use CoW operations wherever possible rather than copying full volumes/images, disabling wipe on delete, etc. In a public-cloud environment I don't think it's reasonable to disable wipe-on-delete. Arguably it would be better to use encryption instead of wipe-on-delete. When done with the backing store, just throw away the key and it'll be secure enough for most purposes. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ironic] disambiguating the term discovery
On 10/21/2014 02:11 AM, Devananda van der Veen wrote: Hi all, I was reminded in the Ironic meeting today that the words hardware discovery are overloaded and used in different ways by different people. Since this is something we are going to talk about at the summit (again), I'd like to start the discussion by building consensus in the language that we're going to use. So, I'm starting this thread to explain how I use those two words, and some other words that I use to mean something else which is what some people mean when they use those words. I'm not saying my words are the right words -- they're just the words that make sense to my brain right now. If someone else has better words, and those words also make sense (or make more sense) then I'm happy to use those instead. So, here are rough definitions for the terms I've been using for the last six months to disambiguate this: hardware discovery The process or act of identifying hitherto unknown hardware, which is addressable by the management system, in order to later make it available for provisioning and management. hardware introspection The process or act of gathering information about the properties or capabilities of hardware already known by the management system. I generally agree with this separation, though it brings some troubles to me, as I'm used to calling discovery what you called introspection (it was not the case this summer, but now I changed my mind). And the term discovery is baked into the.. hmm.. introspection service that I've written [1]. So I would personally prefer to leave discovery as in discovery of hardware properties, though I realize that introspection may be a better name. [1] https://github.com/Divius/ironic-discoverd Why is this disambiguation important? At the last midcycle, we agreed that hardware discovery is out of scope for Ironic -- finding new, unmanaged nodes and enrolling them with Ironic is best left to other services or processes, at least for the forseeable future. However, introspection is definitely within scope for Ironic. Even though we couldn't agree on the details during Juno, we are going to revisit this at the Kilo summit. This is an important feature for many of our current users, and multiple proof of concept implementations of this have been done by different parties over the last year. It may be entirely possible that no one else in our developer community is using the term introspection in the way that I've defined it above -- if so, that's fine, I can stop calling that introspection, but I don't know a better word for the thing that is find-unknown-hardware. Suggestions welcome, Devananda P.S. For what it's worth, googling for hardware discovery yields several results related to identifying unknown network-connected devices and adding them to inventory systems, which is the way that I'm using the term right now, so I don't feel completely off in continuing to say discovery when I mean find unknown network devices and add them to Ironic. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [solum] N00b problems running solum
Hello hello! I'm trying to bring up a solum development environment with vagrant devstack, but I'm having problems running any of the example assemblys. They all get suck at status BUILDING like follows: vagrant@devstack:/var/log/solum/worker$ solum assembly show 5c8c26fc-6c9c-460d-b26a-4ac57a86ca82 +-++ | Property| Value | +-++ | status | BUILDING | | description | test assembly | | application_uri | None | | name| ex1 | | trigger_uri | http://10.0.2.15:9777/v1/triggers/4664cc77-77e4-4ecc-8ba9-784204bee273 | | uuid| 5c8c26fc-6c9c-460d-b26a-4ac57a86ca82 | +-++ The solum worker looks likes it built the docker image successfully: { @timestamp: 2014-10-20 22:16:37.272, project_id: e1bbe85dcd334626891c41462c382af9, build_id: 56476b9d5bc7d12eaf36696fe85baa75ccd51328715a9c617846f04baa94d94e, task: build, message: Step 5 : CMD start web} { @timestamp: 2014-10-20 22:16:37.355, project_id: e1bbe85dcd334626891c41462c382af9, build_id: 56476b9d5bc7d12eaf36696fe85baa75ccd51328715a9c617846f04baa94d94e, task: build, message: --- Running in d128f920e976} { @timestamp: 2014-10-20 22:16:44.986, project_id: e1bbe85dcd334626891c41462c382af9, build_id: 56476b9d5bc7d12eaf36696fe85baa75ccd51328715a9c617846f04baa94d94e, task: build, message: --- 8b540a8b4899} { @timestamp: 2014-10-20 22:16:45.893, project_id: e1bbe85dcd334626891c41462c382af9, build_id: 56476b9d5bc7d12eaf36696fe85baa75ccd51328715a9c617846f04baa94d94e, task: build, message: Removing intermediate container d128f920e976} { @timestamp: 2014-10-20 22:16:45.896, project_id: e1bbe85dcd334626891c41462c382af9, build_id: 56476b9d5bc7d12eaf36696fe85baa75ccd51328715a9c617846f04baa94d94e, task: build, message: Successfully built 8b540a8b4899} { @timestamp: 2014-10-20 22:16:45.919, project_id: e1bbe85dcd334626891c41462c382af9, build_id: 56476b9d5bc7d12eaf36696fe85baa75ccd51328715a9c617846f04baa94d94e, task: build, message: Finished: sudo docker build -t nodeus . [Elapsed: 43 sec] (EXIT_STATUS=0)} { @timestamp: 2014-10-20 22:18:05.596, project_id: e1bbe85dcd334626891c41462c382af9, build_id: 56476b9d5bc7d12eaf36696fe85baa75ccd51328715a9c617846f04baa94d94e, task: build, message: = Total elapsed time: 166 sec} { @timestamp: 2014-10-20 22:18:05.604, project_id: e1bbe85dcd334626891c41462c382af9, build_id: 56476b9d5bc7d12eaf36696fe85baa75ccd51328715a9c617846f04baa94d94e, task: build, message: created_image_id= ID} In on the HEAD of master: commit 6e764bb9d7f831a722ffa2ed6530060ec2f48b82 Author: Ed Cranford ed.cranf...@rackspace.com Date: Thu Oct 16 10:27:07 2014 -0500 Any tips? Thanks, Phil. -- *Philip Cheong* *Elastx *| Public and Private PaaS email: philip.che...@elastx.se office: +46 8 557 728 10 mobile: +46 702 870 814 twitter: @Elastx https://twitter.com/Elastx http://elastx.se ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Pulling nova/virt/hardware.py into nova/objects/
Le 20 oct. 2014 20:13, Dan Smith d...@danplanet.com a écrit : OK, so in reviewing Dan B's patch series that refactors the virt driver's get_available_resource() method [1], I am stuck between two concerns. I like (love even) much of the refactoring work involved in Dan's patches. They replace a whole bunch of our nested dicts that are used in the resource tracker with real objects -- and this is something I've been harping on for months that really hinders developer's understanding of Nova's internals. dict['line1'] = 'Agreed, this is extremely important stuff.' dict['line2'] = 'The current dict mess that we have there is ' dict['line3'] = 'really obscure and confusing.' reply = jsonutils.dumps(dict) However, all of the object classes that Dan B has introduced have been unversioned objects -- i.e. they have not derived from nova.objects.base.NovaObject. This means that these objects cannot be sent over the wire via an RPC API call. In practical terms, this issue has not yet reared its head, because the resource tracker still sends a dictified JSON representation of the object's fields directly over the wire, in the same format as Icehouse, therefore there have been no breakages in RPC API compatibility. Right, so the blueprint for this work states that it's not to be sent over the RPC wire or stored in the database. However, it already is in some cases (at least the ComputeNode object has the unversioned JSONified version of some of these hardware models in it). If the modeling is purely for internal-to-compute-node purposes, then it's all good. However, it surely seems like with the pending scheduler isolation work, we're in a spot where we are building two parallel model hierarchies, and I'm not really sure why. As there are multiple interfaces using non versioned dicts and as we are looking at reducing technical debt by Kilo, there are different blueprints which can be worked in parallel. Here, the virt-to-RT interface has to be objectified, hence Dan's work. On the other end of the RT, the RT-to-scheduler interface has to be objectified, hence Jay and mine's work. I hope we will provide a clear big picture and a roadmap for the Summit so we could give you more insights. My proposal is that before we go and approve any BPs or patches that add to nova/virt/hardware.py, we first put together a patch series that moves the object models in nova/virt/hardware.py to being full-fledged objects in nova/objects/* I'm not sure that just converting them all to NovaObjects is really necessary here. If it's all stuff that is going to go over the wire eventually as part of the resource tracker's expansion, then probably so. If there are bits of the model that only serve to let the resource tracker do its calculations, then perhaps it doesn't make sense to require those be NovaObjects. Totally agreed. Here there is no need to version the interface as the virt/Rt interface is not RPC based and purely internal to nova-compute. We just need to objectify the interface to explicitetly provide what kind of resources are sent but that's it. Regardless, it sounds like we need some discussion on how best to proceed here. Since it's entirely wrapped up in the scheduler work, we should definitely try to make sure that what we're doing here fits with those plans. Last I heard, we weren't sure where we were going to draw the line between nova bits and scheduler bits, so erring on the side of more versioned interfaces seems safest to me. Again, we hope to give you all a better understanding at the Summit. I can't develop further as I'm in vacation until next Wed so I totally assume my last paragraph as an horrible teaser unless someone else from the gang adds more details. -Sylvain --Dan ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Cells conversation starter
On 10/20/2014 08:00 PM, Andrew Laski wrote: One of the big goals for the Kilo cycle by users and developers of the cells functionality within Nova is to get it to a point where it can be considered a first class citizen of Nova. Ultimately I think this comes down to getting it tested by default in Nova jobs, and making it easy for developers to work with. But there's a lot of work to get there. In order to raise awareness of this effort, and get the conversation started on a few things, I've summarized a little bit about cells and this effort below. Goals: Testing of a single cell setup in the gate. Feature parity. Make cells the default implementation. Developers write code once and it works for cells. Ultimately the goal is to improve maintainability of a large feature within the Nova code base. Thanks for the write-up Andrew! Some thoughts/questions below. Looking forward to the discussion on some of these topics, and would be happy to review the code once we get to that point. Feature gaps: Host aggregates Security groups Server groups Shortcomings: Flavor syncing This needs to be addressed now. Cells scheduling/rescheduling Instances can not currently move between cells These two won't affect the default one cell setup so they will be addressed later. What does cells do: Schedule an instance to a cell based on flavor slots available. Proxy API requests to the proper cell. Keep a copy of instance data at the global level for quick retrieval. Sync data up from a child cell to keep the global level up to date. Simplifying assumptions: Cells will be treated as a two level tree structure. Are we thinking of making this official by removing code that actually allows cells to be an actual tree of depth N? I am not sure if doing so would be a win, although it does complicate the RPC/Messaging/State code a bit, but if it's not being used, even though a nice generalization, why keep it around? Plan: Fix flavor breakage in child cell which causes boot tests to fail. Currently the libvirt driver needs flavor.extra_specs which is not synced to the child cell. Some options are to sync flavor and extra specs to child cell db, or pass full data with the request. https://review.openstack.org/#/c/126620/1 offers a means of passing full data with the request. Determine proper switches to turn off Tempest tests for features that don't work with the goal of getting a voting job. Once this is in place we can move towards feature parity and work on internal refactorings. Work towards adding parity for host aggregates, security groups, and server groups. They should be made to work in a single cell setup, but the solution should not preclude them from being used in multiple cells. There needs to be some discussion as to whether a host aggregate or server group is a global concept or per cell concept. Have there been any previous discussions on this topic? If so I'd really like to read up on those to make sure I understand the pros and cons before the summit session. Work towards merging compute/api.py and compute/cells_api.py so that developers only need to make changes/additions in once place. The goal is for as much as possible to be hidden by the RPC layer, which will determine whether a call goes to a compute/conductor/cell. For syncing data between cells, look at using objects to handle the logic of writing data to the cell/parent and then syncing the data to the other. Some of that work has been done already, although in a somewhat ad-hoc fashion, were you thinking of extending objects to support this natively (whatever that means), or do we continue to inline the code in the existing object methods. A potential migration scenario is to consider a non cells setup to be a child cell and converting to cells will mean setting up a parent cell and linking them. There are periodic tasks in place to sync data up from a child already, but a manual kick off mechanism will need to be added. Future plans: Something that has been considered, but is out of scope for now, is that the parent/api cell doesn't need the same data model as the child cell. Since the majority of what it does is act as a cache for API requests, it does not need all the data that a cell needs and what data it does need could be stored in a form that's optimized for reads. Thoughts? ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] request_id deprecation strategy question
On Mon, Oct 20, 2014 at 03:27:19PM -0700, Joe Gordon wrote: On Mon, Oct 20, 2014 at 11:12 AM, gordon chung g...@live.ca wrote: The issue I'm highlighting is that those projects using the code now have to update their api-paste.ini files to import from the new location, presumably while giving some warning to operators about the impending removal of the old code. This was the issue i ran into when trying to switch projects to oslo.middleware where i couldn't get jenkins to pass -- grenade tests successfully did their job. we had a discussion on openstack-qa and it was suggested to add a upgrade script to grenade to handle the new reference and document the switch. [1] if there's any issue with this solution, feel free to let us know. Going down this route means every deployment that wishes to upgrade now has an extra step, and should be avoided whenever possible. Why not just have a wrapper in project.openstack.common pointing to the new oslo.middleware library. If that is not a viable solution, we should give operators one full cycle where the oslo-incubator version is deprecated and they can migrate to the new copy outside of the upgrade process itself. Since there is no deprecation warning in Juno [0], We can deprecate the oslo-incubator copy in Kilo and remove in L. Yeah, this is pretty much my take on it - it's unfortunate that we missed the opportunity to add a deprecation warning for Juno, but given that we did, we're probably stuck with deprecation for Kilo and remove in L. Unless there's some dispensation to the normal backwards-compat policy for paste configs? (as opposed to the project .conf files, where I know you can't do this, because I've been shouted at for trying it in the past ;) Having a shim put back into oslo-incubator, with deprecation warning for Kilo, then sync that to all projects before making the switch seems to be a reasonable compromise to me. Steve ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ironic] disambiguating the term discovery
+1 for the separation I already gave up of the term discovery as you can see on the DRAC Hardware Introspection[1] spec, I also don't think that introspection is the best word for that (we already use the world cloud for OpenStack so it can't get more confusing than that). Perhaps interrogation would be another term for that. [1] https://review.openstack.org/#/c/125920 Cheers, Lucas On Tue, Oct 21, 2014 at 8:49 AM, Dmitry Tantsur dtant...@redhat.com wrote: On 10/21/2014 02:11 AM, Devananda van der Veen wrote: Hi all, I was reminded in the Ironic meeting today that the words hardware discovery are overloaded and used in different ways by different people. Since this is something we are going to talk about at the summit (again), I'd like to start the discussion by building consensus in the language that we're going to use. So, I'm starting this thread to explain how I use those two words, and some other words that I use to mean something else which is what some people mean when they use those words. I'm not saying my words are the right words -- they're just the words that make sense to my brain right now. If someone else has better words, and those words also make sense (or make more sense) then I'm happy to use those instead. So, here are rough definitions for the terms I've been using for the last six months to disambiguate this: hardware discovery The process or act of identifying hitherto unknown hardware, which is addressable by the management system, in order to later make it available for provisioning and management. hardware introspection The process or act of gathering information about the properties or capabilities of hardware already known by the management system. I generally agree with this separation, though it brings some troubles to me, as I'm used to calling discovery what you called introspection (it was not the case this summer, but now I changed my mind). And the term discovery is baked into the.. hmm.. introspection service that I've written [1]. So I would personally prefer to leave discovery as in discovery of hardware properties, though I realize that introspection may be a better name. [1] https://github.com/Divius/ironic-discoverd Why is this disambiguation important? At the last midcycle, we agreed that hardware discovery is out of scope for Ironic -- finding new, unmanaged nodes and enrolling them with Ironic is best left to other services or processes, at least for the forseeable future. However, introspection is definitely within scope for Ironic. Even though we couldn't agree on the details during Juno, we are going to revisit this at the Kilo summit. This is an important feature for many of our current users, and multiple proof of concept implementations of this have been done by different parties over the last year. It may be entirely possible that no one else in our developer community is using the term introspection in the way that I've defined it above -- if so, that's fine, I can stop calling that introspection, but I don't know a better word for the thing that is find-unknown-hardware. Suggestions welcome, Devananda P.S. For what it's worth, googling for hardware discovery yields several results related to identifying unknown network-connected devices and adding them to inventory systems, which is the way that I'm using the term right now, so I don't feel completely off in continuing to say discovery when I mean find unknown network devices and add them to Ironic. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] Tentative schedule for Kilo Design Summit in Paris
Sean Roberts wrote: Chance we can move the congress session from Monday 14:30-16:00 to co-locate Tuesday with GBP? either before or after... Let me ask Chris Hoge (who is in charge of the larger ecosystem sessions), see if you could switch from Monday to Tuesday. -- Thierry Carrez (ttx) ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ironic] disambiguating the term discovery
I agree with Devananda's definition of Œhardware discovery¹ and other tools similar to Ironic use the term discovery in this way, however I have found that these other tools often bundle the gathering of the system properties together with the discovery of the hardware as a single step from a user perspective. I also agree that in Ironic there needs to be a separate term for that (at least from a dev perspective) and I think Lucas¹s suggestion of Œhardware interrogation¹ or something like Œhardware inventory¹ would be more explanatory at first glance than Œintrospection¹. - Sam On 21/10/2014 09:52, Lucas Alvares Gomes lucasago...@gmail.com wrote: +1 for the separation I already gave up of the term discovery as you can see on the DRAC Hardware Introspection[1] spec, I also don't think that introspection is the best word for that (we already use the world cloud for OpenStack so it can't get more confusing than that). Perhaps interrogation would be another term for that. [1] https://review.openstack.org/#/c/125920 Cheers, Lucas On Tue, Oct 21, 2014 at 8:49 AM, Dmitry Tantsur dtant...@redhat.com wrote: On 10/21/2014 02:11 AM, Devananda van der Veen wrote: Hi all, I was reminded in the Ironic meeting today that the words hardware discovery are overloaded and used in different ways by different people. Since this is something we are going to talk about at the summit (again), I'd like to start the discussion by building consensus in the language that we're going to use. So, I'm starting this thread to explain how I use those two words, and some other words that I use to mean something else which is what some people mean when they use those words. I'm not saying my words are the right words -- they're just the words that make sense to my brain right now. If someone else has better words, and those words also make sense (or make more sense) then I'm happy to use those instead. So, here are rough definitions for the terms I've been using for the last six months to disambiguate this: hardware discovery The process or act of identifying hitherto unknown hardware, which is addressable by the management system, in order to later make it available for provisioning and management. hardware introspection The process or act of gathering information about the properties or capabilities of hardware already known by the management system. I generally agree with this separation, though it brings some troubles to me, as I'm used to calling discovery what you called introspection (it was not the case this summer, but now I changed my mind). And the term discovery is baked into the.. hmm.. introspection service that I've written [1]. So I would personally prefer to leave discovery as in discovery of hardware properties, though I realize that introspection may be a better name. [1] https://github.com/Divius/ironic-discoverd Why is this disambiguation important? At the last midcycle, we agreed that hardware discovery is out of scope for Ironic -- finding new, unmanaged nodes and enrolling them with Ironic is best left to other services or processes, at least for the forseeable future. However, introspection is definitely within scope for Ironic. Even though we couldn't agree on the details during Juno, we are going to revisit this at the Kilo summit. This is an important feature for many of our current users, and multiple proof of concept implementations of this have been done by different parties over the last year. It may be entirely possible that no one else in our developer community is using the term introspection in the way that I've defined it above -- if so, that's fine, I can stop calling that introspection, but I don't know a better word for the thing that is find-unknown-hardware. Suggestions welcome, Devananda P.S. For what it's worth, googling for hardware discovery yields several results related to identifying unknown network-connected devices and adding them to inventory systems, which is the way that I'm using the term right now, so I don't feel completely off in continuing to say discovery when I mean find unknown network devices and add them to Ironic. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [horizon]Blueprint- showing a small message to the user for browser incompatibility
On 14/10/14 18:30, Aggarwal, Nikunj wrote: Instead Horizon guys came to an conclusion that we will identify the browser type and version to deal with legacy browsers like older IE or firefox or any other browser and for other major features we can use feature detection with Modernizr. I remember that discussion differently, and I'm not so sure there was a definite conclusion. We definitely should not use a white list for this. -- Radomir Dopieralski ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ironic] disambiguating the term discovery
On Tue, Oct 21, 2014 at 10:27 AM, Sam Betts (sambetts) sambe...@cisco.com wrote: I agree with Devananda's definition of Œhardware discovery¹ and other tools similar to Ironic use the term discovery in this way, however I have found that these other tools often bundle the gathering of the system properties together with the discovery of the hardware as a single step from a user perspective. I also agree that in Ironic there needs to be a separate term for that (at least from a dev perspective) and I think Lucas¹s suggestion of Œhardware interrogation¹ or something like Œhardware inventory¹ would be more explanatory at first glance than Œintrospection¹. Thanks for the suggestion but no inventory please, this is another taboo word in Ironic. This is because when we say hardware inventory it kinda suggests that Ironic could be used as a CMDB, which is not the case. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Pulling nova/virt/hardware.py into nova/objects/
On 10/20/2014 07:38 PM, Jay Pipes wrote: Hi Dan, Dan, Nikola, all Nova devs, OK, so in reviewing Dan B's patch series that refactors the virt driver's get_available_resource() method [1], I am stuck between two concerns. I like (love even) much of the refactoring work involved in Dan's patches. They replace a whole bunch of our nested dicts that are used in the resource tracker with real objects -- and this is something I've been harping on for months that really hinders developer's understanding of Nova's internals. However, all of the object classes that Dan B has introduced have been unversioned objects -- i.e. they have not derived from nova.objects.base.NovaObject. This means that these objects cannot be sent over the wire via an RPC API call. In practical terms, this issue has not yet reared its head, because the resource tracker still sends a dictified JSON representation of the object's fields directly over the wire, in the same format as Icehouse, therefore there have been no breakages in RPC API compatibility. The problems with having all these objects not modelled by deriving from nova.objects.base.NovaObject are two-fold: * The object's fields/schema cannot be changed -- or rather, cannot be changed without introducing upgrade problems. * The objects introduce a different way of serializing the object contents than is used in nova/objects -- it's not that much different, but it's different, and only has not caused a problem because the serialization routines are not yet being used to transfer data over the wire So, what to do? Clearly, I think the nova/virt/hardware.py objects are badly needed. However, one of (the top?) priorities of the Nova project is upgradeability, and by not deriving from nova.objects.base.NovaObject, these nova.virt.hardware objects are putting that mission in jeopardy, IMO. My proposal is that before we go and approve any BPs or patches that add to nova/virt/hardware.py, we first put together a patch series that moves the object models in nova/virt/hardware.py to being full-fledged objects in nova/objects/* I think that we should have both in some cases, and although it makes sense to have them only as objects in some cases - having them as separate classes for some and not others may be confusing. So when does it make sense to have them as separate classes? Well basically whenever there is a need for driver-agnostic logic that will be used outside of the driver (scheduler/claims/API/). Can this stuff go in objects? Technically yes, but objects are really not a good place for such logic as they may already be trying to solve too much (data versioning and downgrading when there is a multi version cloud running, database access for compute, and there are at least 2 more features considered to be part of objects - cells integration and schema data migrations). Take CPU pinning as an example [1] - none of that logic would benefit from living in the NovaObject child class itself, and will make it quite bloated. Having it in the separate module objects can call into is definitely beneficial, while we definitely should stay with objects for versioning/backporting support. So I say in a number of cases we need both. Both is exactly what I did for NUMA, with the exception of the compute node side (we are hopping to start the json blob cleanup in K so I did not concern myself with it for the sake of getting things done, but we will need it). This is what I am doing now with CPU pinning. The question I did not touch upon is what kind of interface does that leave poor Nova developers with. Having everything as objects would allow us to write things like (in the CPU pinning case): instance.cpu_pinning = compute.cpu_pinning.get_pinning_for_instance( instance) Pretty slick, no? While keeping it completely separate would make us do things like cpu_pinning = compute.cpu_pinning.topology_from_obj() if cpu_pinning: instance_pinning = cpu_pinning.get_pinning_for_instance( instance.cpu_pinning.topology_from_obj()) instance.cpu_pinning = objects.InstanceCPUPinning.obj_from_topology( instance_pinning) Way less slick, but can be easily fixed with a level of indirection. Note that the above holds only when we are objectified everywhere - until then - we pretty much *have* to have both. So to sum up - what I think we should do is: 1) Don't bloat the object code with low level stuff 2) Do have objects for versioning everything 3) Make nice APIs that developers can enjoy (after we've converted all the code to use objects). N. [1] https://review.openstack.org/#/c/128738/4/nova/virt/hardware.py Thoughts? -jay [1] https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/virt-driver-get-available-resources-object,n,z ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org
Re: [openstack-dev] [oslo] request_id deprecation strategy question
On Mon, Oct 20, 2014 at 03:27:19PM -0700, Joe Gordon wrote: On Mon, Oct 20, 2014 at 11:12 AM, gordon chung g...@live.ca wrote: The issue I'm highlighting is that those projects using the code now have to update their api-paste.ini files to import from the new location, presumably while giving some warning to operators about the impending removal of the old code. This was the issue i ran into when trying to switch projects to oslo.middleware where i couldn't get jenkins to pass -- grenade tests successfully did their job. we had a discussion on openstack-qa and it was suggested to add a upgrade script to grenade to handle the new reference and document the switch. [1] if there's any issue with this solution, feel free to let us know. Going down this route means every deployment that wishes to upgrade now has an extra step, and should be avoided whenever possible. Why not just have a wrapper in project.openstack.common pointing to the new oslo.middleware library. If that is not a viable solution, we should give operators one full cycle where the oslo-incubator version is deprecated and they can migrate to the new copy outside of the upgrade process itself. Since there is no deprecation warning in Juno [0], We can deprecate the oslo-incubator copy in Kilo and remove in L. I've proposed a patch with a compatibility shim which may provide one way to resolve this: https://review.openstack.org/129858 Steve ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO] Summit scheduling - using our time together wisely.
On 10/20/2014 04:34 PM, Ben Nemec wrote: I guess my only concern would be whether either of those things are contentious (both sound like must-do's at some point) and whether there is anything on either topic that requires f2f conversation to resolve. There's a spec out for Cinder HA already (https://review.openstack.org/#/c/101237/) that seems to have at least general support from everyone, and it's not clear to me that the L3 one can be resolved by us. It sounds like we need Nova and Neutron changes for that. Of course, if we can get some Nova and Neutron folks to commit to attending that session then I could see that being helpful. In general both of those topics on the etherpad are a little light on details, so I'd personally like to see some more specifics on what we'd be talking about. hi, agreed let me add some more comments here as a start. My plan for the Cinder HA session was to start with what we have today and compare it with what we propose in the spec. The spec indeed is currently 'on pause' because of issues affecting some backend drivers (Ceph included) when these are deployed in some active/active configuration. Cinder folks have/are working on it and will have a session about their new State Machine [1] which should help in this regard, if not fix it entirely. I'm planning to join their session before having ours, so we can sync up our planning with Cinder people and update the spec, hopefully for its final revision, jointly. 1. https://etherpad.openstack.org/p/kilo-cinder-summit-topics -- Giulio Fidente GPG KEY: 08D733BA | IRC: giulivo ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [solum] N00b problems running solum
Which Linux distribution are you using? The BUILDING state ends when an AQMP message is seen by Solum from the Solum worker process. If the queue is not working, then the states will not change. There is at least one version of Ubuntu that produced a nonfunctional queue setup before. I suggest you drop by #solum on Freenode and we can offer some tips for how to confirm this. In the mean time, if it is convenient to try a different distro, you might want to give that a try. Regards, Adrian Original message From: Philip Cheong Date:10/21/2014 4:03 AM (GMT-05:00) To: OpenStack Development Mailing List (not for usage questions) Subject: [openstack-dev] [solum] N00b problems running solum Hello hello! I'm trying to bring up a solum development environment with vagrant devstack, but I'm having problems running any of the example assemblys. They all get suck at status BUILDING like follows: vagrant@devstack:/var/log/solum/worker$ solum assembly show 5c8c26fc-6c9c-460d-b26a-4ac57a86ca82 +-++ | Property| Value | +-++ | status | BUILDING | | description | test assembly | | application_uri | None | | name| ex1 | | trigger_uri | http://10.0.2.15:9777/v1/triggers/4664cc77-77e4-4ecc-8ba9-784204bee273 | | uuid| 5c8c26fc-6c9c-460d-b26a-4ac57a86ca82 | +-++ The solum worker looks likes it built the docker image successfully: { @timestamp: 2014-10-20 22:16:37.272, project_id: e1bbe85dcd334626891c41462c382af9, build_id: 56476b9d5bc7d12eaf36696fe85baa75ccd51328715a9c617846f04baa94d94e, task: build, message: Step 5 : CMD start web} { @timestamp: 2014-10-20 22:16:37.355, project_id: e1bbe85dcd334626891c41462c382af9, build_id: 56476b9d5bc7d12eaf36696fe85baa75ccd51328715a9c617846f04baa94d94e, task: build, message: --- Running in d128f920e976} { @timestamp: 2014-10-20 22:16:44.986, project_id: e1bbe85dcd334626891c41462c382af9, build_id: 56476b9d5bc7d12eaf36696fe85baa75ccd51328715a9c617846f04baa94d94e, task: build, message: --- 8b540a8b4899} { @timestamp: 2014-10-20 22:16:45.893, project_id: e1bbe85dcd334626891c41462c382af9, build_id: 56476b9d5bc7d12eaf36696fe85baa75ccd51328715a9c617846f04baa94d94e, task: build, message: Removing intermediate container d128f920e976} { @timestamp: 2014-10-20 22:16:45.896, project_id: e1bbe85dcd334626891c41462c382af9, build_id: 56476b9d5bc7d12eaf36696fe85baa75ccd51328715a9c617846f04baa94d94e, task: build, message: Successfully built 8b540a8b4899} { @timestamp: 2014-10-20 22:16:45.919, project_id: e1bbe85dcd334626891c41462c382af9, build_id: 56476b9d5bc7d12eaf36696fe85baa75ccd51328715a9c617846f04baa94d94e, task: build, message: Finished: sudo docker build -t nodeus . [Elapsed: 43 sec] (EXIT_STATUS=0)} { @timestamp: 2014-10-20 22:18:05.596, project_id: e1bbe85dcd334626891c41462c382af9, build_id: 56476b9d5bc7d12eaf36696fe85baa75ccd51328715a9c617846f04baa94d94e, task: build, message: = Total elapsed time: 166 sec} { @timestamp: 2014-10-20 22:18:05.604, project_id: e1bbe85dcd334626891c41462c382af9, build_id: 56476b9d5bc7d12eaf36696fe85baa75ccd51328715a9c617846f04baa94d94e, task: build, message: created_image_id= ID} In on the HEAD of master: commit 6e764bb9d7f831a722ffa2ed6530060ec2f48b82 Author: Ed Cranford ed.cranf...@rackspace.commailto:ed.cranf...@rackspace.com Date: Thu Oct 16 10:27:07 2014 -0500 Any tips? Thanks, Phil. -- Philip Cheong Elastx | Public and Private PaaS email: philip.che...@elastx.semailto:philip.che...@elastx.se office: +46 8 557 728 10 mobile: +46 702 870 814 twitter: @Elastxhttps://twitter.com/Elastx http://elastx.sehttp://elastx.se/ ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Automatic evacuate
Hi, Sorry for the top posting but it was hard to fit my complete view inline. I'm also thinking about a possible solution for automatic server evacuation. I see two separate sub problems of this problem: 1)compute node monitoring and fencing, 2)automatic server evacuation Compute node monitoring is currently implemented in servicegroup module of nova. As far as I understand pacemaker is the proposed solution in this thread to solve both monitoring and fencing but we tried and found out that pacemaker_remote on baremetal does not work together with fencing (yet), see [1]. So if we need fencing then either we have to go for normal pacemaker instead of pacemaker_remote but that solution doesn't scale or we configure and call stonith directly when pacemaker detect the compute node failure. We can create a pacemaker driver for servicegroup and that driver can hide this currently missing pacemaker functionality by calling stonith directly today and remove this extra functionality as soon as pacemaker itself is capable of doing it. However this means that the service group driver has to know the stonith configuration of the compute nodes. My another concern with pacemaker is that the up state of the resource represents the compute node does not automatically mean that the nova-compute service is also up and running on that compute node. So we have to ask the deployer of the compute node to configure the nova-compute service in pacemaker in a way that the nova-compute service is a pacemaker resource tied to the compute node. Without this configuration change another possibility would be to calculate the up state of a compute service by evaluating a logical operator on a coupled set of sources (e.g. service state in DB AND pacemaker state of the node). For automatic server evacuation we need a piece of code that gets information about the state of the compute nodes periodically and calls the nova evacuation command if necessary. Today the information source of the compute node state is the servicegroup API so the evacuation engine has to be part of nova or the servicegroup API needs to be made available from outside of nova. For me adding the evacuation engine to nova looks simpler than externalizing the servicegroup API. Today the nova evacuate command expects the information about that the server is on shared storage or not. So to be able to automatically call evacuate we also need to automatically determine if the server is on shared storage or not. Also we can consider persisting some of the scheduler hints for example the group hint used by the ServerGroupAntiAffinityFilter as proposed in [2] The new pacemaker servicegroup driver can be implemented first then we can add the evacuation engine as a next step. I'm happy to help with the BP work and the implementation of the feature. [1] http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Remote/#_baremetal_remote_node_use_case [2] https://blueprints.launchpad.net/nova/+spec/validate-targethost-live-migration Cheers, Gibi -Original Message- From: Jastrzebski, Michal [mailto:michal.jastrzeb...@intel.com] Sent: October 18, 2014 09:09 To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Nova] Automatic evacuate -Original Message- From: Florian Haas [mailto:flor...@hastexo.com] Sent: Friday, October 17, 2014 1:49 PM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Nova] Automatic evacuate On Fri, Oct 17, 2014 at 9:53 AM, Jastrzebski, Michal michal.jastrzeb...@intel.com wrote: -Original Message- From: Florian Haas [mailto:flor...@hastexo.com] Sent: Thursday, October 16, 2014 10:53 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Nova] Automatic evacuate On Thu, Oct 16, 2014 at 9:25 AM, Jastrzebski, Michal michal.jastrzeb...@intel.com wrote: In my opinion flavor defining is a bit hacky. Sure, it will provide us functionality fairly quickly, but also will strip us from flexibility Heat would give. Healing can be done in several ways, simple destroy - create (basic convergence workflow so far), evacuate with or without shared storage, even rebuild vm, probably few more when we put more thoughts to it. But then you'd also need to monitor the availability of *individual* guest and down you go the rabbit hole. So suppose you're monitoring a guest with a simple ping. And it stops responding to that ping. I was more reffering to monitoring host (not guest), and for sure not by ping. I was thinking of current zookeeper service group implementation, we might want to use corosync and write servicegroup plugin for that. There are several choices for that, each requires testing really before we make any decission. There is
[openstack-dev] [Summit] Coordination between OpenStack lower layer virt stack (libvirt, QEMU/KVM)
I was discussing $subject on #openstack-nova, Nikola Dipanov suggested it's worthwhile to bring this up on the list. I was looking at http://kilodesignsummit.sched.org/ and noticed there's no specific session (correct me if I'm wrong) that's targeted at coordination between OpenStack - libvirt - QEMU/KVM. Nova, as one of the high-profile customers (to borrow Nikola's phrasing) of libvirt/QEMU/KVM projects, would benefit from better coordination and to keep track of what's happening in the lower layers of open source virtualization stack. Also, libvirt is the virtualization driver that upstream OpenStack infrastructure relies on for Gating. (A relevant thread[1] from openstack-dev list). I have not attended an OpenStack design summit before, but if I have to guess: this topic falls under libvirt driver/cross-project sessions or some kind of unconference/BoF that's done on the fly while co-ordinating notes in an etherpad. Correct? On a related note, I just returned from KVM Forum/LinuxCon Europe. More than a couple of developers expressed interest in closer collaboration with OpenStack layer. For those not familiar, KVM Forum is a developer event that mainly focuses on KVM, QEMU, libvirt projects and their integration work. My report of the conference in plain text here[2]. [1] http://lists.openstack.org/pipermail/openstack-dev/2014-July/040421.html fair standards for all hypervisor drivers [2] https://kashyapc.fedorapeople.org/virt/kvmforum-linuxcon-eu-2014-trip-report.txt -- /kashyap ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Summit] Coordination between OpenStack lower layer virt stack (libvirt, QEMU/KVM)
On Tue, Oct 21, 2014 at 12:58:48PM +0200, Kashyap Chamarthy wrote: I was discussing $subject on #openstack-nova, Nikola Dipanov suggested it's worthwhile to bring this up on the list. I was looking at http://kilodesignsummit.sched.org/ and noticed there's no specific session (correct me if I'm wrong) that's targeted at coordination between OpenStack - libvirt - QEMU/KVM. At previous summits, Nova has given each virt driver a dedicated session in its track. Those sessions have pretty much just been a walkthrough of the various features each virt team was planning. We always have far more topics to discuss than we have time available, and for this summit we want to change direction to maximise the value extracted from face-to-face meetings. As such any session which is just duplicating stuff that could easily be dealt with over email or irc is being cut, to make room for topics where we really need to have the f2f discussions. So the virt driver general sessions from previous summits are not likely to be on the schedule this time around. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Summit] Coordination between OpenStack lower layer virt stack (libvirt, QEMU/KVM)
-Original Message- From: Daniel P. Berrange [mailto:berra...@redhat.com] Sent: 21 October 2014 13:08 To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Summit] Coordination between OpenStack lower layer virt stack (libvirt, QEMU/KVM) On Tue, Oct 21, 2014 at 12:58:48PM +0200, Kashyap Chamarthy wrote: I was discussing $subject on #openstack-nova, Nikola Dipanov suggested it's worthwhile to bring this up on the list. I was looking at http://kilodesignsummit.sched.org/ and noticed there's no specific session (correct me if I'm wrong) that's targeted at coordination between OpenStack - libvirt - QEMU/KVM. At previous summits, Nova has given each virt driver a dedicated session in its track. Those sessions have pretty much just been a walkthrough of the various features each virt team was planning. We always have far more topics to discuss than we have time available, and for this summit we want to change direction to maximise the value extracted from face-to-face meetings. As such any session which is just duplicating stuff that could easily be dealt with over email or irc is being cut, to make room for topics where we really need to have the f2f discussions. So the virt driver general sessions from previous summits are not likely to be on the schedule this time around. Regards, Daniel -- It would also be interesting if the features of KVM could be made available through OpenStack around the same time.. virtio-blk data plane would be an example where we can't work out how to exploit it out of the box under OpenStack. |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Automatic evacuate
On 10/21/2014 06:44 AM, Balázs Gibizer wrote: Hi, Sorry for the top posting but it was hard to fit my complete view inline. I'm also thinking about a possible solution for automatic server evacuation. I see two separate sub problems of this problem: 1)compute node monitoring and fencing, 2)automatic server evacuation Compute node monitoring is currently implemented in servicegroup module of nova. As far as I understand pacemaker is the proposed solution in this thread to solve both monitoring and fencing but we tried and found out that pacemaker_remote on baremetal does not work together with fencing (yet), see [1]. So if we need fencing then either we have to go for normal pacemaker instead of pacemaker_remote but that solution doesn't scale or we configure and call stonith directly when pacemaker detect the compute node failure. I didn't get the same conclusion from the link you reference. It says: That is not to say however that fencing of a baremetal node works any differently than that of a normal cluster-node. The Pacemaker policy engine understands how to fence baremetal remote-nodes. As long as a fencing device exists, the cluster is capable of ensuring baremetal nodes are fenced in the exact same way as normal cluster-nodes are fenced. So, it sounds like the core pacemaker cluster can fence the node to me. I CC'd David Vossel, a pacemaker developer, to see if he can help clarify. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Horizon and Keystone: API Versions and Discovery
- Original Message - From: Dolph Mathews dolph.math...@gmail.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Monday, October 20, 2014 4:38:25 PM Subject: Re: [openstack-dev] Horizon and Keystone: API Versions and Discovery On Mon, Oct 20, 2014 at 7:04 AM, Jamie Lennox jamielen...@redhat.com wrote: - Original Message - From: Dolph Mathews dolph.math...@gmail.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Tuesday, October 7, 2014 6:56:15 PM Subject: Re: [openstack-dev] Horizon and Keystone: API Versions and Discovery On Tuesday, October 7, 2014, Adam Young ayo...@redhat.com wrote: Horizon has a config options which says which version of the Keystone API it should work against: V2 or V3. I am not certain that there is still any reason for Horizon to go against V2. However, If we defer the decision to Keystone, we come up against the problem of discovery. On the surface it is easy, as the Keystone client supports version discovery. The problem is that discovery must be run for each new client creation, and Horizon uses a new client per request. That would mean that every request to Horizon that talks to Keystone would generate at least one additional request. The response is cacheable. Not only is it cachable it is cached by default within the Session object you use so that the session will only make one discovery request per service per session. So horizon can manage how long to cache discovery for by how long they hold on to a session object. As the session object doesn't contain any personal or sensitive date (that is all restricted to the auth plugin) the session object can be persisted between requests. Is there any reason not to cache to disk across sessions? The GET response is entirely endpoint-specific, not exactly session-based. The only reason is that I didn't want to introduce a global variable cache in a library. The session should be a fairly long running object and i'm looking at ways we could serialize it to allow horizon/CLIs to manage it themselves. A quicker way would be to make the discovery cache an actual object and allow horizon/CLIs to handle that seperately to the session/auth plugin. I don't know which they would prefer. Whether or not horizon works that way today - and whether the other services work with discovery as well as keystone does i'm not sure. Is this significant? It gets a little worse when you start thinking about all of the other services out there. If each new request that has to talk to multiple services needs to run discovery, you can image that soon the majority of network chatter would be discovery based. It seems to me that Horizon should somehow cache this data, and share it among clients. Note that I am not talking about user specific data like the endpoints from the service catalog for a specific project. But the overall service catalog, as well as the supported versions of the API, should be cacheable. We can use the standard HTTP cache management API on the Keystone side to specify how long Horizon can trust the data to be current. I think this actually goes for the rest of the endpoints as well: we want to get to a much smaller service catalog, and we can do that by making the catalog holds on IDs. The constraints spec for endpoint binding will be endpoint only anyway, and so having the rest of the endpoint data cached will be valuable there as well. __ _ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/ cgi-bin/mailman/listinfo/ openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Keystone] Question regarding Service Catalog and Identity entries...
- Original Message - From: Ben Meyer ben.me...@rackspace.com To: openstack-dev@lists.openstack.org Sent: Monday, October 20, 2014 3:53:39 PM Subject: Re: [openstack-dev] [Keystone] Question regarding Service Catalog andIdentity entries... On 10/20/2014 08:12 AM, Jamie Lennox wrote: - Original Message - From: Ben Meyer ben.me...@rackspace.com To: openstack-dev@lists.openstack.org Cc: Jamie Painter jamie.pain...@rackspace.com Sent: Tuesday, October 7, 2014 4:31:16 PM Subject: [openstack-dev] [Keystone] Question regarding Service Catalog and Identity entries... I am trying to use the Python Keystone client to integration authentication functionality into a project I am contributing to (https://github.com/rackerlabs/deuce-client). However, I ran into a situation where if I do the following: c = keystoneclient.v2_0.client.Client(username='username', password='password', auth_url=https://keystone-compatible-service.example.com/v2.0/;) Failed to retrieve management_url from token I traced it through the Python Keystoneclient code and it fails due to not finding the identity service entry in the Service Catalog. The authentication otherwise happens in that it has already received a successful response and a full Service Catalog, aside from the missing identity service. This happens with both version 0.10 and 0.11 python keystone clients; I did not try older clients. Talking with the service provider, their version does not include itself in the Service Catalog, and they learned the Keystone itself inserts itself into the Service Catalog. I can certainly understand why it having the identity service entry be part of the Service Catalog, but for them it is (at least for now) not desirable to do so. Questions: - Is it now a standard that Keystone inserts itself into the Service Catalog? It's not a standard that keystone inserts itself into the catalog, the cloud operator should maintain the list of endpoints for their deployment and the 'identity' service should be amongst those endpoints. I'm unclear as to why it would be undesirable to list the identity endpoint in the service catalog. How would this addition change their deployment? The argument is that the Service Catalog is too big so they are hesitant to add new entries to it; and 'identity' in the catalog is redundant since you have to know the 'identity' end-point to even get the service catalog in the first place. Not saying I agree, just that's the argument being made. If it is required by Keystone to be self-referential then they are likely to add it. It's required for the CRUD operations (managing users, projects, roles etc) of keystoneclient. Whether it's realistic that you would ever separate the auth process to a different host than the keystone CRUD I'm not sure, i've never seen it, but the idea is that beyond that initial auth contact there really is no difference between keystone and any other service and keystoneclient will look up the catalog to determine how to talk to keystone. The problem with the code that you provided is that the token that is being returned from your code is unscoped. Which means that it is not associated with a project and therefore it doesn't have a service catalog because the catalog can be project specific. Thus when you go to perform an operation the client will look for the URL it is supposed to talk to in an empty list and fail to find the identity endpoint. This message really needs to be improved. If you add a project_id or project_name to the client information then you should get back a token with a catalog. In my normal case I'm using the project_id field; but have found that it didn't really matter what was used for the credentials in this case since they simply don't have the 'identity' end-points in the Service Catalog. - Or is the Python Keystone Client broken because it is forcing it to be so? I wouldn't say that it is broken because having an identity endpoint in your catalog is a required part of a deployment, in the same way that having a 'compute' endpoint is required if you want to talk to nova. I would be surprised by any decision to purposefully omit the 'identity' endpoint from the service catalog. See above; but from what you are presenting here it sounds like the deployment is broken so it is in fact required by Keystone, even if only a required part of a deployment. As keystoneclient is used by heat, horizon etc I would think it's safe to say it's required. Thanks Ben ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org
Re: [openstack-dev] [Nova] Automatic evacuate
-Original Message- From: Russell Bryant [mailto:rbry...@redhat.com] Sent: October 21, 2014 15:07 To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [Nova] Automatic evacuate On 10/21/2014 06:44 AM, Balázs Gibizer wrote: Hi, Sorry for the top posting but it was hard to fit my complete view inline. I'm also thinking about a possible solution for automatic server evacuation. I see two separate sub problems of this problem: 1)compute node monitoring and fencing, 2)automatic server evacuation Compute node monitoring is currently implemented in servicegroup module of nova. As far as I understand pacemaker is the proposed solution in this thread to solve both monitoring and fencing but we tried and found out that pacemaker_remote on baremetal does not work together with fencing (yet), see [1]. So if we need fencing then either we have to go for normal pacemaker instead of pacemaker_remote but that solution doesn't scale or we configure and call stonith directly when pacemaker detect the compute node failure. I didn't get the same conclusion from the link you reference. It says: That is not to say however that fencing of a baremetal node works any differently than that of a normal cluster-node. The Pacemaker policy engine understands how to fence baremetal remote-nodes. As long as a fencing device exists, the cluster is capable of ensuring baremetal nodes are fenced in the exact same way as normal cluster-nodes are fenced. So, it sounds like the core pacemaker cluster can fence the node to me. I CC'd David Vossel, a pacemaker developer, to see if he can help clarify. It seems there is a contradiction between chapter 1.5 and 7.2 in [1] as 7.2 states: There are some complications involved with understanding a bare-metal node's state that virtual nodes don't have. Once this logic is complete, pacemaker will be able to integrate bare-metal nodes in the same way virtual remote-nodes currently are. Some special considerations for fencing will need to be addressed. Let's wait for David's statement on this. Cheers, Gibi -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Horizon and Keystone: API Versions and Discovery
On Tue, Oct 21, 2014 at 8:34 AM, Jamie Lennox jamielen...@redhat.com wrote: The only reason is that I didn't want to introduce a global variable cache in a library. The session should be a fairly long running object and i'm looking at ways we could serialize it to allow horizon/CLIs to manage it themselves. A quicker way would be to make the discovery cache an actual object and allow horizon/CLIs to handle that seperately to the session/auth plugin. I don't know which they would prefer. We need a generalized caching layer in all of the clients, a session/auth cache is just another instance of that. I've been working under the assumption for OSC that I'd be doing most of that work and that it would live in OSC initially. I like the idea of a cache object that the app subclasses and hands back to the Session, then anything using the Session can make the callbacks. dt -- Dean Troyer dtro...@gmail.com ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [DevStack] Proposal - add support for Markdown for docs
Hi, I have a lot of documentation around DevStack and some configurations that I run for a multi-node lab, that uses Neutron and IPv6. I would love to contribute to them, but they are currently in Markdown format. Looking at the current docs, they are all currently in raw HTML, and I was hoping that it would be possible to add Pelican[1], which would make it easier to contribute documentation since we wouldn't have to write raw HTML. This may help make contributing docs more seamless. What does everyone think? [1]: http://blog.getpelican.com/ -- Sean M. Collins ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [devstack] Enable LVM ephemeral storage for Nova
Hello, I would like to add to DevStack the ability to stand up Nova with LVM ephemeral storage. Below is a draft of the blueprint describing the proposed feature. Suggestions on architecture, implementation and the blueprint in general are very welcome. Best, Dan Enable LVM ephemeral storage for Nova Currently DevStack supports only file based ephemeral storage for Nova, e.g., raw and qcow2. This is an obstacle to Tempest testing of Nova with LVM ephemeral storage, which in the past has been inadvertantly broken (see for example, https://bugs.launchpad.net/nova/+bug/1373962), and to Tempest testing of new features based on LVM ephemeral storage, such as LVM ephemeral storage encryption. To enable Nova to come up with LVM ephemeral storage it must be provided a volume group. Based on an initial discussion with Dean Troyer, this is best achieved by creating a single volume group for all services that potentially need LVM storage; at the moment these are Nova and Cinder. Implementation of this feature will: * move code in lib/cinder/cinder_backends/lvm to lib/lvm with appropriate modifications * rename the Cinder volume group to something generic, e.g., devstack-vg * modify the Cinder initialization and cleanup code appropriately to use the new volume group * initialize the volume group in stack.sh, shortly before services are launched * cleanup the volume group in unstack.sh after the services have been shutdown The question of how large to make the common Nova-Cinder volume group in order to enable LVM ephemeral Tempest testing will have to be explored. Although, given the tiny instance disks used in Nova Tempest tests, the current Cinder volume group size may already be adequate. No new configuration options will be necessary, assuming the volume group size will not be made configurable. smime.p7s Description: S/MIME Cryptographic Signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Fuel] Pluggable framework in Fuel: first prototype ready
Hi, As for a separate section for plugins, I think we should not force it and leave this decision to a plugin developer, so he can create just a single checkbox or a section of the settings tab or a separate tab depending on plugin functionality. Plugins should be able to modify arbitrary release fields. For example, if Ceph was a plugin, it should be able to extend wizard config to add new options to Storage pane. If vCenter was a plugin, it should be able to set maximum amount of Compute nodes to 0. 2014-10-20 21:21 GMT+07:00 Evgeniy L e...@mirantis.com: Hi guys, *Romans' questions:* I feel like we should not require user to unpack the plugin before installing it. Moreover, we may chose to distribute plugins in our own format, which we may potentially change later. E.g. lbaas-v2.0.fp. I like the idea of putting plugin installation functionality in fuel client, which is installed on master node. But in the current version plugin installation requires files operations on the master, as result we can have problems if user's fuel-client is installed on another env. What we can do is to try to determine where fuel-client is installed, if it's master node, we can perform installation, if it isn't master node, we can show user the message, that in the current version remote plugin installation is not supported. In the next versions if we implement plugin manager (which is separate service for plugins management) we will be able to do it remotely. How are we planning to distribute fuel plugin builder and its updates? Yes, as Mike mentioned our plan is to release it on PyPi which is python packages repository, so any developer will be able to run `pip install fpb` and get the tool. What happens if an error occurs during plugin installation? Plugins installation process is very simple, our plan is to have some kind of transaction, to make it atomic. 1. register plugin via API 2. copy the files In case of error on the 1st step, we can do nothing, in case of error on the 2nd step, remove files if there are any, and delete a plugin via rest api. And show user a message. What happens if an error occurs during plugin execution? In the first iteration we are going to interrupt deployment if there are any errors for plugin's tasks, also we are thinking how to improve it, for example we wanted to provide a special flag for each task, like fail_deploument_on_error, and only if it's true, we fail deployment in case of failed task. But it can be tricky to implement, it requires to change the current orchestrator/nailgun error handling logic. So, I'm not sure if we can implement this logic in the first release. Regarding to meaningful error messages, yes, we want to show the user, which plugin causes the error. Shall we consider a separate place in UI (tab) for plugins? +1 to Mike's answer When are we planning to focus on the 2 plugins which were identified as must-haves for 6.0? Cinder LBaaS For Cinder we are going to implement plugin which configures GlusterFS as cinder backend, so, if user has installed GlusterFS cluster, we can configure our cinder to work with it, I want to mention that we don't install GlusterFS nodes, we just configure cinder to work with user's GlusterFS cluster. Stanislaw B. already did some scripts which configures cinder to work with GlusterFS, so we are on testing stage. Regarding to LBaaS, Stanislaw B. did multinode implementation, ha implementation is tricky and requires some additional work, we are working on it. Nathan's questions: Looks like Mike answered UI related questions. Do we offer any kind of validation for settings on plug-ins? Or some way to for the developer to ensure that setting that cannot be default or computed get requested for the plug-in? Yes, each field can have regexp which is used during the validation. *Mike's questions:* One minor thing from me, which I forgot to mention during the demo: verbosity of fpb run. I understand it might sound like a bikeshedding now, but I believe if we develop it right from the very beginning, then we can save some time later. So I would suggest normal, short INFO output, and verbose one with --debug. Agree. Thanks for your feedback, On Sun, Oct 19, 2014 at 1:11 PM, Mike Scherbakov mscherba...@mirantis.com wrote: Hi all, I moved this conversation to openstack-dev to get a broader audience, since we started to discuss technical details. Raw notes from demo session: https://etherpad.openstack.org/p/cinder-neutron-plugins-second-demo. Let me start answering on a few questions below from Roman Nathan. How are we planning to distribute fuel plugin builder and its updates? Ideally, it should be available externally (outside of master node). I don't want us to repeat the same mistake as we did with Fuel client, which doesn't seem to be usable as an external dependency. The plan was to have Fuel
Re: [openstack-dev] [Nova] Cells conversation starter
Hi, to help the discussion, a small compilation about the bugs and previous attempts to fix the missing functionality in cells. Aggregates https://bugs.launchpad.net/nova/+bug/1161208 https://blueprints.launchpad.net/nova/+spec/cells-aggregate-support https://review.openstack.org/#/c/25813/ Server Groups https://bugs.launchpad.net/nova/+bug/1369518 Security Groups https://bugs.launchpad.net/nova/+bug/1274325 Belmiro On Tue, Oct 21, 2014 at 10:31 AM, Nikola Đipanov ndipa...@redhat.com wrote: On 10/20/2014 08:00 PM, Andrew Laski wrote: One of the big goals for the Kilo cycle by users and developers of the cells functionality within Nova is to get it to a point where it can be considered a first class citizen of Nova. Ultimately I think this comes down to getting it tested by default in Nova jobs, and making it easy for developers to work with. But there's a lot of work to get there. In order to raise awareness of this effort, and get the conversation started on a few things, I've summarized a little bit about cells and this effort below. Goals: Testing of a single cell setup in the gate. Feature parity. Make cells the default implementation. Developers write code once and it works for cells. Ultimately the goal is to improve maintainability of a large feature within the Nova code base. Thanks for the write-up Andrew! Some thoughts/questions below. Looking forward to the discussion on some of these topics, and would be happy to review the code once we get to that point. Feature gaps: Host aggregates Security groups Server groups Shortcomings: Flavor syncing This needs to be addressed now. Cells scheduling/rescheduling Instances can not currently move between cells These two won't affect the default one cell setup so they will be addressed later. What does cells do: Schedule an instance to a cell based on flavor slots available. Proxy API requests to the proper cell. Keep a copy of instance data at the global level for quick retrieval. Sync data up from a child cell to keep the global level up to date. Simplifying assumptions: Cells will be treated as a two level tree structure. Are we thinking of making this official by removing code that actually allows cells to be an actual tree of depth N? I am not sure if doing so would be a win, although it does complicate the RPC/Messaging/State code a bit, but if it's not being used, even though a nice generalization, why keep it around? Plan: Fix flavor breakage in child cell which causes boot tests to fail. Currently the libvirt driver needs flavor.extra_specs which is not synced to the child cell. Some options are to sync flavor and extra specs to child cell db, or pass full data with the request. https://review.openstack.org/#/c/126620/1 offers a means of passing full data with the request. Determine proper switches to turn off Tempest tests for features that don't work with the goal of getting a voting job. Once this is in place we can move towards feature parity and work on internal refactorings. Work towards adding parity for host aggregates, security groups, and server groups. They should be made to work in a single cell setup, but the solution should not preclude them from being used in multiple cells. There needs to be some discussion as to whether a host aggregate or server group is a global concept or per cell concept. Have there been any previous discussions on this topic? If so I'd really like to read up on those to make sure I understand the pros and cons before the summit session. Work towards merging compute/api.py and compute/cells_api.py so that developers only need to make changes/additions in once place. The goal is for as much as possible to be hidden by the RPC layer, which will determine whether a call goes to a compute/conductor/cell. For syncing data between cells, look at using objects to handle the logic of writing data to the cell/parent and then syncing the data to the other. Some of that work has been done already, although in a somewhat ad-hoc fashion, were you thinking of extending objects to support this natively (whatever that means), or do we continue to inline the code in the existing object methods. A potential migration scenario is to consider a non cells setup to be a child cell and converting to cells will mean setting up a parent cell and linking them. There are periodic tasks in place to sync data up from a child already, but a manual kick off mechanism will need to be added. Future plans: Something that has been considered, but is out of scope for now, is that the parent/api cell doesn't need the same data model as the child cell. Since the majority of what it does is act as a cache for API requests, it does not need all the data that a cell needs and what data it does need could be stored
Re: [openstack-dev] [cinder][nova] Are disk-intensive operations managed ... or not?
For LVM-thin I believe it is already disabled? It is only really needed on LVM-thick, where the returning zeros behaviour is not done. On 21 October 2014 08:29, Avishay Traeger avis...@stratoscale.com wrote: I would say that wipe-on-delete is not necessary in most deployments. Most storage backends exhibit the following behavior: 1. Delete volume A that has data on physical sectors 1-10 2. Create new volume B 3. Read from volume B before writing, which happens to map to physical sector 5 - backend should return zeroes here, and not data from volume A In case the backend doesn't provide this rather standard behavior, data must be wiped immediately. Otherwise, the only risk is physical security, and if that's not adequate, customers shouldn't be storing all their data there regardless. You could also run a periodic job to wipe deleted volumes to reduce the window of vulnerability, without making delete_volume take a ridiculously long time. Encryption is a good option as well, and of course it protects the data before deletion as well (as long as your keys are protected...) Bottom line - I too think the default in devstack should be to disable this option, and think we should consider making the default False in Cinder itself. This isn't the first time someone has asked why volume deletion takes 20 minutes... As for queuing backup operations and managing bandwidth for various operations, ideally this would be done with a holistic view, so that for example Cinder operations won't interfere with Nova, or different Nova operations won't interfere with each other, but that is probably far down the road. Thanks, Avishay On Tue, Oct 21, 2014 at 9:16 AM, Chris Friesen chris.frie...@windriver.com wrote: On 10/19/2014 09:33 AM, Avishay Traeger wrote: Hi Preston, Replies to some of your cinder-related questions: 1. Creating a snapshot isn't usually an I/O intensive operation. Are you seeing I/O spike or CPU? If you're seeing CPU load, I've seen the CPU usage of cinder-api spike sometimes - not sure why. 2. The 'dd' processes that you see are Cinder wiping the volumes during deletion. You can either disable this in cinder.conf, or you can use a relatively new option to manage the bandwidth used for this. IMHO, deployments should be optimized to not do very long/intensive management operations - for example, use backends with efficient snapshots, use CoW operations wherever possible rather than copying full volumes/images, disabling wipe on delete, etc. In a public-cloud environment I don't think it's reasonable to disable wipe-on-delete. Arguably it would be better to use encryption instead of wipe-on-delete. When done with the backing store, just throw away the key and it'll be secure enough for most purposes. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Duncan Thomas ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] oslotest 1.2.0
The Oslo team is pleased to announce the release of oslotest 1.2.0, our first release for the Kilo cycle. This release includes: cfdb562 Updated from global requirements 118885e Set install_command in tox to avoid pre-releases e5c14b7 Add an extra parameter for test directory in debugger script 356f060 Handle tempfile content encoding 2793ad9 Work toward Python 3.4 support and testing b3610f0 Add links to best practices video and etherpad e815376 Updated from global requirements b2f8b9d Drop .sh extension from oslo_debug_helper.sh d77d23e Add history/changelog to docs aa9c845 fix typo and formatting in contributing docs c5040bb warn against sorting requirements Please report issues via launchpad: https://launchpad.net/oslotest ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [DevStack] Proposal - add support for Markdown for docs
On Tue, Oct 21, 2014 at 9:58 AM, Collins, Sean sean_colli...@cable.comcast.com wrote: Looking at the current docs, they are all currently in raw HTML, and I was hoping that it would be possible to add Pelican[1], which would make it easier to contribute documentation since we wouldn't have to write raw HTML. This may help make contributing docs more seamless. Sean, I assume you realize that by standing up and saying this you become the first volunteer to do the work, right? ;) Seriously, I wouldn't be against something like this and in fact would rather like to refresh the HTML docs. Now that devstack.org is generated on each commit again (thanks anteaya!) we have the ability to do so. Questions that come to mind for further discussion: * Is there a 'good enough' reason to do something new that is different from the rest of docs.openstack.org? Bringing in the legacy site is one thing, we should have a good reason for adding yet another new build dependency. * I'm interested in the opinions of the readership if the shocco-generated script pages are useful. One of the reasons DevStack is in shell script is to make it readable by everyone and the literate-style formatting is to help with that. dt -- Dean Troyer dtro...@gmail.com ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [DevStack] Proposal - add support for Markdown for docs
I do see mentions of gh-pages in the build_docs.sh script - is devstack.org redirect to GitHub? -- Sean M. Collins ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [DevStack] Proposal - add support for Markdown for docs
On Tue, Oct 21, 2014 at 11:33:44AM EDT, Dean Troyer wrote: On Tue, Oct 21, 2014 at 9:58 AM, Collins, Sean sean_colli...@cable.comcast.com wrote: Looking at the current docs, they are all currently in raw HTML, and I was hoping that it would be possible to add Pelican[1], which would make it easier to contribute documentation since we wouldn't have to write raw HTML. This may help make contributing docs more seamless. Sean, I assume you realize that by standing up and saying this you become the first volunteer to do the work, right? ;) Yep - I already have a branch that adds it as a dependency, and I'm starting to hack on it right now, I just figured I'd check on the ML and see core reviewer blessings :) -- Sean M. Collins ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Summit] Coordination between OpenStack lower layer virt stack (libvirt, QEMU/KVM)
On Tue, Oct 21, 2014 at 11:52:26AM +, Tim Bell wrote: It would also be interesting if the features of KVM could be made available through OpenStack around the same time.. virtio-blk data plane would be an example where we can't work out how to exploit it out of the box under OpenStack. The rate of change in QEMU (KVM) is pretty enourmous and many features are not entirely relevant to OpenStack needs. The hard bit though is actually figuring out how to best expose new features via OpenStack. With the cloud paradigm we explicitly want to avoid the end user (the VM instance / image owners) from knowing anything about the compute host hardware. The result is that they will typically not have sufficient knowledge of the system to be able to know whether some new QEMU flag or feature is appropriate to use. So we have to try to design things so that Nova either makes an self-driven policy decision, or define a way for the user or cloud admin to provide hints/preferences via image metadata and/or flavour extra specs, then use this hints to influence the policy decision indirectly. The NUMA work is a good example of this. QEMU provides a feature to let the app define mapping between guest NUMA nodes and host NUMA nodes. The cloud user has no knowledge of the host NUMA topology, so it is impossible for them to take the simply approach to using QEMU's NUMA feature. Instead we have to provide a way for the user or admin to define characteristics of the guest NUMA topology, and then have teh Nova schedular figure out how to map that into the host NUMA topology. This is a pretty major bit of work over above what QEMU provides, so there's inevitably going to a delay between a feature appearing in QEMU and it being available in OpenStack. On your specific point about the virtio blk dataplane option. Libvirt has explicitly not provided any supported way to turn that feature on in the guest XML config, on advice from the QEMU maintainers. This is because the dataplane option is considered a short term hack/experiment to prove a general conceptual idea. Libvirt does allow for QEMU command line option passthrough, but that taints a VM instance as being in an unsupported state. The reason for this is that the dataplane option will be removed from QEMU in favour of a different supportable long term solution, so neither libvirt/QEMU maintainers wish it to be used by production facing apps right now. The long term replacement for dataplane is a new I/O threads option that was recently wired up into QEMU and libvirt. It would be appropriate to look at how to support this I/O threads option in OpenStack now, so please feel free to file a bug requesting it. If you have specific info about usage/deployment scenarios in which dataplane has proved a benefit (or equally a negative), then would be useful to have in the bug too, so we can figure out how to best support it. Ideally we'd not hve to expose this to end users and Nova would just do the right thing to maximise the performance win. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Pulling nova/virt/hardware.py into nova/objects/
As there are multiple interfaces using non versioned dicts and as we are looking at reducing technical debt by Kilo, there are different blueprints which can be worked in parallel. I don't think I disagree with anything above, but I'm not sure what you're getting at. I think the parallelism we should avoid is building models that mirror things we need to send to a remote service and just require conversion. If we're modeling things between the virt driver and the RT (or interface to the RT) that get aggregated or transformed in some way before they leave the service, then that's fine. Here, the virt-to-RT interface has to be objectified, hence Dan's work. On the other end of the RT, the RT-to-scheduler interface has to be objectified, hence Jay and mine's work. Right, but if the RT moves out of the compute service, then we'll need to be using versioned models, regardless of if it's RPC or REST or whatever. So *if* that's going to happen, building an unversioned object hierarchy that then necessarily has to be converted to another form might be work we don't need to do. I hope we will provide a clear big picture and a roadmap for the Summit so we could give you more insights. Right, since that part is still not fully defined, it's hard to know the best course of action going forward. I'd hate to delay any of this work until after summit, but if you think that would be the most efficient, I guess it's only a couple weeks away. Totally agreed. Here there is no need to version the interface as the virt/Rt interface is not RPC based and purely internal to nova-compute. Well, unless the RT is moved outside the compute node, which is (I think) what is being proposed, no? --Dan signature.asc Description: OpenPGP digital signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [all] periodic jobs for master
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Hi all, introducing a new auxiliary feature (e.g. a new messaging backend; some specific configuration of common services, like multiple workers in neutron; a new db driver supported by oslo.db; a plugin that lacks its own third-party CI like linuxbridge in neutron...) in infra usually means creating a separate job that is gating all the patches (sometimes non-voting). It requires a lot of resources on infra side, and for voting jobs, it increases chance of the whole run to fail due to intermittent problems in the gate. So there is a push to avoid adding more gating jobs to projects. I fully support that approach, though I doubt that we should leave the code without any kind of integration testing against master. Lack of such testing means it's hard to propose a change in default components used in gate (like a switch to an eventlet aware db driver that I try to pursue [1]). For stable branches, we have so called periodic jobs that are triggered once in a while against the current code in a stable branch, and report to openstack-stable-maint@ mailing list. An example of failing periodic job report can be found at [2]. I envision that similar approach can be applied to test auxiliary features in gate. So once something is broken in master, the interested parties behind the auxiliary feature will be informed in due time. Now, we could say that functional testing for a component that includes the feature should be enough. But it doesn't seem like the approach is applicable either for system wide changes like switching to Qpid, or running all services against another db driver, or for cases when the service to be tested with a new feature is tightly coupled with core (another neutron plugin). Note that I may miss something infra side, e.g. the approach may actually already be applied in some cases unknown to me, or there are some concerns with the approach. Tell me. [1]: https://review.openstack.org/#/c/125044/ [2]: http://lists.openstack.org/pipermail/openstack-stable-maint/2014-October/002794.html Cheers, /Ihar -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) iQEcBAEBCgAGBQJURoAVAAoJEC5aWaUY1u57LSkH/36lZEMQEptFgTRpbd+2yvWC 5w8kjHTRW1Imri9S1L13lNRBfdLNDMhkoSBr+bXiAJtNV19wZG5b5II4z//0By1M BRI+hwo5VSXRmUAvHuosK+AkkrTpaL0v1rkvgRR3Q7dPyA3Z3zsa2+l/Z5wjrSm2 HQXE9sOfrl2fRMvZNumzOCFq09qxDO1lfVLVyBj9u5vrdh5sbtYOTcTX81F4BkNC 2hQUZ+wvOvsC6H5vFTsTSUo3qPCPUzr8vIL0sLb0mKS7HEVrO7nym7Y6oOq9cNLE 4/xUu6v1AoPJVXpfi9Zvnq/JzyFx/xdrpO2+py3SYoN0pg8W6BjjaN8WsHrCQAk= =Sbk6 -END PGP SIGNATURE- ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] Neutron documentation to update about new vendor plugin, but without code in repository?
Hi Kyle, Can you pls. comment on this discussion and confirm the requirements for getting out-of-tree mechanism_driver listed in the supported plugin/driver list of the Openstack Neutron docs. Thanks, Vad -- On Mon, Oct 20, 2014 at 12:48 PM, Anne Gentle a...@openstack.org wrote: On Mon, Oct 20, 2014 at 2:42 PM, Vadivel Poonathan vadivel.openst...@gmail.com wrote: Hi, * On Fri, Oct 10, 2014 at 7:36 PM, Kevin Benton blak...@gmail.com blak...@gmail.com wrote: I think you will probably have to wait until after the summit so we can see the direction that will be taken with the rest of the in-tree drivers/plugins. It seems like we are moving towards removing all of them so we would definitely need a solution to documenting out-of-tree drivers as you suggested.* [Vad] while i 'm waiting for the conclusion on this subject, i 'm trying to setup the third-party CI/Test system and meet its requirements to get my mechanism_driver listed in the Kilo's documentation, in parallel. Couple of questions/confirmations before i proceed further on this direction... 1) Is there anything more required other than the third-party CI/Test requirements ??.. like should I still need to go-through the entire development process of submit/review/approval of the blue-print and code of my ML2 driver which was already developed and in-use?... The neutron PTL Kyle Mestery can answer if there are any additional requirements. 2) Who is the authority to clarify and confirm the above (and how do i contact them)?... Elections just completed, and the newly elected PTL is Kyle Mestery, http://lists.openstack.org/pipermail/openstack-dev/2014-March/031433.html . Thanks again for your inputs... Regards, Vad -- On Tue, Oct 14, 2014 at 3:17 PM, Anne Gentle a...@openstack.org wrote: On Tue, Oct 14, 2014 at 5:14 PM, Vadivel Poonathan vadivel.openst...@gmail.com wrote: Agreed on the requirements of test results to qualify the vendor plugin to be listed in the upstream docs. Is there any procedure/infrastructure currently available for this purpose?.. Pls. fwd any link/pointers on those info. Here's a link to the third-party testing setup information. http://ci.openstack.org/third_party.html Feel free to keep asking questions as you dig deeper. Thanks, Anne Thanks, Vad -- On Mon, Oct 13, 2014 at 10:25 PM, Akihiro Motoki amot...@gmail.com wrote: I agree with Kevin and Kyle. Even if we decided to use separate tree for neutron plugins and drivers, they still will be regarded as part of the upstream. These plugins/drivers need to prove they are well integrated with Neutron master in some way and gating integration proves it is well tested and integrated. I believe it is a reasonable assumption and requirement that a vendor plugin/driver is listed in the upstream docs. This is a same kind of question as what vendor plugins are tested and worth documented in the upstream docs. I hope you work with the neutron team and run the third party requirements. Thanks, Akihiro On Tue, Oct 14, 2014 at 10:09 AM, Kyle Mestery mest...@mestery.com wrote: On Mon, Oct 13, 2014 at 6:44 PM, Kevin Benton blak...@gmail.com wrote: The OpenStack dev and docs team dont have to worry about gating/publishing/maintaining the vendor specific plugins/drivers. I disagree about the gating part. If a vendor wants to have a link that shows they are compatible with openstack, they should be reporting test results on all patches. A link to a vendor driver in the docs should signify some form of testing that the community is comfortable with. I agree with Kevin here. If you want to play upstream, in whatever form that takes by the end of Kilo, you have to work with the existing third-party requirements and team to take advantage of being a part of things like upstream docs. Thanks, Kyle On Mon, Oct 13, 2014 at 11:33 AM, Vadivel Poonathan vadivel.openst...@gmail.com wrote: Hi, If the plan is to move ALL existing vendor specific plugins/drivers out-of-tree, then having a place-holder within the OpenStack domain would suffice, where the vendors can list their plugins/drivers along with their documentation as how to install and use etc. The main Openstack Neutron documentation page can explain the plugin framework (ml2 type drivers, mechanism drivers, serviec plugin and so on) and its purpose/usage etc, then provide a link to refer the currently supported vendor specific plugins/drivers for more details. That way the documentation will be accurate to what is in-tree and limit the documentation of external plugins/drivers to have just a reference link. So its now vendor's responsibility to keep their driver's up-to-date and their documentation accurate. The OpenStack dev and docs team dont have to worry about gating/publishing/maintaining the vendor specific plugins/drivers. The built-in drivers such
Re: [openstack-dev] Combination of Heat ResourceGroup(index) with Fn::Select doesnt work ?
use a OS::Neutron::PoolMember instead. Then each member template can add itself to the pool. From: Magesh GV [magesh...@oneconvergence.com] Sent: Tuesday, October 21, 2014 12:07 AM To: openstack-dev@lists.openstack.org Subject: [openstack-dev] Combination of Heat ResourceGroup(index) with Fn::Select doesnt work ? I am trying to find a way of creating a dynamic List of Resources(Loadbalancer PoolMembers to be exact) using Heat. The idea is that the number of PoolMembers and the required Addresses would be received as Heat parameters. However, I am unable to get %index% working inside a Fn:Select block. Is this a bug with Heat or am I doing something wrong ? If this is a bug/limitation in heat is there some other way to get what I am trying to do working with heat ? IMO this is a very important usecase for the %index%. Parameters: { NumberOfMembers: { Description: Number of Pool Members to be created, Type: Number, Default: 1 }, MembersList: { Description: Pool Member IP Address, Type: Json, Default: {key0:11.0.0.43} } }, MemberList: { Type: OS::Heat::ResourceGroup, Properties: { count: {Ref:NumberOfMembers}, resource_def: { type: OS::Neutron::PoolMember, properties: { address: { Fn::Select : [ key%index%, {Ref:MembersList}] }, admin_state_up: true, pool_id: {Ref:HaproxyPool}, protocol_port: 80, weight: 1 } } } } Regards, Magesh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [DevStack] Proposal - add support for Markdown for docs
On 10/21/2014 10:37 AM, Collins, Sean wrote: I do see mentions of gh-pages in the build_docs.sh script - is devstack.org redirect to GitHub? Nope - it used to be. It's now built and published like the rest of openstack docs. Now - I don't want to get in the way of the work you're wanting to contribute, because I'm pretty sure that's awesome ... but all of the rest of the OpenStack docs are in RST and not MD, so it seems strange to do new work in a different format. That said - devstack is a bit different of a beast, so maybe it doesn't matter? ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ironic] [Triple-O] Openstack Onboarding
Images that are premade and ready to go would be a huge step in the right direction. You currently are expected to make them all yourself, which involves a lot of work/knowledge. Its great to be able to build them, but right out of the gate, they are too much work for a new user. Thanks, Keivn From: Adam Lawson [alaw...@aqorn.com] Sent: Monday, October 20, 2014 4:47 PM To: OpenStack Development Mailing List (not for usage questions) Subject: [openstack-dev] [Ironic] [Triple-O] Openstack Onboarding I made a similar comment to the Triple-O design summmit etherpad in hopes others have a similar interest in Kilo but I wanted to share evangelize my thoughts with the community for discussion: For better or for worse, one thing I've heard over and over is how Openstack community/TC approves/prefers the use of TripleO and Ironic to deploy Openstack on bare metal. Cool, but for the majority of users considering using Openstack in their organization, the question always goes back to: If I'm not savvy enough yet to install Openstack without these tools, how do I setup TripleO and Ironic? Seems like a chien and egg thing. There has not been much discussion (that I've noticed) re making a deployment process easy to erect. That should be the easy part but it's as confusing as the second part for most who are starting out. Using Openstack to deploy Openstack means the installer method should be straight forward and itself should be easy to install for users with limited understanding of Openstack or the tooling methods used by OOO and Ironic. But the bar to use Openstack continues to be a relatively-high engineering hurdle. It always has been and I'd love to see that change in the next cycle. Something that comes to mind: * Setup Process Definition * Quickstart Wizards * Tooling The above may seem to be dumbing down the process but widespread Openstack adoption requires an easy on-boarding process and so far, it simply doesn't exist. Thoughts? Adam Lawson AQORN, Inc. 427 North Tatnall Street Ste. 58461 Wilmington, Delaware 19801-2230 Toll-free: (844) 4-AQORN-NOW ext. 101 International: +1 302-387-4660 Direct: +1 916-246-2072 [http://www.aqorn.com/images/logo.png] ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [DevStack] Proposal - add support for Markdown for docs
On Tue, Oct 21, 2014 at 11:59:12AM EDT, Monty Taylor wrote: On 10/21/2014 10:37 AM, Collins, Sean wrote: I do see mentions of gh-pages in the build_docs.sh script - is devstack.org redirect to GitHub? Nope - it used to be. It's now built and published like the rest of openstack docs. Now - I don't want to get in the way of the work you're wanting to contribute, because I'm pretty sure that's awesome ... but all of the rest of the OpenStack docs are in RST and not MD, so it seems strange to do new work in a different format. Actually after I sent the e-mail I checked, and Pelican supports multiple formats, including RST. Changing from Markdown to RST isn't too onerous - I'd rather go from Markdown - RST than Markdown - HTML. Thinking a little more, it may be worth cribbing the code from other projects, like how the *-specs repos are generating docs from RST files rather than importing Pelican? -- Sean M. Collins ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Pulling nova/virt/hardware.py into nova/objects/
On Mon, Oct 20, 2014 at 01:38:46PM -0400, Jay Pipes wrote: Hi Dan, Dan, Nikola, all Nova devs, OK, so in reviewing Dan B's patch series that refactors the virt driver's get_available_resource() method [1], I am stuck between two concerns. I like (love even) much of the refactoring work involved in Dan's patches. They replace a whole bunch of our nested dicts that are used in the resource tracker with real objects -- and this is something I've been harping on for months that really hinders developer's understanding of Nova's internals. Yep, as you say one of the problems with understanding WTF.com is going on in the code is that the interface between resource_tracker.py and the virt/driver.py was a completely undocumented dict. Some of the data in the dict got directly copied into the database in whatever format the virt driver sent it in. Other data fields in the dict got over-written by the resource tracker. Other fields got converted into a slightly different format, with extra info added to them. However, all of the object classes that Dan B has introduced have been unversioned objects -- i.e. they have not derived from nova.objects.base.NovaObject. This means that these objects cannot be sent over the wire via an RPC API call. In practical terms, this issue has not yet reared its head, because the resource tracker still sends a dictified JSON representation of the object's fields directly over the wire, in the same format as Icehouse, therefore there have been no breakages in RPC API compatibility. If all the data from the virt driver was going straight into the database or out over the wire, unchanged, then I'd agree that using the versioned objects would clearly make sense. When I started the cleanup though, I got the impression that most the data from the virt driver got changed/munged in some way before hitting the database or RPC layer. There is also the long standing discussions about the extensible resource tracker, that would represent data in the database in a completely generic abstracted way as a list of key/value pairs. So I was imagining that long term what's put in the database by the resource tracker would be in a completely different structure than the data coming out fo the virt drivers. Based on that understanding I felt it would be better to define a clear set of classes solely for the data that's coming out of the virt driver, and de-couple this from the objects used for storing stuff in the database. Of course I've not attempted to tackle the full problemspace of cleaning up the entire resource tracker codebase. I just focused in the interface to the virt drivers. So as you note, in order to maintain compatibility, I was careful to ensure that the classes I defined were able to serialize into the same JSON format as is currently used in the horrible undocumented dicts. I was not really expecting that the to_dict/from_dict/to_json/from_json methods in the virt/hardware.py classes be something we use long termm though. I was just thinking of them as a temporary stepping stone, and that the rest of the people working on the (extensible) resource tracker would eventually convert the RT code to directly read the attributes in the hardware.py classes and use them to populate whatever data format the RT wants to use long term. In particular what I'd like to see is that the virt driver be decoupled from long term changes in the resource tracker code data formats. eg if someone comes along in the L cycle and decides the resource tracker/scheduler would be much more effective if the data was persisted in a new format X, then we ought to avoid having to changing the virt/driver.py virt/hardware.py APIs/classes. The RT code would just use the existing classes and convert into whatever fancy new format is better. The problems with having all these objects not modelled by deriving from nova.objects.base.NovaObject are two-fold: * The object's fields/schema cannot be changed -- or rather, cannot be changed without introducing upgrade problems. * The objects introduce a different way of serializing the object contents than is used in nova/objects -- it's not that much different, but it's different, and only has not caused a problem because the serialization routines are not yet being used to transfer data over the wire So, what to do? Clearly, I think the nova/virt/hardware.py objects are badly needed. However, one of (the top?) priorities of the Nova project is upgradeability, and by not deriving from nova.objects.base.NovaObject, these nova.virt.hardware objects are putting that mission in jeopardy, IMO. My proposal is that before we go and approve any BPs or patches that add to nova/virt/hardware.py, we first put together a patch series that moves the object models in nova/virt/hardware.py to being full-fledged objects in nova/objects/* I really it really depends on how we see the resource tracker data model evolving over the next few
Re: [openstack-dev] [nova] Pulling nova/virt/hardware.py into nova/objects/
On Mon, Oct 20, 2014 at 11:12:57AM -0700, Dan Smith wrote: OK, so in reviewing Dan B's patch series that refactors the virt driver's get_available_resource() method [1], I am stuck between two concerns. I like (love even) much of the refactoring work involved in Dan's patches. They replace a whole bunch of our nested dicts that are used in the resource tracker with real objects -- and this is something I've been harping on for months that really hinders developer's understanding of Nova's internals. dict['line1'] = 'Agreed, this is extremely important stuff.' dict['line2'] = 'The current dict mess that we have there is ' dict['line3'] = 'really obscure and confusing.' reply = jsonutils.dumps(dict) However, all of the object classes that Dan B has introduced have been unversioned objects -- i.e. they have not derived from nova.objects.base.NovaObject. This means that these objects cannot be sent over the wire via an RPC API call. In practical terms, this issue has not yet reared its head, because the resource tracker still sends a dictified JSON representation of the object's fields directly over the wire, in the same format as Icehouse, therefore there have been no breakages in RPC API compatibility. Right, so the blueprint for this work states that it's not to be sent over the RPC wire or stored in the database. However, it already is in some cases (at least the ComputeNode object has the unversioned JSONified version of some of these hardware models in it). If the modeling is purely for internal-to-compute-node purposes, then it's all good. However, it surely seems like with the pending scheduler isolation work, we're in a spot where we are building two parallel model hierarchies, and I'm not really sure why. The rationale behind two parallel data model hiercharies is that the format the virt drivers report data in, is not likely to be exactly the same as the format that the resoure tracker / scheduler wishes to use in the database. If we have a single hierarchy, then whenever we need to update the data format for the schedular to improve its performance or flexibility then we have a ripple effect where we'd have to update all the virt driver implementations too. Based on what I've seen about the extensible resource tracker it seems like over time there was going to be greater divergance between what the virt drivers support and how the ERT wants to persist it after transforming it into a easier to deal with structure. My proposal is that before we go and approve any BPs or patches that add to nova/virt/hardware.py, we first put together a patch series that moves the object models in nova/virt/hardware.py to being full-fledged objects in nova/objects/* I'm not sure that just converting them all to NovaObjects is really necessary here. If it's all stuff that is going to go over the wire eventually as part of the resource tracker's expansion, then probably so. If there are bits of the model that only serve to let the resource tracker do its calculations, then perhaps it doesn't make sense to require those be NovaObjects. Yep, pretty much agree with that position. What's difficult is that we're in a bit of a transition stage where stuff that's reported now does get stuffed straight into the database, but in the future might well undergo translation/calculation prior to being stuff in the database. If we convert everything into NovaObjects based on what is directly stored in the DB today, long term this might have proved to be uneccessary. So I took the pragmatic decision that I'd define some plain Python classes, rather than NovaObject clases on the basis that it was still an improvement over an undocumented dict. And as when the scheduler/resource tracker refactor settled down we could either stick with the plain python classes, or convert them into full Nova objects as applicable. Regardless, it sounds like we need some discussion on how best to proceed here. Since it's entirely wrapped up in the scheduler work, we should definitely try to make sure that what we're doing here fits with those plans. Last I heard, we weren't sure where we were going to draw the line between nova bits and scheduler bits, so erring on the side of more versioned interfaces seems safest to me. FWIW, my patch series is logically split up into two parts. THe first 10 or so patches are just thought of as general cleanup and useful to Nova regardless of what we decide todo. The second 10 or so patches are where the objects start appearing getting used the controversial bits needing mor detailed discussion. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: [openstack-dev] [nova] Pulling nova/virt/hardware.py into nova/objects/
On Tue, Oct 21, 2014 at 08:46:10AM -0700, Dan Smith wrote: As there are multiple interfaces using non versioned dicts and as we are looking at reducing technical debt by Kilo, there are different blueprints which can be worked in parallel. I don't think I disagree with anything above, but I'm not sure what you're getting at. I think the parallelism we should avoid is building models that mirror things we need to send to a remote service and just require conversion. If we're modeling things between the virt driver and the RT (or interface to the RT) that get aggregated or transformed in some way before they leave the service, then that's fine. Here, the virt-to-RT interface has to be objectified, hence Dan's work. On the other end of the RT, the RT-to-scheduler interface has to be objectified, hence Jay and mine's work. Right, but if the RT moves out of the compute service, then we'll need to be using versioned models, regardless of if it's RPC or REST or whatever. So *if* that's going to happen, building an unversioned object hierarchy that then necessarily has to be converted to another form might be work we don't need to do. Yep, that's a bit I'm still pretty fuzzy on myself. I have been imaginging that while the scheduler would split out of nova, the resource_tracker.py code would still remain part of Nova because it is the bit that interfaces with the nova virt driver API. We have historically always considered the Nova virt driver API to be an internal only thing that we retain the right to change at will so anything talking to it would be in-tree. If the resource_tracker.py were to move out of Nova, then this now implies that the get_available_resources() method in virt/driver.py is now defacto stable API between Nova the schedulr project, that we have to preserve compatibility for. If that is ineed the case then clearly using versioned NovaObjects for that method's return value is going to be required. IMHO though this is a not a desirable split of responsibility. Personally I was always expecting that resource_tracker.py would stay part of Nova. It would use the internal get_available_resources() API to talk to the virt driver, and transform the data it got back into whatever format the external scheduler project wants to consume via some formal API and these formats would likely be completely separate. In that world view, there is no need for the get_available_resources() method to use NovaObject. I hope we will provide a clear big picture and a roadmap for the Summit so we could give you more insights. Right, since that part is still not fully defined, it's hard to know the best course of action going forward. I'd hate to delay any of this work until after summit, but if you think that would be the most efficient, I guess it's only a couple weeks away. As mentioned in my other mail, my 20 patch series is split into two real pieces. The first 10 or so patches are just generall cleanup / prep work that is hopefully fairly uncontroversial to consider merging. If the second 10 or so patches have to wait until after the summit, so be it, that's not a big problem. Getting the code structure right is more important long term than a fast merge. Totally agreed. Here there is no need to version the interface as the virt/Rt interface is not RPC based and purely internal to nova-compute. Well, unless the RT is moved outside the compute node, which is (I think) what is being proposed, no? Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [DevStack] Proposal - add support for Markdown for docs
On Tue, Oct 21, 2014 at 11:12 AM, Collins, Sean sean_colli...@cable.comcast.com wrote: On Tue, Oct 21, 2014 at 11:59:12AM EDT, Monty Taylor wrote: On 10/21/2014 10:37 AM, Collins, Sean wrote: I do see mentions of gh-pages in the build_docs.sh script - is devstack.org redirect to GitHub? Nope - it used to be. It's now built and published like the rest of openstack docs. Now - I don't want to get in the way of the work you're wanting to contribute, because I'm pretty sure that's awesome ... but all of the rest of the OpenStack docs are in RST and not MD, so it seems strange to do new work in a different format. Actually after I sent the e-mail I checked, and Pelican supports multiple formats, including RST. Changing from Markdown to RST isn't too onerous - I'd rather go from Markdown - RST than Markdown - HTML. Thinking a little more, it may be worth cribbing the code from other projects, like how the *-specs repos are generating docs from RST files rather than importing Pelican? That seems like a reasonable starting place to me. :) -- Sean M. Collins ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Murano] IRC meeting cancelled today
Folks, I'm canceling our weekly IRC meeting today since we've just released Murano 2014.2 and still working on figuring out roadmap for Kilo cycle. Thanks, Ruslan ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] HA of dhcp agents?
We currently have a mechanism for restarting the DHCP agent on another node, but we'd like the new agent to take over all the old networks of the failed dhcp instance. Right now, since dhcp agents are distinguished by host, and the host has to match the host of the ovs agent, and the ovs agent's host has to be unique per node, the new dhcp agent is registered as a completely new agent and doesn't take over the failed agent's networks. I'm looking for a way to give the new agent the same roles as the previous one. -- Noel On Tue, Oct 21, 2014 at 12:12 AM, Kevin Benton blak...@gmail.com wrote: No, unfortunately when the DHCP agent dies there isn't automatic rescheduling at the moment. On Mon, Oct 20, 2014 at 11:56 PM, Noel Burton-Krahn n...@pistoncloud.com wrote: Thanks for the pointer! I like how the first google hit for this is: Add details on dhcp_agents_per_network option for DHCP agent HA https://bugs.launchpad.net/openstack-manuals/+bug/1370934 :) Seems reasonable to set dhcp_agents_per_network 1. What happens when a DHCP agent dies? Does the scheduler automatically bind another agent to that network? Cheers, -- Noel On Mon, Oct 20, 2014 at 9:03 PM, Jian Wen wenjia...@gmail.com wrote: See dhcp_agents_per_network in neutron.conf. https://bugs.launchpad.net/neutron/+bug/1174132 2014-10-21 6:47 GMT+08:00 Noel Burton-Krahn n...@pistoncloud.com: I've been working on failover for dhcp and L3 agents. I see that in [1], multiple dhcp agents can host the same network. However, it looks like I have to manually assign networks to multiple dhcp agents, which won't work. Shouldn't multiple dhcp agents automatically fail over? [1] http://docs.openstack.org/trunk/config-reference/content/multi_agent_demo_configuration.html ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Best, Jian ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Kevin Benton ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [DevStack] Proposal - add support for Markdown for docs
On 2014-10-21 10:59:12 -0500 (-0500), Monty Taylor wrote: On 10/21/2014 10:37 AM, Collins, Sean wrote: I do see mentions of gh-pages in the build_docs.sh script - is devstack.org redirect to GitHub? Nope - it used to be. It's now built and published like the rest of openstack docs. [...] Well, actually that's still in progress. The last bit we need is to host a redirect somewhere so that devstack.org/(.*) is 301'd to docs.openstack.org/developer/devstack/$1 instead, and then we change the DNS address records to point to wherever we stuck that redirect. -- Jeremy Stanley ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ironic] [Triple-O] Openstack Onboarding
Excerpts from Adam Lawson's message of 2014-10-20 18:47:36 -0500: I made a similar comment to the Triple-O design summmit etherpad in hopes others have a similar interest in Kilo but I wanted to share evangelize my thoughts with the community for discussion: For better or for worse, one thing I've heard over and over is how Openstack community/TC approves/prefers the use of TripleO and Ironic to deploy Openstack on bare metal. Cool, but for the majority of users considering using Openstack in their organization, the question always goes back to: If I'm not savvy enough yet to install Openstack without these tools, how do I setup TripleO and Ironic? Seems like a chien and egg thing. There has not been much discussion (that I've noticed) re making a deployment process easy to erect. That should be the easy part but it's as confusing as the second part for most who are starting out. Using Openstack to deploy Openstack means the installer method should be straight forward and itself should be easy to install for users with limited understanding of Openstack or the tooling methods used by OOO and Ironic. But the bar to use Openstack continues to be a relatively-high engineering hurdle. It always has been and I'd love to see that change in the next cycle. Something that comes to mind: - Setup Process Definition - Quickstart Wizards - Tooling The above may seem to be dumbing down the process but widespread Openstack adoption requires an easy on-boarding process and so far, it simply doesn't exist. You're so right, we do need that. And it is something we've talked about a lot amongst ourselves, but we clearly haven't publicized it. Basically what you described is what Tuskar is intended to be. The basic idea for a TripleO install remains the same as when we first defined it: * Boot a seed VM on a machine physically attached to the provisioning network. Seed VM is an Ironic based single machine OpenStack. * Inform the seed's Ironic about the first real machine you want it to deploy as a deployment cloud (aka undercloud) machine. * Deploy first deployment cloud machine. * Shut down the seed. [1] * Inform deployment cloud of inventory for all hardware. * Deploy the user cloud (aka overcloud) So Tuskar would be a part of that deployment cloud, and would ask you things about your hardware, your desired configuration, and help you get the inventory loaded. So, ideally our gate would leave the images we test as part of the artifacts for build, and we could just distribute those as part of each release. That probably wouldn't be too hard to do, but those images aren't exactly small so I would want to have some kind of strategy for distributing them and limiting the unique images users are exposed to so we're not encouraging people to run CD by downloading each commit's image. There's also, I think, an interesting shift coming that the Kolla project is investigating, which is that perhaps what we should distribute is docker images, not qcow2's. That idea is still pretty new, but we should definitely think about how that might be better for everyone. [1] We don't do this now because there is still some interdependency between seed and undercloud. We're working on it. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] Tentative schedule for Kilo Design Summit in Paris
Thx ~sean On Oct 21, 2014, at 2:18 AM, Thierry Carrez thie...@openstack.org wrote: Sean Roberts wrote: Chance we can move the congress session from Monday 14:30-16:00 to co-locate Tuesday with GBP? either before or after... Let me ask Chris Hoge (who is in charge of the larger ecosystem sessions), see if you could switch from Monday to Tuesday. -- Thierry Carrez (ttx) ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [oslo.db] Add long-lived-transactionalized-db-fixtures - can we comment ?
Hi all - Thanks for all the responses I got on my Make EngineFacade a Facade” spec - plenty of people have commented, pretty much all positively so I’m pretty confident we can start building the basic idea of that out into a new review. I want to point out that there is another, closely related spec that has been around several weeks longer, which is to overhaul the capabilities of our test runner system: Add long-lived-transactionalized-db-fixtures” - https://review.openstack.org/#/c/117335/.I’ve talked about this spec several times before on this list, and it is still out there, and I additionally have most of the implementation working for several weeks now. The spec and implementation has been mostly twisting in the wind, partially due to a little bit of waiting for some possible changes to namespace packages and test invocation, however I’d like to reiterate that A. the whole series works right now independently of those changes (https://review.openstack.org/#/c/117335/, https://review.openstack.org/#/c/120870/), and B. the spec describing the system has hardly been +1/-1’ed by anyone in any case (comments positive or negative are appreciated! I redid the whole thing in response to previous comments many weeks ago). Just to try to pitch this series, yet again, here’s what we get: 1. a solid and extensible system of building up and tearing down databases (provisioning), as well as a DROP ALL of database objects, in a database-specific way (e.g. drops special objects like PG ENUM objects and such). This is completed, works right now. 2. the ability to produce “transactionalized” test fixtures, where you can run as many tests as you want against a single schema that remains in place, each test instead has all of its changes rolled back inside of a transaction. In particular this will make it lots easier for large test suites like Nova’s DB api suite to run against many databases efficiently, as it won’t have to drop and rebuild the whole schema for each test.This mechanism is completed, works right now, as soon as it’s merged I can start doing a proof of concept for Nova’s test_db_api.py. 3. the ability to run non-transactionalized tests like we do now, which going forward would remain appropriate at least for migration tests, on a fixed database-per-subprocess, emitting an unconditional DROP of all objects remaining in the schema at the end of each test without actually dropping the whole database. Completed and works right now, I’ve done a test against Neutron’s migration tests and the optimising suite system works. 4. An overhaul to how connectivity for multiple databases is set up, e.g. Postgresql, MySQL, others. The usual system of “opportunistic” looking around for backends remains unchanged, but you can affect the specific URLs that will be queried, as well as limit the test run to any particular database URL using an environment variable. The system also supports other databases besides the three of SQLite, PG, and MySQL now whereas it had some issues before which would prevent that. Completed and works right now! 5. The ability to have a single test suite run automatically for any number of backends, including future backends that might not be added to oslo.db yet, replacing the current system of subclassing MySQLOpportunisticTest and PostgresqlOpportunisticTest. Not completed! But is fairly trivial. 6. Once the EngineFacade overhaul is in place, the two systems will integrate together so that it will be very simple for projects whose test suite currently runs off of CONF + sqlite will be able to use the new system just by dropping in a mixin class. The four “pillars” I’m trying to get through, hopefully by the end of Kilo are: 1. application connectivity and transaction control 2. test connectivity and transaction 3. query modernization and 4. getting ready for Alembic (where I add SQLite support and multiple branch support). We definitely need #1 and #2, so I’d like to get continued feedback #2 so I can point that in the correct direction. thanks all for your support! ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ironic] disambiguating the term discovery
Having written/worked on a few DC automation tools, Ive typically broken down the process of getting unknown hardware into production in to 4 distinct stages. 1) Discovery (The discovery of unknown hardware) 2) Normalising (Push initial configs like drac/imm/ilo settings, flashing to known good firmware etc etc) 3) Analysis (Figure out what the hardware is and what its constituent parts are cpu/ram/disk/IO caps/serial numbers etc) 4) Burnin (run linpack or equiv tests for 24hrs) At the end of stage 4 the hardware should be ready for provisioning. Hope that helps Stuart On Tue, Oct 21, 2014 at 2:38 AM, Lucas Alvares Gomes lucasago...@gmail.com wrote: On Tue, Oct 21, 2014 at 10:27 AM, Sam Betts (sambetts) sambe...@cisco.com wrote: I agree with Devananda's definition of Œhardware discovery¹ and other tools similar to Ironic use the term discovery in this way, however I have found that these other tools often bundle the gathering of the system properties together with the discovery of the hardware as a single step from a user perspective. I also agree that in Ironic there needs to be a separate term for that (at least from a dev perspective) and I think Lucas¹s suggestion of Œhardware interrogation¹ or something like Œhardware inventory¹ would be more explanatory at first glance than Œintrospection¹. Thanks for the suggestion but no inventory please, this is another taboo word in Ironic. This is because when we say hardware inventory it kinda suggests that Ironic could be used as a CMDB, which is not the case. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- BR, Stuart ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Automatic evacuate
- Original Message - -Original Message- From: Russell Bryant [mailto:rbry...@redhat.com] Sent: October 21, 2014 15:07 To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [Nova] Automatic evacuate On 10/21/2014 06:44 AM, Balázs Gibizer wrote: Hi, Sorry for the top posting but it was hard to fit my complete view inline. I'm also thinking about a possible solution for automatic server evacuation. I see two separate sub problems of this problem: 1)compute node monitoring and fencing, 2)automatic server evacuation Compute node monitoring is currently implemented in servicegroup module of nova. As far as I understand pacemaker is the proposed solution in this thread to solve both monitoring and fencing but we tried and found out that pacemaker_remote on baremetal does not work together with fencing (yet), see [1]. So if we need fencing then either we have to go for normal pacemaker instead of pacemaker_remote but that solution doesn't scale or we configure and call stonith directly when pacemaker detect the compute node failure. I didn't get the same conclusion from the link you reference. It says: That is not to say however that fencing of a baremetal node works any differently than that of a normal cluster-node. The Pacemaker policy engine understands how to fence baremetal remote-nodes. As long as a fencing device exists, the cluster is capable of ensuring baremetal nodes are fenced in the exact same way as normal cluster-nodes are fenced. So, it sounds like the core pacemaker cluster can fence the node to me. I CC'd David Vossel, a pacemaker developer, to see if he can help clarify. It seems there is a contradiction between chapter 1.5 and 7.2 in [1] as 7.2 states: There are some complications involved with understanding a bare-metal node's state that virtual nodes don't have. Once this logic is complete, pacemaker will be able to integrate bare-metal nodes in the same way virtual remote-nodes currently are. Some special considerations for fencing will need to be addressed. Let's wait for David's statement on this. Hey, That's me! I can definitely clear all this up. First off, this document is out of sync with the current state upstream. We're already past Pacemaker v1.1.12 upstream. Section 7.2 of the document being referenced is still talking about future v1.1.11 features. I'll make it simple. If the document references anything that needs to be done in the future, it's already done. Pacemaker remote is feature complete at this point. I've accomplished everything I originally set out to do. I see one change though. In 7.1 I talk about wanting pacemaker to be able to manage resources in containers. I mention something about libvirt sandbox. I scrapped whatever I was doing there. Pacemaker now has docker support. https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/docker I've known this document is out of date. It's on my giant list of things to do. Sorry for any confusion. As far as pacemaker remote and fencing goes, remote-nodes are fenced the exact same way as cluster-nodes. The only consideration that needs to be made is that the cluster-nodes (nodes running the full pacemaker+corosync stack) are the only nodes allowed to initiate fencing. All you have to do is make sure the fencing devices you want to use to fence remote-nodes are accessible to the cluster-nodes. From there you are good to go. Let me know if there's anything else I can clear up. Pacemaker remote was designed to be the solution for the exact scenario you all are discussing here. Compute nodes and pacemaker remote are made for one another :D If anyone is interested in prototyping pacemaker remote for this compute node use case, make sure to include me. I have done quite a bit research into how to maximize pacemaker's ability to scale horizontally. As part of that research I've made a few changes that are directly related to all of this that are not yet in an official pacemaker release. Come to me for the latest rpms and you'll have a less painful experience setting all this up :) -- Vossel Cheers, Gibi -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] Combination of Heat ResourceGroup(index) with Fn::Select doesnt work ?
Hi We are in fact using OS::Neutron::PoolMember resource. I guess ResourceGroup is the only iterative construct in Heat. Is the use case supported today? I think this is more than a simple usage question, hence posting it here. Thank you. Regards Subra On Tue, Oct 21, 2014 at 8:55 AM, Fox, Kevin M kevin@pnnl.gov wrote: use a OS::Neutron::PoolMember instead. Then each member template can add itself to the pool. -- *From:* Magesh GV [magesh...@oneconvergence.com] *Sent:* Tuesday, October 21, 2014 12:07 AM *To:* openstack-dev@lists.openstack.org *Subject:* [openstack-dev] Combination of Heat ResourceGroup(index) with Fn::Select doesnt work ? I am trying to find a way of creating a dynamic List of Resources(Loadbalancer PoolMembers to be exact) using Heat. The idea is that the number of PoolMembers and the required Addresses would be received as Heat parameters. However, I am unable to get %index% working inside a Fn:Select block. Is this a bug with Heat or am I doing something wrong ? If this is a bug/limitation in heat is there some other way to get what I am trying to do working with heat ? IMO this is a very important usecase for the %index%. Parameters: { NumberOfMembers: { Description: Number of Pool Members to be created, Type: Number, Default: 1 }, MembersList: { Description: Pool Member IP Address, Type: Json, Default: {key0:11.0.0.43} } }, MemberList: { Type: OS::Heat::ResourceGroup, Properties: { count: {Ref:NumberOfMembers}, resource_def: { type: OS::Neutron::PoolMember, properties: { address: { Fn::Select : [ key%index%, {Ref:MembersList}] }, admin_state_up: true, pool_id: {Ref:HaproxyPool}, protocol_port: 80, weight: 1 } } } } Regards, Magesh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Thanks OSM (Subrahmanyam Ongole) ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ironic] disambiguating the term discovery
I fully and wholeheartedly agree that inventory management is out of scope of Ironic. But I have a small suggestion: We'd do well as a community to adopt/evangelize an informal rule which I enforce at work (because I see this happen a lot when brainstorming with cross-project goals); we cannot say no (X) without suggesting an alternative (Y)... Like a runner throwing his baton at the next guy in the race instead of handing it to him. ; ) Back on topic however, is there an existing program where inventory data (consumed by Ironic or any other program that needs to know the configuration of hardwareX) could be stored? I.e. hardware catalog? *Adam Lawson* AQORN, Inc. 427 North Tatnall Street Ste. 58461 Wilmington, Delaware 19801-2230 Toll-free: (844) 4-AQORN-NOW ext. 101 International: +1 302-387-4660 Direct: +1 916-246-2072 On Tue, Oct 21, 2014 at 10:29 AM, Stuart Fox stu...@demonware.net wrote: Having written/worked on a few DC automation tools, Ive typically broken down the process of getting unknown hardware into production in to 4 distinct stages. 1) Discovery (The discovery of unknown hardware) 2) Normalising (Push initial configs like drac/imm/ilo settings, flashing to known good firmware etc etc) 3) Analysis (Figure out what the hardware is and what its constituent parts are cpu/ram/disk/IO caps/serial numbers etc) 4) Burnin (run linpack or equiv tests for 24hrs) At the end of stage 4 the hardware should be ready for provisioning. Hope that helps Stuart On Tue, Oct 21, 2014 at 2:38 AM, Lucas Alvares Gomes lucasago...@gmail.com wrote: On Tue, Oct 21, 2014 at 10:27 AM, Sam Betts (sambetts) sambe...@cisco.com wrote: I agree with Devananda's definition of Œhardware discovery¹ and other tools similar to Ironic use the term discovery in this way, however I have found that these other tools often bundle the gathering of the system properties together with the discovery of the hardware as a single step from a user perspective. I also agree that in Ironic there needs to be a separate term for that (at least from a dev perspective) and I think Lucas¹s suggestion of Œhardware interrogation¹ or something like Œhardware inventory¹ would be more explanatory at first glance than Œintrospection¹. Thanks for the suggestion but no inventory please, this is another taboo word in Ironic. This is because when we say hardware inventory it kinda suggests that Ironic could be used as a CMDB, which is not the case. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- BR, Stuart ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Automatic evacuate
On 10/21/2014 07:53 PM, David Vossel wrote: - Original Message - -Original Message- From: Russell Bryant [mailto:rbry...@redhat.com] Sent: October 21, 2014 15:07 To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [Nova] Automatic evacuate On 10/21/2014 06:44 AM, Balázs Gibizer wrote: Hi, Sorry for the top posting but it was hard to fit my complete view inline. I'm also thinking about a possible solution for automatic server evacuation. I see two separate sub problems of this problem: 1)compute node monitoring and fencing, 2)automatic server evacuation Compute node monitoring is currently implemented in servicegroup module of nova. As far as I understand pacemaker is the proposed solution in this thread to solve both monitoring and fencing but we tried and found out that pacemaker_remote on baremetal does not work together with fencing (yet), see [1]. So if we need fencing then either we have to go for normal pacemaker instead of pacemaker_remote but that solution doesn't scale or we configure and call stonith directly when pacemaker detect the compute node failure. I didn't get the same conclusion from the link you reference. It says: That is not to say however that fencing of a baremetal node works any differently than that of a normal cluster-node. The Pacemaker policy engine understands how to fence baremetal remote-nodes. As long as a fencing device exists, the cluster is capable of ensuring baremetal nodes are fenced in the exact same way as normal cluster-nodes are fenced. So, it sounds like the core pacemaker cluster can fence the node to me. I CC'd David Vossel, a pacemaker developer, to see if he can help clarify. It seems there is a contradiction between chapter 1.5 and 7.2 in [1] as 7.2 states: There are some complications involved with understanding a bare-metal node's state that virtual nodes don't have. Once this logic is complete, pacemaker will be able to integrate bare-metal nodes in the same way virtual remote-nodes currently are. Some special considerations for fencing will need to be addressed. Let's wait for David's statement on this. Hey, That's me! I can definitely clear all this up. First off, this document is out of sync with the current state upstream. We're already past Pacemaker v1.1.12 upstream. Section 7.2 of the document being referenced is still talking about future v1.1.11 features. I'll make it simple. If the document references anything that needs to be done in the future, it's already done. Pacemaker remote is feature complete at this point. I've accomplished everything I originally set out to do. I see one change though. In 7.1 I talk about wanting pacemaker to be able to manage resources in containers. I mention something about libvirt sandbox. I scrapped whatever I was doing there. Pacemaker now has docker support. https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/docker I've known this document is out of date. It's on my giant list of things to do. Sorry for any confusion. As far as pacemaker remote and fencing goes, remote-nodes are fenced the exact same way as cluster-nodes. The only consideration that needs to be made is that the cluster-nodes (nodes running the full pacemaker+corosync stack) are the only nodes allowed to initiate fencing. All you have to do is make sure the fencing devices you want to use to fence remote-nodes are accessible to the cluster-nodes. From there you are good to go. Let me know if there's anything else I can clear up. Pacemaker remote was designed to be the solution for the exact scenario you all are discussing here. Compute nodes and pacemaker remote are made for one another :D If anyone is interested in prototyping pacemaker remote for this compute node use case, make sure to include me. I have done quite a bit research into how to maximize pacemaker's ability to scale horizontally. As part of that research I've made a few changes that are directly related to all of this that are not yet in an official pacemaker release. Come to me for the latest rpms and you'll have a less painful experience setting all this up :) -- Vossel Hi Vossel, Could you send us a link to the source RPMs please, we have tested on CentOS7. It might need a recompile. Thank you! Geza Cheers, Gibi -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org
Re: [openstack-dev] [Nova] Automatic evacuate
- Original Message - On Thu, Oct 16, 2014 at 7:48 PM, Jay Pipes jaypi...@gmail.com wrote: While one of us (Jay or me) speaking for the other and saying we agree is a distributed consensus problem that dwarfs the complexity of Paxos You've always had a way with words, Florian :) I knew you'd like that one. :) , *I* for my part do think that an external toolset (i.e. one that lives outside the Nova codebase) is the better approach versus duplicating the functionality of said toolset in Nova. I just believe that the toolset that should be used here is Corosync/Pacemaker and not Ceilometer/Heat. And I believe the former approach leads to *much* fewer necessary code changes *in* Nova than the latter. I agree with you that Corosync/Pacemaker is the tool of choice for monitoring/heartbeat functionality, and is my choice for compute-node-level HA monitoring. For guest-level HA monitoring, I would say use Heat/Ceilometer. For container-level HA monitoring, it looks like fleet or something like Kubernetes would be a good option. Here's why I think that's a bad idea: none of these support the concept of being subordinate to another cluster. Again, suppose a VM stops responding. Then Heat/Ceilometer/Kubernetes/fleet would need to know whether the node hosting the VM is down or not. Only if the node is up or recovered (which Pacemaker would be reponsible for) the VM HA facility would be able to kick in. Effectively you have two views of the cluster membership, and that sort of thing always gets messy. In the HA space we're always facing the same issues when a replication facility (Galera, GlusterFS, DRBD, whatever) has a different view of the cluster membership than the cluster manager itself — which *always* happens for a few seconds on any failover, recovery, or fencing event. Russell's suggestion, by having remote Pacemaker instances on the compute nodes tie in with a Pacemaker cluster on the control nodes, does away with that discrepancy. I'm curious to see how the combination of compute-node-level HA and container-level HA tools will work together in some of the proposed deployment architectures (bare metal + docker containers w/ OpenStack and infrastructure services run in a Kubernetes pod or CoreOS fleet). I have absolutely nothing against an OpenStack cluster using *exclusively* Kubernetes or fleet for HA management, once those have reached sufficient maturity. It's not about reaching sufficient maturity for these two projects. They are on the wrong path to achieve proper HA. Kubernetes and fleet (i'll throw geard into the mix as well) do a great job at distributed management of containers. The difference is instead of integrating with a proper HA stack (like Nova is) kubernetes and fleet are attempting their own HA. In doing this, they've unknowingly blown the scope of their respective projects way beyond what they originally set out to do. Here's the problem. HA is both very misunderstood and deceivingly difficult to achieve. System wide deterministic failover behavior is not a matter of monitoring and restarting failed containers. For kubernetes and fleet to succeed, they will need to integrate with a proper HA stack like pacemaker. Below are some presentation slides on how I envision pacemaker interacting with container orchestration tools. https://github.com/davidvossel/phd/blob/master/doc/presentations/HA_Container_Overview_David_Vossel.pdf?raw=true -- Vossel But just about every significant OpenStack distro out there has settled on Corosync/Pacemaker for the time being. Let's not shove another cluster manager down their throats for little to no real benefit. Cheers, Florian ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [OSSN 0025] Possible Glance image exposure via Swift
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Possible Glance image exposure via Swift - --- ### Summary ### Glance is able to use Swift as a back end for storing virtual machine images. When Glance is configured this way (in multi-tenant mode only), it is possible for unauthenticated users to access public virtual machine images directly from Swift, even though Glance restricts access to those images to authenticated users. ### Affected Services / Software ### Glance, Swift, Havana, Icehouse ### Discussion ### The 'delay_auth_decision' Swift variable modifies the ACL's to either require authentication via Keystone or allow unauthenticated access. When 'delay_auth_decision' is set to '1' the Swift ACL uses a wildcard (*) to accept all incoming responses. When Glance is configured for multi-tenant mode, this will allow all tenants as well as unauthenticated users to have access to the Swift 'public' images. This can happen when Swift and Glance are configured in the following fashion: - begin example Swift proxy-server.conf snippet delay_auth_decision = 1 - end example Swift proxy-server.conf snippet - begin example glance-api.conf snippet default_store = swift swift_store_multi_tenant = True swift_store_create_container_on_put = True - end example glance-api.conf snippet One way to discover the URL is to take a snapshot of a public image. The URL for the snapshot combined with the owner ID of the public image will allow for the Swift URL of the public image to be inferred. This URL can then be utilized anonymously to download the image. ### Recommended Actions ### If your Swift and Glance services are configured in such a way that they are vulnerable, it is recommended that Swift image requests are audited to determine if an unauthorized image request was made. By default when images are accessed a message is logged to the Swift log file. Setting the Swift 'delay_auth_decision' value to '0' (False) will require Keystone authentication to access the Swift containers, and is only recommended for environments using Keystone for authentication. Modifying the Glance configuration to not use Swift in multi-tenant mode will mitigate the issue, but may introduce other issues depending on what configuration is used. Implementing an alternative back end (such as Ceph) will also remove the issue, however will require additional knowledge on how to securely install and configure the new object storage service. The Swift and Glance configuration items are specific to a given environment, so test configurations before deploying them in a production environment. ### Contacts / References ### This OSSN : https://wiki.openstack.org/wiki/OSSN/OSSN-0025 Original LaunchPad Bug : https://bugs.launchpad.net/glance/+bug/1354512 OpenStack Security ML : openstack-secur...@lists.openstack.org OpenStack Security Group : https://launchpad.net/~openstack-ossg -BEGIN PGP SIGNATURE- Version: GnuPG v1 iQEcBAEBAgAGBQJURqfDAAoJEJa+6E7Ri+EVcoQH/ApkXEglyX4wYxk/9NQj2m32 aa/OiHlR3+WpUIxFAbEMIKx9vjF+9govJVIUUy4G4Z1OZZDtJ+JpvmVSqyDkgehe 4Y2RRyikmFoZMGlN+99RNwfMxI0Vve6/JmIf4WsAkCIhbxaydd9SUVI0UfnRfC8Z ObCu76zulSpnpXOlXe1ljoaIRjANGcJXIXey2TY43JLCmZVPyLaTVy/Z76/L/yAi pKKAQYQ9jrqEtdewAsFbauSk95m6Q0e4x8w6ZwAggmu0yQh+2Zb9ZJaPDnZT3/zi rs4zTtiu444zG6o2r9G2h/vLCuWfsP8ovSb9nJc8abumV+XOAi4pKeBNdIcIE8U= =g5o9 -END PGP SIGNATURE- ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [devstack] Enable LVM ephemeral storage for Nova
As a side-note, the new AWS flavors seem to indicate that the Amazon infrastructure is moving to all ECS volumes (and all flash, possibly), both ephemeral and not. This makes sense, as fewer code paths and less interoperability complexity is a good thing. That the same balance of concerns should apply in OpenStack, seems likely. On Tue, Oct 21, 2014 at 7:59 AM, Dan Genin daniel.ge...@jhuapl.edu wrote: Hello, I would like to add to DevStack the ability to stand up Nova with LVM ephemeral storage. Below is a draft of the blueprint describing the proposed feature. Suggestions on architecture, implementation and the blueprint in general are very welcome. Best, Dan Enable LVM ephemeral storage for Nova Currently DevStack supports only file based ephemeral storage for Nova, e.g., raw and qcow2. This is an obstacle to Tempest testing of Nova with LVM ephemeral storage, which in the past has been inadvertantly broken (see for example, https://bugs.launchpad.net/nova/+bug/1373962), and to Tempest testing of new features based on LVM ephemeral storage, such as LVM ephemeral storage encryption. To enable Nova to come up with LVM ephemeral storage it must be provided a volume group. Based on an initial discussion with Dean Troyer, this is best achieved by creating a single volume group for all services that potentially need LVM storage; at the moment these are Nova and Cinder. Implementation of this feature will: * move code in lib/cinder/cinder_backends/lvm to lib/lvm with appropriate modifications * rename the Cinder volume group to something generic, e.g., devstack-vg * modify the Cinder initialization and cleanup code appropriately to use the new volume group * initialize the volume group in stack.sh, shortly before services are launched * cleanup the volume group in unstack.sh after the services have been shutdown The question of how large to make the common Nova-Cinder volume group in order to enable LVM ephemeral Tempest testing will have to be explored. Although, given the tiny instance disks used in Nova Tempest tests, the current Cinder volume group size may already be adequate. No new configuration options will be necessary, assuming the volume group size will not be made configurable. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ironic] disambiguating the term discovery
On Tue, Oct 21, 2014 at 11:26 AM, Adam Lawson alaw...@aqorn.com wrote: I fully and wholeheartedly agree that inventory management is out of scope of Ironic. But I have a small suggestion: We'd do well as a community to adopt/evangelize an informal rule which I enforce at work (because I see this happen a lot when brainstorming with cross-project goals); we cannot say no (X) without suggesting an alternative (Y)... Like a runner throwing his baton at the next guy in the race instead of handing it to him. ; ) Back on topic however, is there an existing program where inventory data (consumed by Ironic or any other program that needs to know the configuration of hardwareX) could be stored? I.e. hardware catalog? Nothing within OpenStack yet does this, and I am not aware of any stackforge projects providing an inventory database / hardware catalog / CMDB. That said, I have been encouraging people to look at integration between existing (enterprise or opensource) inventory management systems and Ironic. My discussions with enterprise CMDB vendors have so far been positive around the current logical boundary (Ironic is a provisioning tool, not a full CMDB). -Devananda ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Cells conversation starter
On 10/21/2014 04:31 AM, Nikola Đipanov wrote: On 10/20/2014 08:00 PM, Andrew Laski wrote: One of the big goals for the Kilo cycle by users and developers of the cells functionality within Nova is to get it to a point where it can be considered a first class citizen of Nova. Ultimately I think this comes down to getting it tested by default in Nova jobs, and making it easy for developers to work with. But there's a lot of work to get there. In order to raise awareness of this effort, and get the conversation started on a few things, I've summarized a little bit about cells and this effort below. Goals: Testing of a single cell setup in the gate. Feature parity. Make cells the default implementation. Developers write code once and it works for cells. Ultimately the goal is to improve maintainability of a large feature within the Nova code base. Thanks for the write-up Andrew! Some thoughts/questions below. Looking forward to the discussion on some of these topics, and would be happy to review the code once we get to that point. Feature gaps: Host aggregates Security groups Server groups Shortcomings: Flavor syncing This needs to be addressed now. Cells scheduling/rescheduling Instances can not currently move between cells These two won't affect the default one cell setup so they will be addressed later. What does cells do: Schedule an instance to a cell based on flavor slots available. Proxy API requests to the proper cell. Keep a copy of instance data at the global level for quick retrieval. Sync data up from a child cell to keep the global level up to date. Simplifying assumptions: Cells will be treated as a two level tree structure. Are we thinking of making this official by removing code that actually allows cells to be an actual tree of depth N? I am not sure if doing so would be a win, although it does complicate the RPC/Messaging/State code a bit, but if it's not being used, even though a nice generalization, why keep it around? My preference would be to remove that code since I don't envision anyone writing tests to ensure that functionality works and/or doesn't regress. But there's the challenge of not knowing if anyone is actually relying on that behavior. So initially I'm not creating a specific work item to remove it. But I think it needs to be made clear that it's not officially supported and may get removed unless a case is made for keeping it and work is put into testing it. Plan: Fix flavor breakage in child cell which causes boot tests to fail. Currently the libvirt driver needs flavor.extra_specs which is not synced to the child cell. Some options are to sync flavor and extra specs to child cell db, or pass full data with the request. https://review.openstack.org/#/c/126620/1 offers a means of passing full data with the request. Determine proper switches to turn off Tempest tests for features that don't work with the goal of getting a voting job. Once this is in place we can move towards feature parity and work on internal refactorings. Work towards adding parity for host aggregates, security groups, and server groups. They should be made to work in a single cell setup, but the solution should not preclude them from being used in multiple cells. There needs to be some discussion as to whether a host aggregate or server group is a global concept or per cell concept. Have there been any previous discussions on this topic? If so I'd really like to read up on those to make sure I understand the pros and cons before the summit session. The only discussion I'm aware of is some comments on https://review.openstack.org/#/c/59101/ , though they mention a discussion at the Utah mid-cycle. The main con I'm aware of for defining these as global concepts is that there is no rescheduling capability in the cells scheduler. So if a build is sent to a cell with a host aggregate that can't fit that instance the build will fail even though there may be space in that host aggregate from a global perspective. That should be somewhat straightforward to address though. I think it makes sense to define these as global concepts. But these are features that aren't used with cells yet so I haven't put a lot of thought into potential arguments or cases for doing this one way or another. Work towards merging compute/api.py and compute/cells_api.py so that developers only need to make changes/additions in once place. The goal is for as much as possible to be hidden by the RPC layer, which will determine whether a call goes to a compute/conductor/cell. For syncing data between cells, look at using objects to handle the logic of writing data to the cell/parent and then syncing the data to the other. Some of that work has been done already, although in a somewhat ad-hoc fashion, were you thinking of extending objects to support this natively (whatever that means), or do we continue to inline the code in the existing object methods.
Re: [openstack-dev] [devstack] Enable LVM ephemeral storage for Nova
Sharing the vg with cinder is likely to cause some pain testing proposed features cinder reconciling backend with the cinder db. Creating a second vg sharing the same backend pv is easy and avoids all such problems. Duncan Thomas On Oct 21, 2014 4:07 PM, Dan Genin daniel.ge...@jhuapl.edu wrote: Hello, I would like to add to DevStack the ability to stand up Nova with LVM ephemeral storage. Below is a draft of the blueprint describing the proposed feature. Suggestions on architecture, implementation and the blueprint in general are very welcome. Best, Dan Enable LVM ephemeral storage for Nova Currently DevStack supports only file based ephemeral storage for Nova, e.g., raw and qcow2. This is an obstacle to Tempest testing of Nova with LVM ephemeral storage, which in the past has been inadvertantly broken (see for example, https://bugs.launchpad.net/nova/+bug/1373962), and to Tempest testing of new features based on LVM ephemeral storage, such as LVM ephemeral storage encryption. To enable Nova to come up with LVM ephemeral storage it must be provided a volume group. Based on an initial discussion with Dean Troyer, this is best achieved by creating a single volume group for all services that potentially need LVM storage; at the moment these are Nova and Cinder. Implementation of this feature will: * move code in lib/cinder/cinder_backends/lvm to lib/lvm with appropriate modifications * rename the Cinder volume group to something generic, e.g., devstack-vg * modify the Cinder initialization and cleanup code appropriately to use the new volume group * initialize the volume group in stack.sh, shortly before services are launched * cleanup the volume group in unstack.sh after the services have been shutdown The question of how large to make the common Nova-Cinder volume group in order to enable LVM ephemeral Tempest testing will have to be explored. Although, given the tiny instance disks used in Nova Tempest tests, the current Cinder volume group size may already be adequate. No new configuration options will be necessary, assuming the volume group size will not be made configurable. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [devstack] Enable LVM ephemeral storage for Nova
Did you mean EBS? I thought it was generally hard to get the same kind of performance from block storage that local ephemeral storage provides but perhaps Amazon has found a way. Life would certainly be much simpler with a single ephemeral backend. Storage pools (https://blueprints.launchpad.net/nova/+spec/use-libvirt-storage-pools) should provide some of the same benefits. On 10/21/2014 02:54 PM, Preston L. Bannister wrote: As a side-note, the new AWS flavors seem to indicate that the Amazon infrastructure is moving to all ECS volumes (and all flash, possibly), both ephemeral and not. This makes sense, as fewer code paths and less interoperability complexity is a good thing. That the same balance of concerns should apply in OpenStack, seems likely. On Tue, Oct 21, 2014 at 7:59 AM, Dan Genin daniel.ge...@jhuapl.edu mailto:daniel.ge...@jhuapl.edu wrote: Hello, I would like to add to DevStack the ability to stand up Nova with LVM ephemeral storage. Below is a draft of the blueprint describing the proposed feature. Suggestions on architecture, implementation and the blueprint in general are very welcome. Best, Dan Enable LVM ephemeral storage for Nova Currently DevStack supports only file based ephemeral storage for Nova, e.g., raw and qcow2. This is an obstacle to Tempest testing of Nova with LVM ephemeral storage, which in the past has been inadvertantly broken (see for example, https://bugs.launchpad.net/nova/+bug/1373962), and to Tempest testing of new features based on LVM ephemeral storage, such as LVM ephemeral storage encryption. To enable Nova to come up with LVM ephemeral storage it must be provided a volume group. Based on an initial discussion with Dean Troyer, this is best achieved by creating a single volume group for all services that potentially need LVM storage; at the moment these are Nova and Cinder. Implementation of this feature will: * move code in lib/cinder/cinder_backends/lvm to lib/lvm with appropriate modifications * rename the Cinder volume group to something generic, e.g., devstack-vg * modify the Cinder initialization and cleanup code appropriately to use the new volume group * initialize the volume group in stack.sh, shortly before services are launched * cleanup the volume group in unstack.sh after the services have been shutdown The question of how large to make the common Nova-Cinder volume group in order to enable LVM ephemeral Tempest testing will have to be explored. Although, given the tiny instance disks used in Nova Tempest tests, the current Cinder volume group size may already be adequate. No new configuration options will be necessary, assuming the volume group size will not be made configurable. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org mailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev smime.p7s Description: S/MIME Cryptographic Signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [devstack] Enable LVM ephemeral storage for Nova
Do you mean that Cinder will be confused by Nova's volumes in its volume group? Yeah, sure that would be similarly easy to implement. Thank you for the suggestion! Dan On 10/21/2014 03:10 PM, Duncan Thomas wrote: Sharing the vg with cinder is likely to cause some pain testing proposed features cinder reconciling backend with the cinder db. Creating a second vg sharing the same backend pv is easy and avoids all such problems. Duncan Thomas On Oct 21, 2014 4:07 PM, Dan Genin daniel.ge...@jhuapl.edu mailto:daniel.ge...@jhuapl.edu wrote: Hello, I would like to add to DevStack the ability to stand up Nova with LVM ephemeral storage. Below is a draft of the blueprint describing the proposed feature. Suggestions on architecture, implementation and the blueprint in general are very welcome. Best, Dan Enable LVM ephemeral storage for Nova Currently DevStack supports only file based ephemeral storage for Nova, e.g., raw and qcow2. This is an obstacle to Tempest testing of Nova with LVM ephemeral storage, which in the past has been inadvertantly broken (see for example, https://bugs.launchpad.net/nova/+bug/1373962), and to Tempest testing of new features based on LVM ephemeral storage, such as LVM ephemeral storage encryption. To enable Nova to come up with LVM ephemeral storage it must be provided a volume group. Based on an initial discussion with Dean Troyer, this is best achieved by creating a single volume group for all services that potentially need LVM storage; at the moment these are Nova and Cinder. Implementation of this feature will: * move code in lib/cinder/cinder_backends/lvm to lib/lvm with appropriate modifications * rename the Cinder volume group to something generic, e.g., devstack-vg * modify the Cinder initialization and cleanup code appropriately to use the new volume group * initialize the volume group in stack.sh, shortly before services are launched * cleanup the volume group in unstack.sh after the services have been shutdown The question of how large to make the common Nova-Cinder volume group in order to enable LVM ephemeral Tempest testing will have to be explored. Although, given the tiny instance disks used in Nova Tempest tests, the current Cinder volume group size may already be adequate. No new configuration options will be necessary, assuming the volume group size will not be made configurable. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org mailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev smime.p7s Description: S/MIME Cryptographic Signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [devstack] Enable LVM ephemeral storage for Nova
Yes, I meant EBS not ECS. Too many similar acronyms... The thing about the Amazon folk is that they collect a lot of metrics, and pretty much do everything on a fairly empirical basis. This is a huge advantage. Starting thinking about what I could with good metrics and building on the performance characteristics of flash. Turns out ... I can see how this could work (and very, very well). But that requires a much longer write-up than I have time for at the moment. On Tue, Oct 21, 2014 at 12:11 PM, Dan Genin daniel.ge...@jhuapl.edu wrote: Did you mean EBS? I thought it was generally hard to get the same kind of performance from block storage that local ephemeral storage provides but perhaps Amazon has found a way. Life would certainly be much simpler with a single ephemeral backend. Storage pools ( https://blueprints.launchpad.net/nova/+spec/use-libvirt-storage-pools) should provide some of the same benefits. On 10/21/2014 02:54 PM, Preston L. Bannister wrote: As a side-note, the new AWS flavors seem to indicate that the Amazon infrastructure is moving to all ECS volumes (and all flash, possibly), both ephemeral and not. This makes sense, as fewer code paths and less interoperability complexity is a good thing. That the same balance of concerns should apply in OpenStack, seems likely. On Tue, Oct 21, 2014 at 7:59 AM, Dan Genin daniel.ge...@jhuapl.edu wrote: Hello, I would like to add to DevStack the ability to stand up Nova with LVM ephemeral storage. Below is a draft of the blueprint describing the proposed feature. Suggestions on architecture, implementation and the blueprint in general are very welcome. Best, Dan Enable LVM ephemeral storage for Nova Currently DevStack supports only file based ephemeral storage for Nova, e.g., raw and qcow2. This is an obstacle to Tempest testing of Nova with LVM ephemeral storage, which in the past has been inadvertantly broken (see for example, https://bugs.launchpad.net/nova/+bug/1373962), and to Tempest testing of new features based on LVM ephemeral storage, such as LVM ephemeral storage encryption. To enable Nova to come up with LVM ephemeral storage it must be provided a volume group. Based on an initial discussion with Dean Troyer, this is best achieved by creating a single volume group for all services that potentially need LVM storage; at the moment these are Nova and Cinder. Implementation of this feature will: * move code in lib/cinder/cinder_backends/lvm to lib/lvm with appropriate modifications * rename the Cinder volume group to something generic, e.g., devstack-vg * modify the Cinder initialization and cleanup code appropriately to use the new volume group * initialize the volume group in stack.sh, shortly before services are launched * cleanup the volume group in unstack.sh after the services have been shutdown The question of how large to make the common Nova-Cinder volume group in order to enable LVM ephemeral Tempest testing will have to be explored. Although, given the tiny instance disks used in Nova Tempest tests, the current Cinder volume group size may already be adequate. No new configuration options will be necessary, assuming the volume group size will not be made configurable. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing listOpenStack-dev@lists.openstack.orghttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] Combination of Heat ResourceGroup(index) with Fn::Select doesnt work ?
On 21/10/14 14:05, Subrahmanyam Ongole wrote: Hi We are in fact using OS::Neutron::PoolMember resource. I guess ResourceGroup is the only iterative construct in Heat. Is the use case supported today? I think this is more than a simple usage question, hence posting it here. Thank you. Regards Subra On Tue, Oct 21, 2014 at 8:55 AM, Fox, Kevin M kevin@pnnl.gov wrote: use a OS::Neutron::PoolMember instead. Then each member template can add itself to the pool. -- *From:* Magesh GV [magesh...@oneconvergence.com] *Sent:* Tuesday, October 21, 2014 12:07 AM *To:* openstack-dev@lists.openstack.org *Subject:* [openstack-dev] Combination of Heat ResourceGroup(index) with Fn::Select doesnt work ? I am trying to find a way of creating a dynamic List of Resources(Loadbalancer PoolMembers to be exact) using Heat. The idea is that the number of PoolMembers and the required Addresses would be received as Heat parameters. However, I am unable to get %index% working inside a Fn:Select block. Yeah, it doesn't work inside intrinsic functions. Is this a bug with Heat or am I doing something wrong ? It's arguable ;) If this is a bug/limitation in heat is there some other way to get what I am trying to do working with heat ? You should define the PoolMember near the server it is linking to the pool, not in a separate ResourceGroup. Group together the things you want to scale as a group in a separate template and scale them as a unit. IMO this is a very important usecase for the %index%. I'm very uneasy about %index% existing at all, and I'm not at all convinced that allowing something like this would achieve anything beyond making bad template design more tempting. Can you talk more about why you need to define the PoolMember resources in a different place to the actual pool members? cheers, Zane. Parameters: { NumberOfMembers: { Description: Number of Pool Members to be created, Type: Number, Default: 1 }, MembersList: { Description: Pool Member IP Address, Type: Json, Default: {key0:11.0.0.43} } }, MemberList: { Type: OS::Heat::ResourceGroup, Properties: { count: {Ref:NumberOfMembers}, resource_def: { type: OS::Neutron::PoolMember, properties: { address: { Fn::Select : [ key%index%, {Ref:MembersList}] }, admin_state_up: true, pool_id: {Ref:HaproxyPool}, protocol_port: 80, weight: 1 } } } } Regards, Magesh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [OSSN 0039] Configuring OpenStack deployments to prevent POODLE attacks
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Configuring OpenStack deployments to prevent POODLE attacks - --- ### Summary ### POODLE (CVE-2014-3566) is a new attack on SSLv3 that allows an active network-based attacker to recover the plaintext from a secure connection using a CBC-mode cipher. Unfortunately, all other cipher modes in SSLv3 are also insecure. Therefore, the recommended solution is to disable SSLv3. We also discuss an alternative option below. Proper mitigation requires addressing this issue on SSLv3 clients and servers. ### Affected Services / Software ### Any service using SSLv3. Depending on the backend SSL library, this can include many components of an OpenStack cloud: - - OpenStack services - - OpenStack clients - - Web servers (Apache, Nginx, etc) - - SSL/TLS terminators (Stud, Pound, etc) - - Proxy services (HAProxy, etc) - - Miscellaneous services (eventlet, syslog, ldap, smtp, etc) ### Discussion ### The POODLE attack was first announced on 14 Oct 2014 [1]. For a deeper technical discussion on POODLE, we refer you to the security advisory at openssl.org [2] and Daniel Franke's write-up [3]. POODLE affects any SSL/TLS connection that can be downgraded to SSLv3. This requires both the client and the server to support SSLv3. Due to the way the protocol negotiations work, an attacker positioned on the network between the client and the server can force a downgrade to SSLv3 by selectively dropping network packets. The best remediation for POODLE is to disable SSLv3 on all clients and servers that you control. This will protect you regardless of the mitigation status on the other end of the connection. An alternative option is to deploy TLS_FALLBACK_SCSV, which will prevent the downgrade attack, but could still allow SSLv3 connections if that is the only supported protocol between the client and server. Any connection that happens over SSLv3 using a CBC-mode cipher would still be vulnerable. You can use the OpenSSL s_client tool to test if a server allows SSLv3 connections: openssl s_client -connect domain name:port -ssl3 If the server does not support SSLv3, you will see a handshake failure message. This indicates that the server does not accept SSLv3 connections. Assuming this server also has SSLv2 disabled, which is a common default today, then no further configuration is needed. If the handshake from s_client completes, then the server requires some configuration. Note that you can perform this step for any service that has SSL/TLS enabled including OpenStack API endpoints. Testing clients is slightly more cumbersome because there are likely many more clients than servers throughout the cloud. However, this test follows a similar pattern. Using the OpenSSL s_server tool, you can create an endpoint that only accepts SSLv3: openssl s_server -cert filename -key filename -state \ -ssl3 -no_ssl2 -no_tls1 -no_tls1_1 -no_tls1_2 \ -tlsextdebug If the client can connect to this endpoint, the client needs to update their configuration as described below. ### Recommended Actions ### We recommend disabling SSLv3 altogether and will provide additional guidance on how to do this below. If SSLv3 is still required on your system due to client compatibility concerns, then TLS_FALLBACK_SCSV is your only choice. In this case you will need an underlying library that supports TLS_FALLBACK_SCSV (such as OpenSSL 1.0.1j). Applications using OpenSSL will automatically start using TLS_FALLBACK_SCSV once OpenSSL is updated. You should perform an audit in your cloud to verify that all SSL/TLS services do use this new library: ldd path to binary that uses OpenSSL | grep ssl Review the output and ensure that it is linked to the new version of OpenSSL that includes TLS_FALLBACK_SCSV support. Disabling SSLv3 can be done at either the application level or the library level. Doing it at the library level ensures consistency throughout the cloud. However, if you are not already compiling OpenSSL then this may not fit into your deployment workflow. In this case, you must consider each application in turn. If you are able to recompile your SSL/TLS library, then this is likely the best option. Disabling SSLv3 at the library level ensures consistency across the system. For OpenSSL, you can use the no-ssl3 build option. Then deploy that library to your cloud and verify that all SSL/TLS services are using the library using the ldd command discussed above. If you are unable to recompile your SSL/TLS library, then you should reconfigure each application that uses SSL/TLS. Each application has a different way to handle this configuration. We provide the configuration option for some of the more common applications below: Apache: SSLProtocol All -SSLv2 -SSLv3 Nginx: ssl_protocols TLSv1 TLSv1.1 TLSv1.2; Stud: Requires code change, see [4]. After code change use --tls option. Pound: Requires code change, see [5]. After code change use DisableSSLv3
Re: [openstack-dev] [neutron] HA of dhcp agents?
As far as I can tell when you specify: dhcp_agents_per_network = X 1 The server binds the network to all the agents (up to X), which means that you have multiple instances of dnsmasq serving dhcp requests at the same time. If one agent dies, there is no fail-over needed per se, as the other agent will continue to server dhcp requests unaffected. For instance, in my env I have dhcp_agents_per_network=2, so If I create a network, and list the agents serving the network I will see the following: neutron dhcp-agent-list-hosting-net test +--+++---+ | id | host | admin_state_up | alive | +--+++---+ | 6dd09649-5e24-403b-9654-7aa0f69f04fb | host1 | True | :-) | | 7d47721a-2725-45f8-b7c4-2731cfabdb48 | host2 | True | :-) | +--+++---+ Isn't that what you're after? Cheers, Armando On 21 October 2014 22:26, Noel Burton-Krahn n...@pistoncloud.com wrote: We currently have a mechanism for restarting the DHCP agent on another node, but we'd like the new agent to take over all the old networks of the failed dhcp instance. Right now, since dhcp agents are distinguished by host, and the host has to match the host of the ovs agent, and the ovs agent's host has to be unique per node, the new dhcp agent is registered as a completely new agent and doesn't take over the failed agent's networks. I'm looking for a way to give the new agent the same roles as the previous one. -- Noel On Tue, Oct 21, 2014 at 12:12 AM, Kevin Benton blak...@gmail.com wrote: No, unfortunately when the DHCP agent dies there isn't automatic rescheduling at the moment. On Mon, Oct 20, 2014 at 11:56 PM, Noel Burton-Krahn n...@pistoncloud.com wrote: Thanks for the pointer! I like how the first google hit for this is: Add details on dhcp_agents_per_network option for DHCP agent HA https://bugs.launchpad.net/openstack-manuals/+bug/1370934 :) Seems reasonable to set dhcp_agents_per_network 1. What happens when a DHCP agent dies? Does the scheduler automatically bind another agent to that network? Cheers, -- Noel On Mon, Oct 20, 2014 at 9:03 PM, Jian Wen wenjia...@gmail.com wrote: See dhcp_agents_per_network in neutron.conf. https://bugs.launchpad.net/neutron/+bug/1174132 2014-10-21 6:47 GMT+08:00 Noel Burton-Krahn n...@pistoncloud.com: I've been working on failover for dhcp and L3 agents. I see that in [1], multiple dhcp agents can host the same network. However, it looks like I have to manually assign networks to multiple dhcp agents, which won't work. Shouldn't multiple dhcp agents automatically fail over? [1] http://docs.openstack.org/trunk/config-reference/content/multi_agent_demo_configuration.html ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Best, Jian ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Kevin Benton ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [neutron] Design Summit Slots and Planning
We've been iterating on Summit ideas on an etherpad [1] for the past month or so now. Thanks to everyone for placing your ideas there! For the past week, the Neutron drivers team has been taking a crack at prioritizing these in preparation for the upcoming Summit. Based on the priorities I've outlined for Kilo here [2], and the fact we have less slots for face to face time in Paris, we've come up with a distilled etherpad [3] which has an early pass at what we'll be talking about in Paris. We'll discuss this during the neutron-drivers IRC meeting tomorrow. Please attend if you have an interest in this area. Thanks! Kyle [1] https://etherpad.openstack.org/p/kilo-neutron-summit-topics [2] http://lists.openstack.org/pipermail/openstack-dev/2014-October/047954.html [3] https://etherpad.openstack.org/p/kilo-neutron-summit-topics-distilled [4] https://wiki.openstack.org/wiki/Meetings/NeutronDrivers ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] HA of dhcp agents?
Hi Armando, Sort of... but what happens when the second one dies? If one DHCP agent dies, I need to be able to start a new DHCP agent on another host and take over from it. As far as I can tell right now, when one DHCP agent dies, another doesn't take up the slack. I have the same problem wit L3 agents by the way, that's next on my list -- Noel On Tue, Oct 21, 2014 at 12:52 PM, Armando M. arma...@gmail.com wrote: As far as I can tell when you specify: dhcp_agents_per_network = X 1 The server binds the network to all the agents (up to X), which means that you have multiple instances of dnsmasq serving dhcp requests at the same time. If one agent dies, there is no fail-over needed per se, as the other agent will continue to server dhcp requests unaffected. For instance, in my env I have dhcp_agents_per_network=2, so If I create a network, and list the agents serving the network I will see the following: neutron dhcp-agent-list-hosting-net test +--+++---+ | id | host | admin_state_up | alive | +--+++---+ | 6dd09649-5e24-403b-9654-7aa0f69f04fb | host1 | True | :-) | | 7d47721a-2725-45f8-b7c4-2731cfabdb48 | host2 | True | :-) | +--+++---+ Isn't that what you're after? Cheers, Armando On 21 October 2014 22:26, Noel Burton-Krahn n...@pistoncloud.com wrote: We currently have a mechanism for restarting the DHCP agent on another node, but we'd like the new agent to take over all the old networks of the failed dhcp instance. Right now, since dhcp agents are distinguished by host, and the host has to match the host of the ovs agent, and the ovs agent's host has to be unique per node, the new dhcp agent is registered as a completely new agent and doesn't take over the failed agent's networks. I'm looking for a way to give the new agent the same roles as the previous one. -- Noel On Tue, Oct 21, 2014 at 12:12 AM, Kevin Benton blak...@gmail.com wrote: No, unfortunately when the DHCP agent dies there isn't automatic rescheduling at the moment. On Mon, Oct 20, 2014 at 11:56 PM, Noel Burton-Krahn n...@pistoncloud.com wrote: Thanks for the pointer! I like how the first google hit for this is: Add details on dhcp_agents_per_network option for DHCP agent HA https://bugs.launchpad.net/openstack-manuals/+bug/1370934 :) Seems reasonable to set dhcp_agents_per_network 1. What happens when a DHCP agent dies? Does the scheduler automatically bind another agent to that network? Cheers, -- Noel On Mon, Oct 20, 2014 at 9:03 PM, Jian Wen wenjia...@gmail.com wrote: See dhcp_agents_per_network in neutron.conf. https://bugs.launchpad.net/neutron/+bug/1174132 2014-10-21 6:47 GMT+08:00 Noel Burton-Krahn n...@pistoncloud.com: I've been working on failover for dhcp and L3 agents. I see that in [1], multiple dhcp agents can host the same network. However, it looks like I have to manually assign networks to multiple dhcp agents, which won't work. Shouldn't multiple dhcp agents automatically fail over? [1] http://docs.openstack.org/trunk/config-reference/content/multi_agent_demo_configuration.html ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Best, Jian ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Kevin Benton ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [devstack] Enable LVM ephemeral storage for Nova
So then it is probably best to leave existing Cinder LVM code in lib/cinder_backends/lvm alone and create a similar set of lvm scripts for Nova, perhaps in lib/nova_backends/lvm? Dan On 10/21/2014 03:10 PM, Duncan Thomas wrote: Sharing the vg with cinder is likely to cause some pain testing proposed features cinder reconciling backend with the cinder db. Creating a second vg sharing the same backend pv is easy and avoids all such problems. Duncan Thomas On Oct 21, 2014 4:07 PM, Dan Genin daniel.ge...@jhuapl.edu mailto:daniel.ge...@jhuapl.edu wrote: Hello, I would like to add to DevStack the ability to stand up Nova with LVM ephemeral storage. Below is a draft of the blueprint describing the proposed feature. Suggestions on architecture, implementation and the blueprint in general are very welcome. Best, Dan Enable LVM ephemeral storage for Nova Currently DevStack supports only file based ephemeral storage for Nova, e.g., raw and qcow2. This is an obstacle to Tempest testing of Nova with LVM ephemeral storage, which in the past has been inadvertantly broken (see for example, https://bugs.launchpad.net/nova/+bug/1373962), and to Tempest testing of new features based on LVM ephemeral storage, such as LVM ephemeral storage encryption. To enable Nova to come up with LVM ephemeral storage it must be provided a volume group. Based on an initial discussion with Dean Troyer, this is best achieved by creating a single volume group for all services that potentially need LVM storage; at the moment these are Nova and Cinder. Implementation of this feature will: * move code in lib/cinder/cinder_backends/lvm to lib/lvm with appropriate modifications * rename the Cinder volume group to something generic, e.g., devstack-vg * modify the Cinder initialization and cleanup code appropriately to use the new volume group * initialize the volume group in stack.sh, shortly before services are launched * cleanup the volume group in unstack.sh after the services have been shutdown The question of how large to make the common Nova-Cinder volume group in order to enable LVM ephemeral Tempest testing will have to be explored. Although, given the tiny instance disks used in Nova Tempest tests, the current Cinder volume group size may already be adequate. No new configuration options will be necessary, assuming the volume group size will not be made configurable. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org mailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev smime.p7s Description: S/MIME Cryptographic Signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Pulling nova/virt/hardware.py into nova/objects/
The rationale behind two parallel data model hiercharies is that the format the virt drivers report data in, is not likely to be exactly the same as the format that the resoure tracker / scheduler wishes to use in the database. Yeah, and in cases where we know where that line is, it makes sense to use the lighter-weight modeling for sure. FWIW, my patch series is logically split up into two parts. THe first 10 or so patches are just thought of as general cleanup and useful to Nova regardless of what we decide todo. The second 10 or so patches are where the objects start appearing getting used the controversial bits needing mor detailed discussion. Right, so after some discussion I think we should go ahead and merge the bottom of this set (all of them are now acked I think) and continue the discussion on the top half where the modeling is introduced. Thanks! --Dan signature.asc Description: OpenPGP digital signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [NFV] NFV BoF session for OpenStack Summit Paris
- Original Message - From: Steve Gordon sgor...@redhat.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Hi all, I took an action item in one of the meetings to try and find a date/time/space to do another NFV BoF session for Paris to take advantage of the fact that many of us will be in attendance for a face to face session. To try and avoid clashing with the general and design summit sessions I am proposing that we meet either before the sessions start one morning, during the lunch break, or after the sessions finish for the day. For the lunch sessions the meeting would be shorter to ensure people actually have time to grab lunch beforehand. I've put together a form here, please register your preferred date/time if you would be interested in attending an NFV BoF session: http://doodle.com/qchvmn4sw5x39cps I will try and work out the *where* once we have a clear picture of the preferences for the above. We can discuss further in the weekly meeting. Thanks! Steve [1] https://openstacksummitnovember2014paris.sched.org/event/f5bcb6033064494390342031e48747e3#.VEWEIOKmhkM Hi all, I have just noticed an update on a conversation I had been following on the community list: http://lists.openstack.org/pipermail/community/2014-October/000921.html It seems like after hours use of the venue will not be an option in Paris, though there may be some space available for BoF style activities on Wednesday. I also noticed this Win the telco BoF session on the summit schedule for the creation of a *new* working group: https://openstacksummitnovember2014paris.sched.org/event/f5bcb6033064494390342031e48747e3#.VEbRkOKmhkM Does anyone know anything about this? It's unclear if this is the appropriate place to discuss the planning and development activities we've been working on. Let's discuss further in the meeting tomorrow. Thanks, Steve ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] Propose to define the compute capability clearly
Hi, Daniel's all, This is a follow up to Daniel's http://osdir.com/ml/openstack-dev/2014-10/msg00557.html , Info on XenAPI data format for 'host_data' call. I'm considering to change the compute capability to be a nova object, with well defined field, the reasons are: a) currently the compute capability is a dict returned from hypervisor, however, different hypervisor may have different return value; b) currently the compute capability filter make decision simply match the flavor extra_specs with this not-well-defined dict, this is not good IMHO. Just want to get some feedback from the mailing list before I try to create a BP and spec for it. Thanks --jyh -Original Message- From: Daniel P. Berrange [mailto:berra...@redhat.com] Sent: Wednesday, October 8, 2014 8:56 AM To: Bob Ball Cc: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] Info on XenAPI data format for 'host_data' call On Wed, Oct 08, 2014 at 03:53:25PM +, Bob Ball wrote: Hi Daniel, The following is an example return value from one of my hosts {host_name-description: Default install of XenServer, host_hostname: ciceronicus, host_memory: {total: 17169604608, overhead: 266592256, free: 16132087808, free-computed: 16111337472}, enabled: true, host_capabilities: [xen-3.0-x86_64, xen-3.0-x86_32p, hvm-3.0-x86_32, hvm-3.0-x86_32p, hvm-3.0- x86_64], host_other-config: {agent_start_time: 1412774967., iscsi_iqn: iqn.2014-10.com.xensource.hq.eng:587b598c, boot_time: 1412774885.}, host_ip_address: 10.219.10.24, host_cpu_info: {physical_features: 0098e3fd-bfebfbff-0001-28100800, modelname: Intel(R) Xeon(R) CPU X3430 @ 2.40GHz, vendor: GenuineIntel, features: 0098e3fd-bfebfbff-0001-28100800, family: 6, maskable: full, cpu_count: 4, socket_count: 1, flags: fpu de tsc msr pae mce cx8 apic sep mtrr mca cmov pat clflush acpi mmx fxsr sse sse2 ss ht nx constant_tsc nonstop_tsc aperfmperf pni vmx est ssse3 sse4_1 sse4_2 popcnt hypervisor ida tpr_shadow vnmi flexpriority ept vpid, stepping: 5, model: 30, features_after_reboot: 0098e3fd-bfebfbff- 0001-28100800, speed: 2394.086}, host_uuid: ec54eebe-b14b- 4b0a-aa89-d2c468771cd3, host_name-label: ciceronicus} Is that enough for what you're looking at? If there is anything I can help with let me know on IRC. Yes, that is perfect, thank you. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] Combination of Heat ResourceGroup(index) with Fn::Select doesnt work ?
Ah, sorry. I misunderstood what you were trying to do. Why create a template that takes in a list of pool members, rather then pass the pool id to the template the instance is in, and use a PoolMember to attach it? Thanks, Kevin From: Subrahmanyam Ongole [song...@oneconvergence.com] Sent: Tuesday, October 21, 2014 11:05 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Heat] Combination of Heat ResourceGroup(index) with Fn::Select doesnt work ? Hi We are in fact using OS::Neutron::PoolMember resource. I guess ResourceGroup is the only iterative construct in Heat. Is the use case supported today? I think this is more than a simple usage question, hence posting it here. Thank you. Regards Subra On Tue, Oct 21, 2014 at 8:55 AM, Fox, Kevin M kevin@pnnl.govmailto:kevin@pnnl.gov wrote: use a OS::Neutron::PoolMember instead. Then each member template can add itself to the pool. From: Magesh GV [magesh...@oneconvergence.commailto:magesh...@oneconvergence.com] Sent: Tuesday, October 21, 2014 12:07 AM To: openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org Subject: [openstack-dev] Combination of Heat ResourceGroup(index) with Fn::Select doesnt work ? I am trying to find a way of creating a dynamic List of Resources(Loadbalancer PoolMembers to be exact) using Heat. The idea is that the number of PoolMembers and the required Addresses would be received as Heat parameters. However, I am unable to get %index% working inside a Fn:Select block. Is this a bug with Heat or am I doing something wrong ? If this is a bug/limitation in heat is there some other way to get what I am trying to do working with heat ? IMO this is a very important usecase for the %index%. Parameters: { NumberOfMembers: { Description: Number of Pool Members to be created, Type: Number, Default: 1 }, MembersList: { Description: Pool Member IP Address, Type: Json, Default: {key0:11.0.0.43} } }, MemberList: { Type: OS::Heat::ResourceGroup, Properties: { count: {Ref:NumberOfMembers}, resource_def: { type: OS::Neutron::PoolMember, properties: { address: { Fn::Select : [ key%index%, {Ref:MembersList}] }, admin_state_up: true, pool_id: {Ref:HaproxyPool}, protocol_port: 80, weight: 1 } } } } Regards, Magesh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.orgmailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Thanks OSM (Subrahmanyam Ongole) ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [horizon] Somewhere to host mockups
David, In the Horizon team meeting on the 14th[1], we had some discussion about where we should host additional items like mockups that are associated with a blueprint. Numerous suggestions came up but there wasn't any official clear answer, so you said you'd check with the infra team. Were you able to do that? [1] http://eavesdrop.openstack.org/meetings/horizon/2014/horizon.2014-10-14-16.00.log.html Thanks, Travis ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [rally][users]: Synchronizing between multiple scenario instances.
Does rally provide any synchronization mechanism to synchronize between multiple scenario, when running in parallel? Rally spawns multiple processes, with each process running the scenario. We need a way to synchronize between these to start a perf test operation at the same time. regards, Behzad ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Fuel] using keystone client in fuel master node
Using the keystone client is currently quite painful for fuel. For example getting tokens from a fuel env when auth required (which is needed if we want to use curl or other clients) is currently quite a mess. In order to get a token you can python EOF from fuelclient.client import Client print Client().auth_token EOF or attempt to use the keystone client cat /etc/fuel/client/config.yaml SERVER_ADDRESS: 10.108.0.2 SERVER_PORT: 8000 KEYSTONE_USER: admin KEYSTONE_PASS: admin KEYSTONE_PORT: 5000 export OS_AUTH_URL=http://10.108.0.2:8000/keystone/v2.0; # where did the auth url come from ? [1] # alternatly its also at http://10.108.0.2:5000/v2.0/ export OS_USERNAME=admin export OS_PASSWORD=admin export OS_TENANT_NAME=admin keystone token-get WARNING:keystoneclient.httpclient:Failed to retrieve management_url from token +---+--+ | Property | Value | +---+--+ | expires | 2014-10-21T00:08:52Z | | id| 4acbc25ee95947e9adeafedecc2f8e31 | | tenant_id | 8bd09f40faec4112864b23c6a03ac3bd | | user_id | ca080c124b8943678e0f1edc6a92b8e2 | +---+--+ [1] https://github.com/stackforge/fuel-web/blob/master/fuelclient/fuelclient/client.py#L55 As we extend our usage of keystone to include other data in the endpoints / catalog, it becomes more and more relevant for us to consume the auth information the same way as is done for the other openstack clients. To this end, I think we should be using the same parameters and patterns as in openstack. This will help admins be familiar with openstack tools, and enable us to use the same methods across multiple clients. Fuel client should be changed to take the same options --os-username, --os-password, etc... as well as accept the environment variables that correlate with them, This would also allow us to bring openrc onto the fuel master, and unify getting credentials to the various clients. Later on, I think we should be adding the fuel url to the endpoint's data so that we can use the client only with the auth url like the openstack clients. It would also allow us to set the fuel endpoint in deployed clouds so that the fuel node could be easily found later. -- Andrew Mirantis Ceph community ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [rally][users]: Synchronizing between multiple scenario instances.
Behzad, Unfortunately at this point there is no support of locking between scenarios. It will be quite tricky for implementation, because we have different load generators, and we will need to find common solutions for all of them. If you have any ideas about how to implement it in such way, I will be more than happy to get this in upstream. One of the way that I see is to having some kind of chain-of-benchmarks: 1) Like first benchmark is running N VMs 2) Second benchmarking is doing something with all those benchmarks 3) Third benchmark is deleting all these VMs (where chain elements are atomic actions) Probably this will be better long term solution. Only thing that we should understand is how to store those results and how to display them. If you would like to help with this let's start discussing it, in some kind of google docs. Thoughts? Best regards, Boris Pavlovic On Wed, Oct 22, 2014 at 2:13 AM, Behzad Dastur (bdastur) bdas...@cisco.com wrote: Does rally provide any synchronization mechanism to synchronize between multiple scenario, when running in parallel? Rally spawns multiple processes, with each process running the scenario. We need a way to synchronize between these to start a perf test operation at the same time. regards, Behzad ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Heat] Stack stuck in DELETE_IN_PROGRESS even though all resources are DELETE_COMPLETE
Greetings all, I'm using Heat from Icehouse and I'm hitting a problem that I'm hoping y'all can shed some light on. I have no problems doing stack-create. I can watch the MySQL commands go by and see it happily update the stack table so that it eventually shows up as CREATE_COMPLETE. When I delete the stack, everything seems to be working fine. I see the MySQL update that sets the stack to DELETE IN PROGRESS as well as the updates that sets my single resource to DELETE_COMPLETE... but I never see the final update to the stack table to set it to DELETE COMPLETE. One very odd thing that I found in the MySQL logs (snippet below), is a query that includes that stack name but with an extra '*' character append to it. My stack is named 'xyzzy8' but notice the 'xzyzzy8*' in the constraint. That's not going to return anything and I'm wondering if that is what's preventing the final stack DELETE_COMPLTE update from happening? There are no errors in the any of the heat logs. Any guidance would be greatly appreciated! Ken SELECT stack.status_reason AS stack_status_reason, stack.created_at AS stack_created_at, stack.deleted_at AS stack_deleted_at, stack.action AS stack_action, stack.status AS stack_status, stack.id AS stack_id, stack.name AS stack_name, stack.raw_template_id AS stack_raw_template_id, stack.username AS stack_username, stack.tenant AS stack_tenant, stack.parameters AS stack_parameters, stack.user_creds_id AS stack_user_creds_id, stack.owner_id AS stack_owner_id, stack.timeout AS stack_timeout, stack.disable_rollback AS stack_disable_rollback, stack.stack_user_project_id AS stack_stack_user_project_id, stack.updated_at AS stack_updated_atFROM stackWHERE stack.deleted_at IS NULL AND (stack.tenant = 'c6c488223aae4e97bf56dda8cef36b3b' OR stack.stack_user_project_id = 'c6c488223aae4e97bf56dda8cef36b3b') AND stack.name = 'xyzzy8*' AND stack.owner_id = '9a3e56d7-0c10-4c1c-8c54-0e5580cee121' ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] Combination of Heat ResourceGroup(index) with Fn::Select doesnt work ?
On Tue, Oct 21, 2014 at 12:32 PM, Zane Bitter zbit...@redhat.com wrote: On 21/10/14 14:05, Subrahmanyam Ongole wrote: Hi We are in fact using OS::Neutron::PoolMember resource. I guess ResourceGroup is the only iterative construct in Heat. Is the use case supported today? I think this is more than a simple usage question, hence posting it here. Thank you. Regards Subra On Tue, Oct 21, 2014 at 8:55 AM, Fox, Kevin M kevin@pnnl.gov wrote: use a OS::Neutron::PoolMember instead. Then each member template can add itself to the pool. -- *From:* Magesh GV [magesh...@oneconvergence.com] *Sent:* Tuesday, October 21, 2014 12:07 AM *To:* openstack-dev@lists.openstack.org *Subject:* [openstack-dev] Combination of Heat ResourceGroup(index) with Fn::Select doesnt work ? I am trying to find a way of creating a dynamic List of Resources(Loadbalancer PoolMembers to be exact) using Heat. The idea is that the number of PoolMembers and the required Addresses would be received as Heat parameters. However, I am unable to get %index% working inside a Fn:Select block. Yeah, it doesn't work inside intrinsic functions. Is this a bug with Heat or am I doing something wrong ? It's arguable ;) If this is a bug/limitation in heat is there some other way to get what I am trying to do working with heat ? You should define the PoolMember near the server it is linking to the pool, not in a separate ResourceGroup. Group together the things you want to scale as a group in a separate template and scale them as a unit. IMO this is a very important usecase for the %index%. I'm very uneasy about %index% existing at all, and I'm not at all convinced that allowing something like this would achieve anything beyond making bad template design more tempting. Can you talk more about why you need to define the PoolMember resources in a different place to the actual pool members? The use-case is to define a template for an LB service and customize it. The parameters i) number of members in the pool and ii) IP addr of each of these members are not known when the template is defined. When the service is instantiated using the template the code that generates the values for these params will pass these in as parameter values to heat. cheers, Zane. Parameters: { NumberOfMembers: { Description: Number of Pool Members to be created, Type: Number, Default: 1 }, MembersList: { Description: Pool Member IP Address, Type: Json, Default: {key0:11.0.0.43} } }, MemberList: { Type: OS::Heat::ResourceGroup, Properties: { count: {Ref:NumberOfMembers}, resource_def: { type: OS::Neutron::PoolMember, properties: { address: { Fn::Select : [ key%index%, {Ref:MembersList}] }, admin_state_up: true, pool_id: {Ref:HaproxyPool}, protocol_port: 80, weight: 1 } } } } Regards, Magesh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] Combination of Heat ResourceGroup(index) with Fn::Select doesnt work ?
On Tue, Oct 21, 2014 at 4:43 PM, Hemanth Ravi hemanthrav...@gmail.com wrote: On Tue, Oct 21, 2014 at 12:32 PM, Zane Bitter zbit...@redhat.com wrote: On 21/10/14 14:05, Subrahmanyam Ongole wrote: Hi We are in fact using OS::Neutron::PoolMember resource. I guess ResourceGroup is the only iterative construct in Heat. Is the use case supported today? I think this is more than a simple usage question, hence posting it here. Thank you. Regards Subra On Tue, Oct 21, 2014 at 8:55 AM, Fox, Kevin M kevin@pnnl.gov wrote: use a OS::Neutron::PoolMember instead. Then each member template can add itself to the pool. -- *From:* Magesh GV [magesh...@oneconvergence.com] *Sent:* Tuesday, October 21, 2014 12:07 AM *To:* openstack-dev@lists.openstack.org *Subject:* [openstack-dev] Combination of Heat ResourceGroup(index) with Fn::Select doesnt work ? I am trying to find a way of creating a dynamic List of Resources(Loadbalancer PoolMembers to be exact) using Heat. The idea is that the number of PoolMembers and the required Addresses would be received as Heat parameters. However, I am unable to get %index% working inside a Fn:Select block. Yeah, it doesn't work inside intrinsic functions. Is this a bug with Heat or am I doing something wrong ? It's arguable ;) If this is a bug/limitation in heat is there some other way to get what I am trying to do working with heat ? You should define the PoolMember near the server it is linking to the pool, not in a separate ResourceGroup. Group together the things you want to scale as a group in a separate template and scale them as a unit. IMO this is a very important usecase for the %index%. I'm very uneasy about %index% existing at all, and I'm not at all convinced that allowing something like this would achieve anything beyond making bad template design more tempting. Can you talk more about why you need to define the PoolMember resources in a different place to the actual pool members? The use-case is to define a template for an LB service and customize it. The parameters i) number of members in the pool and ii) IP addr of each of these members are not known when the template is defined. When the service is instantiated using the template the code that generates the values for these params will pass these in as parameter values to heat. The servers (to be configured as pool members) are instantiated outside the heat template independently. The heat template is created only with the LB service definition and needs to be customized for these parameters. cheers, Zane. Parameters: { NumberOfMembers: { Description: Number of Pool Members to be created, Type: Number, Default: 1 }, MembersList: { Description: Pool Member IP Address, Type: Json, Default: {key0:11.0.0.43} } }, MemberList: { Type: OS::Heat::ResourceGroup, Properties: { count: {Ref:NumberOfMembers}, resource_def: { type: OS::Neutron::PoolMember, properties: { address: { Fn::Select : [ key%index%, {Ref:MembersList}] }, admin_state_up: true, pool_id: {Ref:HaproxyPool}, protocol_port: 80, weight: 1 } } } } Regards, Magesh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] Adding new dependencies to stackforge projects
Hi all, On the cross project meeting today, i promised to bring this to the ML[1]. So here it is: Question : Can a StackForge project (like nova-docker), depend on a library (docker-py) that is not specified in global requirements? Right now the answer seems to be No, as enforced by the CI systems. For the specific problems, see review: https://review.openstack.org/#/c/130065/ You can see that check-tempest-dsvm-f20-docker fails: http://logs.openstack.org/65/130065/1/check/check-tempest-dsvm-f20-docker/f9000d4/devstacklog.txt.gz and the gate-nova-docker-requirements fails: http://logs.openstack.org/65/130065/1/check/gate-nova-docker-requirements/34256d2/console.html For this specific instance, the reason for adding this dependency is to get rid of custom http client in nova-docker project that just duplicates the functionality, needs to be maintained and does not do proper checking etc. But the question is general in the broader since projects should be able to add dependencies and be able to run dsvm and requirements jobs until they are integrated and the delta list of new dependencies to global requirements should be vetted during the process. Thanks, dims PS: A really long rambling version of this email with a proposal to add a flag in devstack-gate/devstack is at [2], Actual review with hacks to get DSVM running by hook/crook that shows that docker-py indeed be used is at [3] [1] http://eavesdrop.openstack.org/meetings/project/2014/project.2014-10-21-21.02.log.html [2] https://etherpad.openstack.org/p/managing-reqs-for-projects-to-be-integrated [3] https://review.openstack.org/#/c/128790/ -- Davanum Srinivas :: https://twitter.com/dims ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [neutron][stable] metadata agent performance
We merged caching support for the metadata agent in juno, and backported to icehouse. It was enabled by default in juno, but disabled by default in icehouse to satisfy the stable maint requirement of not changing functional behavior. While performance of the agent was improved with caching enabled, it regressed a reported 8x when caching was disabled [1]. This means that by default, the caching backport severely impacts icehouse Neutron's performance. So, what is the way forward? We definitely need to document the problem for both icehouse and juno. Is documentation enough? Or can we enable caching by default in icehouse? Or remove the backport entirely. There is also a proposal to replace the metadata agent’s use of the neutron client in favor of rpc [2]. There were comments on an old bug suggesting we didn’t want to do this [3], but assuming that we want this change in Kilo, is backporting even a possibility given that it implies a behavioral change to be useful? Thanks, Maru 1: https://bugs.launchpad.net/cloud-archive/+bug/1361357 2: https://review.openstack.org/#/c/121782 3: https://bugs.launchpad.net/neutron/+bug/1092043 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Adding new dependencies to stackforge projects
On Tue, Oct 21, 2014 at 08:09:38PM -0400, Davanum Srinivas wrote: Hi all, On the cross project meeting today, i promised to bring this to the ML[1]. So here it is: Question : Can a StackForge project (like nova-docker), depend on a library (docker-py) that is not specified in global requirements? So the answer is definitely yes, and this is definitely the case for most projects which aren't in the integrated release. We should only be enforcing requirements on projects in projects.txt in the requirements repo. Right now the answer seems to be No, as enforced by the CI systems. For the specific problems, see review: https://review.openstack.org/#/c/130065/ You can see that check-tempest-dsvm-f20-docker fails: http://logs.openstack.org/65/130065/1/check/check-tempest-dsvm-f20-docker/f9000d4/devstacklog.txt.gz I think you've just hit a bug either in devstack or the nova-docker devstack bits. There isn't any reason these checks should be run on a project which isn't being tracked by global requirements. and the gate-nova-docker-requirements fails: http://logs.openstack.org/65/130065/1/check/gate-nova-docker-requirements/34256d2/console.html I'm not sure why this job is configured to be running on the nova-docker repo. The project should either decide to track global-requirements and then be added to projects.txt or not run the requirements check job. It doesn't make much sense to enforce compliance with global requirements if the project is trying to use libraries not included there. Just remove the job template from the zuul layout for nova-docker: http://git.openstack.org/cgit/openstack-infra/project-config/tree/zuul/layout.yaml#n4602 and then once the issue with devstack is figured out you can add the docker-py to the requirements list. For this specific instance, the reason for adding this dependency is to get rid of custom http client in nova-docker project that just duplicates the functionality, needs to be maintained and does not do proper checking etc. But the question is general in the broader since projects should be able to add dependencies and be able to run dsvm and requirements jobs until they are integrated and the delta list of new dependencies to global requirements should be vetted during the process. If nova-docker isn't tracked by global requirements then there shouldn't be anything blocking you from adding docker-py to the nova-docker requirements. It looks like your just hitting a bug and/or a configuration issue. Granted, there might be some complexity in moving the driver back into the nova tree if there are dependencies on a packages not in global requirements, but that's something that can be addressed when/if the driver is being merged back into nova. Thanks, dims PS: A really long rambling version of this email with a proposal to add a flag in devstack-gate/devstack is at [2], Actual review with hacks to get DSVM running by hook/crook that shows that docker-py indeed be used is at [3] [1] http://eavesdrop.openstack.org/meetings/project/2014/project.2014-10-21-21.02.log.html [2] https://etherpad.openstack.org/p/managing-reqs-for-projects-to-be-integrated [3] https://review.openstack.org/#/c/128790/ -Matt Treinish pgpQ6KVhYL3ts.pgp Description: PGP signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Glance] Proposal to add reviewers requirement to the glance-specs.
Hi, I would like to propose the requirement of 2 reviewers (at least one of them being Glance core), for a spec in order to be approved. This proposal is a step to better plan the development work and help the team prioritize on the reviews and features. The anticipation is that it will help commiters get regular feedback on the active feature-work being done. It will also help us reduce the push for any feature to be merged very late in the cycle that is currently keeping wormhole open for bugs. If there are no objections, we will implement this change for all the features sitting in the review queue of glance-specs. The approved blueprints would be discussed with core reviewers to get a sense of their availability. Thanks, -Nikhil ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Cells conversation starter
- Original Message - From: Andrew Laski andrew.la...@rackspace.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org One of the big goals for the Kilo cycle by users and developers of the cells functionality within Nova is to get it to a point where it can be considered a first class citizen of Nova. Ultimately I think this comes down to getting it tested by default in Nova jobs, and making it easy for developers to work with. But there's a lot of work to get there. In order to raise awareness of this effort, and get the conversation started on a few things, I've summarized a little bit about cells and this effort below. Goals: Testing of a single cell setup in the gate. Feature parity. Make cells the default implementation. Developers write code once and it works for cells. Ultimately the goal is to improve maintainability of a large feature within the Nova code base. Feature gaps: Host aggregates Security groups Server groups Shortcomings: Flavor syncing This needs to be addressed now. Cells scheduling/rescheduling Instances can not currently move between cells These two won't affect the default one cell setup so they will be addressed later. What does cells do: Schedule an instance to a cell based on flavor slots available. Proxy API requests to the proper cell. Keep a copy of instance data at the global level for quick retrieval. Sync data up from a child cell to keep the global level up to date. Simplifying assumptions: Cells will be treated as a two level tree structure. Plan: Fix flavor breakage in child cell which causes boot tests to fail. Currently the libvirt driver needs flavor.extra_specs which is not synced to the child cell. Some options are to sync flavor and extra specs to child cell db, or pass full data with the request. https://review.openstack.org/#/c/126620/1 offers a means of passing full data with the request. Determine proper switches to turn off Tempest tests for features that don't work with the goal of getting a voting job. Once this is in place we can move towards feature parity and work on internal refactorings. Work towards adding parity for host aggregates, security groups, and server groups. They should be made to work in a single cell setup, but the solution should not preclude them from being used in multiple cells. There needs to be some discussion as to whether a host aggregate or server group is a global concept or per cell concept. Work towards merging compute/api.py and compute/cells_api.py so that developers only need to make changes/additions in once place. The goal is for as much as possible to be hidden by the RPC layer, which will determine whether a call goes to a compute/conductor/cell. For syncing data between cells, look at using objects to handle the logic of writing data to the cell/parent and then syncing the data to the other. A potential migration scenario is to consider a non cells setup to be a child cell and converting to cells will mean setting up a parent cell and linking them. There are periodic tasks in place to sync data up from a child already, but a manual kick off mechanism will need to be added. Future plans: Something that has been considered, but is out of scope for now, is that the parent/api cell doesn't need the same data model as the child cell. Since the majority of what it does is act as a cache for API requests, it does not need all the data that a cell needs and what data it does need could be stored in a form that's optimized for reads. In terms of future plans I'd like to also explore how continued iteration on the Cells concept might line up against the use cases presented in the recent threads on OpenStack cascading: * http://lists.openstack.org/pipermail/openstack-dev/2014-September/047470.html * http://lists.openstack.org/pipermail/openstack-dev/2014-October/047526.html Thanks, Steve ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Cells conversation starter
Hi, Many thanks to Steve to link these topics together. +1 Consider that there are lots of production installation of cells solution, the improvement on cells are definitely necessary. And also hope to add the following demand for the large cloud operator ( especially cloud for NFV ) in the agenda: 1) large cloud operator ask multi-vendor to build the distributed but unified cloud together. (each vendor has his own OpenStack based solution) 2) restful API /CLI is required for each site to make the cloud always workable and manageable. 3) the unified cloud need to expose open and standard api That's drive force for OpenStack cascading, let's try to work together to see if cells can evolve to solve this demands. Best Regards Chaoyi Huang ( Joe Huang ) -Original Message- From: Steve Gordon [mailto:sgor...@redhat.com] Sent: Wednesday, October 22, 2014 10:10 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Nova] Cells conversation starter - Original Message - From: Andrew Laski andrew.la...@rackspace.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org One of the big goals for the Kilo cycle by users and developers of the cells functionality within Nova is to get it to a point where it can be considered a first class citizen of Nova. Ultimately I think this comes down to getting it tested by default in Nova jobs, and making it easy for developers to work with. But there's a lot of work to get there. In order to raise awareness of this effort, and get the conversation started on a few things, I've summarized a little bit about cells and this effort below. Goals: Testing of a single cell setup in the gate. Feature parity. Make cells the default implementation. Developers write code once and it works for cells. Ultimately the goal is to improve maintainability of a large feature within the Nova code base. Feature gaps: Host aggregates Security groups Server groups Shortcomings: Flavor syncing This needs to be addressed now. Cells scheduling/rescheduling Instances can not currently move between cells These two won't affect the default one cell setup so they will be addressed later. What does cells do: Schedule an instance to a cell based on flavor slots available. Proxy API requests to the proper cell. Keep a copy of instance data at the global level for quick retrieval. Sync data up from a child cell to keep the global level up to date. Simplifying assumptions: Cells will be treated as a two level tree structure. Plan: Fix flavor breakage in child cell which causes boot tests to fail. Currently the libvirt driver needs flavor.extra_specs which is not synced to the child cell. Some options are to sync flavor and extra specs to child cell db, or pass full data with the request. https://review.openstack.org/#/c/126620/1 offers a means of passing full data with the request. Determine proper switches to turn off Tempest tests for features that don't work with the goal of getting a voting job. Once this is in place we can move towards feature parity and work on internal refactorings. Work towards adding parity for host aggregates, security groups, and server groups. They should be made to work in a single cell setup, but the solution should not preclude them from being used in multiple cells. There needs to be some discussion as to whether a host aggregate or server group is a global concept or per cell concept. Work towards merging compute/api.py and compute/cells_api.py so that developers only need to make changes/additions in once place. The goal is for as much as possible to be hidden by the RPC layer, which will determine whether a call goes to a compute/conductor/cell. For syncing data between cells, look at using objects to handle the logic of writing data to the cell/parent and then syncing the data to the other. A potential migration scenario is to consider a non cells setup to be a child cell and converting to cells will mean setting up a parent cell and linking them. There are periodic tasks in place to sync data up from a child already, but a manual kick off mechanism will need to be added. Future plans: Something that has been considered, but is out of scope for now, is that the parent/api cell doesn't need the same data model as the child cell. Since the majority of what it does is act as a cache for API requests, it does not need all the data that a cell needs and what data it does need could be stored in a form that's optimized for reads. In terms of future plans I'd like to also explore how continued iteration on the Cells concept might line up against the use cases presented in the recent threads on OpenStack cascading: *
Re: [openstack-dev] [neutron][stable] metadata agent performance
It sounds like the only reasonable option we are left with right now is to document. Even if we enabled/removed the backport, it would take time until users can get their hands on a new cut of the stable branch. We would need to be more diligent in the future and limit backports to just bug fixes to prevent situations like this from occurring, or maybe we need to have better testing...um...definitely the latter :) My 2c Armando On 22 October 2014 05:56, Maru Newby ma...@redhat.com wrote: We merged caching support for the metadata agent in juno, and backported to icehouse. It was enabled by default in juno, but disabled by default in icehouse to satisfy the stable maint requirement of not changing functional behavior. While performance of the agent was improved with caching enabled, it regressed a reported 8x when caching was disabled [1]. This means that by default, the caching backport severely impacts icehouse Neutron's performance. So, what is the way forward? We definitely need to document the problem for both icehouse and juno. Is documentation enough? Or can we enable caching by default in icehouse? Or remove the backport entirely. There is also a proposal to replace the metadata agent’s use of the neutron client in favor of rpc [2]. There were comments on an old bug suggesting we didn’t want to do this [3], but assuming that we want this change in Kilo, is backporting even a possibility given that it implies a behavioral change to be useful? Thanks, Maru 1: https://bugs.launchpad.net/cloud-archive/+bug/1361357 2: https://review.openstack.org/#/c/121782 3: https://bugs.launchpad.net/neutron/+bug/1092043 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Cells conversation starter
On 22/10/14 03:07, Andrew Laski wrote: On 10/21/2014 04:31 AM, Nikola Đipanov wrote: On 10/20/2014 08:00 PM, Andrew Laski wrote: One of the big goals for the Kilo cycle by users and developers of the cells functionality within Nova is to get it to a point where it can be considered a first class citizen of Nova. Ultimately I think this comes down to getting it tested by default in Nova jobs, and making it easy for developers to work with. But there's a lot of work to get there. In order to raise awareness of this effort, and get the conversation started on a few things, I've summarized a little bit about cells and this effort below. Goals: Testing of a single cell setup in the gate. Feature parity. Make cells the default implementation. Developers write code once and it works for cells. Ultimately the goal is to improve maintainability of a large feature within the Nova code base. Thanks for the write-up Andrew! Some thoughts/questions below. Looking forward to the discussion on some of these topics, and would be happy to review the code once we get to that point. Feature gaps: Host aggregates Security groups Server groups Shortcomings: Flavor syncing This needs to be addressed now. Cells scheduling/rescheduling Instances can not currently move between cells These two won't affect the default one cell setup so they will be addressed later. What does cells do: Schedule an instance to a cell based on flavor slots available. Proxy API requests to the proper cell. Keep a copy of instance data at the global level for quick retrieval. Sync data up from a child cell to keep the global level up to date. Simplifying assumptions: Cells will be treated as a two level tree structure. Are we thinking of making this official by removing code that actually allows cells to be an actual tree of depth N? I am not sure if doing so would be a win, although it does complicate the RPC/Messaging/State code a bit, but if it's not being used, even though a nice generalization, why keep it around? My preference would be to remove that code since I don't envision anyone writing tests to ensure that functionality works and/or doesn't regress. But there's the challenge of not knowing if anyone is actually relying on that behavior. So initially I'm not creating a specific work item to remove it. But I think it needs to be made clear that it's not officially supported and may get removed unless a case is made for keeping it and work is put into testing it. While I agree that N is a bit interesting, I have seen N=3 in production [central API]--[state/region1]--[state/region DC1] \-[state/region DC2] --[state/region2 DC] --[state/region3 DC] --[state/region4 DC] Plan: Fix flavor breakage in child cell which causes boot tests to fail. Currently the libvirt driver needs flavor.extra_specs which is not synced to the child cell. Some options are to sync flavor and extra specs to child cell db, or pass full data with the request. https://review.openstack.org/#/c/126620/1 offers a means of passing full data with the request. Determine proper switches to turn off Tempest tests for features that don't work with the goal of getting a voting job. Once this is in place we can move towards feature parity and work on internal refactorings. Work towards adding parity for host aggregates, security groups, and server groups. They should be made to work in a single cell setup, but the solution should not preclude them from being used in multiple cells. There needs to be some discussion as to whether a host aggregate or server group is a global concept or per cell concept. Have there been any previous discussions on this topic? If so I'd really like to read up on those to make sure I understand the pros and cons before the summit session. The only discussion I'm aware of is some comments on https://review.openstack.org/#/c/59101/ , though they mention a discussion at the Utah mid-cycle. The main con I'm aware of for defining these as global concepts is that there is no rescheduling capability in the cells scheduler. So if a build is sent to a cell with a host aggregate that can't fit that instance the build will fail even though there may be space in that host aggregate from a global perspective. That should be somewhat straightforward to address though. I think it makes sense to define these as global concepts. But these are features that aren't used with cells yet so I haven't put a lot of thought into potential arguments or cases for doing this one way or another. Work towards merging compute/api.py and compute/cells_api.py so that developers only need to make changes/additions in once place. The goal is for as much as possible to be hidden by the RPC layer, which will determine whether a call goes to a compute/conductor/cell.
Re: [openstack-dev] [Nova] Cells conversation starter
Thanks for this. It would be interesting to see how much of this work you think is achievable in Kilo. How long do you see this process taking? In line with that, is it just you currently working on this? Would calling for volunteers to help be meaningful? Michael On Tue, Oct 21, 2014 at 5:00 AM, Andrew Laski andrew.la...@rackspace.com wrote: One of the big goals for the Kilo cycle by users and developers of the cells functionality within Nova is to get it to a point where it can be considered a first class citizen of Nova. Ultimately I think this comes down to getting it tested by default in Nova jobs, and making it easy for developers to work with. But there's a lot of work to get there. In order to raise awareness of this effort, and get the conversation started on a few things, I've summarized a little bit about cells and this effort below. Goals: Testing of a single cell setup in the gate. Feature parity. Make cells the default implementation. Developers write code once and it works for cells. Ultimately the goal is to improve maintainability of a large feature within the Nova code base. Feature gaps: Host aggregates Security groups Server groups Shortcomings: Flavor syncing This needs to be addressed now. Cells scheduling/rescheduling Instances can not currently move between cells These two won't affect the default one cell setup so they will be addressed later. What does cells do: Schedule an instance to a cell based on flavor slots available. Proxy API requests to the proper cell. Keep a copy of instance data at the global level for quick retrieval. Sync data up from a child cell to keep the global level up to date. Simplifying assumptions: Cells will be treated as a two level tree structure. Plan: Fix flavor breakage in child cell which causes boot tests to fail. Currently the libvirt driver needs flavor.extra_specs which is not synced to the child cell. Some options are to sync flavor and extra specs to child cell db, or pass full data with the request. https://review.openstack.org/#/c/126620/1 offers a means of passing full data with the request. Determine proper switches to turn off Tempest tests for features that don't work with the goal of getting a voting job. Once this is in place we can move towards feature parity and work on internal refactorings. Work towards adding parity for host aggregates, security groups, and server groups. They should be made to work in a single cell setup, but the solution should not preclude them from being used in multiple cells. There needs to be some discussion as to whether a host aggregate or server group is a global concept or per cell concept. Work towards merging compute/api.py and compute/cells_api.py so that developers only need to make changes/additions in once place. The goal is for as much as possible to be hidden by the RPC layer, which will determine whether a call goes to a compute/conductor/cell. For syncing data between cells, look at using objects to handle the logic of writing data to the cell/parent and then syncing the data to the other. A potential migration scenario is to consider a non cells setup to be a child cell and converting to cells will mean setting up a parent cell and linking them. There are periodic tasks in place to sync data up from a child already, but a manual kick off mechanism will need to be added. Future plans: Something that has been considered, but is out of scope for now, is that the parent/api cell doesn't need the same data model as the child cell. Since the majority of what it does is act as a cache for API requests, it does not need all the data that a cell needs and what data it does need could be stored in a form that's optimized for reads. Thoughts? ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Rackspace Australia ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [neutron] Can ofagent connect to switches other than OVS?
Hi, all I’m trying to connect ofagent to switches other than OVS. But, it’s not going. I think that the ofagent cannot connect their switches. Is there anyone tried? Hirofumi ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev