Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
On 01/23/2014 11:28 AM, Justin Santa Barbara wrote: Would appreciate feedback / opinions on this blueprint: https://blueprints.launchpad.net/nova/+spec/first-discover-your-peers The blueprint starts out with: When running a clustered service on Nova, typically each node needs to find its peers. In the physical world, this is typically done using multicast. On the cloud, we either can't or don't want to use multicast. So, it seems that at the root of this, you're looking for a cloud-compatible way for instances to message each other. I really don't see the metadata API as the appropriate place for that. How about using Marconi here? If not, what's missing from Marconi's API to solve your messaging use case to allow instances to discover each other? -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
Russell Bryant wrote: So, it seems that at the root of this, you're looking for a cloud-compatible way for instances to message each other. No: discovery of peers, not messaging. After discovery, communication between nodes will then be done directly e.g. over TCP. Examples of services that work using this model: Elasticsearch, JBoss Data Grid, anything using JGroups, the next version of Zookeeper, etc. The instances just need some way to find each other; a nice way to think of this is as a replacement for multicast-discovery on the cloud. All these services then switch to direct messaging, because using an intermediate service introduces too much latency. With this blueprint though, we could build and run a great backend for Marconi, using OOO. I really don't see the metadata API as the appropriate place for that. But I presume you're OK with it for discovery? :-) How about using Marconi here? If not, what's missing from Marconi's API to solve your messaging use case to allow instances to discover each other? Well, again: discovery, so Marconi isn't the natural fit it may at first appear. Not sure if Marconi supports 'broadcast' queues (that would be the missing piece if it doesn't). But, even if we could abuse a Marconi queue for this: 1) Marconi isn't widely deployed 2) There is no easy way for a node to discover Marconi, even if it was deployed. 3) There is no easy way for a node to authenticate to Marconi, even if we could discover it I absolutely think we should fix each of those obstacles, and I'm sure we will eventually. But in the meantime, let's get this into Icehouse! Justin ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
On 02/05/2014 04:45 PM, Justin Santa Barbara wrote: Russell Bryant wrote: So, it seems that at the root of this, you're looking for a cloud-compatible way for instances to message each other. No: discovery of peers, not messaging. After discovery, communication I'm saying use messaging as the means to implement discovery. between nodes will then be done directly e.g. over TCP. Examples of services that work using this model: Elasticsearch, JBoss Data Grid, anything using JGroups, the next version of Zookeeper, etc. The instances just need some way to find each other; a nice way to think of this is as a replacement for multicast-discovery on the cloud. All these services then switch to direct messaging, because using an intermediate service introduces too much latency. With this blueprint though, we could build and run a great backend for Marconi, using OOO. I really don't see the metadata API as the appropriate place for that. But I presume you're OK with it for discovery? :-) How about using Marconi here? If not, what's missing from Marconi's API to solve your messaging use case to allow instances to discover each other? Well, again: discovery, so Marconi isn't the natural fit it may at first appear. Not sure if Marconi supports 'broadcast' queues (that would be the missing piece if it doesn't). But, even if we could abuse a Marconi queue for this: 1) Marconi isn't widely deployed Yet. I think we need to look to the future and decide on the right solution to the problem. 2) There is no easy way for a node to discover Marconi, even if it was deployed. That's what the Keystone service catalog is for. 3) There is no easy way for a node to authenticate to Marconi, even if we could discover it huh? The whole point of Marconi is to allow instances to have a messaging service available to them to use. Of course they can auth to it. I absolutely think we should fix each of those obstacles, and I'm sure we will eventually. But in the meantime, let's get this into Icehouse! NACK. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
On 01/27/2014 11:02 AM, Day, Phil wrote: -Original Message- From: Clint Byrum [mailto:cl...@fewbar.com] Sent: 24 January 2014 21:09 To: openstack-dev Subject: Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service Excerpts from Justin Santa Barbara's message of 2014-01-24 12:29:49 -0800: Clint Byrum cl...@fewbar.com wrote: Heat has been working hard to be able to do per-instance limited access in Keystone for a while. A trust might work just fine for what you want. I wasn't actually aware of the progress on trusts. It would be helpful except (1) it is more work to have to create a separate trust (it is even more painful to do so with IAM) and (2) it doesn't look like we can yet lock-down these delegations as much as people would probably want. I think IAM is the end-game in terms of the model that people actually want, and it ends up being incredibly complex. Delegation is very useful (particularly because clusters could auto-scale themselves), but I'd love to get an easier solution for the peer discovery problem than where delegation ends up. Are you hesitant to just use Heat? This is exactly what it is supposed to do.. make a bunch of API calls and expose the results to instances for use in configuration. If you're just hesitant to use a declarative templating language, I totally understand. The auto-scaling minded people are also feeling this way. You could join them in the quest to create an imperative cluster-making API for Heat. I don't want to _depend_ on Heat. My hope is that we can just launch 3 instances with the Cassandra image, and get a Cassandra cluster. It might be that we want Heat to auto-scale that cluster, Ceilometer to figure out when to scale it, Neutron to isolate it, etc but I think we can solve the basic discovery problem cleanly without tying in all the other services. Heat's value-add doesn't come from solving this problem! I suppose we disagree on this fundamental point then. Heat's value-add really does come from solving this exact problem. It provides a layer above all of the other services to facilitate expression of higher level concepts. Nova exposes a primitive API, where as Heat is meant to have a more logical expression of the user's intentions. That includes exposure of details of one resource to another (not just compute, swift containers, load balancers, volumes, images, etc). The main problem I see with using heat is that seems to depend on all instances having network access to the heat server, and I'm not sure how that would work for Neutron VPN network. This is already solved for the Metadata server because the Neturon proxy already provides secure access. That sounds like an integration issue we should fix. (regardless of whether it makes Justin's life any better) If we can't use heat in some situations because neutron doesn't know how to securely proxy to its metadata service ... that's kinda yuck. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
Russell Bryant wrote: I'm saying use messaging as the means to implement discovery. OK. Sorry that I didn't get this before. 1) Marconi isn't widely deployed Yet. I think we need to look to the future and decide on the right solution to the problem. Agreed 100%. I actually believe this _is_ the correct long-term solution. The fact that it doesn't depend on long-term roadmaps for other projects is merely a nice bonus. 2) There is no easy way for a node to discover Marconi, even if it was deployed. That's what the Keystone service catalog is for. Agreed. But, as far as I know, we have not defined how an instance reaches the Keystone service catalog. Probably, we would need to expose the Keystone endpoint in the metadata. (And, yes, we should do that too, but it doesn't really matter until we solve #3...) 3) There is no easy way for a node to authenticate to Marconi, even if we could discover it huh? The whole point of Marconi is to allow instances to have a messaging service available to them to use. Of course they can auth to it. As far as I know, we haven't defined any way for an instance to get credentials to use. The only approach that I know of is that the end-user puts their credentials into the metadata. But we don't have particularly fine-grained roles, so I can't see anyone wanting that in production! I absolutely think we should fix each of those obstacles, and I'm sure we will eventually. But in the meantime, let's get this into Icehouse! NACK. Well there's no need to shout :-) I understand the idea that everything in OpenStack should work together: I am a big proponent of it. However, this blueprint is a nice self-contained solution that solves a real problem today. The alternative Marconi-based approach is not only years away from public-cloud deployment, but will be more complicated for the end user. Have you ever tried defining IAM roles on EC2? - yuk! Even once we reach the happy day where we have Marconi everywhere, pub-sub queues, IAM, Instance Roles, and Keystone auto-discovery; even then end-users would still prefer the it just works result this blueprint will provide. As such we're not duplicating functionality, and we could have discovery in June, not in Juno (or - realistically - M). So: Is this a permanent no, or just a not-in-Icehouse no? Justin ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
Excerpts from Monty Taylor's message of 2014-02-05 14:57:33 -0800: On 01/27/2014 11:02 AM, Day, Phil wrote: -Original Message- From: Clint Byrum [mailto:cl...@fewbar.com] Sent: 24 January 2014 21:09 To: openstack-dev Subject: Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service Excerpts from Justin Santa Barbara's message of 2014-01-24 12:29:49 -0800: Clint Byrum cl...@fewbar.com wrote: Heat has been working hard to be able to do per-instance limited access in Keystone for a while. A trust might work just fine for what you want. I wasn't actually aware of the progress on trusts. It would be helpful except (1) it is more work to have to create a separate trust (it is even more painful to do so with IAM) and (2) it doesn't look like we can yet lock-down these delegations as much as people would probably want. I think IAM is the end-game in terms of the model that people actually want, and it ends up being incredibly complex. Delegation is very useful (particularly because clusters could auto-scale themselves), but I'd love to get an easier solution for the peer discovery problem than where delegation ends up. Are you hesitant to just use Heat? This is exactly what it is supposed to do.. make a bunch of API calls and expose the results to instances for use in configuration. If you're just hesitant to use a declarative templating language, I totally understand. The auto-scaling minded people are also feeling this way. You could join them in the quest to create an imperative cluster-making API for Heat. I don't want to _depend_ on Heat. My hope is that we can just launch 3 instances with the Cassandra image, and get a Cassandra cluster. It might be that we want Heat to auto-scale that cluster, Ceilometer to figure out when to scale it, Neutron to isolate it, etc but I think we can solve the basic discovery problem cleanly without tying in all the other services. Heat's value-add doesn't come from solving this problem! I suppose we disagree on this fundamental point then. Heat's value-add really does come from solving this exact problem. It provides a layer above all of the other services to facilitate expression of higher level concepts. Nova exposes a primitive API, where as Heat is meant to have a more logical expression of the user's intentions. That includes exposure of details of one resource to another (not just compute, swift containers, load balancers, volumes, images, etc). The main problem I see with using heat is that seems to depend on all instances having network access to the heat server, and I'm not sure how that would work for Neutron VPN network. This is already solved for the Metadata server because the Neturon proxy already provides secure access. That sounds like an integration issue we should fix. (regardless of whether it makes Justin's life any better) If we can't use heat in some situations because neutron doesn't know how to securely proxy to its metadata service ... that's kinda yuck. Indeed that is a known problem with Heat and one that has several solutions. One simple solution is for Heat to simply update the nova userdata, and for in-instance tools to just query ec2 metadata. The only obstacle to that is that ec2 metadata is visible to non-privileged users on the box without extra restrictions being applied. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
Excerpts from Day, Phil's message of 2014-01-27 03:02:17 -0800: -Original Message- From: Clint Byrum [mailto:cl...@fewbar.com] Sent: 24 January 2014 21:09 To: openstack-dev Subject: Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service Excerpts from Justin Santa Barbara's message of 2014-01-24 12:29:49 -0800: Clint Byrum cl...@fewbar.com wrote: Heat has been working hard to be able to do per-instance limited access in Keystone for a while. A trust might work just fine for what you want. I wasn't actually aware of the progress on trusts. It would be helpful except (1) it is more work to have to create a separate trust (it is even more painful to do so with IAM) and (2) it doesn't look like we can yet lock-down these delegations as much as people would probably want. I think IAM is the end-game in terms of the model that people actually want, and it ends up being incredibly complex. Delegation is very useful (particularly because clusters could auto-scale themselves), but I'd love to get an easier solution for the peer discovery problem than where delegation ends up. Are you hesitant to just use Heat? This is exactly what it is supposed to do.. make a bunch of API calls and expose the results to instances for use in configuration. If you're just hesitant to use a declarative templating language, I totally understand. The auto-scaling minded people are also feeling this way. You could join them in the quest to create an imperative cluster-making API for Heat. I don't want to _depend_ on Heat. My hope is that we can just launch 3 instances with the Cassandra image, and get a Cassandra cluster. It might be that we want Heat to auto-scale that cluster, Ceilometer to figure out when to scale it, Neutron to isolate it, etc but I think we can solve the basic discovery problem cleanly without tying in all the other services. Heat's value-add doesn't come from solving this problem! I suppose we disagree on this fundamental point then. Heat's value-add really does come from solving this exact problem. It provides a layer above all of the other services to facilitate expression of higher level concepts. Nova exposes a primitive API, where as Heat is meant to have a more logical expression of the user's intentions. That includes exposure of details of one resource to another (not just compute, swift containers, load balancers, volumes, images, etc). The main problem I see with using heat is that seems to depend on all instances having network access to the heat server, and I'm not sure how that would work for Neutron VPN network. This is already solved for the Metadata server because the Neturon proxy already provides secure access. This is not actually true. For Justin's use case, only access to the ec2 metadata is needed. When using Heat one can set the Userdata to whatever one already has discovered before booting the server at boot time. Heat uses its own Metadata server for ongoing updates, but in Justin's prescribed scenario, the machines discover each-other at boot-up only anyway. So machine 0 sees no other machines. Machine 1 sees machine 0. Machine 2 sees 0 and 1... etc. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
Excerpts from Murray, Paul (HP Cloud Services)'s message of 2014-01-27 04:14:44 -0800: Hi Justin, My though process is to go back to basics. To perform discovery there is no getting away from the fact that you have to start with a well-known address that your peers can access on the network. The second part is a service/protocol accessible at that address that can perform the discovery. So the questions are: what well-known addresses can I reach? And is that a suitable place to implement the service/protocol. The metadata service is different to the others in that it can be accessed without credentials (correct me if I'm wrong), so it is the only possibility out of the openstack services if you do not want to have credentials on the peer instances. If that is not the case then the other services are options. All services require security groups and/or networks to be configured appropriately to access them. (Yes, the question can all instances access the same metadata service did really mean are they all local. Sorry for being unclear. But I think your answer is yes, they are, right?) Implementing the peer discovery in the instances themselves requires some kind of multicast or knowing a list of addresses to try. In both cases either the actual addresses or some name resolved through a naming service would do. Whatever is starting your instances does have access to at least nova, so it can find out if there are any running instances and what their addresses are. These could be used as the addresses they try first. These are the way that internet p2p services work and they work in the cloud. That's kind of my point about using Heat. You can use any higher level tool to achieve this by dropping the existing addresses into userdata and then using a gossip protocol to spread the word to existing nodes about new ones. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
-Original Message- From: Justin Santa Barbara [mailto:jus...@fathomdb.com] Sent: 28 January 2014 20:17 To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service Thanks John - combining with the existing effort seems like the right thing to do (I've reached out to Claxton to coordinate). Great to see that the larger issues around quotas / write-once have already been agreed. So I propose that sharing will work in the same way, but some values are visible across all instances in the project. I do not think it would be appropriate for all entries to be shared this way. A few options: 1) A separate endpoint for shared values 2) Keys are shared iff e.g. they start with a prefix, like 'peers_XXX' 3) Keys are set the same way, but a 'shared' parameter can be passed, either as a query parameter or in the JSON. I like option #3 the best, but feedback is welcome. I think I will have to store the value using a system_metadata entry per shared key. I think this avoids issues with concurrent writes, and also makes it easier to have more advanced sharing policies (e.g. when we have hierarchical projects) Thank you to everyone for helping me get to what IMHO is a much better solution than the one I started with! Justin I think #1 or #3 would be fine. I don't really like #2 - doing this kind of thing through naming conventions always leads to problems IMO. Phil ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
-Original Message- From: Vishvananda Ishaya [mailto:vishvana...@gmail.com] Sent: 29 January 2014 03:40 To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service On Jan 28, 2014, at 12:17 PM, Justin Santa Barbara jus...@fathomdb.com wrote: Thanks John - combining with the existing effort seems like the right thing to do (I've reached out to Claxton to coordinate). Great to see that the larger issues around quotas / write-once have already been agreed. So I propose that sharing will work in the same way, but some values are visible across all instances in the project. I do not think it would be appropriate for all entries to be shared this way. A few options: 1) A separate endpoint for shared values 2) Keys are shared iff e.g. they start with a prefix, like 'peers_XXX' 3) Keys are set the same way, but a 'shared' parameter can be passed, either as a query parameter or in the JSON. I like option #3 the best, but feedback is welcome. I think I will have to store the value using a system_metadata entry per shared key. I think this avoids issues with concurrent writes, and also makes it easier to have more advanced sharing policies (e.g. when we have hierarchical projects) Thank you to everyone for helping me get to what IMHO is a much better solution than the one I started with! Justin I am -1 on the post data. I think we should avoid using the metadata service as a cheap queue for communicating across vms and this moves strongly in that direction. I am +1 on providing a list of ip addresses in the current security group(s) via metadata. I like limiting by security group instead of project because this could prevent the 1000 instance case where people have large shared tenants and it also provides a single tenant a way to have multiple autodiscoverd services. Also the security group info is something that neutron has access to so the neutron proxy should be able to generate the necessary info if neutron is in use. If the visibility is going to be controlled by security group membership, then security groups will have to be extended to have a share metadata attribute. Its not valid to assume that instances in the same security group should be able to see information about each other. The fundamental problem I see here is that the user who has access to the GuestOS, and who therefore has access to the metadata, is not always the same as the owner of the VM. A PaaS service that runs multiple VMs in the same tenant, and makes those individual VMs available to separate users, needs to be able to prevent those users from discovering the other VMs in the same tenant.Those VMs normally are in the same SG as they have common inbound and outbound rules - but access within the groups is disabled. The other concern that I have about bounding the scope with security groups is that its quite possible that the VMs that want to discover each other could be in different security groups. That would seem to lead to folks having to create a separate SG (maybe with no rules) just to scope discoverability. It kind of feels like we're in danger of overloading the role of security groups here in the same way that we want to avoid overloading the scope of the metadata service - although I can see that a security group is closer in concept to the kind of relationship between VMs That we're trying to express. Just as an interesting side note, we put this vm list in way back in the NASA days as an easy way to get mpi clusters running. In this case we grouped the instances by the key_name used to launch the instance instead of security group. I don't think it occurred to us to use security groups at the time. Note we also provided the number of cores, but this was for convienience because the mpi implementation didn't support discovering number of cores. Code below. Vish $ git show 2cf40bb3 commit 2cf40bb3b21d33f4025f80d175a4c2ec7a2f8414 Author: Vishvananda Ishaya vishvana...@yahoo.com Date: Thu Jun 24 04:11:54 2010 +0100 Adding mpi data diff --git a/nova/endpoint/cloud.py b/nova/endpoint/cloud.py index 8046d42..74da0ee 100644 --- a/nova/endpoint/cloud.py +++ b/nova/endpoint/cloud.py @@ -95,8 +95,21 @@ class CloudController(object): def get_instance_by_ip(self, ip): return self.instdir.by_ip(ip) +def _get_mpi_data(self, project_id): +result = {} +for node_name, node in self.instances.iteritems(): +for instance in node.values(): +if instance['project_id'] == project_id: +line = '%s slots=%d' % (instance['private_dns_name'], instance.get('vcpus', 0)) +if instance['key_name'] in result: +result[instance['key_name']].append(line
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
Certainly my original inclination (and code!) was to agree with you Vish, but: 1) It looks like we're going to have writable metadata anyway, for communication from the instance to the API. 2) I believe the restrictions make it impractical to abuse it as a message-bus: size-limits, quotas and write-once make it very poorly suited for anything queue like. 3) Anything that isn't opt-in will likely have security implications which means that it won't get deployed. This must be deployed to be useful. In short: I agree that it's not the absolute ideal solution (for me, that would be no opt-in), but it feels like the best solution given that we must have opt-in, or else e.g. HP won't deploy it. It uses a (soon to be) existing mechanism, and is readily extensible without breaking APIs. On your idea of scoping by security group, I believe a certain someone is looking at supporting hierarchical projects, so we will likely need to support more advanced logic here later anyway. For example: the ability to specify whether an entry should be shared with instances in child projects. This will likely take the form of a sort of selector language, so I anticipate we could offer a filter on security groups as well if this is useful. We might well also allow selection by instance tags. The approach allows this, though I would like to keep it as simple as possible at first (share with other instances in project or don't share) Justin On Tue, Jan 28, 2014 at 10:39 PM, Vishvananda Ishaya vishvana...@gmail.com wrote: On Jan 28, 2014, at 12:17 PM, Justin Santa Barbara jus...@fathomdb.com wrote: Thanks John - combining with the existing effort seems like the right thing to do (I've reached out to Claxton to coordinate). Great to see that the larger issues around quotas / write-once have already been agreed. So I propose that sharing will work in the same way, but some values are visible across all instances in the project. I do not think it would be appropriate for all entries to be shared this way. A few options: 1) A separate endpoint for shared values 2) Keys are shared iff e.g. they start with a prefix, like 'peers_XXX' 3) Keys are set the same way, but a 'shared' parameter can be passed, either as a query parameter or in the JSON. I like option #3 the best, but feedback is welcome. I think I will have to store the value using a system_metadata entry per shared key. I think this avoids issues with concurrent writes, and also makes it easier to have more advanced sharing policies (e.g. when we have hierarchical projects) Thank you to everyone for helping me get to what IMHO is a much better solution than the one I started with! Justin I am -1 on the post data. I think we should avoid using the metadata service as a cheap queue for communicating across vms and this moves strongly in that direction. I am +1 on providing a list of ip addresses in the current security group(s) via metadata. I like limiting by security group instead of project because this could prevent the 1000 instance case where people have large shared tenants and it also provides a single tenant a way to have multiple autodiscoverd services. Also the security group info is something that neutron has access to so the neutron proxy should be able to generate the necessary info if neutron is in use. Just as an interesting side note, we put this vm list in way back in the NASA days as an easy way to get mpi clusters running. In this case we grouped the instances by the key_name used to launch the instance instead of security group. I don't think it occurred to us to use security groups at the time. Note we also provided the number of cores, but this was for convienience because the mpi implementation didn't support discovering number of cores. Code below. Vish $ git show 2cf40bb3 commit 2cf40bb3b21d33f4025f80d175a4c2ec7a2f8414 Author: Vishvananda Ishaya vishvana...@yahoo.com Date: Thu Jun 24 04:11:54 2010 +0100 Adding mpi data diff --git a/nova/endpoint/cloud.py b/nova/endpoint/cloud.py index 8046d42..74da0ee 100644 --- a/nova/endpoint/cloud.py +++ b/nova/endpoint/cloud.py @@ -95,8 +95,21 @@ class CloudController(object): def get_instance_by_ip(self, ip): return self.instdir.by_ip(ip) +def _get_mpi_data(self, project_id): +result = {} +for node_name, node in self.instances.iteritems(): +for instance in node.values(): +if instance['project_id'] == project_id: +line = '%s slots=%d' % (instance['private_dns_name'], instance.get('vcpus', 0)) +if instance['key_name'] in result: +result[instance['key_name']].append(line) +else: +result[instance['key_name']] = [line] +return result + def get_metadata(self, ip): i = self.get_instance_by_ip(ip) +mpi =
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
On Jan 29, 2014, at 5:26 AM, Justin Santa Barbara jus...@fathomdb.com wrote: Certainly my original inclination (and code!) was to agree with you Vish, but: 1) It looks like we're going to have writable metadata anyway, for communication from the instance to the API. 2) I believe the restrictions make it impractical to abuse it as a message-bus: size-limits, quotas and write-once make it very poorly suited for anything queue like. 3) Anything that isn't opt-in will likely have security implications which means that it won't get deployed. This must be deployed to be useful. Fair enough. I agree that there are significant enough security implications to skip the simple version. Vish In short: I agree that it's not the absolute ideal solution (for me, that would be no opt-in), but it feels like the best solution given that we must have opt-in, or else e.g. HP won't deploy it. It uses a (soon to be) existing mechanism, and is readily extensible without breaking APIs. On your idea of scoping by security group, I believe a certain someone is looking at supporting hierarchical projects, so we will likely need to support more advanced logic here later anyway. For example: the ability to specify whether an entry should be shared with instances in child projects. This will likely take the form of a sort of selector language, so I anticipate we could offer a filter on security groups as well if this is useful. We might well also allow selection by instance tags. The approach allows this, though I would like to keep it as simple as possible at first (share with other instances in project or don't share) Justin On Tue, Jan 28, 2014 at 10:39 PM, Vishvananda Ishaya vishvana...@gmail.com wrote: On Jan 28, 2014, at 12:17 PM, Justin Santa Barbara jus...@fathomdb.com wrote: Thanks John - combining with the existing effort seems like the right thing to do (I've reached out to Claxton to coordinate). Great to see that the larger issues around quotas / write-once have already been agreed. So I propose that sharing will work in the same way, but some values are visible across all instances in the project. I do not think it would be appropriate for all entries to be shared this way. A few options: 1) A separate endpoint for shared values 2) Keys are shared iff e.g. they start with a prefix, like 'peers_XXX' 3) Keys are set the same way, but a 'shared' parameter can be passed, either as a query parameter or in the JSON. I like option #3 the best, but feedback is welcome. I think I will have to store the value using a system_metadata entry per shared key. I think this avoids issues with concurrent writes, and also makes it easier to have more advanced sharing policies (e.g. when we have hierarchical projects) Thank you to everyone for helping me get to what IMHO is a much better solution than the one I started with! Justin I am -1 on the post data. I think we should avoid using the metadata service as a cheap queue for communicating across vms and this moves strongly in that direction. I am +1 on providing a list of ip addresses in the current security group(s) via metadata. I like limiting by security group instead of project because this could prevent the 1000 instance case where people have large shared tenants and it also provides a single tenant a way to have multiple autodiscoverd services. Also the security group info is something that neutron has access to so the neutron proxy should be able to generate the necessary info if neutron is in use. Just as an interesting side note, we put this vm list in way back in the NASA days as an easy way to get mpi clusters running. In this case we grouped the instances by the key_name used to launch the instance instead of security group. I don't think it occurred to us to use security groups at the time. Note we also provided the number of cores, but this was for convienience because the mpi implementation didn't support discovering number of cores. Code below. Vish $ git show 2cf40bb3 commit 2cf40bb3b21d33f4025f80d175a4c2ec7a2f8414 Author: Vishvananda Ishaya vishvana...@yahoo.com Date: Thu Jun 24 04:11:54 2010 +0100 Adding mpi data diff --git a/nova/endpoint/cloud.py b/nova/endpoint/cloud.py index 8046d42..74da0ee 100644 --- a/nova/endpoint/cloud.py +++ b/nova/endpoint/cloud.py @@ -95,8 +95,21 @@ class CloudController(object): def get_instance_by_ip(self, ip): return self.instdir.by_ip(ip) +def _get_mpi_data(self, project_id): +result = {} +for node_name, node in self.instances.iteritems(): +for instance in node.values(): +if instance['project_id'] == project_id: +line = '%s slots=%d' % (instance['private_dns_name'], instance.get('vcpus', 0)) +if instance['key_name'] in result: +
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
On 27 January 2014 14:52, Justin Santa Barbara jus...@fathomdb.com wrote: Day, Phil wrote: We already have a mechanism now where an instance can push metadata as a way of Windows instances sharing their passwords - so maybe this could build on that somehow - for example each instance pushes the data its willing to share with other instances owned by the same tenant ? I do like that and think it would be very cool, but it is much more complex to implement I think. I don't think its that complicated - just needs one extra attribute stored per instance (for example into instance_system_metadata) which allows the instance to be included in the list Ah - OK, I think I better understand what you're proposing, and I do like it. The hardest bit of having the metadata store be full read/write would be defining what is and is not allowed (rate-limits, size-limits, etc). I worry that you end up with a new key-value store, and with per-instance credentials. That would be a separate discussion: this blueprint is trying to provide a focused replacement for multicast discovery for the cloud. But: thank you for reminding me about the Windows password though... It may provide a reasonable model: We would have a new endpoint, say 'discovery'. An instance can POST a single string value to the endpoint. A GET on the endpoint will return any values posted by all instances in the same project. One key only; name not publicly exposed ('discovery_datum'?); 255 bytes of value only. I expect most instances will just post their IPs, but I expect other uses will be found. If I provided a patch that worked in this way, would you/others be on-board? I like that idea. Seems like a good compromise. I have added my review comments to the blueprint. We have this related blueprints going on, setting metadata on a particular server, rather than a group: https://blueprints.launchpad.net/nova/+spec/metadata-service-callbacks It is limiting things using the existing Quota on metadata updates. It would be good to agree a similar format between the two. John ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
Thanks John - combining with the existing effort seems like the right thing to do (I've reached out to Claxton to coordinate). Great to see that the larger issues around quotas / write-once have already been agreed. So I propose that sharing will work in the same way, but some values are visible across all instances in the project. I do not think it would be appropriate for all entries to be shared this way. A few options: 1) A separate endpoint for shared values 2) Keys are shared iff e.g. they start with a prefix, like 'peers_XXX' 3) Keys are set the same way, but a 'shared' parameter can be passed, either as a query parameter or in the JSON. I like option #3 the best, but feedback is welcome. I think I will have to store the value using a system_metadata entry per shared key. I think this avoids issues with concurrent writes, and also makes it easier to have more advanced sharing policies (e.g. when we have hierarchical projects) Thank you to everyone for helping me get to what IMHO is a much better solution than the one I started with! Justin On Tue, Jan 28, 2014 at 4:38 AM, John Garbutt j...@johngarbutt.com wrote: On 27 January 2014 14:52, Justin Santa Barbara jus...@fathomdb.com wrote: Day, Phil wrote: We already have a mechanism now where an instance can push metadata as a way of Windows instances sharing their passwords - so maybe this could build on that somehow - for example each instance pushes the data its willing to share with other instances owned by the same tenant ? I do like that and think it would be very cool, but it is much more complex to implement I think. I don't think its that complicated - just needs one extra attribute stored per instance (for example into instance_system_metadata) which allows the instance to be included in the list Ah - OK, I think I better understand what you're proposing, and I do like it. The hardest bit of having the metadata store be full read/write would be defining what is and is not allowed (rate-limits, size-limits, etc). I worry that you end up with a new key-value store, and with per-instance credentials. That would be a separate discussion: this blueprint is trying to provide a focused replacement for multicast discovery for the cloud. But: thank you for reminding me about the Windows password though... It may provide a reasonable model: We would have a new endpoint, say 'discovery'. An instance can POST a single string value to the endpoint. A GET on the endpoint will return any values posted by all instances in the same project. One key only; name not publicly exposed ('discovery_datum'?); 255 bytes of value only. I expect most instances will just post their IPs, but I expect other uses will be found. If I provided a patch that worked in this way, would you/others be on-board? I like that idea. Seems like a good compromise. I have added my review comments to the blueprint. We have this related blueprints going on, setting metadata on a particular server, rather than a group: https://blueprints.launchpad.net/nova/+spec/metadata-service-callbacks It is limiting things using the existing Quota on metadata updates. It would be good to agree a similar format between the two. John ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
On Jan 28, 2014, at 12:17 PM, Justin Santa Barbara jus...@fathomdb.com wrote: Thanks John - combining with the existing effort seems like the right thing to do (I've reached out to Claxton to coordinate). Great to see that the larger issues around quotas / write-once have already been agreed. So I propose that sharing will work in the same way, but some values are visible across all instances in the project. I do not think it would be appropriate for all entries to be shared this way. A few options: 1) A separate endpoint for shared values 2) Keys are shared iff e.g. they start with a prefix, like 'peers_XXX' 3) Keys are set the same way, but a 'shared' parameter can be passed, either as a query parameter or in the JSON. I like option #3 the best, but feedback is welcome. I think I will have to store the value using a system_metadata entry per shared key. I think this avoids issues with concurrent writes, and also makes it easier to have more advanced sharing policies (e.g. when we have hierarchical projects) Thank you to everyone for helping me get to what IMHO is a much better solution than the one I started with! Justin I am -1 on the post data. I think we should avoid using the metadata service as a cheap queue for communicating across vms and this moves strongly in that direction. I am +1 on providing a list of ip addresses in the current security group(s) via metadata. I like limiting by security group instead of project because this could prevent the 1000 instance case where people have large shared tenants and it also provides a single tenant a way to have multiple autodiscoverd services. Also the security group info is something that neutron has access to so the neutron proxy should be able to generate the necessary info if neutron is in use. Just as an interesting side note, we put this vm list in way back in the NASA days as an easy way to get mpi clusters running. In this case we grouped the instances by the key_name used to launch the instance instead of security group. I don’t think it occurred to us to use security groups at the time. Note we also provided the number of cores, but this was for convienience because the mpi implementation didn’t support discovering number of cores. Code below. Vish $ git show 2cf40bb3 commit 2cf40bb3b21d33f4025f80d175a4c2ec7a2f8414 Author: Vishvananda Ishaya vishvana...@yahoo.com Date: Thu Jun 24 04:11:54 2010 +0100 Adding mpi data diff --git a/nova/endpoint/cloud.py b/nova/endpoint/cloud.py index 8046d42..74da0ee 100644 --- a/nova/endpoint/cloud.py +++ b/nova/endpoint/cloud.py @@ -95,8 +95,21 @@ class CloudController(object): def get_instance_by_ip(self, ip): return self.instdir.by_ip(ip) +def _get_mpi_data(self, project_id): +result = {} +for node_name, node in self.instances.iteritems(): +for instance in node.values(): +if instance['project_id'] == project_id: +line = '%s slots=%d' % (instance['private_dns_name'], instance.get('vcpus', 0)) +if instance['key_name'] in result: +result[instance['key_name']].append(line) +else: +result[instance['key_name']] = [line] +return result + def get_metadata(self, ip): i = self.get_instance_by_ip(ip) +mpi = self._get_mpi_data(i['project_id']) if i is None: return None if i['key_name']: @@ -135,7 +148,8 @@ class CloudController(object): 'public-keys' : keys, 'ramdisk-id': i.get('ramdisk_id', ''), 'reservation-id': i['reservation_id'], -'security-groups': i.get('groups', '') +'security-groups': i.get('groups', ''), +'mpi': mpi } } if False: # TODO: store ancestor ids On Tue, Jan 28, 2014 at 4:38 AM, John Garbutt j...@johngarbutt.com wrote: On 27 January 2014 14:52, Justin Santa Barbara jus...@fathomdb.com wrote: Day, Phil wrote: We already have a mechanism now where an instance can push metadata as a way of Windows instances sharing their passwords - so maybe this could build on that somehow - for example each instance pushes the data its willing to share with other instances owned by the same tenant ? I do like that and think it would be very cool, but it is much more complex to implement I think. I don't think its that complicated - just needs one extra attribute stored per instance (for example into instance_system_metadata) which allows the instance to be included in the list Ah - OK, I think I better understand what you're proposing, and I do like it. The hardest bit of having the metadata store be full read/write would be defining what is and is not allowed (rate-limits, size-limits, etc). I worry that you end up with a new key-value store, and with per-instance
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
What worried me most, I think, is that if we make this part of the standard metadata then everyone would get it, and that raises a couple of concerns: - Users with lots of instances (say 1000's) but who weren't trying to run any form of discovery would start getting a lot more metadata returned, which might cause performance issues The list of peers is only returned if the request comes in for peers.json, so there's no growth in the returned data unless it is requested. Because of the very clear instructions in the comment to always pre-fetch data, it is always pre-fetched, even though it would make more sense to me to fetch it lazily when it was requested! Easy to fix, but I'm obeying the comment because it was phrased in the form of a grammatically valid sentence :-) Ok, thanks for the clarification - I'd missed that this was a new json object, I thought you were just adding the data onto the existing object. - Some users might be running instances on behalf of customers (consider say a PaaS type service where the user gets access into an instance but not to the Nova API. In that case I wouldn't want one instance to be able to discover these kinds of details about other instances. Yes, I do think this is a valid concern. But, there is likely to be _much_ more sensitive information in the metadata service, so anyone doing this is hopefully blocking the metadata service anyway. On EC2 with IAM, or if we use trusts, there will be auth token in there. And not just for security, but also because if the PaaS program is auto-detecting EC2/OpenStack by looking for the metadata service, that will cause the program to be very confused if it sees the metadata for its host! Currently the metadata service only returns information for the instance that is requesting it (the Neutron proxy validates the source address and project), so the concern around sensitive information is already mitigated.But if we're now going to return information about other instances that changes the picture somewhat. We already have a mechanism now where an instance can push metadata as a way of Windows instances sharing their passwords - so maybe this could build on that somehow - for example each instance pushes the data its willing to share with other instances owned by the same tenant ? I do like that and think it would be very cool, but it is much more complex to implement I think. I don't think its that complicated - just needs one extra attribute stored per instance (for example into instance_system_metadata) which allows the instance to be included in the list It also starts to become a different problem: I do think we need a state-store, like Swift or etcd or Zookeeper that is easily accessibly to the instances. Indeed, one of the things I'd like to build using this blueprint is a distributed key-value store which would offer that functionality. But I think that having peer discovery is a much more tightly defined blueprint, whereas some form of shared read-write data-store is probably top-level project complexity. Isn't the metadata already in effect that state-store ? I'd just like to see it separate from the existing metadata blob, and on an opt-in basis Separate: is peers.json enough? I'm not sure I'm understanding you here. Yep - that ticks the separate box. Opt-in: IMHO, the danger of our OpenStack everything-is-optional-and- configurable approach is that we end up in a scenario where nothing is consistent and so nothing works out of the box. I'd much rather hash-out an agreement about what is safe to share, even if that is just IPs, and then get to the point where it is globally enabled. Would you be OK with it if it was just a list of IPs? I still think that would cause problems for PaaS services that abstracts the users away from direct control of the instance (I,e. the PaaS service is the Nova tenant, and creates instances in that tenant that are then made available to individual users. At the moment the only data such a user can see even from metadata are details of their instance. Extending that to allowing discover of other instances in the same tenant still feels to me to be something that needs to be controllable. The number of instances that want / need to be able to discover each other is subset of all instances, so making those explicitly declare themselves to the metadata service (when they have to already have the logic to get peers.json) doesn't sound like a major additional complication to me. Cheers, Phil ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
Hi Justin, My though process is to go back to basics. To perform discovery there is no getting away from the fact that you have to start with a well-known address that your peers can access on the network. The second part is a service/protocol accessible at that address that can perform the discovery. So the questions are: what well-known addresses can I reach? And is that a suitable place to implement the service/protocol. The metadata service is different to the others in that it can be accessed without credentials (correct me if I'm wrong), so it is the only possibility out of the openstack services if you do not want to have credentials on the peer instances. If that is not the case then the other services are options. All services require security groups and/or networks to be configured appropriately to access them. (Yes, the question can all instances access the same metadata service did really mean are they all local. Sorry for being unclear. But I think your answer is yes, they are, right?) Implementing the peer discovery in the instances themselves requires some kind of multicast or knowing a list of addresses to try. In both cases either the actual addresses or some name resolved through a naming service would do. Whatever is starting your instances does have access to at least nova, so it can find out if there are any running instances and what their addresses are. These could be used as the addresses they try first. These are the way that internet p2p services work and they work in the cloud. So there are options. The metadata service is a good place in terms of accessibility, but may not be for other reasons. In particular, the lack of credentials relates to the fact it is only allowed to see its own information. Making that more dynamic and including information about other things in the system might change the security model slightly. Secondly, is it the purpose of the metadata server to do this job? That's more a matter of choice. Personally, I think no, this is not the right place. Paul. From: Justin Santa Barbara [mailto:jus...@fathomdb.com] Sent: 24 January 2014 21:01 To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service Murray, Paul (HP Cloud Services) wrote: Multicast is not generally used over the internet, so the comment about removing multicast is not really justified, and any of the approaches that work there could be used. I think multicast/broadcast is commonly used 'behind the firewall', but I'm happy to hear of any other alternatives that you would recommend - particularly if they can work on the cloud! I agree that the metadata service is a sensible alternative. Do you imagine your instances all having access to the same metadata service? Is there something more generic and not tied to the architecture of a single openstack deployment? Not sure I understand - doesn't every Nova instance has access to the metadata service, and they all connect to the same back-end database? Has anyone not deployed the metadata service? It is not cross-region / cross-provider - is that what you mean? In terms of implementation (https://review.openstack.org/#/c/68825/) it is supposed to be the same as if you had done a list-instances call on the API provider. I know there's been talk of federation here; when this happens it would be awesome to have a cross-provider view (optionally, probably). Although this is a simple example, it is also the first of quite a lot of useful primitives that are commonly provided by configuration services. As it is possible to do what you want by other means (including using an implementation that has multicast within subnets - I'm sure neutron does actually have this), it seems that this makes less of a special case and rather a requirement for a more general notification service? I don't see any other solution offering as easy a solution for users (either the developer of the application or the person that launches the instances). If every instance had an automatic keystone token/trust with read-only access to its own project, that would be great. If Heat intercepted every Nova call and added metadata, that would be great. If Marconi offered every instance a 'broadcast' queue where it could reach all its peers, and we had a Keystone trust for that, that would be great. But, those are all 12 month projects, and even if you built them and they were awesome they still wouldn't get deployed on all the major clouds, so I _still_ couldn't rely on them as an application developer. My hope is to find something that every cloud can be comfortable deploying, that solves discovery just as broadcast/multicast solves it on typical LANs. It may be that anything other than IP addresses will make e.g. HP public cloud uncomfortable; if so then I'll tweak it to just be IPs. Finding
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
Day, Phil wrote: We already have a mechanism now where an instance can push metadata as a way of Windows instances sharing their passwords - so maybe this could build on that somehow - for example each instance pushes the data its willing to share with other instances owned by the same tenant ? I do like that and think it would be very cool, but it is much more complex to implement I think. I don't think its that complicated - just needs one extra attribute stored per instance (for example into instance_system_metadata) which allows the instance to be included in the list Ah - OK, I think I better understand what you're proposing, and I do like it. The hardest bit of having the metadata store be full read/write would be defining what is and is not allowed (rate-limits, size-limits, etc). I worry that you end up with a new key-value store, and with per-instance credentials. That would be a separate discussion: this blueprint is trying to provide a focused replacement for multicast discovery for the cloud. But: thank you for reminding me about the Windows password though... It may provide a reasonable model: We would have a new endpoint, say 'discovery'. An instance can POST a single string value to the endpoint. A GET on the endpoint will return any values posted by all instances in the same project. One key only; name not publicly exposed ('discovery_datum'?); 255 bytes of value only. I expect most instances will just post their IPs, but I expect other uses will be found. If I provided a patch that worked in this way, would you/others be on-board? Justin ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
Hi Justin, I can see the value of this, but I'm a bit wary of the metadata service extending into a general API - for example I can see this extending into a debate about what information needs to be made available about the instances (would you always want all instances exposed, all details, etc) - if not we'd end up starting to implement policy restrictions in the metadata service and starting to replicate parts of the API itself. Just seeing instances launched before me doesn't really help if they've been deleted (but are still in the cached values) does it ? Since there is some external agent creating these instances, why can't that just provide the details directly as user defined metadata ? Phil From: Justin Santa Barbara [mailto:jus...@fathomdb.com] Sent: 23 January 2014 16:29 To: OpenStack Development Mailing List Subject: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service Would appreciate feedback / opinions on this blueprint: https://blueprints.launchpad.net/nova/+spec/first-discover-your-peers The idea is: clustered services typically run some sort of gossip protocol, but need to find (just) one peer to connect to. In the physical environment, this was done using multicast. On the cloud, that isn't a great solution. Instead, I propose exposing a list of instances in the same project, through the metadata service. In particular, I'd like to know if anyone has other use cases for instance discovery. For peer-discovery, we can cache the instance list for the lifetime of the instance, because it suffices merely to see instances that were launched before me. (peer1 might not join to peer2, but peer2 will join to peer1). Other use cases are likely much less forgiving! Justin ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
Good points - thank you. For arbitrary operations, I agree that it would be better to expose a token in the metadata service, rather than allowing the metadata service to expose unbounded amounts of API functionality. We should therefore also have a per-instance token in the metadata, though I don't see Keystone getting the prerequisite IAM-level functionality for two+ releases (?). However, I think I can justify peer discovery as the 'one exception'. Here's why: discovery of peers is widely used for self-configuring clustered services, including those built in pre-cloud days. Multicast/broadcast used to be the solution, but cloud broke that. The cloud is supposed to be about distributed systems, yet we broke the primary way distributed systems do peer discovery. Today's workarounds are pretty terrible, e.g. uploading to an S3 bucket, or sharing EC2 credentials with the instance (tolerable now with IAM, but painful to configure). We're not talking about allowing instances to program the architecture (e.g. attach volumes etc), but rather just to do the equivalent of a multicast for discovery. In other words, we're restoring some functionality we took away (discovery via multicast) rather than adding programmable-infrastructure cloud functionality. We expect the instances to start a gossip protocol to determine who is actually up/down, who else is in the cluster, etc. As such, we don't need accurate information - we only have to help a node find one living peer. (Multicast/broadcast was not entirely reliable either!) Further, instance #2 will contact instance #1, so it doesn’t matter if instance #1 doesn’t have instance #2 in the list, as long as instance #2 sees instance #1. I'm relying on the idea that instance launching takes time 0, so other instances will be in the starting state when the metadata request comes in, even if we launch instances simultaneously. (Another reason why I don't filter instances by state!) I haven't actually found where metadata caching is implemented, although the constructor of InstanceMetadata documents restrictions that really only make sense if it is. Anyone know where it is cached? In terms of information exposed: An alternative would be to try to connect to every IP in the subnet we are assigned; this blueprint can be seen as an optimization on that (to avoid DDOS-ing the public clouds). So I’ve tried to expose only the information that enables directed scanning: availability zone, reservation id, security groups, network ids labels cidrs IPs [example below]. A naive implementation will just try every peer; a smarter implementation might check the security groups to try to filter it, or the zone information to try to connect to nearby peers first. Note that I don’t expose e.g. the instance state: if you want to know whether a node is up, you have to try connecting to it. I don't believe any of this information is at all sensitive, particularly not to instances in the same project. On external agents doing the configuration: yes, they could put this into user defined metadata, but then we're tied to a configuration system. We have to get 20 configuration systems to agree on a common format (Heat, Puppet, Chef, Ansible, SaltStack, Vagrant, Fabric, all the home-grown systems!) It also makes it hard to launch instances concurrently (because you want node #2 to have the metadata for node #1, so you have to wait for node #1 to get an IP). More generally though, I have in mind a different model, which I call 'configuration from within' (as in 'truth comes from within'). I don’t want a big imperialistic configuration system that comes and enforces its view of the world onto primitive machines. I want a smart machine that comes into existence, discovers other machines and cooperates with them. This is the Netflix pre-baked AMI concept, rather than the configuration management approach. The blueprint does not exclude 'imperialistic' configuration systems, but it does enable e.g. just launching N instances in one API call, or just using an auto-scaling group. I suspect the configuration management systems would prefer this to having to implement this themselves. (Example JSON below) Justin --- Example JSON: [ { availability_zone: nova, network_info: [ { id: e60bbbaf-1d2e-474e-bbd2-864db7205b60, network: { id: f2940cd1-f382-4163-a18f-c8f937c99157, label: private, subnets: [ { cidr: 10.11.12.0/24, ips: [ { address: 10.11.12.4, type: fixed, version: 4 } ], version: 4 }, {
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
I haven't actually found where metadata caching is implemented, although the constructor of InstanceMetadata documents restrictions that really only make sense if it is. Anyone know where it is cached? Here's the code that does the caching: https://github.com/openstack/nova/blob/master/nova/api/metadata/handler.py#L84-L98 Data is only cached for 15 seconds by default - the main reason for caching is that cloud-init makes a sequence of calls to get various items of metadata, and it saves a lot of DB access if we don't have to go back for them multiple times. If your using the Openstack metadata calls instead then the caching doesn't buy much as it returns a single json blob with all the values. From: Justin Santa Barbara [mailto:jus...@fathomdb.com] Sent: 24 January 2014 15:43 To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service Good points - thank you. For arbitrary operations, I agree that it would be better to expose a token in the metadata service, rather than allowing the metadata service to expose unbounded amounts of API functionality. We should therefore also have a per-instance token in the metadata, though I don't see Keystone getting the prerequisite IAM-level functionality for two+ releases (?). However, I think I can justify peer discovery as the 'one exception'. Here's why: discovery of peers is widely used for self-configuring clustered services, including those built in pre-cloud days. Multicast/broadcast used to be the solution, but cloud broke that. The cloud is supposed to be about distributed systems, yet we broke the primary way distributed systems do peer discovery. Today's workarounds are pretty terrible, e.g. uploading to an S3 bucket, or sharing EC2 credentials with the instance (tolerable now with IAM, but painful to configure). We're not talking about allowing instances to program the architecture (e.g. attach volumes etc), but rather just to do the equivalent of a multicast for discovery. In other words, we're restoring some functionality we took away (discovery via multicast) rather than adding programmable-infrastructure cloud functionality. We expect the instances to start a gossip protocol to determine who is actually up/down, who else is in the cluster, etc. As such, we don't need accurate information - we only have to help a node find one living peer. (Multicast/broadcast was not entirely reliable either!) Further, instance #2 will contact instance #1, so it doesn't matter if instance #1 doesn't have instance #2 in the list, as long as instance #2 sees instance #1. I'm relying on the idea that instance launching takes time 0, so other instances will be in the starting state when the metadata request comes in, even if we launch instances simultaneously. (Another reason why I don't filter instances by state!) I haven't actually found where metadata caching is implemented, although the constructor of InstanceMetadata documents restrictions that really only make sense if it is. Anyone know where it is cached? In terms of information exposed: An alternative would be to try to connect to every IP in the subnet we are assigned; this blueprint can be seen as an optimization on that (to avoid DDOS-ing the public clouds). So I've tried to expose only the information that enables directed scanning: availability zone, reservation id, security groups, network ids labels cidrs IPs [example below]. A naive implementation will just try every peer; a smarter implementation might check the security groups to try to filter it, or the zone information to try to connect to nearby peers first. Note that I don't expose e.g. the instance state: if you want to know whether a node is up, you have to try connecting to it. I don't believe any of this information is at all sensitive, particularly not to instances in the same project. On external agents doing the configuration: yes, they could put this into user defined metadata, but then we're tied to a configuration system. We have to get 20 configuration systems to agree on a common format (Heat, Puppet, Chef, Ansible, SaltStack, Vagrant, Fabric, all the home-grown systems!) It also makes it hard to launch instances concurrently (because you want node #2 to have the metadata for node #1, so you have to wait for node #1 to get an IP). More generally though, I have in mind a different model, which I call 'configuration from within' (as in 'truth comes from within'). I don't want a big imperialistic configuration system that comes and enforces its view of the world onto primitive machines. I want a smart machine that comes into existence, discovers other machines and cooperates with them. This is the Netflix pre-baked AMI concept, rather than the configuration management approach. The blueprint does not exclude 'imperialistic' configuration
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
On Fri, Jan 24, 2014 at 12:55 PM, Day, Phil philip@hp.com wrote: I haven't actually found where metadata caching is implemented, although the constructor of InstanceMetadata documents restrictions that really only make sense if it is. Anyone know where it is cached? Here’s the code that does the caching: https://github.com/openstack/nova/blob/master/nova/api/metadata/handler.py#L84-L98 Data is only cached for 15 seconds by default – the main reason for caching is that cloud-init makes a sequence of calls to get various items of metadata, and it saves a lot of DB access if we don’t have to go back for them multiple times. If your using the Openstack metadata calls instead then the caching doesn’t buy much as it returns a single json blob with all the values. Thanks (not quite sure how I missed that, but I did!) 15 second 'micro-caching' is probably great for peer discovery. Short enough that we'll find any peer basically as soon as it boots if we're polling (e.g. we haven't yet connected to a peer), long enough to prevent denial-of-service. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
Excerpts from Justin Santa Barbara's message of 2014-01-24 07:43:23 -0800: Good points - thank you. For arbitrary operations, I agree that it would be better to expose a token in the metadata service, rather than allowing the metadata service to expose unbounded amounts of API functionality. We should therefore also have a per-instance token in the metadata, though I don't see Keystone getting the prerequisite IAM-level functionality for two+ releases (?). Heat has been working hard to be able to do per-instance limited access in Keystone for a while. A trust might work just fine for what you want. However, I think I can justify peer discovery as the 'one exception'. Here's why: discovery of peers is widely used for self-configuring clustered services, including those built in pre-cloud days. Multicast/broadcast used to be the solution, but cloud broke that. The cloud is supposed to be about distributed systems, yet we broke the primary way distributed systems do peer discovery. Today's workarounds are pretty terrible, e.g. uploading to an S3 bucket, or sharing EC2 credentials with the instance (tolerable now with IAM, but painful to configure). We're not talking about allowing instances to program the architecture (e.g. attach volumes etc), but rather just to do the equivalent of a multicast for discovery. In other words, we're restoring some functionality we took away (discovery via multicast) rather than adding programmable-infrastructure cloud functionality. Are you hesitant to just use Heat? This is exactly what it is supposed to do.. make a bunch of API calls and expose the results to instances for use in configuration. If you're just hesitant to use a declarative templating language, I totally understand. The auto-scaling minded people are also feeling this way. You could join them in the quest to create an imperative cluster-making API for Heat. We expect the instances to start a gossip protocol to determine who is actually up/down, who else is in the cluster, etc. As such, we don't need accurate information - we only have to help a node find one living peer. (Multicast/broadcast was not entirely reliable either!) Further, instance #2 will contact instance #1, so it doesn’t matter if instance #1 doesn’t have instance #2 in the list, as long as instance #2 sees instance #1. I'm relying on the idea that instance launching takes time 0, so other instances will be in the starting state when the metadata request comes in, even if we launch instances simultaneously. (Another reason why I don't filter instances by state!) I haven't actually found where metadata caching is implemented, although the constructor of InstanceMetadata documents restrictions that really only make sense if it is. Anyone know where it is cached? In terms of information exposed: An alternative would be to try to connect to every IP in the subnet we are assigned; this blueprint can be seen as an optimization on that (to avoid DDOS-ing the public clouds). So I’ve tried to expose only the information that enables directed scanning: availability zone, reservation id, security groups, network ids labels cidrs IPs [example below]. A naive implementation will just try every peer; a smarter implementation might check the security groups to try to filter it, or the zone information to try to connect to nearby peers first. Note that I don’t expose e.g. the instance state: if you want to know whether a node is up, you have to try connecting to it. I don't believe any of this information is at all sensitive, particularly not to instances in the same project. On external agents doing the configuration: yes, they could put this into user defined metadata, but then we're tied to a configuration system. We have to get 20 configuration systems to agree on a common format (Heat, Puppet, Chef, Ansible, SaltStack, Vagrant, Fabric, all the home-grown systems!) It also makes it hard to launch instances concurrently (because you want node #2 to have the metadata for node #1, so you have to wait for node #1 to get an IP). More generally though, I have in mind a different model, which I call 'configuration from within' (as in 'truth comes from within'). I don’t want a big imperialistic configuration system that comes and enforces its view of the world onto primitive machines. I want a smart machine that comes into existence, discovers other machines and cooperates with them. This is the Netflix pre-baked AMI concept, rather than the configuration management approach. :) We are on the same page. I really think Heat is where higher level information sharing of this type belongs. I do think it might make sense for Heat to push things into user-data post-boot, rather than only expose them via its own metadata service. However, even without that, you can achieve what you're talking about right now with Heat's separate metadata. The blueprint does not
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
Good points - thank you. For arbitrary operations, I agree that it would be better to expose a token in the metadata service, rather than allowing the metadata service to expose unbounded amounts of API functionality. We should therefore also have a per-instance token in the metadata, though I don't see Keystone getting the prerequisite IAM-level functionality for two+ releases (?). I can also see that in Neutron not all instances have access to the API servers. so I'm not against having something in metadata providing its well-focused. ... In terms of information exposed: An alternative would be to try to connect to every IP in the subnet we are assigned; this blueprint can be seen as an optimization on that (to avoid DDOS-ing the public clouds). Well if you're on a Neutron private network then you'd only be DDOS-ing yourself. In fact I think Neutron allows broadcast and multicast on private networks, and as nova-net is going to be deprecated at some point I wonder if this is reducing to a corner case ? So I've tried to expose only the information that enables directed scanning: availability zone, reservation id, security groups, network ids labels cidrs IPs [example below]. A naive implementation will just try every peer; a smarter implementation might check the security groups to try to filter it, or the zone information to try to connect to nearby peers first. Note that I don't expose e.g. the instance state: if you want to know whether a node is up, you have to try connecting to it. I don't believe any of this information is at all sensitive, particularly not to instances in the same project. Does it really need all of that - it seems that IP address would really be enough and the agents or whatever in the instance could take it from there ? What worried me most, I think, is that if we make this part of the standard metadata then everyone would get it, and that raises a couple of concerns: - Users with lots of instances (say 1000's) but who weren't trying to run any form of discovery would start getting a lot more metadata returned, which might cause performance issues - Some users might be running instances on behalf of customers (consider say a PaaS type service where the user gets access into an instance but not to the Nova API. In that case I wouldn't want one instance to be able to discover these kinds of details about other instances. So it kind of feels to me that this should be some other specific set of metadata that instances can ask for, and that instances have to explicitly opt into. We already have a mechanism now where an instance can push metadata as a way of Windows instances sharing their passwords - so maybe this could build on that somehow - for example each instance pushes the data its willing to share with other instances owned by the same tenant ? On external agents doing the configuration: yes, they could put this into user defined metadata, but then we're tied to a configuration system. We have to get 20 configuration systems to agree on a common format (Heat, Puppet, Chef, Ansible, SaltStack, Vagrant, Fabric, all the home-grown systems!) It also makes it hard to launch instances concurrently (because you want node #2 to have the metadata for node #1, so you have to wait for node #1 to get an IP). Well you've kind of got to agree on a common format anyway haven't you if the information is going to come from metadata ? But I get your other points. More generally though, I have in mind a different model, which I call 'configuration from within' (as in 'truth comes from within'). I don't want a big imperialistic configuration system that comes and enforces its view of the world onto primitive machines. I want a smart machine that comes into existence, discovers other machines and cooperates with them. This is the Netflix pre-baked AMI concept, rather than the configuration management approach. The blueprint does not exclude 'imperialistic' configuration systems, but it does enable e.g. just launching N instances in one API call, or just using an auto-scaling group. I suspect the configuration management systems would prefer this to having to implement this themselves. Yep, I get the concept, and metadata does seem like the best existing mechanism to do this as its already available to all instances regardless of where they are on the network, and it's a controlled interface. I'd just like to see it separate from the existing metadata blob, and on an opt-in basis. Phil ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
Hi Justin, It's nice to see someone bringing this kind of thing up. Seeding discovery is a handy primitive to have. Multicast is not generally used over the internet, so the comment about removing multicast is not really justified, and any of the approaches that work there could be used. Alternatively your instances could use the nova or neutron APIs to obtain any information you want - if they are network connected - but certainly whatever is starting them has access, so something can at least provide the information. I agree that the metadata service is a sensible alternative. Do you imagine your instances all having access to the same metadata service? Is there something more generic and not tied to the architecture of a single openstack deployment? Although this is a simple example, it is also the first of quite a lot of useful primitives that are commonly provided by configuration services. As it is possible to do what you want by other means (including using an implementation that has multicast within subnets - I'm sure neutron does actually have this), it seems that this makes less of a special case and rather a requirement for a more general notification service? Having said that I do like this kind of stuff :) Paul. From: Justin Santa Barbara [mailto:jus...@fathomdb.com] Sent: 24 January 2014 15:43 To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service Good points - thank you. For arbitrary operations, I agree that it would be better to expose a token in the metadata service, rather than allowing the metadata service to expose unbounded amounts of API functionality. We should therefore also have a per-instance token in the metadata, though I don't see Keystone getting the prerequisite IAM-level functionality for two+ releases (?). However, I think I can justify peer discovery as the 'one exception'. Here's why: discovery of peers is widely used for self-configuring clustered services, including those built in pre-cloud days. Multicast/broadcast used to be the solution, but cloud broke that. The cloud is supposed to be about distributed systems, yet we broke the primary way distributed systems do peer discovery. Today's workarounds are pretty terrible, e.g. uploading to an S3 bucket, or sharing EC2 credentials with the instance (tolerable now with IAM, but painful to configure). We're not talking about allowing instances to program the architecture (e.g. attach volumes etc), but rather just to do the equivalent of a multicast for discovery. In other words, we're restoring some functionality we took away (discovery via multicast) rather than adding programmable-infrastructure cloud functionality. We expect the instances to start a gossip protocol to determine who is actually up/down, who else is in the cluster, etc. As such, we don't need accurate information - we only have to help a node find one living peer. (Multicast/broadcast was not entirely reliable either!) Further, instance #2 will contact instance #1, so it doesn't matter if instance #1 doesn't have instance #2 in the list, as long as instance #2 sees instance #1. I'm relying on the idea that instance launching takes time 0, so other instances will be in the starting state when the metadata request comes in, even if we launch instances simultaneously. (Another reason why I don't filter instances by state!) I haven't actually found where metadata caching is implemented, although the constructor of InstanceMetadata documents restrictions that really only make sense if it is. Anyone know where it is cached? In terms of information exposed: An alternative would be to try to connect to every IP in the subnet we are assigned; this blueprint can be seen as an optimization on that (to avoid DDOS-ing the public clouds). So I've tried to expose only the information that enables directed scanning: availability zone, reservation id, security groups, network ids labels cidrs IPs [example below]. A naive implementation will just try every peer; a smarter implementation might check the security groups to try to filter it, or the zone information to try to connect to nearby peers first. Note that I don't expose e.g. the instance state: if you want to know whether a node is up, you have to try connecting to it. I don't believe any of this information is at all sensitive, particularly not to instances in the same project. On external agents doing the configuration: yes, they could put this into user defined metadata, but then we're tied to a configuration system. We have to get 20 configuration systems to agree on a common format (Heat, Puppet, Chef, Ansible, SaltStack, Vagrant, Fabric, all the home-grown systems!) It also makes it hard to launch instances concurrently (because you want node #2 to have the metadata for node #1, so you have to wait
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
Would it make sense to simply have the neutron metadata service re-export every endpoint listed in keystone at /openstack/api/endpoint-name? Thanks, Kevin From: Murray, Paul (HP Cloud Services) [pmur...@hp.com] Sent: Friday, January 24, 2014 11:04 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service Hi Justin, It’s nice to see someone bringing this kind of thing up. Seeding discovery is a handy primitive to have. Multicast is not generally used over the internet, so the comment about removing multicast is not really justified, and any of the approaches that work there could be used. Alternatively your instances could use the nova or neutron APIs to obtain any information you want – if they are network connected – but certainly whatever is starting them has access, so something can at least provide the information. I agree that the metadata service is a sensible alternative. Do you imagine your instances all having access to the same metadata service? Is there something more generic and not tied to the architecture of a single openstack deployment? Although this is a simple example, it is also the first of quite a lot of useful primitives that are commonly provided by configuration services. As it is possible to do what you want by other means (including using an implementation that has multicast within subnets – I’m sure neutron does actually have this), it seems that this makes less of a special case and rather a requirement for a more general notification service? Having said that I do like this kind of stuff :) Paul. From: Justin Santa Barbara [mailto:jus...@fathomdb.com] Sent: 24 January 2014 15:43 To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service Good points - thank you. For arbitrary operations, I agree that it would be better to expose a token in the metadata service, rather than allowing the metadata service to expose unbounded amounts of API functionality. We should therefore also have a per-instance token in the metadata, though I don't see Keystone getting the prerequisite IAM-level functionality for two+ releases (?). However, I think I can justify peer discovery as the 'one exception'. Here's why: discovery of peers is widely used for self-configuring clustered services, including those built in pre-cloud days. Multicast/broadcast used to be the solution, but cloud broke that. The cloud is supposed to be about distributed systems, yet we broke the primary way distributed systems do peer discovery. Today's workarounds are pretty terrible, e.g. uploading to an S3 bucket, or sharing EC2 credentials with the instance (tolerable now with IAM, but painful to configure). We're not talking about allowing instances to program the architecture (e.g. attach volumes etc), but rather just to do the equivalent of a multicast for discovery. In other words, we're restoring some functionality we took away (discovery via multicast) rather than adding programmable-infrastructure cloud functionality. We expect the instances to start a gossip protocol to determine who is actually up/down, who else is in the cluster, etc. As such, we don't need accurate information - we only have to help a node find one living peer. (Multicast/broadcast was not entirely reliable either!) Further, instance #2 will contact instance #1, so it doesn’t matter if instance #1 doesn’t have instance #2 in the list, as long as instance #2 sees instance #1. I'm relying on the idea that instance launching takes time 0, so other instances will be in the starting state when the metadata request comes in, even if we launch instances simultaneously. (Another reason why I don't filter instances by state!) I haven't actually found where metadata caching is implemented, although the constructor of InstanceMetadata documents restrictions that really only make sense if it is. Anyone know where it is cached? In terms of information exposed: An alternative would be to try to connect to every IP in the subnet we are assigned; this blueprint can be seen as an optimization on that (to avoid DDOS-ing the public clouds). So I’ve tried to expose only the information that enables directed scanning: availability zone, reservation id, security groups, network ids labels cidrs IPs [example below]. A naive implementation will just try every peer; a smarter implementation might check the security groups to try to filter it, or the zone information to try to connect to nearby peers first. Note that I don’t expose e.g. the instance state: if you want to know whether a node is up, you have to try connecting to it. I don't believe any of this information is at all sensitive, particularly not to instances in the same
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
Well if you're on a Neutron private network then you'd only be DDOS-ing yourself. In fact I think Neutron allows broadcast and multicast on private networks, and as nova-net is going to be deprecated at some point I wonder if this is reducing to a corner case ? Neutron may well re-enable multicast/broadcast, but I think that (1) multicast/broadcast is the wrong thing to use anyway, and more of an artifact of the way clusters were previously deployed and (2) we should have an option that doesn't require people to install Neutron with multicast enabled. I think that many public clouds, particularly those that want to encourage an XaaS ecosystem, will avoid forcing people to use Neutron's isolated networks. it seems that IP address would really be enough and the agents or whatever in the instance could take it from there ? Quite possibly. I'm very open to doing just that if people would prefer. What worried me most, I think, is that if we make this part of the standard metadata then everyone would get it, and that raises a couple of concerns: - Users with lots of instances (say 1000's) but who weren't trying to run any form of discovery would start getting a lot more metadata returned, which might cause performance issues The list of peers is only returned if the request comes in for peers.json, so there's no growth in the returned data unless it is requested. Because of the very clear instructions in the comment to always pre-fetch data, it is always pre-fetched, even though it would make more sense to me to fetch it lazily when it was requested! Easy to fix, but I'm obeying the comment because it was phrased in the form of a grammatically valid sentence :-) - Some users might be running instances on behalf of customers (consider say a PaaS type service where the user gets access into an instance but not to the Nova API. In that case I wouldn't want one instance to be able to discover these kinds of details about other instances. Yes, I do think this is a valid concern. But, there is likely to be _much_ more sensitive information in the metadata service, so anyone doing this is hopefully blocking the metadata service anyway. On EC2 with IAM, or if we use trusts, there will be auth token in there. And not just for security, but also because if the PaaS program is auto-detecting EC2/OpenStack by looking for the metadata service, that will cause the program to be very confused if it sees the metadata for its host! So it kind of feels to me that this should be some other specific set of metadata that instances can ask for, and that instances have to explicitly opt into. I think we have this in terms of the peers.json endpoint for byte-count concerns. For security, we only go per-project; I don't think we're exposing any new information; and anyone doing multi-tenant should either be using projects or be blocking 169.254 anyway. We already have a mechanism now where an instance can push metadata as a way of Windows instances sharing their passwords - so maybe this could build on that somehow - for example each instance pushes the data its willing to share with other instances owned by the same tenant ? I do like that and think it would be very cool, but it is much more complex to implement I think. It also starts to become a different problem: I do think we need a state-store, like Swift or etcd or Zookeeper that is easily accessibly to the instances. Indeed, one of the things I'd like to build using this blueprint is a distributed key-value store which would offer that functionality. But I think that having peer discovery is a much more tightly defined blueprint, whereas some form of shared read-write data-store is probably top-level project complexity. On external agents doing the configuration: yes, they could put this into user defined metadata, but then we're tied to a configuration system. We have to get 20 configuration systems to agree on a common format (Heat, Puppet, Chef, Ansible, SaltStack, Vagrant, Fabric, all the home-grown systems!) It also makes it hard to launch instances concurrently (because you want node #2 to have the metadata for node #1, so you have to wait for node #1 to get an IP). Well you've kind of got to agree on a common format anyway haven't you if the information is going to come from metadata ? But I get your other points. We do have to define a format, but because we only implement it once if we do it at the Nova level I hope that there will be much more pragmatism than if we had to get the configuration cabal to agree. We can implement the format, and if consumers want the functionality that's the format they must parse :-) I'd just like to see it separate from the existing metadata blob, and on an opt-in basis Separate: is peers.json enough? I'm not sure I'm understanding you here. Opt-in: IMHO, the danger of our OpenStack everything-is-optional-and-configurable
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
Clint Byrum cl...@fewbar.com wrote: Heat has been working hard to be able to do per-instance limited access in Keystone for a while. A trust might work just fine for what you want. I wasn't actually aware of the progress on trusts. It would be helpful except (1) it is more work to have to create a separate trust (it is even more painful to do so with IAM) and (2) it doesn't look like we can yet lock-down these delegations as much as people would probably want. I think IAM is the end-game in terms of the model that people actually want, and it ends up being incredibly complex. Delegation is very useful (particularly because clusters could auto-scale themselves), but I'd love to get an easier solution for the peer discovery problem than where delegation ends up. Are you hesitant to just use Heat? This is exactly what it is supposed to do.. make a bunch of API calls and expose the results to instances for use in configuration. If you're just hesitant to use a declarative templating language, I totally understand. The auto-scaling minded people are also feeling this way. You could join them in the quest to create an imperative cluster-making API for Heat. I don't want to _depend_ on Heat. My hope is that we can just launch 3 instances with the Cassandra image, and get a Cassandra cluster. It might be that we want Heat to auto-scale that cluster, Ceilometer to figure out when to scale it, Neutron to isolate it, etc but I think we can solve the basic discovery problem cleanly without tying in all the other services. Heat's value-add doesn't come from solving this problem! :) We are on the same page. I really think Heat is where higher level information sharing of this type belongs. I do think it might make sense for Heat to push things into user-data post-boot, rather than only expose them via its own metadata service. However, even without that, you can achieve what you're talking about right now with Heat's separate metadata. ... N instances in one API call is something Heat does well, and it does auto scaling too, so I feel like your idea is mostly just asking for a simpler way to use Heat, which I think everyone would agree would be good for all Heat users. :) I have a personal design goal of solving the discovery problem in a way that works even on non-clouds. So I can write a clustered service, and it will run everywhere. The way I see it is that: - If we're on physical, the instance will use multicast broadcast to find peers on the network. - If we're on OpenStack, the instance will use this blueprint to find its peers. The instance may be launched through Nova, or Heat, or Puppet/Chef/Salt/etc. I would like to see people use Heat, but I don't want to force people to use Heat. If Heat starts putting a more accurate list of peers into metadata, I will check that first. But if I can't find that list of peers that Heat provides, I will fall-back to whatever I can get from Nova so that I can cope with people not on Heat. - If we're on EC2, the user must configure an IAM role and assign it to their instances, and then we will query the list of instances. It gives me great pleasure that EC2 will end up needing the most undifferentiated lifting from the user. Justin ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
Murray, Paul (HP Cloud Services) wrote: Multicast is not generally used over the internet, so the comment about removing multicast is not really justified, and any of the approaches that work there could be used. I think multicast/broadcast is commonly used 'behind the firewall', but I'm happy to hear of any other alternatives that you would recommend - particularly if they can work on the cloud! I agree that the metadata service is a sensible alternative. Do you imagine your instances all having access to the same metadata service? Is there something more generic and not tied to the architecture of a single openstack deployment? Not sure I understand - doesn't every Nova instance has access to the metadata service, and they all connect to the same back-end database? Has anyone not deployed the metadata service? It is not cross-region / cross-provider - is that what you mean? In terms of implementation ( https://review.openstack.org/#/c/68825/) it is supposed to be the same as if you had done a list-instances call on the API provider. I know there's been talk of federation here; when this happens it would be awesome to have a cross-provider view (optionally, probably). Although this is a simple example, it is also the first of quite a lot of useful primitives that are commonly provided by configuration services. As it is possible to do what you want by other means (including using an implementation that has multicast within subnets – I’m sure neutron does actually have this), it seems that this makes less of a special case and rather a requirement for a more general notification service? I don't see any other solution offering as easy a solution for users (either the developer of the application or the person that launches the instances). If every instance had an automatic keystone token/trust with read-only access to its own project, that would be great. If Heat intercepted every Nova call and added metadata, that would be great. If Marconi offered every instance a 'broadcast' queue where it could reach all its peers, and we had a Keystone trust for that, that would be great. But, those are all 12 month projects, and even if you built them and they were awesome they still wouldn't get deployed on all the major clouds, so I _still_ couldn't rely on them as an application developer. My hope is to find something that every cloud can be comfortable deploying, that solves discovery just as broadcast/multicast solves it on typical LANs. It may be that anything other than IP addresses will make e.g. HP public cloud uncomfortable; if so then I'll tweak it to just be IPs. Finding an acceptable solution for everyone is the most important thing to me. I am very open to any alternatives that will actually get deployed! One idea I had: I could return a flat list of IPs, as JSON objects: [ { ip: '1.2.3.4' }, { ip: '1.2.3.5' }, { ip: '1.2.3.6' } ] If e.g. it turns out that security groups are really important, then we can just pop the extra attribute into the same data format without breaking the API: ... { ip: '1.2.3.4', security_groups: [ 'sg1', 'sg2' ] } ... Justin ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
Fox, Kevin M wrote: Would it make sense to simply have the neutron metadata service re-export every endpoint listed in keystone at /openstack/api/endpoint-name? Do you mean with an implicit token for read-only access, so the instance doesn't need a token? That is a superset of my proposal, so it would solve my use-case. I can't see it getting enabled in production though, given the depth of feelings about exposing just the subset of information I proposed ... :-) I would be very happy to be proved wrong here! ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
Excerpts from Justin Santa Barbara's message of 2014-01-24 12:29:49 -0800: Clint Byrum cl...@fewbar.com wrote: Heat has been working hard to be able to do per-instance limited access in Keystone for a while. A trust might work just fine for what you want. I wasn't actually aware of the progress on trusts. It would be helpful except (1) it is more work to have to create a separate trust (it is even more painful to do so with IAM) and (2) it doesn't look like we can yet lock-down these delegations as much as people would probably want. I think IAM is the end-game in terms of the model that people actually want, and it ends up being incredibly complex. Delegation is very useful (particularly because clusters could auto-scale themselves), but I'd love to get an easier solution for the peer discovery problem than where delegation ends up. Are you hesitant to just use Heat? This is exactly what it is supposed to do.. make a bunch of API calls and expose the results to instances for use in configuration. If you're just hesitant to use a declarative templating language, I totally understand. The auto-scaling minded people are also feeling this way. You could join them in the quest to create an imperative cluster-making API for Heat. I don't want to _depend_ on Heat. My hope is that we can just launch 3 instances with the Cassandra image, and get a Cassandra cluster. It might be that we want Heat to auto-scale that cluster, Ceilometer to figure out when to scale it, Neutron to isolate it, etc but I think we can solve the basic discovery problem cleanly without tying in all the other services. Heat's value-add doesn't come from solving this problem! I suppose we disagree on this fundamental point then. Heat's value-add really does come from solving this exact problem. It provides a layer above all of the other services to facilitate expression of higher level concepts. Nova exposes a primitive API, where as Heat is meant to have a more logical expression of the user's intentions. That includes exposure of details of one resource to another (not just compute, swift containers, load balancers, volumes, images, etc). :) We are on the same page. I really think Heat is where higher level information sharing of this type belongs. I do think it might make sense for Heat to push things into user-data post-boot, rather than only expose them via its own metadata service. However, even without that, you can achieve what you're talking about right now with Heat's separate metadata. ... N instances in one API call is something Heat does well, and it does auto scaling too, so I feel like your idea is mostly just asking for a simpler way to use Heat, which I think everyone would agree would be good for all Heat users. :) I have a personal design goal of solving the discovery problem in a way that works even on non-clouds. So I can write a clustered service, and it will run everywhere. The way I see it is that: - If we're on physical, the instance will use multicast broadcast to find peers on the network. - If we're on OpenStack, the instance will use this blueprint to find its peers. The instance may be launched through Nova, or Heat, or Puppet/Chef/Salt/etc. I would like to see people use Heat, but I don't want to force people to use Heat. If Heat starts putting a more accurate list of peers into metadata, I will check that first. But if I can't find that list of peers that Heat provides, I will fall-back to whatever I can get from Nova so that I can cope with people not on Heat. - If we're on EC2, the user must configure an IAM role and assign it to their instances, and then we will query the list of instances. It gives me great pleasure that EC2 will end up needing the most undifferentiated lifting from the user. Heat is meant to be a facility for exactly what you want. If you don't want to ask people to use it, you're just duplicating Heat functionality in Nova. Using Heat means no query/filter for the instances you want: you have the exact addresses in your cluster. My suggestion would be that if you want to hide all of the complexity of Heat from users, you add a simplified API to Heat that enables your use case. In many ways that is exactly what Savanna, Trove, et.al are: domain specific cluster API's backed by orchestration. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: discovery of peer instances through metadata service
I suppose we disagree on this fundamental point then. Heat's value-add really does come from solving this exact problem. It provides a layer above all of the other services to facilitate expression of higher level concepts. Nova exposes a primitive API, where as Heat is meant to have a more logical expression of the user's intentions. That includes exposure of details of one resource to another (not just compute, swift containers, load balancers, volumes, images, etc). That's a great vision for Heat, and I look forward to using it. Heat is meant to be a facility for exactly what you want. If you don't want to ask people to use it, you're just duplicating Heat functionality in Nova. Using Heat means no query/filter for the instances you want: you have the exact addresses in your cluster. My suggestion would be that if you want to hide all of the complexity of Heat from users, you add a simplified API to Heat that enables your use case. In many ways that is exactly what Savanna, Trove, et.al are: domain specific cluster API's backed by orchestration. I take it as a +1 for the feature that so many projects are suggesting that they should be the one to implement it. Given that there are so many projects that feel they should have implemented it, this tells me that it may in fact be common functionality, and thus we should put it into the low-level project, i.e. Nova. I don't think this should preclude Heat, Marconi, Neutron and any other project in our big happy family from also implementing the feature, or from doing it more completely using their domain-specific knowledge. This is open-source after all :-) Justin ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev