Re: [openstack-dev] [ironic] Re: New API for node create, specifying initial provision state
2015-08-27 18:43 GMT+02:00 Clint Byrum cl...@fewbar.com: Excerpts from Lucas Alvares Gomes's message of 2015-08-27 02:40:26 -0700: On Wed, Aug 26, 2015 at 11:09 PM, Julia Kreger juliaashleykre...@gmail.com wrote: My apologies for not expressing my thoughts on this matter sooner, however I've had to spend some time collecting my thoughts. To me, it seems like we do not trust our users. Granted, when I say users, I mean administrators who likely know more about the disposition and capabilities of their fleet than could ever be discovered or inferred via software. Sure, we have other users, mainly in the form of consumers, asking Ironic for hardware to be deployed, but the driver for adoption is who feels the least amount of pain. API versioning aside, I have to ask the community, what is more important? - An inflexible workflow that forces an administrator to always have a green field, and to step through a workflow that we've dictated, which may not apply to their operational scenario, ultimately driving them to write custom code to inject new nodes into the database directly, which will surely break from time to time, causing them to hate Ironic and look for a different solution. - A happy administrator that has the capabilities to do their job (and thus manage the baremetal node wherever it is in the operator's lifecycle) in an efficient fashion, thus causing them to fall in love with Ironic. I'm sorry, I find the language used in this reply very offensive. That's not even a real question, due the alternatives you're basically asking the community What's more important, be happy or be sad ? Be efficient or not efficient? Funny, I find your response a bit offensive, as a user of Ironic who has been falling in love with it for a couple of years now, and is confused by the recent changes to the API that completely ignore me. I have _zero_ interest in this workflow. I want my nodes to be available as soon as I tell Ironic about them. You've added a step that makes no sense to me. Why not just let me create nodes in that state? Because we don't have a test on a users' experience level in OpenStack in our node-create command ;) It won't distinguish between you, knowing precisely what you're doing, and a confused user who picked a wrong command and is in one step from shooting his/her leg. It reminds me of a funny thing Monty Taylor pointed out in the Westin in Atlanta. We had to scramble to find our room keys to work the elevator, and upon unlocking the elevator, had to then push the floor for that room. As he pointed out Why doesn't it just go to my floor now? So, I get why you have the workflow, but I don't understand why you didn't include a short circuit for your existing users who are _perfectly happy_ not having the workflow. So now I have to pin to an old API version to keep working the way I want, and you will eventually remove that API version, and I will proceed to grumble about why I have to change. Everything I know about API versioning tells me that we won't ever remove a single API version. It's not about an inflexible workflow which dictates what people do making them hate the project. It's about finding a common pattern for an work flow that will work for all types of machines, it's about consistency, it's about keeping the history of what happened to that node. When a node is on a specific state you know what it's been through so you can easily debug it (i.e an ACTIVE node means that it passed through MANAGEABLE - CLEAN* - AVAILABLE - DEPLOY* - ACTIVE. Even if some of the states are non-op for a given driver, it's a clear path). Think about our API, it's not that we don't allow vendors to add every new features they have to the core part of the API because we don't trust them or we think that their shiny features are not worthy. We don't do that to make it consistent, to have an abstraction layer that will work the same for all types of hardware. I mean it when I said I want to have a fresh mind to read the proposal this new work flow. But I rather read a technical explanation than an emotional one. What I want to know for example is what it will look like when one register a node in ACTIVE state directly? What about the internal driver fields? What about the TFTP/HTTP environment that is built as part of the DEPLOY process ? What about the ports in Neutron ? and so on... Emotions matter to users. You're right that a technical argument helps us get our work done efficiently. But don't forget _why Ironic exists_. It's not for you to develop on, and it's not just for Nova to talk to. It's for your users to handle their datacenter in the wee hours without you to hold their hand. Make that hard, get somebody fired or burned out, and no technical argument will ever convince them to use Ironic again. You care
Re: [openstack-dev] [ironic] Re: New API for node create, specifying initial provision state
Excerpts from Lucas Alvares Gomes's message of 2015-08-27 02:40:26 -0700: On Wed, Aug 26, 2015 at 11:09 PM, Julia Kreger juliaashleykre...@gmail.com wrote: My apologies for not expressing my thoughts on this matter sooner, however I've had to spend some time collecting my thoughts. To me, it seems like we do not trust our users. Granted, when I say users, I mean administrators who likely know more about the disposition and capabilities of their fleet than could ever be discovered or inferred via software. Sure, we have other users, mainly in the form of consumers, asking Ironic for hardware to be deployed, but the driver for adoption is who feels the least amount of pain. API versioning aside, I have to ask the community, what is more important? - An inflexible workflow that forces an administrator to always have a green field, and to step through a workflow that we've dictated, which may not apply to their operational scenario, ultimately driving them to write custom code to inject new nodes into the database directly, which will surely break from time to time, causing them to hate Ironic and look for a different solution. - A happy administrator that has the capabilities to do their job (and thus manage the baremetal node wherever it is in the operator's lifecycle) in an efficient fashion, thus causing them to fall in love with Ironic. I'm sorry, I find the language used in this reply very offensive. That's not even a real question, due the alternatives you're basically asking the community What's more important, be happy or be sad ? Be efficient or not efficient? Funny, I find your response a bit offensive, as a user of Ironic who has been falling in love with it for a couple of years now, and is confused by the recent changes to the API that completely ignore me. I have _zero_ interest in this workflow. I want my nodes to be available as soon as I tell Ironic about them. You've added a step that makes no sense to me. Why not just let me create nodes in that state? It reminds me of a funny thing Monty Taylor pointed out in the Westin in Atlanta. We had to scramble to find our room keys to work the elevator, and upon unlocking the elevator, had to then push the floor for that room. As he pointed out Why doesn't it just go to my floor now? So, I get why you have the workflow, but I don't understand why you didn't include a short circuit for your existing users who are _perfectly happy_ not having the workflow. So now I have to pin to an old API version to keep working the way I want, and you will eventually remove that API version, and I will proceed to grumble about why I have to change. It's not about an inflexible workflow which dictates what people do making them hate the project. It's about finding a common pattern for an work flow that will work for all types of machines, it's about consistency, it's about keeping the history of what happened to that node. When a node is on a specific state you know what it's been through so you can easily debug it (i.e an ACTIVE node means that it passed through MANAGEABLE - CLEAN* - AVAILABLE - DEPLOY* - ACTIVE. Even if some of the states are non-op for a given driver, it's a clear path). Think about our API, it's not that we don't allow vendors to add every new features they have to the core part of the API because we don't trust them or we think that their shiny features are not worthy. We don't do that to make it consistent, to have an abstraction layer that will work the same for all types of hardware. I mean it when I said I want to have a fresh mind to read the proposal this new work flow. But I rather read a technical explanation than an emotional one. What I want to know for example is what it will look like when one register a node in ACTIVE state directly? What about the internal driver fields? What about the TFTP/HTTP environment that is built as part of the DEPLOY process ? What about the ports in Neutron ? and so on... Emotions matter to users. You're right that a technical argument helps us get our work done efficiently. But don't forget _why Ironic exists_. It's not for you to develop on, and it's not just for Nova to talk to. It's for your users to handle their datacenter in the wee hours without you to hold their hand. Make that hard, get somebody fired or burned out, and no technical argument will ever convince them to use Ironic again. I think I see the problem though. Ironic needs a new mission statement: To produce an OpenStack service and associated libraries capable of managing and provisioning physical machines, and to do this in a security-aware and fault-tolerant manner. Mission accomplished. It's been capable of doing that for a long time. Perhaps the project should rethink whether _users_ should be considered in a new mission statement. __ OpenStack
Re: [openstack-dev] [ironic] Re: New API for node create, specifying initial provision state
Hi, On Thu, Aug 27, 2015 at 5:43 PM, Clint Byrum cl...@fewbar.com wrote: Excerpts from Lucas Alvares Gomes's message of 2015-08-27 02:40:26 -0700: On Wed, Aug 26, 2015 at 11:09 PM, Julia Kreger juliaashleykre...@gmail.com wrote: My apologies for not expressing my thoughts on this matter sooner, however I've had to spend some time collecting my thoughts. To me, it seems like we do not trust our users. Granted, when I say users, I mean administrators who likely know more about the disposition and capabilities of their fleet than could ever be discovered or inferred via software. Sure, we have other users, mainly in the form of consumers, asking Ironic for hardware to be deployed, but the driver for adoption is who feels the least amount of pain. API versioning aside, I have to ask the community, what is more important? - An inflexible workflow that forces an administrator to always have a green field, and to step through a workflow that we've dictated, which may not apply to their operational scenario, ultimately driving them to write custom code to inject new nodes into the database directly, which will surely break from time to time, causing them to hate Ironic and look for a different solution. - A happy administrator that has the capabilities to do their job (and thus manage the baremetal node wherever it is in the operator's lifecycle) in an efficient fashion, thus causing them to fall in love with Ironic. I'm sorry, I find the language used in this reply very offensive. That's not even a real question, due the alternatives you're basically asking the community What's more important, be happy or be sad ? Be efficient or not efficient? Funny, I find your response a bit offensive, as a user of Ironic who has been falling in love with it for a couple of years now, and is confused by the recent changes to the API that completely ignore me. I'm sorry if you feel like that, I didn't mean to offend anyone. I have _zero_ interest in this workflow. I want my nodes to be available as soon as I tell Ironic about them. You've added a step that makes no sense to me. Why not just let me create nodes in that state? It reminds me of a funny thing Monty Taylor pointed out in the Westin in Atlanta. We had to scramble to find our room keys to work the elevator, and upon unlocking the elevator, had to then push the floor for that room. As he pointed out Why doesn't it just go to my floor now? So, I get why you have the workflow, but I don't understand why you didn't include a short circuit for your existing users who are _perfectly happy_ not having the workflow. So now I have to pin to an old API version to keep working the way I want, and you will eventually remove that API version, and I will proceed to grumble about why I have to change. Sure, I don't think that in any of my replies I have said that I'm against the idea of having anything like that, quite the opposite, I've said that I want to have a fresh mind when I hear the proposal; meaning no prejudgment. But we have a process to deal with such requests, in Ironic we have a spec process [1] which an idea have go to through before it's becomes accepted into the project. The work flow you have zero interest in and makes no sense to you was the work flow that have been discussed by the Ironic community in the open as part of the this spec here [2]. I'm sure everyone would appreciate your input on that at the time. But even now it's not late, the idea of having the short circuit still can be included to the project so I encourage you to go through the spec process [1] and propose it. [1] https://wiki.openstack.org/wiki/Ironic/Specs_Process [2] https://review.openstack.org/#/c/133828/7 Emotions matter to users. You're right that a technical argument helps us get our work done efficiently. But don't forget _why Ironic exists_. It's not for you to develop on, and it's not just for Nova to talk to. It's for your users to handle their datacenter in the wee hours without you to hold their hand. Make that hard, get somebody fired or burned out, and no technical argument will ever convince them to use Ironic again. Emotions matters yes but that's implicit. Nobody will ever be happy if something doesn't technically work. So, I'm sure the idea that will be proposed presents technical challenges and we are a technical community so let's focus on that. Cheers, Lucas __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ironic] Re: New API for node create, specifying initial provision state
On Wed, Aug 26, 2015 at 11:09 PM, Julia Kreger juliaashleykre...@gmail.com wrote: My apologies for not expressing my thoughts on this matter sooner, however I've had to spend some time collecting my thoughts. To me, it seems like we do not trust our users. Granted, when I say users, I mean administrators who likely know more about the disposition and capabilities of their fleet than could ever be discovered or inferred via software. Sure, we have other users, mainly in the form of consumers, asking Ironic for hardware to be deployed, but the driver for adoption is who feels the least amount of pain. API versioning aside, I have to ask the community, what is more important? - An inflexible workflow that forces an administrator to always have a green field, and to step through a workflow that we've dictated, which may not apply to their operational scenario, ultimately driving them to write custom code to inject new nodes into the database directly, which will surely break from time to time, causing them to hate Ironic and look for a different solution. - A happy administrator that has the capabilities to do their job (and thus manage the baremetal node wherever it is in the operator's lifecycle) in an efficient fashion, thus causing them to fall in love with Ironic. I'm sorry, I find the language used in this reply very offensive. That's not even a real question, due the alternatives you're basically asking the community What's more important, be happy or be sad ? Be efficient or not efficient? It's not about an inflexible workflow which dictates what people do making them hate the project. It's about finding a common pattern for an work flow that will work for all types of machines, it's about consistency, it's about keeping the history of what happened to that node. When a node is on a specific state you know what it's been through so you can easily debug it (i.e an ACTIVE node means that it passed through MANAGEABLE - CLEAN* - AVAILABLE - DEPLOY* - ACTIVE. Even if some of the states are non-op for a given driver, it's a clear path). Think about our API, it's not that we don't allow vendors to add every new features they have to the core part of the API because we don't trust them or we think that their shiny features are not worthy. We don't do that to make it consistent, to have an abstraction layer that will work the same for all types of hardware. I mean it when I said I want to have a fresh mind to read the proposal this new work flow. But I rather read a technical explanation than an emotional one. What I want to know for example is what it will look like when one register a node in ACTIVE state directly? What about the internal driver fields? What about the TFTP/HTTP environment that is built as part of the DEPLOY process ? What about the ports in Neutron ? and so on... Cheers, Lucas __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ironic] Re: New API for node create, specifying initial provision state
On 08/27/2015 11:40 AM, Lucas Alvares Gomes wrote: On Wed, Aug 26, 2015 at 11:09 PM, Julia Kreger juliaashleykre...@gmail.com wrote: My apologies for not expressing my thoughts on this matter sooner, however I've had to spend some time collecting my thoughts. To me, it seems like we do not trust our users. Granted, when I say users, I mean administrators who likely know more about the disposition and capabilities of their fleet than could ever be discovered or inferred via software. Sure, we have other users, mainly in the form of consumers, asking Ironic for hardware to be deployed, but the driver for adoption is who feels the least amount of pain. API versioning aside, I have to ask the community, what is more important? - An inflexible workflow that forces an administrator to always have a green field, and to step through a workflow that we've dictated, which may not apply to their operational scenario, ultimately driving them to write custom code to inject new nodes into the database directly, which will surely break from time to time, causing them to hate Ironic and look for a different solution. - A happy administrator that has the capabilities to do their job (and thus manage the baremetal node wherever it is in the operator's lifecycle) in an efficient fashion, thus causing them to fall in love with Ironic. I'm sorry, I find the language used in this reply very offensive. That's not even a real question, due the alternatives you're basically asking the community What's more important, be happy or be sad ? Be efficient or not efficient? It's not about an inflexible workflow which dictates what people do making them hate the project. It's about finding a common pattern for an work flow that will work for all types of machines, it's about consistency, it's about keeping the history of what happened to that node. When a node is on a specific state you know what it's been through so you can easily debug it (i.e an ACTIVE node means that it passed through MANAGEABLE - CLEAN* - AVAILABLE - DEPLOY* - ACTIVE. Even if some of the states are non-op for a given driver, it's a clear path). Think about our API, it's not that we don't allow vendors to add every new features they have to the core part of the API because we don't trust them or we think that their shiny features are not worthy. We don't do that to make it consistent, to have an abstraction layer that will work the same for all types of hardware. I mean it when I said I want to have a fresh mind to read the proposal this new work flow. But I rather read a technical explanation than an emotional one. What I want to know for example is what it will look like when one register a node in ACTIVE state directly? What about the internal driver fields? What about the TFTP/HTTP environment that is built as part of the DEPLOY process ? What about the ports in Neutron ? and so on... I agree with everything Lucas said. I also want to point that it's completely unrealistic to expect even majority of Ironic users to have at least some idea about how Ironic actually works. And definitely not all our users are Ironic developers. I routinely help people who never used Ironic before, and they don't have problems with running 1, 2, 10 commands, if they're written in the documentation and clearly explained. What they do have problems with is several ways of doing the same thing, with different ways being broken under different conditions. Cheers, Lucas __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ironic] Re: New API for node create, specifying initial provision state
My apologies for not expressing my thoughts on this matter sooner, however I've had to spend some time collecting my thoughts. To me, it seems like we do not trust our users. Granted, when I say users, I mean administrators who likely know more about the disposition and capabilities of their fleet than could ever be discovered or inferred via software. Sure, we have other users, mainly in the form of consumers, asking Ironic for hardware to be deployed, but the driver for adoption is who feels the least amount of pain. API versioning aside, I have to ask the community, what is more important? - An inflexible workflow that forces an administrator to always have a green field, and to step through a workflow that we've dictated, which may not apply to their operational scenario, ultimately driving them to write custom code to inject new nodes into the database directly, which will surely break from time to time, causing them to hate Ironic and look for a different solution. - A happy administrator that has the capabilities to do their job (and thus manage the baremetal node wherever it is in the operator's lifecycle) in an efficient fashion, thus causing them to fall in love with Ironic. To me, it seems like happy administrators are the most important thing for us to focus on, and while the workflow nature is extremely important for greenfield deployments, the ability to override the workflow seems absolutely vital to an existing deployment, even if it is via a trust_me super secret advanced handshake of doom that tells the API that the user know best. As a consumer of Ironic, an administrator of sorts, I don't care about API versions as much as much as it has been argued. I care about being able to achieve a task to meet my goals in an efficient and repeatable fashion. I want it to be easier for an administrator to do their job. -Julia On Tue, Aug 18, 2015 at 8:05 PM, Ruby Loo rlooya...@gmail.com wrote: On 17 August 2015 at 20:20, Robert Collins robe...@robertcollins.net wrote: On 11 August 2015 at 06:13, Ruby Loo rlooya...@gmail.com wrote: Hi, sorry for the delay. I vote no. I understand the rationale of trying to do things so that we don't break our users but that's what the versioning is meant for and more importantly -- I think adding the ENROLL state is fairly important wrt the lifecycle of a node. I don't particularly want to hide that and/or let folks opt out of it in the long term. From a reviewer point-of-view, my concern is me trying to remember all the possible permutations/states etc that are possible to make sure that new code doesn't break existing behavior. I haven't thought out whether adding this new API would make that worse or not, but then, I don't really want to have to think about it. So KISS as much as we can! :) I'm a little surprised by this, to be honest. Here's why: allowing the initial state to be chosen from ENROLL/AVAILABLE from the latest version of the API is precisely as complex as allowing two versions of the API {old, new} where old creates nodes in AVAILABLE and new creates nodes in ENROLL. The only difference I can see is that eventually someday if {old} stops being supported, then and only then we can go through the code and clean things up. It seems to me that the costs to us of supporting graceful transitions for users here are: 1) A new version NEWVER of the API that supports node state being one of {not supplied, AVAILABLE, ENROLL}, on creation, defaulting to AVAILABLE when not supplied. 2) Supporting the initial state of AVAILABLE indefinitely rather than just until we *delete* version 1.10. 3) CD deployments that had rolled forward to 1.11 will need to add the state parameter to their scripts to move forward to NEWVER. 4) Don't default the client to the veresions between 1.10 and NEWVER versions at any point. That seems like a very small price to pay on our side, and the benefits for users are that they can opt into the new functionality when they are ready. -Rob After thinking about this some more, I'm not actually going to address Rob's points above. What I want to do is go back and discuss... what do people think about having an API that allows the initial provision state to be specified, for a node that is created in Ironic. I'm assuming that enroll state exists :) Earlier today on IRC, Devananda mentioned that there's a very strong case for allowing a node to be created in any of the stable states (enroll, manageable, available, active). Maybe he'll elaborate later on this. I know that there's a use case where there is a desire to import nodes (with instances on them) from another system into ironic, and have them be active right away. (They don't want the nodes to go from enroll-verifying-manageable-cleaning!!!-available!!!-active). 1. What would the default provision state be, if it wasn't specified? A. 'available' to be backwards compatible with pre-v1.11 or
Re: [openstack-dev] [ironic] Re: New API for node create, specifying initial provision state
On Wed, Aug 19, 2015 at 09:48:29AM +0100, Lucas Alvares Gomes wrote: Hi, After thinking about this some more, I'm not actually going to address Rob's points above. What I want to do is go back and discuss... what do people think about having an API that allows the initial provision state to be specified, for a node that is created in Ironic. I'm assuming that enroll state exists :) Earlier today on IRC, Devananda mentioned that there's a very strong case for allowing a node to be created in any of the stable states (enroll, manageable, available, active). Maybe he'll elaborate later on this. I know that there's a use case where there is a desire to import nodes (with instances on them) from another system into ironic, and have them be active right away. (They don't want the nodes to go from enroll-verifying-manageable-cleaning!!!-available!!!-active). I would like to hear the more elaborated proposal before we start digging much into this problem. 1. What would the default provision state be, if it wasn't specified? A. 'available' to be backwards compatible with pre-v1.11 or B. 'enroll' to be consistent with v1.11+ or ? 2. What would it mean to set the initial provision state to something other than 'enroll'? manageable In our state machinery[0], a node goes from enroll - verifying - manageable. For manageble to be initial state, does it mean that A. whatever is needed for enroll and verifying is done and succeeds (under the hood) or B. whatever is needed for enroll is done and succeeds (but no verifying) or C. no enroll or verifying is done, it goes straight to manageble I'm fine with A.I'm not sure that B makes sense and I definitely don't think C makes sense. To date, verifying means checking that the conductor can get the power state on the node, to verify the supplied power credentials. I don't think it is a big deal if we skip this step; it just means that the next time some action is taken on the node, it might fail. available In our state machinery, a node goes from enroll - verifying - manageable - cleaning - available. For available to be initial state, does it mean that A. whatever is needed for enroll, verifying, cleaning is done and succeeds (under the hood) or B. whatever is needed for enroll is done and succeeds (but no verifying or cleaning) or ?? active In our state machinery, a node goes from enroll - verifying - manageable - cleaning - available-deploying-active. For active to be initial state, does it mean that A. whatever is needed for enroll, verifying, cleaning, deploying is done and succeeds (under the hood) or B. whatever is needed for enroll is done and succeeds (but no verifying or cleaning) or C. whatever is needed for enroll and I dunno, any 'takeover' stuff by conductor or whatever node states need to be updated to be in active? What I'm more concerned about allowing the enroll a node in any stable state is that it's going to be a big change in our API. We have PATCH as a mechanism of updating a resource partially because we have read-only attributes (driver_internal_info, *_updated_at, etc...) in the API that are internal and should not be updated by the user. Some states might depend on them i.e a node in ACTIVE state might have indicators in the driver_internal_info field. Another thing it's really cross resource, a node in ACTIVE state will depend on a certain port which it was used to be deployed (and other things about registering that port in Neutron with the right DHCP information, so if one is PXE booting after ACTIVE the node won't get stuck with a boot error. (Also we need to create the right TFTP (or TFTP + HTTP for iPXE) environment for that node. Anyway, I don't want to get much deeper, I think we should all be open to hear what will be proposed with a fresh mind. +1, there are tons of dragons here. Now that we're to the point where our state machine is well-defined with a single entrypoint, I think adding any entrypoints needs to be well thought out. We should be able to make assumptions about what we can do from a given state, and if we are going to allow folks to define other entrypoints, those assumptions need to be satisfied. I'm somewhat open to adding entrypoints, but I'd like to see specs first. // jim Cheers, Lucas __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
Re: [openstack-dev] [ironic] Re: New API for node create, specifying initial provision state
On 21 Aug 2015 6:45 am, Jim Rollenhagen +1, there are tons of dragons here. Now that we're to the point where our state machine is well-defined with a single entrypoint, I think I'm clearly confused. When was 1.6 deleted? Rob __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ironic] Re: New API for node create, specifying initial provision state
Hi On 21 Aug 2015 6:45 am, Jim Rollenhagen +1, there are tons of dragons here. Now that we're to the point where our state machine is well-defined with a single entrypoint, I think I'm clearly confused. When was 1.6 deleted? It wasn't and won't be AFAICT. But I think Jim is talking about versions = 1.11 of the API which will always use ENROLL as the entry point because that's was how things were planned for the new state machine. So yeah, we have more than one entry point depending on the version of the API you use. Cheers, Lucas __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ironic] Re: New API for node create, specifying initial provision state
On Thu, Aug 20, 2015 at 09:57:14PM +0100, Lucas Alvares Gomes wrote: Hi On 21 Aug 2015 6:45 am, Jim Rollenhagen +1, there are tons of dragons here. Now that we're to the point where our state machine is well-defined with a single entrypoint, I think I'm clearly confused. When was 1.6 deleted? It wasn't and won't be AFAICT. But I think Jim is talking about versions = 1.11 of the API which will always use ENROLL as the entry point because that's was how things were planned for the new state machine. So yeah, we have more than one entry point depending on the version of the API you use. Right, sorry about that. I still think the point stands. Every new entrypoint adds complexity that we need to manage, and I'd love for us to take a long hard look at new ones and not just allow whatever people want. ACTIVE is a great example of a state where we make a ton of assumptions about what the node is doing and what metadata it has. // jim __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ironic] Re: New API for node create, specifying initial provision state
My opinion: - If a new API is desirable by operators who would like to skip a few steps in Ironic before making it active, then we should do it. I mean we should allow them to skip the enroll state and manageable state, thereby giving them an opportunity to land the node in manageable or available state by default. - Default state (by default) should be enroll as that's where the state of a node in Ironic begins. May be optionally it can be tweaked in ironic.conf. - I don't like the idea to land a node directly in active state. The main reason being it differs from driver to driver, what it takes to bring a node to active and what is required for a take over for the active node. For example, while deploying a partition image (by pxe or virtual media drivers), the uuid of the root partition should be available in the driver_internal_info for take_over to happen. So, it would mean that even for existing drivers, we would need to at least provide a mechanism for writing driver_internal_info from the API which is not desirable. It is very much a valid use case to do import. From first thought, I think we should have a new API endpoint to request such an import and a new method in DeployInterface (not an abstract method) for importing bare metals from another system. The API should allow parameters to be passed from the driver to do the import, optionally requesting to reboot the bare metal after it is imported (to make sure that Ironic can properly manage the node again). The new method in DeployInterface should do what it takes to import the bare metal given the parameters. But, that might be a different story :). Regards, Ramesh On Wed, Aug 19, 2015 at 5:35 AM, Ruby Loo rlooya...@gmail.com wrote: On 17 August 2015 at 20:20, Robert Collins robe...@robertcollins.net wrote: On 11 August 2015 at 06:13, Ruby Loo rlooya...@gmail.com wrote: Hi, sorry for the delay. I vote no. I understand the rationale of trying to do things so that we don't break our users but that's what the versioning is meant for and more importantly -- I think adding the ENROLL state is fairly important wrt the lifecycle of a node. I don't particularly want to hide that and/or let folks opt out of it in the long term. From a reviewer point-of-view, my concern is me trying to remember all the possible permutations/states etc that are possible to make sure that new code doesn't break existing behavior. I haven't thought out whether adding this new API would make that worse or not, but then, I don't really want to have to think about it. So KISS as much as we can! :) I'm a little surprised by this, to be honest. Here's why: allowing the initial state to be chosen from ENROLL/AVAILABLE from the latest version of the API is precisely as complex as allowing two versions of the API {old, new} where old creates nodes in AVAILABLE and new creates nodes in ENROLL. The only difference I can see is that eventually someday if {old} stops being supported, then and only then we can go through the code and clean things up. It seems to me that the costs to us of supporting graceful transitions for users here are: 1) A new version NEWVER of the API that supports node state being one of {not supplied, AVAILABLE, ENROLL}, on creation, defaulting to AVAILABLE when not supplied. 2) Supporting the initial state of AVAILABLE indefinitely rather than just until we *delete* version 1.10. 3) CD deployments that had rolled forward to 1.11 will need to add the state parameter to their scripts to move forward to NEWVER. 4) Don't default the client to the veresions between 1.10 and NEWVER versions at any point. That seems like a very small price to pay on our side, and the benefits for users are that they can opt into the new functionality when they are ready. -Rob After thinking about this some more, I'm not actually going to address Rob's points above. What I want to do is go back and discuss... what do people think about having an API that allows the initial provision state to be specified, for a node that is created in Ironic. I'm assuming that enroll state exists :) Earlier today on IRC, Devananda mentioned that there's a very strong case for allowing a node to be created in any of the stable states (enroll, manageable, available, active). Maybe he'll elaborate later on this. I know that there's a use case where there is a desire to import nodes (with instances on them) from another system into ironic, and have them be active right away. (They don't want the nodes to go from enroll-verifying-manageable-cleaning!!!-available!!!-active). 1. What would the default provision state be, if it wasn't specified? A. 'available' to be backwards compatible with pre-v1.11 or B. 'enroll' to be consistent with v1.11+ or ? 2. What would it mean to set the initial provision state to something other than 'enroll'? manageable In our state machinery[0], a node
Re: [openstack-dev] [ironic] Re: New API for node create, specifying initial provision state
To be honest, I'm tired of repeating the same arguments again and again... I personally would like to get something cool done, rather than discussing how to work around our new state machine again and again. Now to some trolling: please include a way to users to opt-out from NOSTATE - AVAILABLE renaming. On 08/18/2015 09:11 PM, Ruby Loo wrote: Apologies, forgot to add [ironic] to the subject. On 18 August 2015 at 13:27, Ruby Loo rlooya...@gmail.com mailto:rlooya...@gmail.com wrote: Hi, I want to start a different thread on this topic because I don't think this is about whether/how to do API microversions. Rather, given that we are going to support microversioning, how to deal with the non-backward compatible change in 1.11 with the ENROLL state (instead of AVAILABLE) being the provision state that a node is in, after being created/registered in ironic. (This was from 'Let's talk about API versions, http://lists.openstack.org/pipermail/openstack-dev/2015-August/072287.html.) I want to think about this before replying but others are more than welcome to reply first so that I may not feel the need to reply :-) --ruby maybe chop off this and above when replying :-) On 17 August 2015 at 20:20, Robert Collins robe...@robertcollins.net mailto:robe...@robertcollins.net wrote: On 11 August 2015 at 06:13, Ruby Loo rlooya...@gmail.com mailto:rlooya...@gmail.com wrote: Hi, sorry for the delay. I vote no. I understand the rationale of trying to do things so that we don't break our users but that's what the versioning is meant for and more importantly -- I think adding the ENROLL state is fairly important wrt the lifecycle of a node. I don't particularly want to hide that and/or let folks opt out of it in the long term. From a reviewer point-of-view, my concern is me trying to remember all the possible permutations/states etc that are possible to make sure that new code doesn't break existing behavior. I haven't thought out whether adding this new API would make that worse or not, but then, I don't really want to have to think about it. So KISS as much as we can! :) I'm a little surprised by this, to be honest. Here's why: allowing the initial state to be chosen from ENROLL/AVAILABLE from the latest version of the API is precisely as complex as allowing two versions of the API {old, new} where old creates nodes in AVAILABLE and new creates nodes in ENROLL. The only difference I can see is that eventually someday if {old} stops being supported, then and only then we can go through the code and clean things up. It seems to me that the costs to us of supporting graceful transitions for users here are: 1) A new version NEWVER of the API that supports node state being one of {not supplied, AVAILABLE, ENROLL}, on creation, defaulting to AVAILABLE when not supplied. -1, it's a breaking change again. And it does not make any sense to me. 2) Supporting the initial state of AVAILABLE indefinitely rather than just until we *delete* version 1.10. We don't delete any versions. This would be a terrible (backward incompatible) change, breaking the whole idea of versioning. 3) CD deployments that had rolled forward to 1.11 will need to add the state parameter to their scripts to move forward to NEWVER. 4) Don't default the client to the veresions between 1.10 and NEWVER versions at any point. That seems like a very small price to pay on our side, and the benefits for users are that they can opt into the new functionality when they are ready. That's what versioning is for, so we're fine, nothing needs to be done. -Rob -- Robert Collins rbtcoll...@hp.com mailto:rbtcoll...@hp.com Distinguished Technologist HP Converged Cloud __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe:
Re: [openstack-dev] [ironic] Re: New API for node create, specifying initial provision state
On 08/19/2015 02:05 AM, Ruby Loo wrote: On 17 August 2015 at 20:20, Robert Collins robe...@robertcollins.net mailto:robe...@robertcollins.net wrote: On 11 August 2015 at 06:13, Ruby Loo rlooya...@gmail.com mailto:rlooya...@gmail.com wrote: Hi, sorry for the delay. I vote no. I understand the rationale of trying to do things so that we don't break our users but that's what the versioning is meant for and more importantly -- I think adding the ENROLL state is fairly important wrt the lifecycle of a node. I don't particularly want to hide that and/or let folks opt out of it in the long term. From a reviewer point-of-view, my concern is me trying to remember all the possible permutations/states etc that are possible to make sure that new code doesn't break existing behavior. I haven't thought out whether adding this new API would make that worse or not, but then, I don't really want to have to think about it. So KISS as much as we can! :) I'm a little surprised by this, to be honest. Here's why: allowing the initial state to be chosen from ENROLL/AVAILABLE from the latest version of the API is precisely as complex as allowing two versions of the API {old, new} where old creates nodes in AVAILABLE and new creates nodes in ENROLL. The only difference I can see is that eventually someday if {old} stops being supported, then and only then we can go through the code and clean things up. It seems to me that the costs to us of supporting graceful transitions for users here are: 1) A new version NEWVER of the API that supports node state being one of {not supplied, AVAILABLE, ENROLL}, on creation, defaulting to AVAILABLE when not supplied. 2) Supporting the initial state of AVAILABLE indefinitely rather than just until we *delete* version 1.10. 3) CD deployments that had rolled forward to 1.11 will need to add the state parameter to their scripts to move forward to NEWVER. 4) Don't default the client to the veresions between 1.10 and NEWVER versions at any point. That seems like a very small price to pay on our side, and the benefits for users are that they can opt into the new functionality when they are ready. -Rob After thinking about this some more, I'm not actually going to address Rob's points above. What I want to do is go back and discuss... what do people think about having an API that allows the initial provision state to be specified, for a node that is created in Ironic. I'm assuming that enroll state exists :) Again... Earlier today on IRC, Devananda mentioned that there's a very strong case for allowing a node to be created in any of the stable states (enroll, manageable, available, active). Maybe he'll elaborate later on this. I know that there's a use case where there is a desire to import nodes (with instances on them) from another system into ironic, and have them be active right away. (They don't want the nodes to go from enroll-verifying-manageable-cleaning!!!-available!!!-active). And I want node to be created in INSPECTING state directly. I don't care it's a transient state, I just want it :) Oh, and can I please skip MANAGEABLE? I need the following flow INSPECTING-AVAILABLE. Now seriously: to what degree are we going to allow people to break our state machine? Or alternatively, are we going to allow steps to happen automatically? I'm in favor of this idea actually, maybe someone feels like writing a spec? 1. What would the default provision state be, if it wasn't specified? A. 'available' to be backwards compatible with pre-v1.11 or B. 'enroll' to be consistent with v1.11+ or ? B. No more breaking changes please. 2. What would it mean to set the initial provision state to something other than 'enroll'? manageable In our state machinery[0], a node goes from enroll - verifying - manageable. For manageble to be initial state, does it mean that A. whatever is needed for enroll and verifying is done and succeeds (under the hood) or B. whatever is needed for enroll is done and succeeds (but no verifying) or C. no enroll or verifying is done, it goes straight to manageble A sounds nice, but that's now how our state machine currently works. Being able to skip states is really an interesting feature, but it requires somewhat broader discussion. And then yes, you should allow me to just straight into INSPECTING in this case :) If it's not implied, then my
Re: [openstack-dev] [ironic] Re: New API for node create, specifying initial provision state
Hi, After thinking about this some more, I'm not actually going to address Rob's points above. What I want to do is go back and discuss... what do people think about having an API that allows the initial provision state to be specified, for a node that is created in Ironic. I'm assuming that enroll state exists :) Earlier today on IRC, Devananda mentioned that there's a very strong case for allowing a node to be created in any of the stable states (enroll, manageable, available, active). Maybe he'll elaborate later on this. I know that there's a use case where there is a desire to import nodes (with instances on them) from another system into ironic, and have them be active right away. (They don't want the nodes to go from enroll-verifying-manageable-cleaning!!!-available!!!-active). I would like to hear the more elaborated proposal before we start digging much into this problem. 1. What would the default provision state be, if it wasn't specified? A. 'available' to be backwards compatible with pre-v1.11 or B. 'enroll' to be consistent with v1.11+ or ? 2. What would it mean to set the initial provision state to something other than 'enroll'? manageable In our state machinery[0], a node goes from enroll - verifying - manageable. For manageble to be initial state, does it mean that A. whatever is needed for enroll and verifying is done and succeeds (under the hood) or B. whatever is needed for enroll is done and succeeds (but no verifying) or C. no enroll or verifying is done, it goes straight to manageble I'm fine with A.I'm not sure that B makes sense and I definitely don't think C makes sense. To date, verifying means checking that the conductor can get the power state on the node, to verify the supplied power credentials. I don't think it is a big deal if we skip this step; it just means that the next time some action is taken on the node, it might fail. available In our state machinery, a node goes from enroll - verifying - manageable - cleaning - available. For available to be initial state, does it mean that A. whatever is needed for enroll, verifying, cleaning is done and succeeds (under the hood) or B. whatever is needed for enroll is done and succeeds (but no verifying or cleaning) or ?? active In our state machinery, a node goes from enroll - verifying - manageable - cleaning - available-deploying-active. For active to be initial state, does it mean that A. whatever is needed for enroll, verifying, cleaning, deploying is done and succeeds (under the hood) or B. whatever is needed for enroll is done and succeeds (but no verifying or cleaning) or C. whatever is needed for enroll and I dunno, any 'takeover' stuff by conductor or whatever node states need to be updated to be in active? What I'm more concerned about allowing the enroll a node in any stable state is that it's going to be a big change in our API. We have PATCH as a mechanism of updating a resource partially because we have read-only attributes (driver_internal_info, *_updated_at, etc...) in the API that are internal and should not be updated by the user. Some states might depend on them i.e a node in ACTIVE state might have indicators in the driver_internal_info field. Another thing it's really cross resource, a node in ACTIVE state will depend on a certain port which it was used to be deployed (and other things about registering that port in Neutron with the right DHCP information, so if one is PXE booting after ACTIVE the node won't get stuck with a boot error. (Also we need to create the right TFTP (or TFTP + HTTP for iPXE) environment for that node. Anyway, I don't want to get much deeper, I think we should all be open to hear what will be proposed with a fresh mind. Cheers, Lucas __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ironic] Re: New API for node create, specifying initial provision state
On 17 August 2015 at 20:20, Robert Collins robe...@robertcollins.net wrote: On 11 August 2015 at 06:13, Ruby Loo rlooya...@gmail.com wrote: Hi, sorry for the delay. I vote no. I understand the rationale of trying to do things so that we don't break our users but that's what the versioning is meant for and more importantly -- I think adding the ENROLL state is fairly important wrt the lifecycle of a node. I don't particularly want to hide that and/or let folks opt out of it in the long term. From a reviewer point-of-view, my concern is me trying to remember all the possible permutations/states etc that are possible to make sure that new code doesn't break existing behavior. I haven't thought out whether adding this new API would make that worse or not, but then, I don't really want to have to think about it. So KISS as much as we can! :) I'm a little surprised by this, to be honest. Here's why: allowing the initial state to be chosen from ENROLL/AVAILABLE from the latest version of the API is precisely as complex as allowing two versions of the API {old, new} where old creates nodes in AVAILABLE and new creates nodes in ENROLL. The only difference I can see is that eventually someday if {old} stops being supported, then and only then we can go through the code and clean things up. It seems to me that the costs to us of supporting graceful transitions for users here are: 1) A new version NEWVER of the API that supports node state being one of {not supplied, AVAILABLE, ENROLL}, on creation, defaulting to AVAILABLE when not supplied. 2) Supporting the initial state of AVAILABLE indefinitely rather than just until we *delete* version 1.10. 3) CD deployments that had rolled forward to 1.11 will need to add the state parameter to their scripts to move forward to NEWVER. 4) Don't default the client to the veresions between 1.10 and NEWVER versions at any point. That seems like a very small price to pay on our side, and the benefits for users are that they can opt into the new functionality when they are ready. -Rob After thinking about this some more, I'm not actually going to address Rob's points above. What I want to do is go back and discuss... what do people think about having an API that allows the initial provision state to be specified, for a node that is created in Ironic. I'm assuming that enroll state exists :) Earlier today on IRC, Devananda mentioned that there's a very strong case for allowing a node to be created in any of the stable states (enroll, manageable, available, active). Maybe he'll elaborate later on this. I know that there's a use case where there is a desire to import nodes (with instances on them) from another system into ironic, and have them be active right away. (They don't want the nodes to go from enroll-verifying-manageable-cleaning!!!-available!!!-active). 1. What would the default provision state be, if it wasn't specified? A. 'available' to be backwards compatible with pre-v1.11 or B. 'enroll' to be consistent with v1.11+ or ? 2. What would it mean to set the initial provision state to something other than 'enroll'? manageable In our state machinery[0], a node goes from enroll - verifying - manageable. For manageble to be initial state, does it mean that A. whatever is needed for enroll and verifying is done and succeeds (under the hood) or B. whatever is needed for enroll is done and succeeds (but no verifying) or C. no enroll or verifying is done, it goes straight to manageble I'm fine with A.I'm not sure that B makes sense and I definitely don't think C makes sense. To date, verifying means checking that the conductor can get the power state on the node, to verify the supplied power credentials. I don't think it is a big deal if we skip this step; it just means that the next time some action is taken on the node, it might fail. available In our state machinery, a node goes from enroll - verifying - manageable - cleaning - available. For available to be initial state, does it mean that A. whatever is needed for enroll, verifying, cleaning is done and succeeds (under the hood) or B. whatever is needed for enroll is done and succeeds (but no verifying or cleaning) or ?? active In our state machinery, a node goes from enroll - verifying - manageable - cleaning - available-deploying-active. For active to be initial state, does it mean that A. whatever is needed for enroll, verifying, cleaning, deploying is done and succeeds (under the hood) or B. whatever is needed for enroll is done and succeeds (but no verifying or cleaning) or C. whatever is needed for enroll and I dunno, any 'takeover' stuff by conductor or whatever node states need to be updated to be in active? --ruby [0] http://docs.openstack.org/developer/ironic/dev/states.html __ OpenStack Development Mailing List (not for usage