Re: [openstack-dev] [Ironic] backwards compat issue with PXEDeply and AgentDeploy drivers
One more try on the ops list as it turns out I wasn't subscribed... :( // jim > On Oct 14, 2015, at 17:39, Jim Rollenhagen wrote: > > Cross posting to the ops list for visibility. > > To follow up on this, we decided not to fix this after all. Besides > Ramesh's (very valid) points below, we realized that fixing this for out > of tree drivers that depended on Kilo behavior would break out of tree > drivers that implemented their own boot mechanism (e.g. virtual media), > or drivers that just generally didn't expect Ironic to handle booting > for them. > > Copying my patch to our release notes here, to explain the breakage and > how to fix: > > The AgentDeploy and ISCSIDeploy (formerly known as PXEDeploy) now depend > on drivers to mix in an instance of a BootInterface. For drivers that > exist out of tree, that use these deploy drivers directly, an error will > be thrown during deployment. For drivers that expect these deploy classes > to handle PXE booting, please mix in `ironic.drivers.modules.pxe.PXEBoot` > as self.boot. Drivers that handle booting itself (for example, a driver > that implements booting from virtual media) should mix in > `ironic.drivers.modules.fake.FakeBoot` as self.boot, to make calls to the > boot interface a no-op. Additionally, as mentioned before, > `ironic.drivers.modules.pxe.PXEDeploy` has moved to > `ironic.drivers.modules.iscsi_deploy.ISCSIDeploy`, which will break drivers > that mix this class in. > > To out of tree driver authors: the Ironic team apologizes profusely for > this inconvenience. We're meeting up in Tokyo to discuss our driver API > and the boundaries there; please join us! > > // jim > >> On Tue, Oct 06, 2015 at 12:05:37PM +0530, Ramakrishnan G wrote: >> Well it's nice to fix, but I really don't know if we should be fixing it. >> As discussed in one of the Ironic meetings before, we might need to define >> what is our driver API or SDK or DDK or whatever we choose to call it . >> Please see inline for my thoughts. >> >> On Tue, Oct 6, 2015 at 5:54 AM, Devananda van der Veen < >> devananda@gmail.com> wrote: >> >>> tldr; the boot / deploy interface split we did broke an out of tree >>> driver. I've proposed a patch. We should get a fix into stable/liberty too. >>> >>> Longer version... >>> >>> I was rebasing my AMTTool driver [0] on top of master because the in-tree >>> one still does not work for me, only to discover that my driver suddenly >>> failed to deploy. I have filed this bug >>> https://bugs.launchpad.net/ironic/+bug/1502980 >>> because we broke at least one out of tree driver (mine). I highly suspect >>> we've broken many other out of tree drivers that relied on either the >>> PXEDeploy or AgentDeploy interfaces that were present in Kilo release. Both >>> classes in Liberty are making explicit calls to "task.driver.boot" -- and >>> kilo-era driver classes did not define this interface. >> >> >> I would like to think more about what really our driver API is ? We have a >> couple of well defined interfaces in ironic/drivers/base.py which people >> may follow, implement an out-of-tree driver, make it a stevedore entrypoint >> and get it working with Ironic. >> >> But >> >> 1) Do we promise them that in-tree implementations of these interfaces will >> always exist. For example in boot/deploy work done in Liberty, we removed >> the class PxeDeploy [1]. It actually got broken down to PXEBoot and >> ISCSIDeploy. In the first place, do we guarantee that they will exist for >> ever in the same place with the same name ? :) >> >> 2) Do we really promise the in-tree implementations of these interfaces >> will behave the same way ? For example, the broken stuff AgentDeploy which >> is an implementation of our DeployInterface. Do we guarantee that this >> implementation will always keep doing what ever it was every time code is >> rebased ? >> >> [1] https://review.openstack.org/#/c/166513/19/ironic/drivers/modules/pxe.py >> >> >> >>> >>> I worked out a patch for the AgentDeploy driver and have proposed it here: >>> https://review.openstack.org/#/c/231215/1 >>> >>> I'd like to ask folks to review it quickly -- we should fix this ASAP and >>> backport it to stable/liberty before the next release, if possible. We >>> should also make a similar fix for the PXEDeploy class. If anyone gets to >>> this before I do, please reply here and let me know so we don't duplicate >>> effort. >> >> >> This isn't going to be as same as above as there is no longer a PXEDeploy >> class any more. We might need to create a new class PXEDeploy which >> probably inherits from ISCSIDeploy and has task.driver.boot worked around >> in the same way as the above patch. >> >> >> >>> >>> Also, Jim already spotted something in the review that is a bit >>> concerning. It seems like the IloVirtualMediaAgentVendorInterface class >>> expects the driver it is attached to *not* to have a boot interface and >>> *not* to call boot.clean_up_ramdisk. Conversely,
Re: [openstack-dev] [Ironic] backwards compat issue with PXEDeply and AgentDeploy drivers
Cross posting to the ops list for visibility. To follow up on this, we decided not to fix this after all. Besides Ramesh's (very valid) points below, we realized that fixing this for out of tree drivers that depended on Kilo behavior would break out of tree drivers that implemented their own boot mechanism (e.g. virtual media), or drivers that just generally didn't expect Ironic to handle booting for them. Copying my patch to our release notes here, to explain the breakage and how to fix: The AgentDeploy and ISCSIDeploy (formerly known as PXEDeploy) now depend on drivers to mix in an instance of a BootInterface. For drivers that exist out of tree, that use these deploy drivers directly, an error will be thrown during deployment. For drivers that expect these deploy classes to handle PXE booting, please mix in `ironic.drivers.modules.pxe.PXEBoot` as self.boot. Drivers that handle booting itself (for example, a driver that implements booting from virtual media) should mix in `ironic.drivers.modules.fake.FakeBoot` as self.boot, to make calls to the boot interface a no-op. Additionally, as mentioned before, `ironic.drivers.modules.pxe.PXEDeploy` has moved to `ironic.drivers.modules.iscsi_deploy.ISCSIDeploy`, which will break drivers that mix this class in. To out of tree driver authors: the Ironic team apologizes profusely for this inconvenience. We're meeting up in Tokyo to discuss our driver API and the boundaries there; please join us! // jim On Tue, Oct 06, 2015 at 12:05:37PM +0530, Ramakrishnan G wrote: > Well it's nice to fix, but I really don't know if we should be fixing it. > As discussed in one of the Ironic meetings before, we might need to define > what is our driver API or SDK or DDK or whatever we choose to call it . > Please see inline for my thoughts. > > On Tue, Oct 6, 2015 at 5:54 AM, Devananda van der Veen < > devananda@gmail.com> wrote: > > > tldr; the boot / deploy interface split we did broke an out of tree > > driver. I've proposed a patch. We should get a fix into stable/liberty too. > > > > Longer version... > > > > I was rebasing my AMTTool driver [0] on top of master because the in-tree > > one still does not work for me, only to discover that my driver suddenly > > failed to deploy. I have filed this bug > > https://bugs.launchpad.net/ironic/+bug/1502980 > > because we broke at least one out of tree driver (mine). I highly suspect > > we've broken many other out of tree drivers that relied on either the > > PXEDeploy or AgentDeploy interfaces that were present in Kilo release. Both > > classes in Liberty are making explicit calls to "task.driver.boot" -- and > > kilo-era driver classes did not define this interface. > > > > > I would like to think more about what really our driver API is ? We have a > couple of well defined interfaces in ironic/drivers/base.py which people > may follow, implement an out-of-tree driver, make it a stevedore entrypoint > and get it working with Ironic. > > But > > 1) Do we promise them that in-tree implementations of these interfaces will > always exist. For example in boot/deploy work done in Liberty, we removed > the class PxeDeploy [1]. It actually got broken down to PXEBoot and > ISCSIDeploy. In the first place, do we guarantee that they will exist for > ever in the same place with the same name ? :) > > 2) Do we really promise the in-tree implementations of these interfaces > will behave the same way ? For example, the broken stuff AgentDeploy which > is an implementation of our DeployInterface. Do we guarantee that this > implementation will always keep doing what ever it was every time code is > rebased ? > > [1] https://review.openstack.org/#/c/166513/19/ironic/drivers/modules/pxe.py > > > > > > > I worked out a patch for the AgentDeploy driver and have proposed it here: > > https://review.openstack.org/#/c/231215/1 > > > > I'd like to ask folks to review it quickly -- we should fix this ASAP and > > backport it to stable/liberty before the next release, if possible. We > > should also make a similar fix for the PXEDeploy class. If anyone gets to > > this before I do, please reply here and let me know so we don't duplicate > > effort. > > > > > This isn't going to be as same as above as there is no longer a PXEDeploy > class any more. We might need to create a new class PXEDeploy which > probably inherits from ISCSIDeploy and has task.driver.boot worked around > in the same way as the above patch. > > > > > > > Also, Jim already spotted something in the review that is a bit > > concerning. It seems like the IloVirtualMediaAgentVendorInterface class > > expects the driver it is attached to *not* to have a boot interface and > > *not* to call boot.clean_up_ramdisk. Conversely, other drivers may be > > expecting AgentVendorInterface to call boot.clean_up_ramdisk -- since that > > was its default behavior in Kilo. I'm not sure what the right way to fix > > this is, but I lean towards updating the in-tree driver so
Re: [openstack-dev] [Ironic] backwards compat issue with PXEDeply and AgentDeploy drivers
On Wed, Oct 07, 2015 at 12:41:55PM -0700, Devananda van der Veen wrote: > Ramesh, > > I thought about your points over night, and then looked at our in-tree > driver code from stable/kilo and asked myself, "what if this driver was out > of tree?" They'd all have broken -- for very similar reasons as what I > encountered with my demo driver. > > When we split the boot and deploy interfaces, we kept compatibility only at > the boundary between ConductorManager and the Driver class. That's all we > set out to do, because that's all we defined the interface to be. I can > accept that, but I'd like us to think about whether the driver interface: > a) is merely the interfaces defined in ironic/drivers/base.py, as we > previously defined, or > b) also includes one or more of the hardware-agnostic interface > implementations (eg, PXEBoot, AgentDeploy, AgentRAID, inspector.Inspector) > > As recent experience has taught me, these classes provide essential > primitives for building new hardware drivers. If we want to support > development of hardware drivers out of tree, but we don't want to include > (b) in our definition of the API, then we are signalling that such drivers > must be implemented entirely out of tree (iow, they're not allowed to > borrow *any* functionality from ironic/drivers/modules/*). > > And if we're signalling that, and someone goes and implements such a driver > and then later wants to propose it upstream -- how will we feel about > accepting a completely alternative implementation of, say, the pxe boot > driver? I agree; there's some hard things to think about here. I'd like to get a definition of our driver API solidified and documented during Mitaka. It's odd, because there are two pieces to (b) above; the names/methods provided, and the actual behavior. I'm not sure that AgentDeploy is not backwards compatible when it comes to names or methods, but we've certainly changed the behavior. I've added this topic to our design summit proposal list. As for the original problem, I like ramesh's idea with the FakeBoot driver. I think it will cause the least amount of breakage to out-of-tree drivers, and still allow for an easy fix. We *do* need to be very loud about this in release notes, ops mailing list, etc. Deva, could you update the patch ASAP? I'd like to get it landed tomorrow, so we can backport and get 4.2.1 out the door this week. Thanks for digging into this! :) // jim __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ironic] backwards compat issue with PXEDeply and AgentDeploy drivers
Ramesh, I thought about your points over night, and then looked at our in-tree driver code from stable/kilo and asked myself, "what if this driver was out of tree?" They'd all have broken -- for very similar reasons as what I encountered with my demo driver. When we split the boot and deploy interfaces, we kept compatibility only at the boundary between ConductorManager and the Driver class. That's all we set out to do, because that's all we defined the interface to be. I can accept that, but I'd like us to think about whether the driver interface: a) is merely the interfaces defined in ironic/drivers/base.py, as we previously defined, or b) also includes one or more of the hardware-agnostic interface implementations (eg, PXEBoot, AgentDeploy, AgentRAID, inspector.Inspector) As recent experience has taught me, these classes provide essential primitives for building new hardware drivers. If we want to support development of hardware drivers out of tree, but we don't want to include (b) in our definition of the API, then we are signalling that such drivers must be implemented entirely out of tree (iow, they're not allowed to borrow *any* functionality from ironic/drivers/modules/*). And if we're signalling that, and someone goes and implements such a driver and then later wants to propose it upstream -- how will we feel about accepting a completely alternative implementation of, say, the pxe boot driver? Curious what others think... -deva On Mon, Oct 5, 2015 at 11:35 PM, Ramakrishnan G < rameshg87.openst...@gmail.com> wrote: > > Well it's nice to fix, but I really don't know if we should be fixing it. > As discussed in one of the Ironic meetings before, we might need to define > what is our driver API or SDK or DDK or whatever we choose to call it . > Please see inline for my thoughts. > > On Tue, Oct 6, 2015 at 5:54 AM, Devananda van der Veen < > devananda@gmail.com> wrote: > >> tldr; the boot / deploy interface split we did broke an out of tree >> driver. I've proposed a patch. We should get a fix into stable/liberty too. >> >> Longer version... >> >> I was rebasing my AMTTool driver [0] on top of master because the in-tree >> one still does not work for me, only to discover that my driver suddenly >> failed to deploy. I have filed this bug >> https://bugs.launchpad.net/ironic/+bug/1502980 >> because we broke at least one out of tree driver (mine). I highly suspect >> we've broken many other out of tree drivers that relied on either the >> PXEDeploy or AgentDeploy interfaces that were present in Kilo release. Both >> classes in Liberty are making explicit calls to "task.driver.boot" -- and >> kilo-era driver classes did not define this interface. >> > > > I would like to think more about what really our driver API is ? We have a > couple of well defined interfaces in ironic/drivers/base.py which people > may follow, implement an out-of-tree driver, make it a stevedore entrypoint > and get it working with Ironic. > > But > > 1) Do we promise them that in-tree implementations of these interfaces > will always exist. For example in boot/deploy work done in Liberty, we > removed the class PxeDeploy [1]. It actually got broken down to PXEBoot > and ISCSIDeploy. In the first place, do we guarantee that they will exist > for ever in the same place with the same name ? :) > > 2) Do we really promise the in-tree implementations of these interfaces > will behave the same way ? For example, the broken stuff AgentDeploy which > is an implementation of our DeployInterface. Do we guarantee that this > implementation will always keep doing what ever it was every time code is > rebased ? > > [1] > https://review.openstack.org/#/c/166513/19/ironic/drivers/modules/pxe.py > > > >> >> I worked out a patch for the AgentDeploy driver and have proposed it here: >> https://review.openstack.org/#/c/231215/1 >> >> I'd like to ask folks to review it quickly -- we should fix this ASAP and >> backport it to stable/liberty before the next release, if possible. We >> should also make a similar fix for the PXEDeploy class. If anyone gets to >> this before I do, please reply here and let me know so we don't duplicate >> effort. >> > > > This isn't going to be as same as above as there is no longer a PXEDeploy > class any more. We might need to create a new class PXEDeploy which > probably inherits from ISCSIDeploy and has task.driver.boot worked around > in the same way as the above patch. > > > >> >> Also, Jim already spotted something in the review that is a bit >> concerning. It seems like the IloVirtualMediaAgentVendorInterface class >> expects the driver it is attached to *not* to have a boot interface and >> *not* to call boot.clean_up_ramdisk. Conversely, other drivers may be >> expecting AgentVendorInterface to call boot.clean_up_ramdisk -- since that >> was its default behavior in Kilo. I'm not sure what the right way to fix >> this is, but I lean towards updating the in-tree driver so we remain >> backwards-
Re: [openstack-dev] [Ironic] backwards compat issue with PXEDeply and AgentDeploy drivers
Well it's nice to fix, but I really don't know if we should be fixing it. As discussed in one of the Ironic meetings before, we might need to define what is our driver API or SDK or DDK or whatever we choose to call it . Please see inline for my thoughts. On Tue, Oct 6, 2015 at 5:54 AM, Devananda van der Veen < devananda@gmail.com> wrote: > tldr; the boot / deploy interface split we did broke an out of tree > driver. I've proposed a patch. We should get a fix into stable/liberty too. > > Longer version... > > I was rebasing my AMTTool driver [0] on top of master because the in-tree > one still does not work for me, only to discover that my driver suddenly > failed to deploy. I have filed this bug > https://bugs.launchpad.net/ironic/+bug/1502980 > because we broke at least one out of tree driver (mine). I highly suspect > we've broken many other out of tree drivers that relied on either the > PXEDeploy or AgentDeploy interfaces that were present in Kilo release. Both > classes in Liberty are making explicit calls to "task.driver.boot" -- and > kilo-era driver classes did not define this interface. > I would like to think more about what really our driver API is ? We have a couple of well defined interfaces in ironic/drivers/base.py which people may follow, implement an out-of-tree driver, make it a stevedore entrypoint and get it working with Ironic. But 1) Do we promise them that in-tree implementations of these interfaces will always exist. For example in boot/deploy work done in Liberty, we removed the class PxeDeploy [1]. It actually got broken down to PXEBoot and ISCSIDeploy. In the first place, do we guarantee that they will exist for ever in the same place with the same name ? :) 2) Do we really promise the in-tree implementations of these interfaces will behave the same way ? For example, the broken stuff AgentDeploy which is an implementation of our DeployInterface. Do we guarantee that this implementation will always keep doing what ever it was every time code is rebased ? [1] https://review.openstack.org/#/c/166513/19/ironic/drivers/modules/pxe.py > > I worked out a patch for the AgentDeploy driver and have proposed it here: > https://review.openstack.org/#/c/231215/1 > > I'd like to ask folks to review it quickly -- we should fix this ASAP and > backport it to stable/liberty before the next release, if possible. We > should also make a similar fix for the PXEDeploy class. If anyone gets to > this before I do, please reply here and let me know so we don't duplicate > effort. > This isn't going to be as same as above as there is no longer a PXEDeploy class any more. We might need to create a new class PXEDeploy which probably inherits from ISCSIDeploy and has task.driver.boot worked around in the same way as the above patch. > > Also, Jim already spotted something in the review that is a bit > concerning. It seems like the IloVirtualMediaAgentVendorInterface class > expects the driver it is attached to *not* to have a boot interface and > *not* to call boot.clean_up_ramdisk. Conversely, other drivers may be > expecting AgentVendorInterface to call boot.clean_up_ramdisk -- since that > was its default behavior in Kilo. I'm not sure what the right way to fix > this is, but I lean towards updating the in-tree driver so we remain > backwards-compatible for out of tree drivers. > > > -Devananda > > [0] https://github.com/devananda/ironic/tree/new-amt-driver > > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Ironic] backwards compat issue with PXEDeply and AgentDeploy drivers
tldr; the boot / deploy interface split we did broke an out of tree driver. I've proposed a patch. We should get a fix into stable/liberty too. Longer version... I was rebasing my AMTTool driver [0] on top of master because the in-tree one still does not work for me, only to discover that my driver suddenly failed to deploy. I have filed this bug https://bugs.launchpad.net/ironic/+bug/1502980 because we broke at least one out of tree driver (mine). I highly suspect we've broken many other out of tree drivers that relied on either the PXEDeploy or AgentDeploy interfaces that were present in Kilo release. Both classes in Liberty are making explicit calls to "task.driver.boot" -- and kilo-era driver classes did not define this interface. I worked out a patch for the AgentDeploy driver and have proposed it here: https://review.openstack.org/#/c/231215/1 I'd like to ask folks to review it quickly -- we should fix this ASAP and backport it to stable/liberty before the next release, if possible. We should also make a similar fix for the PXEDeploy class. If anyone gets to this before I do, please reply here and let me know so we don't duplicate effort. Also, Jim already spotted something in the review that is a bit concerning. It seems like the IloVirtualMediaAgentVendorInterface class expects the driver it is attached to *not* to have a boot interface and *not* to call boot.clean_up_ramdisk. Conversely, other drivers may be expecting AgentVendorInterface to call boot.clean_up_ramdisk -- since that was its default behavior in Kilo. I'm not sure what the right way to fix this is, but I lean towards updating the in-tree driver so we remain backwards-compatible for out of tree drivers. -Devananda [0] https://github.com/devananda/ironic/tree/new-amt-driver __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev