Looks good. Though I do wonder if it's necessary to fail the entire load process just because the old instance doesn't get deleted. I think there's three possibilities:
1. Don't check for successful deletion; there won't be any conflicts because we're using openstackComputerMap. This would give the fastest load times, but the only way to find out that something went wrong would be to look at the list of instances and see if there are any duplicate names. Could cause issues with quotas if running near capacity since there could be extra instances lying around. 2. Check for successful deletion, but only log the error, don't fail the load. Slower load times, but the load won't fail and the error will be logged. Could cause issues with quotas if running near capacity since there could be extra instances lying around. 3. What the module does now, check for successful deletion and fail if not. Least end user friendly since they might encounter failures, but the safest option. Won't cause quota issues on it's own, though an admin could still change the computer back to available without deleting the defunct instance. Instance deletion time is also not very consistent in my experience; I've seen anything from seconds to over a minute and I imagine it could go higher on OpenStack systems that see heavier usage. If we stick with option 3 I'd recommend bumping the timeout by another minute or two just to be safe. I think it's less necessary for option 2 since it doesn't fail on timeout. I took a look at some of the other provisioning modules to see what they do: - VMware logs a warning if it fails to delete the old VM, but only fails if the VM is still responding to SSH - Libvirt fails if deletion fails - VirtualBox doesn't check for successful deletion, though it will fail if it can't find the old VM to delete I think options 2 or 3 would be most consistent with existing behaviour. I'd lean towards option 2 since end users won't see any extra failures and we can keep a lower timeout which will mean lower load times even if a deletion takes a long time. What are you thoughts? Cameron On Tue, Jul 22, 2014 at 9:05 AM, YOUNG OH <[email protected]> wrote: > Cameron, > > Yes, you are definitely right. I was noticed that using hostname to find > the openstack instance id is not working properly and also can cause the > problem you described. I've back to use the openstackComputerMap table to > get_os_instance_id when the instance is pingable and also add a loop in > _terminate_os_instance to check whether the instance is completely deleted > or not. Please take a look at it again and let me know if you have any > concerns. Thank you. > > Best regards, > Young > > > On Mon, Jul 21, 2014 at 4:18 PM, Cameron Mann <[email protected]> > wrote: > > > Sounds like good progress to me. One comment though, it looks like > > _terminate_os_instance does the DELETE request, checks the response for > > success and then sleeps for 30 seconds while the instance deletes. > However, > > I don't believe a successful response to the DELETE request guarantees > the > > instance will actually be deleted. I've run into situations where an > > instance gets stuck in the error or deleting states but the command line > > client reports no errors when trying to delete it. This could result in a > > situation where multiple instances with the same name exist which could > > cause _get_os_instance_id to return the wrong ID since it filters the > > instances based on name and selects the first in the list. > > > > I think either returning to using openstackComputerMap or looping with a > > timeout until the instance is actually deleted would be better choices. > The > > former would allow the new instance to be created even if the deletion of > > the old one fails. The latter would put the computer in VCL into an error > > state which would make it more obvious something has gone wrong, though > at > > the cost of potentially failing a user's reservation. As an added > > precaution It might also be worth having _get_os_instance_id fail if > > there's more than one instance in the response. > > > > Cameron > > > > > > On Fri, Jul 18, 2014 at 9:25 AM, YOUNG OH <[email protected]> > wrote: > > > > > Cameron, > > > > > > I hope you had a great time and welcome back to work :-). And, yes, the > > > OpenStack module with directly using OpenStack APIs can solve the > > concerns > > > we've discussed and it's more flexible to apply new version of > OpenStack > > > APIs, if necessary. In the updated openstack module, I've changed the > two > > > main things. First, I've used the hostname in Computer table (unique in > > the > > > same VCL database) to create an instance and get the instance id to > > > terminate rather than using the openstackComputerMap table. This can > > avoid > > > using an additional table and database transactions. Second, I've > changed > > > the openstackImageMap to openstackimagerevision table that maps the > > > imagerevision id with the openstack image id. This table consists of > > three > > > fields (imagerevisionid, imagedetails, flavordetails). The imagedetails > > and > > > flavordetails contains the details image and flavor information with > json > > > format. Thus, when VCL creates an instance, it gets each detail > > information > > > and parse them to find the corresponding openstack image id and flavor > > id. > > > In addition, I've implemented the get_image_size() subroutine because > the > > > image size information was not supported in OpenStack ESSEX but it > > supports > > > now. This is a short summary about the changes. So, if you have any > > concern > > > or questions about the updates, please let me know. Thank you. > > > > > > Best regards, > > > Young-Hyun > > > > > > > > > On Thu, Jul 17, 2014 at 11:53 AM, Cameron Mann <[email protected] > > > > > wrote: > > > > > > > Sorry for the silence from my end, I realized I forgot to mention I > was > > > > going to be on vacation for the last week and a half. Anyways, it > looks > > > > like Young's updates have addressed the main concerns we were having > > with > > > > regards to the command line client. Given the progress he's made we > > think > > > > going ahead with his module makes the most sense. > > > > > > > > Cameron > > > > > > > > > > > > On Wed, Jul 16, 2014 at 9:27 AM, YOUNG OH <[email protected]> > > > wrote: > > > > > > > > > Andy, > > > > > > > > > > Thank you for your comments. I've tried to apply what you addressed > > and > > > > > committed my module again. This module finds all openstack > > information > > > > > using OpenStack APIs and database. Thank you. > > > > > > > > > > Best regards, > > > > > Young-Hyun > > > > > > > > > > > > > > > > > > > > On Wed, Jul 9, 2014 at 10:24 AM, Andy Kurth <[email protected]> > > > wrote: > > > > > > > > > > > Thanks Young. Looks good! If I understand correctly, you are > > > avoiding > > > > > the > > > > > > need to use the CLI or cpan module by interacting directly with > > > > OpenStack > > > > > > via the REST API? > > > > > > > > > > > > It looks like the only commands you're running on the management > > node > > > > are > > > > > > "nova" and "qemu-img" in _get_flavor_type. Would it be possible > to > > > > > > accomplish this via the API? I haven't traced through how your > > code > > > > > works > > > > > > too deeply, but was wondering if the following could be used: > > > > > > http://docs.openstack.org/api/openstack > > > > > > -compute/2/content/Flavors-d1e4180.html > > > > > > > > > > > > It would be wonderful if you can eliminate the need for these to > be > > > > > > executed. This would mean a pure API solution with nothing > special > > > > > needing > > > > > > to be installed on the management node. > > > > > > > > > > > > If you do need to call these commands, instead of using qx and > > > > backticks > > > > > > are used to run commands on the management node. Please change > > this > > > to > > > > > > use: > > > > > > my ($exit_status, $output) = $self->mn_os->execute($command); > > > > > > > > > > > > Also, always, always, always make sure $output and anything else > > you > > > > try > > > > > to > > > > > > parse with a regex are defined first. This will avoid some nasty > > > "Use > > > > of > > > > > > uninitialized value in pattern match" errors which could > > potentially > > > > lead > > > > > > to the entire process dying. > > > > > > > > > > > > The indentation looks great! :) There are a few places where > the > > > > curly > > > > > > bracket style could be modified. Just about all of the existing > > code > > > > > > places opening brackets on the same line as the while/for > statement > > > > such > > > > > > as: > > > > > > while ($loop > 0) { > > > > > > -instead of- > > > > > > while ($loop > 0) > > > > > > { > > > > > > > > > > > > Please add a pod "=head2 subroutine_name ... =cut" heading for > > every > > > > > > subroutine. This is helpful for others to read/understand your > > code. > > > > > The > > > > > > pod syntax can be a bit finicky. You can tell if it is formatted > > > > > properly > > > > > > by running "pod2text openstack.pm". > > > > > > > > > > > > Lastly (as mainly a reminder), we will need to incorporate all of > > the > > > > > > database changes in vcl.sql and whatever method we use for the > next > > > > > release > > > > > > to replace update-vcl.sql. I made a reminder comment here: > > > > > > https://issues.apache.org/jira/browse/VCL-764 > > > > > > > > > > > > Regards, > > > > > > Andy > > > > > > > > > > > > > > > > > > > > >
