Then option 2 sounds reasonable. Just warn and log if openstack gets failed to delete the instance but continue to load a new instance. It seems okay to me.
Best regards, Young On Wed, Jul 23, 2014 at 12:38 PM, Cameron Mann <[email protected]> wrote: > The new instance that gets created is completely independent from the old > one; both could remain powered on with no problems. OpenStack handles the > IP addresses and if one is already in use it won't be assigned again. Just > in case there's any confusion, the issue before was that the module was > using looking up the instance using an identifier which wouldn't be unique > between the new and old instances. Now that it's using a unique identifier > it could never delete an instance and there'd never be a conflict barring a > problem in OpenStack. > > Cameron > > > On Wed, Jul 23, 2014 at 9:31 AM, Andy Kurth <[email protected]> wrote: > > > Option 2 sounds reasonable as long as there is no possibility the VM > which > > could not be deleted can not cause conflicts with other VMs. The main > > concern would be if the VM remained powered on with an IP address that > > could be reused. Would this be possible? If you encounter a problem > > deleting a VM, can you reliably verify that it is powered off? > > > > I think the priorities should be: > > 1) avoid conflicts > > 2) reduce reservation failures > > 3) optimize load time > > > > Load time is important, but the deletion issue should really only affect > > the user experience when the image requested is not preloaded. If VMs > are > > being reloaded for every reservation then there is another problem. > > > > -Andy > > > > > > > > > > On Tue, Jul 22, 2014 at 3:37 PM, Cameron Mann <[email protected]> > > wrote: > > > > > Agreed, this goes back to more general VCL behaviour so it would be > good > > to > > > get others' input. > > > > > > Cameron > > > > > > > > > On Tue, Jul 22, 2014 at 1:22 PM, YOUNG OH <[email protected]> > > wrote: > > > > > > > That's great observation. I think option 1 and 2 can provide fast > > loading > > > > time due to no load failure. But, to avoid quota issues, an admin > > should > > > > periodically check the all the instances whether there are any > > duplicate > > > > instance names and defunct instances. The option 3 is safe but slow > to > > > load > > > > if any deleting instance fails. I also agree that option 2 would be a > > > good > > > > choice because it can provide lower load time for end-users and also > we > > > > cannot exactly estimate the deletion time. But I hope to hear others' > > > > thoughts. > > > > > > > > Best regards > > > > Young > > > > > > > > > > > > On Tue, Jul 22, 2014 at 2:46 PM, Cameron Mann < > [email protected]> > > > > wrote: > > > > > > > > > Looks good. Though I do wonder if it's necessary to fail the entire > > > load > > > > > process just because the old instance doesn't get deleted. I think > > > > there's > > > > > three possibilities: > > > > > > > > > > 1. Don't check for successful deletion; there won't be any > conflicts > > > > > because we're using openstackComputerMap. This would give the > fastest > > > > load > > > > > times, but the only way to find out that something went wrong would > > be > > > to > > > > > look at the list of instances and see if there are any duplicate > > names. > > > > > Could cause issues with quotas if running near capacity since there > > > could > > > > > be extra instances lying around. > > > > > > > > > > 2. Check for successful deletion, but only log the error, don't > fail > > > the > > > > > load. Slower load times, but the load won't fail and the error will > > be > > > > > logged. Could cause issues with quotas if running near capacity > since > > > > there > > > > > could be extra instances lying around. > > > > > > > > > > 3. What the module does now, check for successful deletion and fail > > if > > > > not. > > > > > Least end user friendly since they might encounter failures, but > the > > > > safest > > > > > option. Won't cause quota issues on it's own, though an admin could > > > still > > > > > change the computer back to available without deleting the defunct > > > > > instance. > > > > > > > > > > Instance deletion time is also not very consistent in my > experience; > > > I've > > > > > seen anything from seconds to over a minute and I imagine it could > go > > > > > higher on OpenStack systems that see heavier usage. If we stick > with > > > > option > > > > > 3 I'd recommend bumping the timeout by another minute or two just > to > > be > > > > > safe. I think it's less necessary for option 2 since it doesn't > fail > > on > > > > > timeout. > > > > > > > > > > I took a look at some of the other provisioning modules to see what > > > they > > > > > do: > > > > > > > > > > - VMware logs a warning if it fails to delete the old VM, but only > > > fails > > > > if > > > > > the VM is still responding to SSH > > > > > - Libvirt fails if deletion fails > > > > > - VirtualBox doesn't check for successful deletion, though it will > > fail > > > > if > > > > > it can't find the old VM to delete > > > > > > > > > > I think options 2 or 3 would be most consistent with existing > > > behaviour. > > > > > I'd lean towards option 2 since end users won't see any extra > > failures > > > > and > > > > > we can keep a lower timeout which will mean lower load times even > if > > a > > > > > deletion takes a long time. > > > > > > > > > > What are you thoughts? > > > > > > > > > > Cameron > > > > > > > > > > > > > > > On Tue, Jul 22, 2014 at 9:05 AM, YOUNG OH <[email protected] > > > > > > wrote: > > > > > > > > > > > Cameron, > > > > > > > > > > > > Yes, you are definitely right. I was noticed that using hostname > to > > > > find > > > > > > the openstack instance id is not working properly and also can > > cause > > > > the > > > > > > problem you described. I've back to use the openstackComputerMap > > > table > > > > to > > > > > > get_os_instance_id when the instance is pingable and also add a > > loop > > > in > > > > > > _terminate_os_instance to check whether the instance is > completely > > > > > deleted > > > > > > or not. Please take a look at it again and let me know if you > have > > > any > > > > > > concerns. Thank you. > > > > > > > > > > > > Best regards, > > > > > > Young > > > > > > > > > > > > > > > > > > On Mon, Jul 21, 2014 at 4:18 PM, Cameron Mann < > > > [email protected]> > > > > > > wrote: > > > > > > > > > > > > > Sounds like good progress to me. One comment though, it looks > > like > > > > > > > _terminate_os_instance does the DELETE request, checks the > > response > > > > for > > > > > > > success and then sleeps for 30 seconds while the instance > > deletes. > > > > > > However, > > > > > > > I don't believe a successful response to the DELETE request > > > > guarantees > > > > > > the > > > > > > > instance will actually be deleted. I've run into situations > where > > > an > > > > > > > instance gets stuck in the error or deleting states but the > > command > > > > > line > > > > > > > client reports no errors when trying to delete it. This could > > > result > > > > > in a > > > > > > > situation where multiple instances with the same name exist > which > > > > could > > > > > > > cause _get_os_instance_id to return the wrong ID since it > filters > > > the > > > > > > > instances based on name and selects the first in the list. > > > > > > > > > > > > > > I think either returning to using openstackComputerMap or > looping > > > > with > > > > > a > > > > > > > timeout until the instance is actually deleted would be better > > > > choices. > > > > > > The > > > > > > > former would allow the new instance to be created even if the > > > > deletion > > > > > of > > > > > > > the old one fails. The latter would put the computer in VCL > into > > an > > > > > error > > > > > > > state which would make it more obvious something has gone > wrong, > > > > though > > > > > > at > > > > > > > the cost of potentially failing a user's reservation. As an > added > > > > > > > precaution It might also be worth having _get_os_instance_id > fail > > > if > > > > > > > there's more than one instance in the response. > > > > > > > > > > > > > > Cameron > > > > > > > > > > > > > > > > > > > > > On Fri, Jul 18, 2014 at 9:25 AM, YOUNG OH < > > [email protected] > > > > > > > > > > wrote: > > > > > > > > > > > > > > > Cameron, > > > > > > > > > > > > > > > > I hope you had a great time and welcome back to work :-). > And, > > > yes, > > > > > the > > > > > > > > OpenStack module with directly using OpenStack APIs can solve > > the > > > > > > > concerns > > > > > > > > we've discussed and it's more flexible to apply new version > of > > > > > > OpenStack > > > > > > > > APIs, if necessary. In the updated openstack module, I've > > changed > > > > the > > > > > > two > > > > > > > > main things. First, I've used the hostname in Computer table > > > > (unique > > > > > in > > > > > > > the > > > > > > > > same VCL database) to create an instance and get the instance > > id > > > to > > > > > > > > terminate rather than using the openstackComputerMap table. > > This > > > > can > > > > > > > avoid > > > > > > > > using an additional table and database transactions. Second, > > I've > > > > > > changed > > > > > > > > the openstackImageMap to openstackimagerevision table that > maps > > > the > > > > > > > > imagerevision id with the openstack image id. This table > > consists > > > > of > > > > > > > three > > > > > > > > fields (imagerevisionid, imagedetails, flavordetails). The > > > > > imagedetails > > > > > > > and > > > > > > > > flavordetails contains the details image and flavor > information > > > > with > > > > > > json > > > > > > > > format. Thus, when VCL creates an instance, it gets each > detail > > > > > > > information > > > > > > > > and parse them to find the corresponding openstack image id > and > > > > > flavor > > > > > > > id. > > > > > > > > In addition, I've implemented the get_image_size() subroutine > > > > because > > > > > > the > > > > > > > > image size information was not supported in OpenStack ESSEX > but > > > it > > > > > > > supports > > > > > > > > now. This is a short summary about the changes. So, if you > have > > > any > > > > > > > concern > > > > > > > > or questions about the updates, please let me know. Thank > you. > > > > > > > > > > > > > > > > Best regards, > > > > > > > > Young-Hyun > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Jul 17, 2014 at 11:53 AM, Cameron Mann < > > > > > [email protected] > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Sorry for the silence from my end, I realized I forgot to > > > > mention I > > > > > > was > > > > > > > > > going to be on vacation for the last week and a half. > > Anyways, > > > it > > > > > > looks > > > > > > > > > like Young's updates have addressed the main concerns we > were > > > > > having > > > > > > > with > > > > > > > > > regards to the command line client. Given the progress he's > > > made > > > > we > > > > > > > think > > > > > > > > > going ahead with his module makes the most sense. > > > > > > > > > > > > > > > > > > Cameron > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Jul 16, 2014 at 9:27 AM, YOUNG OH < > > > > [email protected] > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Andy, > > > > > > > > > > > > > > > > > > > > Thank you for your comments. I've tried to apply what you > > > > > addressed > > > > > > > and > > > > > > > > > > committed my module again. This module finds all > openstack > > > > > > > information > > > > > > > > > > using OpenStack APIs and database. Thank you. > > > > > > > > > > > > > > > > > > > > Best regards, > > > > > > > > > > Young-Hyun > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Jul 9, 2014 at 10:24 AM, Andy Kurth < > > > > [email protected] > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Thanks Young. Looks good! If I understand correctly, > > you > > > > are > > > > > > > > avoiding > > > > > > > > > > the > > > > > > > > > > > need to use the CLI or cpan module by interacting > > directly > > > > with > > > > > > > > > OpenStack > > > > > > > > > > > via the REST API? > > > > > > > > > > > > > > > > > > > > > > It looks like the only commands you're running on the > > > > > management > > > > > > > node > > > > > > > > > are > > > > > > > > > > > "nova" and "qemu-img" in _get_flavor_type. Would it be > > > > > possible > > > > > > to > > > > > > > > > > > accomplish this via the API? I haven't traced through > > how > > > > your > > > > > > > code > > > > > > > > > > works > > > > > > > > > > > too deeply, but was wondering if the following could be > > > used: > > > > > > > > > > > http://docs.openstack.org/api/openstack > > > > > > > > > > > -compute/2/content/Flavors-d1e4180.html > > > > > > > > > > > > > > > > > > > > > > It would be wonderful if you can eliminate the need for > > > these > > > > > to > > > > > > be > > > > > > > > > > > executed. This would mean a pure API solution with > > nothing > > > > > > special > > > > > > > > > > needing > > > > > > > > > > > to be installed on the management node. > > > > > > > > > > > > > > > > > > > > > > If you do need to call these commands, instead of using > > qx > > > > and > > > > > > > > > backticks > > > > > > > > > > > are used to run commands on the management node. > Please > > > > change > > > > > > > this > > > > > > > > to > > > > > > > > > > > use: > > > > > > > > > > > my ($exit_status, $output) = > > > $self->mn_os->execute($command); > > > > > > > > > > > > > > > > > > > > > > Also, always, always, always make sure $output and > > anything > > > > > else > > > > > > > you > > > > > > > > > try > > > > > > > > > > to > > > > > > > > > > > parse with a regex are defined first. This will avoid > > some > > > > > nasty > > > > > > > > "Use > > > > > > > > > of > > > > > > > > > > > uninitialized value in pattern match" errors which > could > > > > > > > potentially > > > > > > > > > lead > > > > > > > > > > > to the entire process dying. > > > > > > > > > > > > > > > > > > > > > > The indentation looks great! :) There are a few > places > > > > where > > > > > > the > > > > > > > > > curly > > > > > > > > > > > bracket style could be modified. Just about all of the > > > > > existing > > > > > > > code > > > > > > > > > > > places opening brackets on the same line as the > while/for > > > > > > statement > > > > > > > > > such > > > > > > > > > > > as: > > > > > > > > > > > while ($loop > 0) { > > > > > > > > > > > -instead of- > > > > > > > > > > > while ($loop > 0) > > > > > > > > > > > { > > > > > > > > > > > > > > > > > > > > > > Please add a pod "=head2 subroutine_name ... =cut" > > heading > > > > for > > > > > > > every > > > > > > > > > > > subroutine. This is helpful for others to > > read/understand > > > > your > > > > > > > code. > > > > > > > > > > The > > > > > > > > > > > pod syntax can be a bit finicky. You can tell if it is > > > > > formatted > > > > > > > > > > properly > > > > > > > > > > > by running "pod2text openstack.pm". > > > > > > > > > > > > > > > > > > > > > > Lastly (as mainly a reminder), we will need to > > incorporate > > > > all > > > > > of > > > > > > > the > > > > > > > > > > > database changes in vcl.sql and whatever method we use > > for > > > > the > > > > > > next > > > > > > > > > > release > > > > > > > > > > > to replace update-vcl.sql. I made a reminder comment > > here: > > > > > > > > > > > https://issues.apache.org/jira/browse/VCL-764 > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > Andy > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
