Agreed, this goes back to more general VCL behaviour so it would be good to get others' input.
Cameron On Tue, Jul 22, 2014 at 1:22 PM, YOUNG OH <[email protected]> wrote: > That's great observation. I think option 1 and 2 can provide fast loading > time due to no load failure. But, to avoid quota issues, an admin should > periodically check the all the instances whether there are any duplicate > instance names and defunct instances. The option 3 is safe but slow to load > if any deleting instance fails. I also agree that option 2 would be a good > choice because it can provide lower load time for end-users and also we > cannot exactly estimate the deletion time. But I hope to hear others' > thoughts. > > Best regards > Young > > > On Tue, Jul 22, 2014 at 2:46 PM, Cameron Mann <[email protected]> > wrote: > > > Looks good. Though I do wonder if it's necessary to fail the entire load > > process just because the old instance doesn't get deleted. I think > there's > > three possibilities: > > > > 1. Don't check for successful deletion; there won't be any conflicts > > because we're using openstackComputerMap. This would give the fastest > load > > times, but the only way to find out that something went wrong would be to > > look at the list of instances and see if there are any duplicate names. > > Could cause issues with quotas if running near capacity since there could > > be extra instances lying around. > > > > 2. Check for successful deletion, but only log the error, don't fail the > > load. Slower load times, but the load won't fail and the error will be > > logged. Could cause issues with quotas if running near capacity since > there > > could be extra instances lying around. > > > > 3. What the module does now, check for successful deletion and fail if > not. > > Least end user friendly since they might encounter failures, but the > safest > > option. Won't cause quota issues on it's own, though an admin could still > > change the computer back to available without deleting the defunct > > instance. > > > > Instance deletion time is also not very consistent in my experience; I've > > seen anything from seconds to over a minute and I imagine it could go > > higher on OpenStack systems that see heavier usage. If we stick with > option > > 3 I'd recommend bumping the timeout by another minute or two just to be > > safe. I think it's less necessary for option 2 since it doesn't fail on > > timeout. > > > > I took a look at some of the other provisioning modules to see what they > > do: > > > > - VMware logs a warning if it fails to delete the old VM, but only fails > if > > the VM is still responding to SSH > > - Libvirt fails if deletion fails > > - VirtualBox doesn't check for successful deletion, though it will fail > if > > it can't find the old VM to delete > > > > I think options 2 or 3 would be most consistent with existing behaviour. > > I'd lean towards option 2 since end users won't see any extra failures > and > > we can keep a lower timeout which will mean lower load times even if a > > deletion takes a long time. > > > > What are you thoughts? > > > > Cameron > > > > > > On Tue, Jul 22, 2014 at 9:05 AM, YOUNG OH <[email protected]> > wrote: > > > > > Cameron, > > > > > > Yes, you are definitely right. I was noticed that using hostname to > find > > > the openstack instance id is not working properly and also can cause > the > > > problem you described. I've back to use the openstackComputerMap table > to > > > get_os_instance_id when the instance is pingable and also add a loop in > > > _terminate_os_instance to check whether the instance is completely > > deleted > > > or not. Please take a look at it again and let me know if you have any > > > concerns. Thank you. > > > > > > Best regards, > > > Young > > > > > > > > > On Mon, Jul 21, 2014 at 4:18 PM, Cameron Mann <[email protected]> > > > wrote: > > > > > > > Sounds like good progress to me. One comment though, it looks like > > > > _terminate_os_instance does the DELETE request, checks the response > for > > > > success and then sleeps for 30 seconds while the instance deletes. > > > However, > > > > I don't believe a successful response to the DELETE request > guarantees > > > the > > > > instance will actually be deleted. I've run into situations where an > > > > instance gets stuck in the error or deleting states but the command > > line > > > > client reports no errors when trying to delete it. This could result > > in a > > > > situation where multiple instances with the same name exist which > could > > > > cause _get_os_instance_id to return the wrong ID since it filters the > > > > instances based on name and selects the first in the list. > > > > > > > > I think either returning to using openstackComputerMap or looping > with > > a > > > > timeout until the instance is actually deleted would be better > choices. > > > The > > > > former would allow the new instance to be created even if the > deletion > > of > > > > the old one fails. The latter would put the computer in VCL into an > > error > > > > state which would make it more obvious something has gone wrong, > though > > > at > > > > the cost of potentially failing a user's reservation. As an added > > > > precaution It might also be worth having _get_os_instance_id fail if > > > > there's more than one instance in the response. > > > > > > > > Cameron > > > > > > > > > > > > On Fri, Jul 18, 2014 at 9:25 AM, YOUNG OH <[email protected]> > > > wrote: > > > > > > > > > Cameron, > > > > > > > > > > I hope you had a great time and welcome back to work :-). And, yes, > > the > > > > > OpenStack module with directly using OpenStack APIs can solve the > > > > concerns > > > > > we've discussed and it's more flexible to apply new version of > > > OpenStack > > > > > APIs, if necessary. In the updated openstack module, I've changed > the > > > two > > > > > main things. First, I've used the hostname in Computer table > (unique > > in > > > > the > > > > > same VCL database) to create an instance and get the instance id to > > > > > terminate rather than using the openstackComputerMap table. This > can > > > > avoid > > > > > using an additional table and database transactions. Second, I've > > > changed > > > > > the openstackImageMap to openstackimagerevision table that maps the > > > > > imagerevision id with the openstack image id. This table consists > of > > > > three > > > > > fields (imagerevisionid, imagedetails, flavordetails). The > > imagedetails > > > > and > > > > > flavordetails contains the details image and flavor information > with > > > json > > > > > format. Thus, when VCL creates an instance, it gets each detail > > > > information > > > > > and parse them to find the corresponding openstack image id and > > flavor > > > > id. > > > > > In addition, I've implemented the get_image_size() subroutine > because > > > the > > > > > image size information was not supported in OpenStack ESSEX but it > > > > supports > > > > > now. This is a short summary about the changes. So, if you have any > > > > concern > > > > > or questions about the updates, please let me know. Thank you. > > > > > > > > > > Best regards, > > > > > Young-Hyun > > > > > > > > > > > > > > > On Thu, Jul 17, 2014 at 11:53 AM, Cameron Mann < > > [email protected] > > > > > > > > > wrote: > > > > > > > > > > > Sorry for the silence from my end, I realized I forgot to > mention I > > > was > > > > > > going to be on vacation for the last week and a half. Anyways, it > > > looks > > > > > > like Young's updates have addressed the main concerns we were > > having > > > > with > > > > > > regards to the command line client. Given the progress he's made > we > > > > think > > > > > > going ahead with his module makes the most sense. > > > > > > > > > > > > Cameron > > > > > > > > > > > > > > > > > > On Wed, Jul 16, 2014 at 9:27 AM, YOUNG OH < > [email protected] > > > > > > > > wrote: > > > > > > > > > > > > > Andy, > > > > > > > > > > > > > > Thank you for your comments. I've tried to apply what you > > addressed > > > > and > > > > > > > committed my module again. This module finds all openstack > > > > information > > > > > > > using OpenStack APIs and database. Thank you. > > > > > > > > > > > > > > Best regards, > > > > > > > Young-Hyun > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Jul 9, 2014 at 10:24 AM, Andy Kurth < > [email protected] > > > > > > > > wrote: > > > > > > > > > > > > > > > Thanks Young. Looks good! If I understand correctly, you > are > > > > > avoiding > > > > > > > the > > > > > > > > need to use the CLI or cpan module by interacting directly > with > > > > > > OpenStack > > > > > > > > via the REST API? > > > > > > > > > > > > > > > > It looks like the only commands you're running on the > > management > > > > node > > > > > > are > > > > > > > > "nova" and "qemu-img" in _get_flavor_type. Would it be > > possible > > > to > > > > > > > > accomplish this via the API? I haven't traced through how > your > > > > code > > > > > > > works > > > > > > > > too deeply, but was wondering if the following could be used: > > > > > > > > http://docs.openstack.org/api/openstack > > > > > > > > -compute/2/content/Flavors-d1e4180.html > > > > > > > > > > > > > > > > It would be wonderful if you can eliminate the need for these > > to > > > be > > > > > > > > executed. This would mean a pure API solution with nothing > > > special > > > > > > > needing > > > > > > > > to be installed on the management node. > > > > > > > > > > > > > > > > If you do need to call these commands, instead of using qx > and > > > > > > backticks > > > > > > > > are used to run commands on the management node. Please > change > > > > this > > > > > to > > > > > > > > use: > > > > > > > > my ($exit_status, $output) = $self->mn_os->execute($command); > > > > > > > > > > > > > > > > Also, always, always, always make sure $output and anything > > else > > > > you > > > > > > try > > > > > > > to > > > > > > > > parse with a regex are defined first. This will avoid some > > nasty > > > > > "Use > > > > > > of > > > > > > > > uninitialized value in pattern match" errors which could > > > > potentially > > > > > > lead > > > > > > > > to the entire process dying. > > > > > > > > > > > > > > > > The indentation looks great! :) There are a few places > where > > > the > > > > > > curly > > > > > > > > bracket style could be modified. Just about all of the > > existing > > > > code > > > > > > > > places opening brackets on the same line as the while/for > > > statement > > > > > > such > > > > > > > > as: > > > > > > > > while ($loop > 0) { > > > > > > > > -instead of- > > > > > > > > while ($loop > 0) > > > > > > > > { > > > > > > > > > > > > > > > > Please add a pod "=head2 subroutine_name ... =cut" heading > for > > > > every > > > > > > > > subroutine. This is helpful for others to read/understand > your > > > > code. > > > > > > > The > > > > > > > > pod syntax can be a bit finicky. You can tell if it is > > formatted > > > > > > > properly > > > > > > > > by running "pod2text openstack.pm". > > > > > > > > > > > > > > > > Lastly (as mainly a reminder), we will need to incorporate > all > > of > > > > the > > > > > > > > database changes in vcl.sql and whatever method we use for > the > > > next > > > > > > > release > > > > > > > > to replace update-vcl.sql. I made a reminder comment here: > > > > > > > > https://issues.apache.org/jira/browse/VCL-764 > > > > > > > > > > > > > > > > Regards, > > > > > > > > Andy > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
