Re: OpenStack Module

Andy Kurth Wed, 23 Jul 2014 08:33:06 -0700

Option 2 sounds reasonable as long as there is no possibility the VM which
could not be deleted can not cause conflicts with other VMs.  The main
concern would be if the VM remained powered on with an IP address that
could be reused.  Would this be possible?  If you encounter a problem
deleting a VM, can you reliably verify that it is powered off?


I think the priorities should be:
1) avoid conflicts
2) reduce reservation failures
3) optimize load time

Load time is important, but the deletion issue should really only affect
the user experience when the image requested is not preloaded.  If VMs are
being reloaded for every reservation then there is another problem.

-Andy




On Tue, Jul 22, 2014 at 3:37 PM, Cameron Mann <[email protected]>
wrote:

> Agreed, this goes back to more general VCL behaviour so it would be good to
> get others' input.
>
> Cameron
>
>
> On Tue, Jul 22, 2014 at 1:22 PM, YOUNG OH <[email protected]> wrote:
>
> > That's great observation. I think option 1 and 2 can provide fast loading
> > time due to no load failure. But, to avoid quota issues, an admin should
> > periodically check the all the instances whether there are any duplicate
> > instance names and defunct instances. The option 3 is safe but slow to
> load
> > if any deleting instance fails. I also agree that option 2 would be a
> good
> > choice because it can provide lower load time for end-users and also we
> > cannot exactly estimate the deletion time. But I hope to hear others'
> > thoughts.
> >
> > Best regards
> > Young
> >
> >
> > On Tue, Jul 22, 2014 at 2:46 PM, Cameron Mann <[email protected]>
> > wrote:
> >
> > > Looks good. Though I do wonder if it's necessary to fail the entire
> load
> > > process just because the old instance doesn't get deleted. I think
> > there's
> > > three possibilities:
> > >
> > > 1. Don't check for successful deletion; there won't be any conflicts
> > > because we're using openstackComputerMap. This would give the fastest
> > load
> > > times, but the only way to find out that something went wrong would be
> to
> > > look at the list of instances and see if there are any duplicate names.
> > > Could cause issues with quotas if running near capacity since there
> could
> > > be extra instances lying around.
> > >
> > > 2. Check for successful deletion, but only log the error, don't fail
> the
> > > load. Slower load times, but the load won't fail and the error will be
> > > logged. Could cause issues with quotas if running near capacity since
> > there
> > > could be extra instances lying around.
> > >
> > > 3. What the module does now, check for successful deletion and fail if
> > not.
> > > Least end user friendly since they might encounter failures, but the
> > safest
> > > option. Won't cause quota issues on it's own, though an admin could
> still
> > > change the computer back to available without deleting the defunct
> > > instance.
> > >
> > > Instance deletion time is also not very consistent in my experience;
> I've
> > > seen anything from seconds to over a minute and I imagine it could go
> > > higher on OpenStack systems that see heavier usage. If we stick with
> > option
> > > 3 I'd recommend bumping the timeout by another minute or two just to be
> > > safe. I think it's less necessary for option 2 since it doesn't fail on
> > > timeout.
> > >
> > > I took a look at some of the other provisioning modules to see what
> they
> > > do:
> > >
> > > - VMware logs a warning if it fails to delete the old VM, but only
> fails
> > if
> > > the VM is still responding to SSH
> > > - Libvirt fails if deletion fails
> > > - VirtualBox doesn't check for successful deletion, though it will fail
> > if
> > > it can't find the old VM to delete
> > >
> > > I think options 2 or 3 would be most consistent with existing
> behaviour.
> > > I'd lean towards option 2 since end users won't see any extra failures
> > and
> > > we can keep a lower timeout which will mean lower load times even if a
> > > deletion takes a long time.
> > >
> > > What are you thoughts?
> > >
> > > Cameron
> > >
> > >
> > > On Tue, Jul 22, 2014 at 9:05 AM, YOUNG OH <[email protected]>
> > wrote:
> > >
> > > > Cameron,
> > > >
> > > > Yes, you are definitely right. I was noticed that using hostname to
> > find
> > > > the openstack instance id is not working properly and also can cause
> > the
> > > > problem you described. I've back to use the openstackComputerMap
> table
> > to
> > > > get_os_instance_id when the instance is pingable and also add a loop
> in
> > > > _terminate_os_instance to check whether the instance is completely
> > > deleted
> > > > or not. Please take a look at it again and let me know if you have
> any
> > > > concerns. Thank you.
> > > >
> > > > Best regards,
> > > > Young
> > > >
> > > >
> > > > On Mon, Jul 21, 2014 at 4:18 PM, Cameron Mann <
> [email protected]>
> > > > wrote:
> > > >
> > > > > Sounds like good progress to me. One comment though, it looks like
> > > > > _terminate_os_instance does the DELETE request, checks the response
> > for
> > > > > success and then sleeps for 30 seconds while the instance deletes.
> > > > However,
> > > > > I don't believe a successful response to the DELETE request
> > guarantees
> > > > the
> > > > > instance will actually be deleted. I've run into situations where
> an
> > > > > instance gets stuck in the error or deleting states but the command
> > > line
> > > > > client reports no errors when trying to delete it. This could
> result
> > > in a
> > > > > situation where multiple instances with the same name exist which
> > could
> > > > > cause _get_os_instance_id to return the wrong ID since it filters
> the
> > > > > instances based on name and selects the first in the list.
> > > > >
> > > > > I think either returning to using openstackComputerMap or looping
> > with
> > > a
> > > > > timeout until the instance is actually deleted would be better
> > choices.
> > > > The
> > > > > former would allow the new instance to be created even if the
> > deletion
> > > of
> > > > > the old one fails. The latter would put the computer in VCL into an
> > > error
> > > > > state which would make it more obvious something has gone wrong,
> > though
> > > > at
> > > > > the cost of potentially failing a user's reservation. As an added
> > > > > precaution It might also be worth having _get_os_instance_id fail
> if
> > > > > there's more than one instance in the response.
> > > > >
> > > > > Cameron
> > > > >
> > > > >
> > > > > On Fri, Jul 18, 2014 at 9:25 AM, YOUNG OH <[email protected]
> >
> > > > wrote:
> > > > >
> > > > > > Cameron,
> > > > > >
> > > > > > I hope you had a great time and welcome back to work :-). And,
> yes,
> > > the
> > > > > > OpenStack module with directly using OpenStack APIs can solve the
> > > > > concerns
> > > > > > we've discussed and it's more flexible to apply new version of
> > > > OpenStack
> > > > > > APIs, if necessary. In the updated openstack module, I've changed
> > the
> > > > two
> > > > > > main things. First, I've used the hostname in Computer table
> > (unique
> > > in
> > > > > the
> > > > > > same VCL database) to create an instance and get the instance id
> to
> > > > > > terminate rather than using the openstackComputerMap table. This
> > can
> > > > > avoid
> > > > > > using an additional table and database transactions. Second, I've
> > > > changed
> > > > > > the openstackImageMap to openstackimagerevision table that maps
> the
> > > > > > imagerevision id with the openstack image id. This table consists
> > of
> > > > > three
> > > > > > fields (imagerevisionid, imagedetails, flavordetails). The
> > > imagedetails
> > > > > and
> > > > > > flavordetails contains the details image and flavor information
> > with
> > > > json
> > > > > > format. Thus, when VCL creates an instance, it gets each detail
> > > > > information
> > > > > > and parse them to find the corresponding openstack image id and
> > > flavor
> > > > > id.
> > > > > > In addition, I've implemented the get_image_size() subroutine
> > because
> > > > the
> > > > > > image size information was not supported in OpenStack ESSEX but
> it
> > > > > supports
> > > > > > now. This is a short summary about the changes. So, if you have
> any
> > > > > concern
> > > > > > or questions about the updates, please let me know. Thank you.
> > > > > >
> > > > > > Best regards,
> > > > > > Young-Hyun
> > > > > >
> > > > > >
> > > > > > On Thu, Jul 17, 2014 at 11:53 AM, Cameron Mann <
> > > [email protected]
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Sorry for the silence from my end, I realized I forgot to
> > mention I
> > > > was
> > > > > > > going to be on vacation for the last week and a half. Anyways,
> it
> > > > looks
> > > > > > > like Young's updates have addressed the main concerns we were
> > > having
> > > > > with
> > > > > > > regards to the command line client. Given the progress he's
> made
> > we
> > > > > think
> > > > > > > going ahead with his module makes the most sense.
> > > > > > >
> > > > > > > Cameron
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Jul 16, 2014 at 9:27 AM, YOUNG OH <
> > [email protected]
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Andy,
> > > > > > > >
> > > > > > > > Thank you for your comments. I've tried to apply what you
> > > addressed
> > > > > and
> > > > > > > > committed my module again. This module finds all openstack
> > > > > information
> > > > > > > > using OpenStack APIs and database. Thank you.
> > > > > > > >
> > > > > > > > Best regards,
> > > > > > > > Young-Hyun
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Jul 9, 2014 at 10:24 AM, Andy Kurth <
> > [email protected]
> > > >
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Thanks Young.  Looks good!  If I understand correctly, you
> > are
> > > > > > avoiding
> > > > > > > > the
> > > > > > > > > need to use the CLI or cpan module by interacting directly
> > with
> > > > > > > OpenStack
> > > > > > > > > via the REST API?
> > > > > > > > >
> > > > > > > > > It looks like the only commands you're running on the
> > > management
> > > > > node
> > > > > > > are
> > > > > > > > > "nova" and "qemu-img" in _get_flavor_type.  Would it be
> > > possible
> > > > to
> > > > > > > > > accomplish this via the API?  I haven't traced through how
> > your
> > > > > code
> > > > > > > > works
> > > > > > > > > too deeply, but was wondering if the following could be
> used:
> > > > > > > > > http://docs.openstack.org/api/openstack
> > > > > > > > > -compute/2/content/Flavors-d1e4180.html
> > > > > > > > >
> > > > > > > > > It would be wonderful if you can eliminate the need for
> these
> > > to
> > > > be
> > > > > > > > > executed.  This would mean a pure API solution with nothing
> > > > special
> > > > > > > > needing
> > > > > > > > > to be installed on the management node.
> > > > > > > > >
> > > > > > > > > If you do need to call these commands, instead of using qx
> > and
> > > > > > > backticks
> > > > > > > > > are used to run commands on the management node.  Please
> > change
> > > > > this
> > > > > > to
> > > > > > > > > use:
> > > > > > > > > my ($exit_status, $output) =
> $self->mn_os->execute($command);
> > > > > > > > >
> > > > > > > > > Also, always, always, always make sure $output and anything
> > > else
> > > > > you
> > > > > > > try
> > > > > > > > to
> > > > > > > > > parse with a regex are defined first.  This will avoid some
> > > nasty
> > > > > > "Use
> > > > > > > of
> > > > > > > > > uninitialized value in pattern match" errors which could
> > > > > potentially
> > > > > > > lead
> > > > > > > > > to the entire process dying.
> > > > > > > > >
> > > > > > > > > The indentation looks great!  :)  There are a few places
> > where
> > > > the
> > > > > > > curly
> > > > > > > > > bracket style could be modified.  Just about all of the
> > > existing
> > > > > code
> > > > > > > > > places opening brackets on the same line as the while/for
> > > > statement
> > > > > > > such
> > > > > > > > > as:
> > > > > > > > > while ($loop > 0) {
> > > > > > > > > -instead of-
> > > > > > > > > while ($loop > 0)
> > > > > > > > >    {
> > > > > > > > >
> > > > > > > > > Please add a pod "=head2 subroutine_name ... =cut" heading
> > for
> > > > > every
> > > > > > > > > subroutine.  This is helpful for others to read/understand
> > your
> > > > > code.
> > > > > > > >  The
> > > > > > > > > pod syntax can be a bit finicky.  You can tell if it is
> > > formatted
> > > > > > > > properly
> > > > > > > > > by running "pod2text openstack.pm".
> > > > > > > > >
> > > > > > > > > Lastly (as mainly a reminder), we will need to incorporate
> > all
> > > of
> > > > > the
> > > > > > > > > database changes in vcl.sql and whatever method we use for
> > the
> > > > next
> > > > > > > > release
> > > > > > > > > to replace update-vcl.sql.  I made a reminder comment here:
> > > > > > > > > https://issues.apache.org/jira/browse/VCL-764
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Andy
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: OpenStack Module

Reply via email to