Agreed, this goes back to more general VCL behaviour so it would be good to
get others' input.

Cameron


On Tue, Jul 22, 2014 at 1:22 PM, YOUNG OH <[email protected]> wrote:

> That's great observation. I think option 1 and 2 can provide fast loading
> time due to no load failure. But, to avoid quota issues, an admin should
> periodically check the all the instances whether there are any duplicate
> instance names and defunct instances. The option 3 is safe but slow to load
> if any deleting instance fails. I also agree that option 2 would be a good
> choice because it can provide lower load time for end-users and also we
> cannot exactly estimate the deletion time. But I hope to hear others'
> thoughts.
>
> Best regards
> Young
>
>
> On Tue, Jul 22, 2014 at 2:46 PM, Cameron Mann <[email protected]>
> wrote:
>
> > Looks good. Though I do wonder if it's necessary to fail the entire load
> > process just because the old instance doesn't get deleted. I think
> there's
> > three possibilities:
> >
> > 1. Don't check for successful deletion; there won't be any conflicts
> > because we're using openstackComputerMap. This would give the fastest
> load
> > times, but the only way to find out that something went wrong would be to
> > look at the list of instances and see if there are any duplicate names.
> > Could cause issues with quotas if running near capacity since there could
> > be extra instances lying around.
> >
> > 2. Check for successful deletion, but only log the error, don't fail the
> > load. Slower load times, but the load won't fail and the error will be
> > logged. Could cause issues with quotas if running near capacity since
> there
> > could be extra instances lying around.
> >
> > 3. What the module does now, check for successful deletion and fail if
> not.
> > Least end user friendly since they might encounter failures, but the
> safest
> > option. Won't cause quota issues on it's own, though an admin could still
> > change the computer back to available without deleting the defunct
> > instance.
> >
> > Instance deletion time is also not very consistent in my experience; I've
> > seen anything from seconds to over a minute and I imagine it could go
> > higher on OpenStack systems that see heavier usage. If we stick with
> option
> > 3 I'd recommend bumping the timeout by another minute or two just to be
> > safe. I think it's less necessary for option 2 since it doesn't fail on
> > timeout.
> >
> > I took a look at some of the other provisioning modules to see what they
> > do:
> >
> > - VMware logs a warning if it fails to delete the old VM, but only fails
> if
> > the VM is still responding to SSH
> > - Libvirt fails if deletion fails
> > - VirtualBox doesn't check for successful deletion, though it will fail
> if
> > it can't find the old VM to delete
> >
> > I think options 2 or 3 would be most consistent with existing behaviour.
> > I'd lean towards option 2 since end users won't see any extra failures
> and
> > we can keep a lower timeout which will mean lower load times even if a
> > deletion takes a long time.
> >
> > What are you thoughts?
> >
> > Cameron
> >
> >
> > On Tue, Jul 22, 2014 at 9:05 AM, YOUNG OH <[email protected]>
> wrote:
> >
> > > Cameron,
> > >
> > > Yes, you are definitely right. I was noticed that using hostname to
> find
> > > the openstack instance id is not working properly and also can cause
> the
> > > problem you described. I've back to use the openstackComputerMap table
> to
> > > get_os_instance_id when the instance is pingable and also add a loop in
> > > _terminate_os_instance to check whether the instance is completely
> > deleted
> > > or not. Please take a look at it again and let me know if you have any
> > > concerns. Thank you.
> > >
> > > Best regards,
> > > Young
> > >
> > >
> > > On Mon, Jul 21, 2014 at 4:18 PM, Cameron Mann <[email protected]>
> > > wrote:
> > >
> > > > Sounds like good progress to me. One comment though, it looks like
> > > > _terminate_os_instance does the DELETE request, checks the response
> for
> > > > success and then sleeps for 30 seconds while the instance deletes.
> > > However,
> > > > I don't believe a successful response to the DELETE request
> guarantees
> > > the
> > > > instance will actually be deleted. I've run into situations where an
> > > > instance gets stuck in the error or deleting states but the command
> > line
> > > > client reports no errors when trying to delete it. This could result
> > in a
> > > > situation where multiple instances with the same name exist which
> could
> > > > cause _get_os_instance_id to return the wrong ID since it filters the
> > > > instances based on name and selects the first in the list.
> > > >
> > > > I think either returning to using openstackComputerMap or looping
> with
> > a
> > > > timeout until the instance is actually deleted would be better
> choices.
> > > The
> > > > former would allow the new instance to be created even if the
> deletion
> > of
> > > > the old one fails. The latter would put the computer in VCL into an
> > error
> > > > state which would make it more obvious something has gone wrong,
> though
> > > at
> > > > the cost of potentially failing a user's reservation. As an added
> > > > precaution It might also be worth having _get_os_instance_id fail if
> > > > there's more than one instance in the response.
> > > >
> > > > Cameron
> > > >
> > > >
> > > > On Fri, Jul 18, 2014 at 9:25 AM, YOUNG OH <[email protected]>
> > > wrote:
> > > >
> > > > > Cameron,
> > > > >
> > > > > I hope you had a great time and welcome back to work :-). And, yes,
> > the
> > > > > OpenStack module with directly using OpenStack APIs can solve the
> > > > concerns
> > > > > we've discussed and it's more flexible to apply new version of
> > > OpenStack
> > > > > APIs, if necessary. In the updated openstack module, I've changed
> the
> > > two
> > > > > main things. First, I've used the hostname in Computer table
> (unique
> > in
> > > > the
> > > > > same VCL database) to create an instance and get the instance id to
> > > > > terminate rather than using the openstackComputerMap table. This
> can
> > > > avoid
> > > > > using an additional table and database transactions. Second, I've
> > > changed
> > > > > the openstackImageMap to openstackimagerevision table that maps the
> > > > > imagerevision id with the openstack image id. This table consists
> of
> > > > three
> > > > > fields (imagerevisionid, imagedetails, flavordetails). The
> > imagedetails
> > > > and
> > > > > flavordetails contains the details image and flavor information
> with
> > > json
> > > > > format. Thus, when VCL creates an instance, it gets each detail
> > > > information
> > > > > and parse them to find the corresponding openstack image id and
> > flavor
> > > > id.
> > > > > In addition, I've implemented the get_image_size() subroutine
> because
> > > the
> > > > > image size information was not supported in OpenStack ESSEX but it
> > > > supports
> > > > > now. This is a short summary about the changes. So, if you have any
> > > > concern
> > > > > or questions about the updates, please let me know. Thank you.
> > > > >
> > > > > Best regards,
> > > > > Young-Hyun
> > > > >
> > > > >
> > > > > On Thu, Jul 17, 2014 at 11:53 AM, Cameron Mann <
> > [email protected]
> > > >
> > > > > wrote:
> > > > >
> > > > > > Sorry for the silence from my end, I realized I forgot to
> mention I
> > > was
> > > > > > going to be on vacation for the last week and a half. Anyways, it
> > > looks
> > > > > > like Young's updates have addressed the main concerns we were
> > having
> > > > with
> > > > > > regards to the command line client. Given the progress he's made
> we
> > > > think
> > > > > > going ahead with his module makes the most sense.
> > > > > >
> > > > > > Cameron
> > > > > >
> > > > > >
> > > > > > On Wed, Jul 16, 2014 at 9:27 AM, YOUNG OH <
> [email protected]
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Andy,
> > > > > > >
> > > > > > > Thank you for your comments. I've tried to apply what you
> > addressed
> > > > and
> > > > > > > committed my module again. This module finds all openstack
> > > > information
> > > > > > > using OpenStack APIs and database. Thank you.
> > > > > > >
> > > > > > > Best regards,
> > > > > > > Young-Hyun
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Jul 9, 2014 at 10:24 AM, Andy Kurth <
> [email protected]
> > >
> > > > > wrote:
> > > > > > >
> > > > > > > > Thanks Young.  Looks good!  If I understand correctly, you
> are
> > > > > avoiding
> > > > > > > the
> > > > > > > > need to use the CLI or cpan module by interacting directly
> with
> > > > > > OpenStack
> > > > > > > > via the REST API?
> > > > > > > >
> > > > > > > > It looks like the only commands you're running on the
> > management
> > > > node
> > > > > > are
> > > > > > > > "nova" and "qemu-img" in _get_flavor_type.  Would it be
> > possible
> > > to
> > > > > > > > accomplish this via the API?  I haven't traced through how
> your
> > > > code
> > > > > > > works
> > > > > > > > too deeply, but was wondering if the following could be used:
> > > > > > > > http://docs.openstack.org/api/openstack
> > > > > > > > -compute/2/content/Flavors-d1e4180.html
> > > > > > > >
> > > > > > > > It would be wonderful if you can eliminate the need for these
> > to
> > > be
> > > > > > > > executed.  This would mean a pure API solution with nothing
> > > special
> > > > > > > needing
> > > > > > > > to be installed on the management node.
> > > > > > > >
> > > > > > > > If you do need to call these commands, instead of using qx
> and
> > > > > > backticks
> > > > > > > > are used to run commands on the management node.  Please
> change
> > > > this
> > > > > to
> > > > > > > > use:
> > > > > > > > my ($exit_status, $output) = $self->mn_os->execute($command);
> > > > > > > >
> > > > > > > > Also, always, always, always make sure $output and anything
> > else
> > > > you
> > > > > > try
> > > > > > > to
> > > > > > > > parse with a regex are defined first.  This will avoid some
> > nasty
> > > > > "Use
> > > > > > of
> > > > > > > > uninitialized value in pattern match" errors which could
> > > > potentially
> > > > > > lead
> > > > > > > > to the entire process dying.
> > > > > > > >
> > > > > > > > The indentation looks great!  :)  There are a few places
> where
> > > the
> > > > > > curly
> > > > > > > > bracket style could be modified.  Just about all of the
> > existing
> > > > code
> > > > > > > > places opening brackets on the same line as the while/for
> > > statement
> > > > > > such
> > > > > > > > as:
> > > > > > > > while ($loop > 0) {
> > > > > > > > -instead of-
> > > > > > > > while ($loop > 0)
> > > > > > > >    {
> > > > > > > >
> > > > > > > > Please add a pod "=head2 subroutine_name ... =cut" heading
> for
> > > > every
> > > > > > > > subroutine.  This is helpful for others to read/understand
> your
> > > > code.
> > > > > > >  The
> > > > > > > > pod syntax can be a bit finicky.  You can tell if it is
> > formatted
> > > > > > > properly
> > > > > > > > by running "pod2text openstack.pm".
> > > > > > > >
> > > > > > > > Lastly (as mainly a reminder), we will need to incorporate
> all
> > of
> > > > the
> > > > > > > > database changes in vcl.sql and whatever method we use for
> the
> > > next
> > > > > > > release
> > > > > > > > to replace update-vcl.sql.  I made a reminder comment here:
> > > > > > > > https://issues.apache.org/jira/browse/VCL-764
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Andy
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to