Re: VMware provisioning module for vCenter clusters

Aaron Coburn Tue, 21 Feb 2012 08:23:48 -0800

Hi, Andy,

see my comments below.


On Feb 21, 2012, at 10:28 AM, Andy Kurth wrote:

> Hi Aaron C,
> I took a pretty close look at vCenter.pm and tried it out on a vCenter
> test server.  Looks good.
> 
> Question:  Where is the $DATACENTER variable declared and set?  I see
> that it should contain the vCenter datacenter name you want to use but
> don't see where it is defined.  It would be nice for the code to be
> able to handle multiple datacenter names on the same vCenter host.
> I'm guessing that's what $DATACENTER is used for.  The datacenter name
> could be added to the vmprofile table.  We ~should~ theoretically be
> able to define multiple vmhost entries which point to the same
> computer but use different profiles.  I have come across this issue
> related to other non-vCenter issues.  It can be useful to assign some
> VMs to a host but configure them to use a special datastore.  You can
> create multiple vmhost table entries pointing to the same computer and
> configure the vmhost entries to use different profiles.  The backend
> can handle this but the frontend gets tripped up and doesn't see the
> VMs which are assigned to the host to be available

I put this value in /etc/vcl/vcld.conf and updated the Utils.pm module to load 
the value when the configuration is read.
My logic behind this was simply to avoid modifying the database structure for a 
custom module, but setting that value in the vmprofile would be preferred now 
that this will be part of the general release.


> The changes you had to make to get things to work with vCenter also
> seem to work with standalone hosts.  I incorporated the differences
> between vCenter.pm and vSphere_SDK.pm into vSpherer_SDK.pm in my test
> environment by doing the following:
> 
> -Get rid of all calls to VIExt::get_host_view and replace them with
> $self->get_datacenter_view.  The objects returned by these seem like
> they can be used interchangeably.
> -Add a "begin_entity => $self->_get_datacenter_view()" to all
> Vim::find_entity_view(s) calls.
> -Add a "datacenter => $self->_get_datacenter_view()" argument to a few
> calls such as the $file_manager->* calls
> -Replace all hard-coded references to the default standalone
> datacenter name "ha-datacenter" to $datacenter->{name}
> -Make a minor change to _get_datastore_info where it checks if the
> datastore URL begins with /vmfs/volumes.  Add checks for the
> additional paths which may be returned when connecting to vCenter such
> as netfs: and sanfs:.

If the code works for standalone hosts and can all be put into the vSphere_SDK 
module, then it would make a lot of sense to just have one module. It also 
would simplify the issue you mention below regarding the VMware::initialize 
subroutine.

> I encountered the problem you mentioned in the Jira issue regarding
> vCenter not being able to copy a virtual disk using CopyVirtualDisk.
> That stinks...  I like your workaround.

Just to clarify -- I have employed two different work-arounds for this. The 
first one used a FileManager object to explicitly copy each disk extent. That 
worked... except that it didn't preserve the thin-provisionedness of the 
virtual disks, which was highly problematic for our datastore. The better 
approach (for many reasons) is to use the CloneVM method. The only "gotcha" 
here is that VMware may silently truncate the names and enclosing directories 
of the virtual disks during a Clone operation. You had mentioned earlier that 
the VMware::get_vmdk_file_path subroutine could handle this with minor 
modifications.


> How would you like to proceed?  I can commit the changes to
> vSphere_SDK.pm and I don't think it will adversely affect standalone
> hosts currently using the module nor should it affect what you're
> currently using since the changes made are to subroutines overridden
> in vCenter.pm.  We can also add vCenter.pm and add code to
> VMware.pm::initialize to try to load it as discussed in the 2.3
> release thread.  Duplicated code can be removed from vCenter.pm after
> you have a chance to test it.

If it is possible to have a single module to handle both stand-alone and 
clustered hosts, then I believe that would be a much better design.

One other issue that I had when moving from a single ESX host to the vCenter is 
that the VIExt::http_(put|get)_file subroutines became intermittently less 
reliable. By intermittently, I mean that it failed about once for every fifty 
attempts. After having no luck trying to diagnose the problem, I simply defined 
an additional (optional) parameter in the copy_file_from and copy_file_to 
subroutines. That parameter ($attempts) allowed me to recursively call the 
copy_file_from/to routine if an attempt failed. This is what the code looks 
like in the copy_file_to subroutine:

sub copy_file_to {

    ...

    my ($from, $to, $attempts) = @_;
    $attempts = 0 unless $attempts;

    ...

    if ($response->is_success) {
        notify($ERRORS{'DEBUG'}, 0, "copied file from management node to VM 
host: '$source_file_path' --> $vmhost_hostname:'[$destination_datastore_name] 
$destination_relative_datastore_path'");
        return 1;
    }
    else {
        notify($ERRORS{'WARNING'}, 0, "failed to copy file from management node 
to VM host: '$source_file_path' --> 
$vmhost_hostname:'$destination_file_path'\nerror: " . $response->message);
        if ($attempts > 3){
            return;
        } else {
            notify($ERRORS{'DEBUG'}, 0, "trying again");
            sleep 5;
            return $self->copy_file_to($from, $to, ++$attempts);
        }
    }
}

A similar block exists in the copy_file_from subroutine. As you can see I set 
the $attempts value at only 3 and have had no problems in five months. When I 
inspect the logs, though, I still find that the GET/PUT requests fail 
intermittently, so I have left this code in place.

Finally, one more issue I encountered was that the VMware module will call 
vmhost_os->create_directory($destination_directory_path) in the copy_vmdk 
subroutine. This causes problems for the CloneVM method, as it does not clone a 
virtual disk into an existing directory on a datastore. So I just wrapped that 
in a conditional:

if( ref($self->{api}) ne "VCL::Module::Provisioning::VMware::vCenter"){
        ...
}

So yes, if you add these to the vSphere_SDK module, I will happily test it!

Aaron


> On Wed, Feb 1, 2012 at 11:24 AM, Andy Kurth <andy_ku...@ncsu.edu> wrote:
>> Regarding image/file naming:
>> I foresaw issues like this when writing VMware.pm.  That's why all of
>> the naming logic is funneled through the various get_vmdk_* and
>> get_vmx_* subroutines.  Changes to naming conventions or where files
>> are saved should (~theoretically~) only involve changing 1 or 2 of
>> these subroutines.  All of the code in VMware.pm and the API modules
>> should rely on the get_* subroutines and not try to construct paths.
>> 
>> If vCenter for example goes and renames a vmdk, a small amount of code
>> could be added to VMware.pm::get_vmdk_file_path():
>> if ($self->api->can('get_mangled_vmdk_file_path')) {
>>   $vmdk_file_path = $self->api->get_mangled_vmdk_file_path();
>> }
>> 
>> This would allow all of the other code to still rely on
>> get_vmdk_file_path() for returning the correct path.
>> 
>> It shouldn't really matter what you name the image or files.  I wasn't
>> aware of the character limit.  During image capture, the backend code
>> could rename the image in the database to something like
>> vmwarewinxp-82-v3 if necessary.  I'm currently working on the KVM code
>> to do something like this.  The issue I ran into has to do with the
>> KVM code being able to load VMware images and then capture new images
>> or update revisions.  As a result, the updated image shouldn't be
>> named "vmware<x>-".  I was thinking the backend code could possibly
>> rename imagerevision.imagename via a update_image_name subroutine.
>> There may be some gotchas with the web code if you change the format.
>> Will have to test this.
>> 
>> 
>> Linked clones:
>> Already committed.  I mentioned this earlier in this thread on 8/17.
>> Yes, there are added dangers with snapshots.  I have added some safety
>> checks into the code.  A snapshot is taken after a VM is registered
>> but before it is powered on.  If the snapshot fails, the VM is
>> immediately deleted to prevent someone from powering it on.  You have
>> to be very careful when working with the vSphere Client.  If you
>> delete a linked clone VM it may delete the backing file.
>> 
>> 
>> prepare_vmx:
>> The current VMware code writes the vmx file out and then copies it to
>> the host for a few reasons:
>> - This is similar to how the old vmware.pm module did things.  It was
>> never changed to be purely done through the vSphere SDK/vim-cmd
>> because it was working.
>> - I'm not sure if every vmx setting is configurable via the vSphere
>> SDK/vim-cmd.  There are many undocumented vmx options which can be
>> useful.
>> 
>> That said, I have nothing against changing anything.  I would like to
>> keep all of the logic in VMware.pm.  The idea is that the API modules
>> are basically utility modules which expose functionality.  The
>> different API modules should be able to be used interchangeably with
>> the resulting VMs constructed very similarly.  Each subroutine in the
>> API modules should do a single simple task.  The register_vm
>> subroutine should do only that, not actually define/construct the VM,
>> add devices, or other operations.
>> 
>> I'd prefer to add subroutines to the API modules to add devices or
>> make other VM configuration changes, replacing the vmx file generation
>> tasks done by prepare_vmx.  There could either be granular subroutines
>> for the different device types or a single subroutine named something
>> like 'vmx_add_device' which accepts somewhat elaborate arguments.  It
>> could be called from a new define_vm subroutine in VMware.pm similar
>> to:
>> 
>> if ($self->api->can('vmx_add_device')) {
>>   $self->api->vmx_add_device('virtual_disk' => {
>>         'vmdk_file_path' => $self->get_vmdk_file_path(),
>>         'adapter_type' => $self->get_vm_disk_adapter_type()
>>      }
>>   );
>>   # Add other devices...
>> }
>> else {
>>   $self->prepare_vmx().
>> }
>> 
>> I'm not at all familiar with what needs to be done for vCenter.  Do
>> you think this would work?
>> 
>> Also, I haven't had a chance to dig through all of vCenter.pm but
>> there appears to be some duplicated code.  Was everything from
>> vSphere.pm copied into vCenter.pm or are all of the subroutines in
>> vCenter.pm ones which you had to modify?  What are the main things you
>> had to change which didn't work in vSphere.pm?  I'd like to push the
>> general changes into vSphere.pm and try to keep vCenter.pm small if
>> possible.
>> 
>> Thanks,
>> Andy
>> 
>> 
>> On Tue, Jan 31, 2012 at 10:14 AM, Aaron Coburn <acob...@amherst.edu> wrote:
>>> 
>>> On Jan 31, 2012, at 9:31 AM, Sean Dilda wrote:
>>> 
>>> On 1/31/12 8:46 AM, Aaron Coburn wrote:
>>> 
>>> Sean,
>>> 
>>> 
>>> You can use the vsphere api to get the file names if you really need
>>> 
>>> them.
>>> 
>>> 
>>> This is true, and that may very well be the better approach. I am not
>>> 
>>> entirely happy with the method I described earlier, which relies on my
>>> 
>>> own observation of an apparently undocumented behavior of VMware. There
>>> 
>>> is, for instance, no guarantee that VMware won't start truncating
>>> 
>>> virtual disk filenames at 28 characters at some point in the future.
>>> 
>>> 
>>> On the other hand, if VMware is allowed to (arbitrarily?) truncate
>>> 
>>> values that follow an otherwise fairly strict VCL naming convention,
>>> 
>>> there could be negative implications for subsequent revisions of a given
>>> 
>>> image. Imagine, for instance, a situation with multiple management nodes
>>> 
>>> or at least multiple VM host infrastructures. If one infrastructure
>>> 
>>> truncates the imagename value, what happens if that image is revised
>>> 
>>> again later in a different VM infrastructure (possibly, even, with a
>>> 
>>> different VMware API being used). Would the new imagename be generated
>>> 
>>> properly? (This example is not at all hypothetical, given the way we
>>> 
>>> have set up our VCL).
>>> 
>>> 
>>> Another potential problem is that the VCL database enforces unique
>>> 
>>> imagename values. The VCL, however, would not be able to enforce this if
>>> 
>>> VMware is allowed to truncate these values according to its own opaque
>>> 
>>> contrivance. In fact, it is entirely is possible to construct a scenario
>>> 
>>> in which different versions of the same base image are assigned
>>> 
>>> identical paths in a VMware datastore. Aside from not suiting the
>>> 
>>> database schema particularly well, this could, potentially, overwrite
>>> 
>>> another revision of the same image stored in the VCL repository path.
>>> 
>>> 
>>> 
>>> I agree completely with the concerns you mentioned.  That's why I suggested
>>> the APIs.  VMware thinks of itself as the only one managing or caring about
>>> filenames.  Thus, I think VCL trying to enforce filenames is going to lead
>>> to nothing but problems.  Instead I think it should only care about the VM's
>>> name and let VMware figure out the filenames.
>>> 
>>> 
>>> I, too, think this approach would be the best for a vCenter-based
>>> infrastructure -- that is, a setup where the VCL keeps track of the names of
>>> the virtual disks, but doesn't enforce a particular format. What I don't
>>> know is what implications that would have on a VMware setup that uses the
>>> VIM_SSH or VIX_API modules -- code with which I am almost entirely
>>> unfamiliar. I also don't know how this would affect how images are shared
>>> using the image library features of the VCL. We are moving toward a
>>> multi-node, physically distributed architecture using a shared image
>>> library, but we are not there yet, so I am only vaguely aware of how this is
>>> implemented. Perhaps someone who uses the image library to share images
>>> between management nodes could chime in?
>>> 
>>> 
>>> Why does vcl write its own vmx instead of using the apis? vSphere
>>> 
>>> expects programs to use the apis, not to hand craft files.
>>> 
>>> 
>>> I completely agree. I must admit that, when I wrote the vCenter
>>> 
>>> provisioning module, I tried to use as much existing VCL code as
>>> 
>>> possible. Rather than reimplementing everything to use vSphere managed
>>> 
>>> objects, I rewrote only what appeared to be necessary. So for example,
>>> 
>>> in VMWare.pm, the 'load' subroutine calls 'prepare_vmx', which generates
>>> 
>>> a vmx text file on the management node. This file is then transferred
>>> 
>>> via VIExt::http_put_file into the VMware infrastructure. While I managed
>>> 
>>> to make this sequence of commands work in the context of vCenter, I
>>> 
>>> agree that it would be far better to implement this differently for a
>>> 
>>> vCenter provisioning module.
>>> 
>>> 
>>> Within the context of the current VMware module design, the
>>> 
>>> communication with the VM host is expected to occur by means of the API
>>> 
>>> object (i.e. $self->{api}). So for instance, the creation of the virtual
>>> 
>>> machine via vCenter would logically occur when
>>> 
>>> $self->api->vm_register(…) is called. The call to $self->prepare_vmx in
>>> 
>>> the preceding lines would be superfluous, and there would be no need for
>>> 
>>> that method to write and then transfer a vmx file to the VM host
>>> 
>>> infrastructure. In short, there would need to be some way for the
>>> 
>>> vCenter API to circumvent that sequence of commands.
>>> 
>>> 
>>> Since all the VMware API modules inherit from the base VMware class,
>>> 
>>> there could be a new method, such as 'require_prepare_vmx', for which
>>> 
>>> the VMware class provides a simple default implementation:
>>> 
>>> 
>>> 
>>> 
>>> I haven't had a chance to look at the code, but would it make more sense for
>>> the vSphere module to overwrite the prepare_vmx function with 'return 1;' ?
>>>  That way it doesn't happen for vSphere, but no code in higher modules needs
>>> to be modified.
>>> 
>>> 
>>> That would work with a simple change in the VMware module. Because the
>>> prepare_vmx function is executed in the context of the VMware base object --
>>> rather than the API -- the line in VMware.pm:
>>> 
>>> $self->prepare_vmx
>>> 
>>> would need to be changed to:
>>> 
>>> $self->api->prepare_vmx
>>> 
>>> If I understand the code correctly, any VMware API in use will fall back on
>>> the prepare_vmx subroutine implemented in the (parent) VMware module. Then,
>>> the vCenter module could implement its own short-circuited, { return 1; }
>>> block.
>>> 
>>>

Re: VMware provisioning module for vCenter clusters

Reply via email to