On May 23, 2024 12:40 pm, Dominik Csapak wrote: > On 5/22/24 12:08, Fabian Grünbichler wrote: >> On April 29, 2024 1:21 pm, Dominik Csapak wrote: >>> since we want to handle ova files (which are only ovf+images bundled in >>> a tar file) for import, add code that handles that. >>> >>> we introduce a valid volname for files contained in ovas like this: >>> >>> storage:import/archive.ova/disk-1.vmdk >>> >>> by basically treating the last part of the path as the name for the >>> contained disk we want. >>> >>> in that case we return 'import' as type with 'vmdk/qcow2/raw' as format >>> (we cannot use something like 'ova+vmdk' without extending the 'format' >>> parsing to that for all storages/formats. This is because it runs >>> though a verify format check at least once) >>> >>> we then provide 3 functions to use for that: >>> >>> * copy_needs_extraction: determines from the given volid (like above) if >>> that needs extraction to copy it, currently only 'import' vtype + a >>> volid with the above format returns true >>> >>> * extract_disk_from_import_file: this actually extracts the file from >>> the archive. Currently only ova is supported, so the extraction with >>> 'tar' is hardcoded, but again we can easily extend/modify that should >>> we need to. >>> >>> we currently extract into the either the import storage or a given >>> target storage in the images directory so if the cleanup does not >>> happen, the user can still see and interact with the image via >>> api/cli/gui >>> >>> * cleanup_extracted_image: intended to cleanup the extracted images from >>> above >>> >>> we have to modify the `parse_ovf` a bit to handle the missing disk >>> images, and we parse the size out of the ovf part (since this is >>> informal only, it should be no problem if we cannot parse it sometimes) >>> >>> Signed-off-by: Dominik Csapak <d.csa...@proxmox.com> >>> --- >>> src/PVE/API2/Storage/Status.pm | 1 + >>> src/PVE/GuestImport.pm | 100 +++++++++++++++++++++++++++++++++ >>> src/PVE/GuestImport/OVF.pm | 53 ++++++++++++++--- >>> src/PVE/Makefile | 1 + >>> src/PVE/Storage.pm | 2 +- >>> src/PVE/Storage/DirPlugin.pm | 15 ++++- >>> src/PVE/Storage/Plugin.pm | 5 ++ >>> src/test/parse_volname_test.pm | 15 +++++ >>> 8 files changed, 182 insertions(+), 10 deletions(-) >>> create mode 100644 src/PVE/GuestImport.pm >>> >>> diff --git a/src/PVE/API2/Storage/Status.pm b/src/PVE/API2/Storage/Status.pm >>> index dc6cc69..acde730 100644 >>> --- a/src/PVE/API2/Storage/Status.pm >>> +++ b/src/PVE/API2/Storage/Status.pm >>> @@ -749,6 +749,7 @@ __PACKAGE__->register_method({ >>> 'efi-state-lost', >>> 'guest-is-running', >>> 'nvme-unsupported', >>> + 'ova-needs-extracting', >>> 'ovmf-with-lsi-unsupported', >>> 'serial-port-socket-only', >>> ], >>> diff --git a/src/PVE/GuestImport.pm b/src/PVE/GuestImport.pm >>> new file mode 100644 >>> index 0000000..d405e30 >>> --- /dev/null >>> +++ b/src/PVE/GuestImport.pm >>> @@ -0,0 +1,100 @@ >>> +package PVE::GuestImport; >>> + >>> +use strict; >>> +use warnings; >>> + >>> +use File::Path; >>> + >>> +use PVE::Storage; >> >> another circular module dependency.. >>
true, sorry for the noise! :) > why do you think so? nothing in storage is using PVE::GuestImport only > PVE::GuestImport::OVF ? > >>> +use PVE::Tools qw(run_command); >>> + >>> +sub copy_needs_extraction { >>> + my ($volid) = @_; >>> + my $cfg = PVE::Storage::config(); >>> + my ($vtype, $name, undef, undef, undef, undef, $fmt) = >>> PVE::Storage::parse_volname($cfg, $volid); >>> + >>> + # only volumes inside ovas need extraction >>> + return $vtype eq 'import' && $fmt =~ m/^ova\+(.*)$/; >>> +} >> >> this could just as well live in qemu-server, there's only two call sites >> in one module there.. one of them even already has the parsed volname ;) > > true > >> >>> + >>> +sub extract_disk_from_import_file { >> >> I don't really like that this is using lots of plugin stuff.. >> >>> + my ($volid, $vmid, $target_storeid) = @_; >>> + >>> + my ($source_storeid, $volname) = PVE::Storage::parse_volume_id($volid); >>> + $target_storeid //= $source_storeid; >>> + my $cfg = PVE::Storage::config(); >>> + my $source_scfg = PVE::Storage::storage_config($cfg, $source_storeid); >>> + my $source_plugin = PVE::Storage::Plugin->lookup($source_scfg->{type}); >>> + >>> + my ($vtype, $name, undef, undef, undef, undef, $fmt) = >>> + $source_plugin->parse_volname($volname); >> >> could be PVE::Storage::parse_volname >> >>> + >>> + die "only files with content type 'import' can be extracted\n" >>> + if $vtype ne 'import' || $fmt !~ m/^ova\+/; >>> + >>> + # extract the inner file from the name >>> + my $archive; >>> + my $inner_file; >>> + if ($name =~ m!^(.*\.ova)/(${PVE::Storage::SAFE_CHAR_CLASS_RE}+)$!) { >>> + $archive = "import/$1"; >>> + $inner_file = $2; >>> + ($fmt) = $fmt =~ /^ova\+(.*)$/; >>> + } else { >>> + die "cannot extract $volid - invalid volname $volname\n"; >>> + } >>> + >>> + my ($ova_path) = $source_plugin->path($source_scfg, $archive, >>> $source_storeid); >> >> could be PVE::Storage::path >> >>> + >>> + my $target_scfg = PVE::Storage::storage_config($cfg, $target_storeid); >>> + my $target_plugin = PVE::Storage::Plugin->lookup($target_scfg->{type}); >>> + >>> + my $destdir = $target_plugin->get_subdir($target_scfg, 'images'); >> >> could be PVE::Storage::get_image_dir >> >>> + >>> + my $pid = $$; >>> + $destdir .= "/tmp_${pid}_${vmid}"; >>> + mkpath $destdir; >>> + >>> + ($ova_path) = $ova_path =~ m|^(.*)$|; # untaint >>> + >>> + my $source_path = "$destdir/$inner_file"; >>> + my $target_path; >>> + my $target_volname; >>> + eval { >>> + run_command(['tar', '-x', '--force-local', '-C', $destdir, '-f', >>> $ova_path, $inner_file]); >>> + >>> + # check for symlinks and other non regular files >>> + if (-l $source_path || ! -f $source_path) { >>> + die "only regular files are allowed\n"; >>> + } >>> + >>> + my $target_diskname >>> + = $target_plugin->find_free_diskname($target_storeid, $target_scfg, >>> $vmid, $fmt, 1); >> >> these here requires holding a lock until the diskname is actually used >> (the rename below), else it's racey.. > > we do have a lock over vm creation in the only path this is called and it is > vmid specific... > so is this really a problem? yes, every disk allocation needs to hold the storage lock to avoid two actions in parallel thinking they own a "new" disk name that hasn't yet been allocated properly. it's possible to allocate new volumes without going over a guest specific API after all, and there are automation use cases doing just that (pre-allocating the volumes, then creating/updating the VM). >>> + $target_volname = "$vmid/" . $target_diskname; >> >> this encodes a fact about volname semantics that might not be a given >> for external, dir-based plugins (not sure if we want to worry about that >> though, or how to avoid it ;)). > > i mean we could call 'alloc' with a very small size instead > and simply "overwrite" it? then we'd also get around things like > mkpath and imagedir etc. that might actually be nice(r) than the current approach since it avoids the volname format issue entirely. the only downside is that we then briefly have a "wrong" disk visible, but since the VM has to be locked at that point there shouldn't be too much harm in that? >>> + $target_path = $target_plugin->filesystem_path($target_scfg, >>> $target_volname); >> >> this should be equivalent to PVE::Storage::path for DirPlugin based >> storages? >> >>> + >>> + print "renaming $source_path to $target_path\n"; >>> + my $imagedir = $target_plugin->get_subdir($target_scfg, 'images'); >> >> we already did this above, but see comment there ;) > > true ;) > >> >>> + mkpath "$imagedir/$vmid"; >>> + >>> + rename($source_path, $target_path) or die "unable to move - $!\n"; >>> + }; >>> + if (my $err = $@) { >>> + unlink $source_path; >>> + unlink $target_path if defined($target_path); >> >> isn't this pretty much impossible to happen? the last thing we do in the >> eval block is the rename - if that failed, $target_path can't exist yet. >> if it didn't fail, we can't end up here? > > that probably depends on the underlying filesystem no? not > every fs has posix rename semantics i guess? I think we can assume an intra-FS rename to either work and have an effect, or not work and not have an effect on anything we want to support as dir storage? :) > in that case we'd cleanup the file, and if it does not exists, it doesn't hurt > but sure, but error handling tends to get more complicated over time, so not having nops in there reduces the complexity somewhat IMHO. >>> + rmdir $destdir; >> >> this and unlink $source_path could just be a remove_tree on $destdir >> instead, with less chances of leaving stuff around? > > this is true of course and removes the issue above > >> >>> + die "error during extraction: $err\n"; >>> + } >>> + >>> + rmdir $destdir; >> >> could also be a remove_tree, just to be on the safe side? > > yup > >> >>> + >>> + return "$target_storeid:$target_volname"; >>> +} >>> + >>> +sub cleanup_extracted_image { >>> + my ($source) = @_; >>> + >>> + my $cfg = PVE::Storage::config(); >>> + PVE::Storage::vdisk_free($cfg, $source); >>> +} >> >> why do we need this helper, and not just call vdisk_free directly in >> qemu-server (we do that in tons of places there as part of error >> handling for freshly allocated volumes)? > > ok makes sense > >> >>> + >>> +1; >>> diff --git a/src/PVE/GuestImport/OVF.pm b/src/PVE/GuestImport/OVF.pm >>> index 0eb5e9c..6b79078 100644 >>> --- a/src/PVE/GuestImport/OVF.pm >>> +++ b/src/PVE/GuestImport/OVF.pm >>> @@ -85,11 +85,37 @@ sub id_to_pve { >>> } >>> } >>> >>> +# technically defined in DSP0004 (https://www.dmtf.org/dsp/DSP0004) as an >>> ABNF >>> +# but realistically this always takes the form of 'byte * base^exponent' >>> +sub try_parse_capacity_unit { >>> + my ($unit_text) = @_; >>> + >>> + if ($unit_text =~ m/^\s*byte\s*\*\s*([0-9]+)\s*\^\s*([0-9]+)\s*$/) { >>> + my $base = $1; >>> + my $exp = $2; >>> + return $base ** $exp; >>> + } >>> + >>> + return undef; >>> +} >>> + >>> # returns two references, $qm which holds qm.conf style key/values, and >>> \@disks >>> sub parse_ovf { >>> - my ($ovf, $debug) = @_; >>> + my ($ovf, $isOva, $debug) = @_; >>> + >>> + # we have to ignore missing disk images for ova >>> + my $dom; >>> + if ($isOva) { >>> + my $raw = ""; >>> + PVE::Tools::run_command(['tar', '-xO', '--wildcards', '--occurrence=1', >>> '-f', $ovf, '*.ovf'], outfunc => sub { >>> + my $line = shift; >>> + $raw .= $line; >>> + }); >>> + $dom = XML::LibXML->load_xml(string => $raw, no_blanks => 1); >>> + } else { >>> + $dom = XML::LibXML->load_xml(location => $ovf, no_blanks => 1); >>> + } >>> >>> - my $dom = XML::LibXML->load_xml(location => $ovf, no_blanks => 1); >>> >>> # register the xml namespaces in a xpath context object >>> # 'ovf' is the default namespace so it will prepended to each xml >>> element >>> @@ -177,7 +203,17 @@ sub parse_ovf { >>> # @ needs to be escaped to prevent Perl double quote interpolation >>> my $xpath_find_fileref = sprintf("/ovf:Envelope/ovf:DiskSection/\ >>> ovf:Disk[\@ovf:diskId='%s']/\@ovf:fileRef", $disk_id); >>> + my $xpath_find_capacity = sprintf("/ovf:Envelope/ovf:DiskSection/\ >>> +ovf:Disk[\@ovf:diskId='%s']/\@ovf:capacity", $disk_id); >>> + my $xpath_find_capacity_unit = sprintf("/ovf:Envelope/ovf:DiskSection/\ >>> +ovf:Disk[\@ovf:diskId='%s']/\@ovf:capacityAllocationUnits", $disk_id); >>> my $fileref = $xpc->findvalue($xpath_find_fileref); >>> + my $capacity = $xpc->findvalue($xpath_find_capacity); >>> + my $capacity_unit = $xpc->findvalue($xpath_find_capacity_unit); >>> + my $virtual_size; >>> + if (my $factor = try_parse_capacity_unit($capacity_unit)) { >>> + $virtual_size = $capacity * $factor; >>> + } >>> >>> my $valid_url_chars = qr@${valid_uripath_chars}|/@; >>> if (!$fileref || $fileref !~ m/^${valid_url_chars}+$/) { >>> @@ -217,7 +253,7 @@ ovf:Item[rasd:InstanceID='%s']/rasd:ResourceType", >>> $controller_id); >>> die "error parsing $filepath, are you using a symlink ?\n"; >>> } >>> >>> - if (!-e $backing_file_path) { >>> + if (!-e $backing_file_path && !$isOva) { >>> die "error parsing $filepath, file seems not to exist at >>> $backing_file_path\n"; >>> } >>> >>> @@ -225,16 +261,19 @@ ovf:Item[rasd:InstanceID='%s']/rasd:ResourceType", >>> $controller_id); >>> ($filepath) = $filepath =~ m|^(${PVE::Storage::SAFE_CHAR_CLASS_RE}+)$|; >>> # untaint & check no sub/parent dirs >>> die "invalid path\n" if !$filepath; >>> >>> - my $virtual_size = PVE::Storage::file_size_info($backing_file_path); >>> - die "error parsing $backing_file_path, cannot determine file size\n" >>> - if !$virtual_size; >>> + if (!$isOva) { >>> + my $size = PVE::Storage::file_size_info($backing_file_path); >>> + die "error parsing $backing_file_path, cannot determine file size\n" >>> + if !$size; >>> >>> + $virtual_size = $size; >>> + } >>> $pve_disk = { >>> disk_address => $pve_disk_address, >>> backing_file => $backing_file_path, >>> - virtual_size => $virtual_size >>> relative_path => $filepath, >>> }; >>> + $pve_disk->{virtual_size} = $virtual_size if defined($virtual_size); >>> push @disks, $pve_disk; >>> >>> } >>> diff --git a/src/PVE/Makefile b/src/PVE/Makefile >>> index e15a275..0af3081 100644 >>> --- a/src/PVE/Makefile >>> +++ b/src/PVE/Makefile >>> @@ -5,6 +5,7 @@ install: >>> install -D -m 0644 Storage.pm ${DESTDIR}${PERLDIR}/PVE/Storage.pm >>> install -D -m 0644 Diskmanage.pm ${DESTDIR}${PERLDIR}/PVE/Diskmanage.pm >>> install -D -m 0644 CephConfig.pm ${DESTDIR}${PERLDIR}/PVE/CephConfig.pm >>> + install -D -m 0644 GuestImport.pm >>> ${DESTDIR}${PERLDIR}/PVE/GuestImport.pm >>> make -C Storage install >>> make -C GuestImport install >>> make -C API2 install >>> diff --git a/src/PVE/Storage.pm b/src/PVE/Storage.pm >>> index 1ed91c2..adc1b45 100755 >>> --- a/src/PVE/Storage.pm >>> +++ b/src/PVE/Storage.pm >>> @@ -114,7 +114,7 @@ our $VZTMPL_EXT_RE_1 = qr/\.tar\.(gz|xz|zst)/i; >>> >>> our $BACKUP_EXT_RE_2 = >>> qr/\.(tgz|(?:tar|vma)(?:\.(${\PVE::Storage::Plugin::COMPRESSOR_RE}))?)/; >>> >>> -our $IMPORT_EXT_RE_1 = qr/\.(ovf|qcow2|raw|vmdk)/; >>> +our $IMPORT_EXT_RE_1 = qr/\.(ova|ovf|qcow2|raw|vmdk)/; >>> >>> our $SAFE_CHAR_CLASS_RE = qr/[a-zA-Z0-9\-\.\+\=\_]/; >>> >>> diff --git a/src/PVE/Storage/DirPlugin.pm b/src/PVE/Storage/DirPlugin.pm >>> index 3e3b1e7..ea89464 100644 >>> --- a/src/PVE/Storage/DirPlugin.pm >>> +++ b/src/PVE/Storage/DirPlugin.pm >>> @@ -258,15 +258,26 @@ sub get_import_metadata { >>> # NOTE: all types of warnings must be added to the return schema of >>> the import-metadata API endpoint >>> my $warnings = []; >>> >>> + my $isOva = 0; >>> + if ($name =~ m/\.ova$/) { >>> + $isOva = 1; >>> + push @$warnings, { type => 'ova-needs-extracting' }; >>> + } >>> my $path = $class->path($scfg, $volname, $storeid, undef); >>> - my $res = PVE::GuestImport::OVF::parse_ovf($path); >>> + my $res = PVE::GuestImport::OVF::parse_ovf($path, $isOva); >>> my $disks = {}; >>> for my $disk ($res->{disks}->@*) { >>> my $id = $disk->{disk_address}; >>> my $size = $disk->{virtual_size}; >>> my $path = $disk->{relative_path}; >>> + my $volid; >>> + if ($isOva) { >>> + $volid = "$storeid:$volname/$path"; >>> + } else { >>> + $volid = "$storeid:import/$path", >>> + } >>> $disks->{$id} = { >>> - volid => "$storeid:import/$path", >>> + volid => $volid, >>> defined($size) ? (size => $size) : (), >>> }; >>> } >>> diff --git a/src/PVE/Storage/Plugin.pm b/src/PVE/Storage/Plugin.pm >>> index 33f0f3a..640d156 100644 >>> --- a/src/PVE/Storage/Plugin.pm >>> +++ b/src/PVE/Storage/Plugin.pm >>> @@ -654,6 +654,11 @@ sub parse_volname { >>> return ('backup', $fn); >>> } elsif ($volname =~ m!^snippets/([^/]+)$!) { >>> return ('snippets', $1); >>> + } elsif ($volname =~ >>> m!^import/(${PVE::Storage::SAFE_CHAR_CLASS_RE}+\.ova\/(${PVE::Storage::SAFE_CHAR_CLASS_RE}+))$!) >>> { >>> + my $archive = $1; >>> + my $file = $2; >>> + my (undef, $format, undef) = parse_name_dir($file); >>> + return ('import', $archive, undef, undef, undef, undef, "ova+$format"); >> >> these could be improved if the format was already checked in the elsif >> condition I think, since the error message of parse_name_dir is a bit >> opaque/lacking context.. also, parse_name_dir allows subvol, which we >> don't want to allow here I think? > > ok yeah that makes sense > >> >>> } elsif ($volname =~ >>> m!^import/(${PVE::Storage::SAFE_CHAR_CLASS_RE}+$PVE::Storage::IMPORT_EXT_RE_1)$!) >>> { >>> return ('import', $1, undef, undef, undef, undef, $2); >>> } >>> diff --git a/src/test/parse_volname_test.pm b/src/test/parse_volname_test.pm >>> index a8c746f..bc7b4e8 100644 >>> --- a/src/test/parse_volname_test.pm >>> +++ b/src/test/parse_volname_test.pm >>> @@ -93,6 +93,21 @@ my $tests = [ >>> volname => 'import/import.ovf', >>> expected => ['import', 'import.ovf', undef, undef, undef ,undef, >>> 'ovf'], >>> }, >>> + { >>> + description => "Import, innner file of ova", >>> + volname => 'import/import.ova/disk.qcow2', >>> + expected => ['import', 'import.ova/disk.qcow2', undef, undef, undef, >>> undef, 'ova+qcow2'], >>> + }, >>> + { >>> + description => "Import, innner file of ova", >>> + volname => 'import/import.ova/disk.vmdk', >>> + expected => ['import', 'import.ova/disk.vmdk', undef, undef, undef, >>> undef, 'ova+vmdk'], >>> + }, >>> + { >>> + description => "Import, innner file of ova", >>> + volname => 'import/import.ova/disk.raw', >>> + expected => ['import', 'import.ova/disk.raw', undef, undef, undef, >>> undef, 'ova+raw'], >>> + }, >>> # >>> # failed matches >>> # >>> -- >>> 2.39.2 >>> >>> >>> >>> _______________________________________________ >>> pve-devel mailing list >>> pve-devel@lists.proxmox.com >>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel >>> >>> >>> >> >> >> _______________________________________________ >> pve-devel mailing list >> pve-devel@lists.proxmox.com >> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel >> >> > > > _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel