Re: [pve-devel] [PATCH v3 manager 1/1] backup: move logic to include guests into method

2020-03-18 Thread Aaron Lauterer



On 3/18/20 2:10 PM, Fabian Grünbichler wrote:

On March 18, 2020 11:57 am, Aaron Lauterer wrote:



On 3/17/20 3:33 PM, Fabian Grünbichler wrote:

On March 16, 2020 4:44 pm, Aaron Lauterer wrote:

This extracts the logic which guests are to be included in a backup job
into its own method 'get_included_guests'. This makes it possible to
develop other features around backup jobs.

Logic which was spread out accross the API2/VZDump.pm file and the
VZDump.pm file is refactored into the new method, most notably the
logic that dealt with excluded guests.

The new method will return the VMIDs for the whole cluster. If a VMID is
present on the local node and thus part of the local backup job is still
determined in the VZDump::exec_backup method.


this is not true. previously, that was handled partly by the API
already, so behaviour changed in some cases.



Permission checks are kept where they are to be able to handle missing
permissions according to the current context. The old behavior to die
on a backup job when the user is missing the permission to a guest and
the job is not an 'all' or 'exclude' job is kept.

The creation of the skiplist is moved to the VZDump::exec_backup method,
close to the place where it is used.


but it was used already in the API, so it made sense to have it there.
see comments in-line. I am not saying we can't change this behaviour,
but from this description it does not sound like it was (fully)
intentional.


Thanks for the feedback!

It seems like the early exit right after the skiplist creation got lost
while I was working on the whole thing :/
Also thanks to point out the taskid thing for a single guest backup
further down.

The idea driving this patch is to have one place to decide which guests
will be included in a backup job so it can be used in more places than
just the actual backups. Right now that logic is very spread out and
sometimes even a bit redundant AFAICT.

For example in the PVE/API2/VZDump.pm where @vmids will contain only
local vmids if the `all` parameter is not set and then in
PVE/VZDump->exec_backup the vmids are checked again against
$plugin->vmlist().

After going through all the changes and existing code once more I am not
sure if we should even try to keep the exact same behavior in place.

What if the new PVE/VZDump->get_included_guests sub will return
* all included VMIDs
* local VMIDs
* local skiplist
for the current node?


yes, or just included/local and skipped/external. a caller interested in
all of them can just merge those two mutually exclusive lists, a caller
interested only in the former (e.g., vzdump with --all) can just ignore
the latter, the regular vzdump call can just pass them both along so
that exec_backup can print a skiplist and act on the filtered one.


Sounds good. Then I will refactor it to that.




This way the logic could be nicely contained in one sub instead of being
spread out over multiple modules and subs. We can still exit quietly if
there are no local VMIDs and will do so even in other situations like
when `all` is set but there are either no guests on the local node or
they are all in the `exclude` list.

On the other hand there will be a bit higher computational load to
generate these lists all at once and not spread out with early exits
like now.


I don't see how the load would be much higher? we are talking about
calls that either end up doing a backup, or end up parsing the configs
of all the returned VMIDs. in both cases, iterating once over the
(cached) vmlist provided by pmxcfs to find out which guest is owned by
which node is very low-effort. this is not in any performance-critical
hot-path ;)


I was thinking about clusters with several hundred to maybe a low 4 
digit number of guests. But yeah, even then it should be negligible.





I hope I did not miss anything.


___
pve-devel mailing list
pve-devel@pve.proxmox.com
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH v3 manager 1/1] backup: move logic to include guests into method

2020-03-18 Thread Fabian Grünbichler
On March 18, 2020 11:57 am, Aaron Lauterer wrote:
> 
> 
> On 3/17/20 3:33 PM, Fabian Grünbichler wrote:
>> On March 16, 2020 4:44 pm, Aaron Lauterer wrote:
>>> This extracts the logic which guests are to be included in a backup job
>>> into its own method 'get_included_guests'. This makes it possible to
>>> develop other features around backup jobs.
>>>
>>> Logic which was spread out accross the API2/VZDump.pm file and the
>>> VZDump.pm file is refactored into the new method, most notably the
>>> logic that dealt with excluded guests.
>>>
>>> The new method will return the VMIDs for the whole cluster. If a VMID is
>>> present on the local node and thus part of the local backup job is still
>>> determined in the VZDump::exec_backup method.
>> 
>> this is not true. previously, that was handled partly by the API
>> already, so behaviour changed in some cases.
>> 
>>>
>>> Permission checks are kept where they are to be able to handle missing
>>> permissions according to the current context. The old behavior to die
>>> on a backup job when the user is missing the permission to a guest and
>>> the job is not an 'all' or 'exclude' job is kept.
>>>
>>> The creation of the skiplist is moved to the VZDump::exec_backup method,
>>> close to the place where it is used.
>> 
>> but it was used already in the API, so it made sense to have it there.
>> see comments in-line. I am not saying we can't change this behaviour,
>> but from this description it does not sound like it was (fully)
>> intentional.
> 
> Thanks for the feedback!
> 
> It seems like the early exit right after the skiplist creation got lost 
> while I was working on the whole thing :/
> Also thanks to point out the taskid thing for a single guest backup 
> further down.
> 
> The idea driving this patch is to have one place to decide which guests 
> will be included in a backup job so it can be used in more places than 
> just the actual backups. Right now that logic is very spread out and 
> sometimes even a bit redundant AFAICT.
> 
> For example in the PVE/API2/VZDump.pm where @vmids will contain only 
> local vmids if the `all` parameter is not set and then in 
> PVE/VZDump->exec_backup the vmids are checked again against 
> $plugin->vmlist().
> 
> After going through all the changes and existing code once more I am not 
> sure if we should even try to keep the exact same behavior in place.
> 
> What if the new PVE/VZDump->get_included_guests sub will return
> * all included VMIDs
> * local VMIDs
> * local skiplist
> for the current node?

yes, or just included/local and skipped/external. a caller interested in 
all of them can just merge those two mutually exclusive lists, a caller 
interested only in the former (e.g., vzdump with --all) can just ignore 
the latter, the regular vzdump call can just pass them both along so 
that exec_backup can print a skiplist and act on the filtered one.

> This way the logic could be nicely contained in one sub instead of being 
> spread out over multiple modules and subs. We can still exit quietly if 
> there are no local VMIDs and will do so even in other situations like 
> when `all` is set but there are either no guests on the local node or 
> they are all in the `exclude` list.
> 
> On the other hand there will be a bit higher computational load to 
> generate these lists all at once and not spread out with early exits 
> like now.

I don't see how the load would be much higher? we are talking about 
calls that either end up doing a backup, or end up parsing the configs 
of all the returned VMIDs. in both cases, iterating once over the 
(cached) vmlist provided by pmxcfs to find out which guest is owned by 
which node is very low-effort. this is not in any performance-critical 
hot-path ;)

> I hope I did not miss anything.

___
pve-devel mailing list
pve-devel@pve.proxmox.com
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH v3 manager 1/1] backup: move logic to include guests into method

2020-03-18 Thread Aaron Lauterer



On 3/17/20 3:33 PM, Fabian Grünbichler wrote:

On March 16, 2020 4:44 pm, Aaron Lauterer wrote:

This extracts the logic which guests are to be included in a backup job
into its own method 'get_included_guests'. This makes it possible to
develop other features around backup jobs.

Logic which was spread out accross the API2/VZDump.pm file and the
VZDump.pm file is refactored into the new method, most notably the
logic that dealt with excluded guests.

The new method will return the VMIDs for the whole cluster. If a VMID is
present on the local node and thus part of the local backup job is still
determined in the VZDump::exec_backup method.


this is not true. previously, that was handled partly by the API
already, so behaviour changed in some cases.



Permission checks are kept where they are to be able to handle missing
permissions according to the current context. The old behavior to die
on a backup job when the user is missing the permission to a guest and
the job is not an 'all' or 'exclude' job is kept.

The creation of the skiplist is moved to the VZDump::exec_backup method,
close to the place where it is used.


but it was used already in the API, so it made sense to have it there.
see comments in-line. I am not saying we can't change this behaviour,
but from this description it does not sound like it was (fully)
intentional.


Thanks for the feedback!

It seems like the early exit right after the skiplist creation got lost 
while I was working on the whole thing :/
Also thanks to point out the taskid thing for a single guest backup 
further down.


The idea driving this patch is to have one place to decide which guests 
will be included in a backup job so it can be used in more places than 
just the actual backups. Right now that logic is very spread out and 
sometimes even a bit redundant AFAICT.


For example in the PVE/API2/VZDump.pm where @vmids will contain only 
local vmids if the `all` parameter is not set and then in 
PVE/VZDump->exec_backup the vmids are checked again against 
$plugin->vmlist().


After going through all the changes and existing code once more I am not 
sure if we should even try to keep the exact same behavior in place.


What if the new PVE/VZDump->get_included_guests sub will return
* all included VMIDs
* local VMIDs
* local skiplist
for the current node?

This way the logic could be nicely contained in one sub instead of being 
spread out over multiple modules and subs. We can still exit quietly if 
there are no local VMIDs and will do so even in other situations like 
when `all` is set but there are either no guests on the local node or 
they are all in the `exclude` list.


On the other hand there will be a bit higher computational load to 
generate these lists all at once and not spread out with early exits 
like now.


I hope I did not miss anything.






Signed-off-by: Aaron Lauterer 
---
changes: rebased

  PVE/API2/VZDump.pm | 36 +++---
  PVE/VZDump.pm  | 74 --
  2 files changed, 55 insertions(+), 55 deletions(-)

diff --git a/PVE/API2/VZDump.pm b/PVE/API2/VZDump.pm
index f01e4de0..bc380099 100644
--- a/PVE/API2/VZDump.pm
+++ b/PVE/API2/VZDump.pm
@@ -69,43 +69,15 @@ __PACKAGE__->register_method ({
return 'OK' if $param->{node} && $param->{node} ne $nodename;
  
  	my $cmdline = PVE::VZDump::Common::command_line($param);

-   my @vmids;
-   # convert string lists to arrays
-   if ($param->{pool}) {
-   @vmids = 
@{PVE::API2Tools::get_resource_pool_guest_members($param->{pool})};
-   } else {
-   @vmids = PVE::Tools::split_list(extract_param($param, 'vmid'));
-   }
+
+   $param->{vmids} = PVE::VZDump->get_included_guests($param, $nodename);
+   my @vmids = @{$param->{vmids}};


so this is just to not change a few call sites down below ;)

  
  	if($param->{stop}){

PVE::VZDump::stop_running_backups();
return 'OK' if !scalar(@vmids);
}
  
-	my $skiplist = [];

-   if (!$param->{all}) {
-   if (!$param->{node} || $param->{node} eq $nodename) {


so here skip list was only done when not including all VMs (since when
doing --all, we'd usually skip lots of VMs and don't want to be too
noisy I guess?)


-   my $vmlist = PVE::Cluster::get_vmlist();
-   my @localvmids = ();
-   foreach my $vmid (@vmids) {
-   my $d = $vmlist->{ids}->{$vmid};
-   if ($d && ($d->{node} ne $nodename)) {
-   push @$skiplist, $vmid;
-   } else {
-   push @localvmids, $vmid;
-   }
-   }
-   @vmids = @localvmids;


and here the actual skipping happened (well, see below ;)). which means
that the API from this point on already had the filtered list, for
decisions such as


-   # silent exit if specified VMs run on other nodes
-   return "OK" if 

Re: [pve-devel] [PATCH v3 manager 1/1] backup: move logic to include guests into method

2020-03-17 Thread Fabian Grünbichler
On March 16, 2020 4:44 pm, Aaron Lauterer wrote:
> This extracts the logic which guests are to be included in a backup job
> into its own method 'get_included_guests'. This makes it possible to
> develop other features around backup jobs.
> 
> Logic which was spread out accross the API2/VZDump.pm file and the
> VZDump.pm file is refactored into the new method, most notably the
> logic that dealt with excluded guests.
> 
> The new method will return the VMIDs for the whole cluster. If a VMID is
> present on the local node and thus part of the local backup job is still
> determined in the VZDump::exec_backup method.

this is not true. previously, that was handled partly by the API 
already, so behaviour changed in some cases.

> 
> Permission checks are kept where they are to be able to handle missing
> permissions according to the current context. The old behavior to die
> on a backup job when the user is missing the permission to a guest and
> the job is not an 'all' or 'exclude' job is kept.
> 
> The creation of the skiplist is moved to the VZDump::exec_backup method,
> close to the place where it is used.

but it was used already in the API, so it made sense to have it there.
see comments in-line. I am not saying we can't change this behaviour, 
but from this description it does not sound like it was (fully) 
intentional.

> 
> Signed-off-by: Aaron Lauterer 
> ---
> changes: rebased
> 
>  PVE/API2/VZDump.pm | 36 +++---
>  PVE/VZDump.pm  | 74 --
>  2 files changed, 55 insertions(+), 55 deletions(-)
> 
> diff --git a/PVE/API2/VZDump.pm b/PVE/API2/VZDump.pm
> index f01e4de0..bc380099 100644
> --- a/PVE/API2/VZDump.pm
> +++ b/PVE/API2/VZDump.pm
> @@ -69,43 +69,15 @@ __PACKAGE__->register_method ({
>   return 'OK' if $param->{node} && $param->{node} ne $nodename;
>  
>   my $cmdline = PVE::VZDump::Common::command_line($param);
> - my @vmids;
> - # convert string lists to arrays
> - if ($param->{pool}) {
> - @vmids = 
> @{PVE::API2Tools::get_resource_pool_guest_members($param->{pool})};
> - } else {
> - @vmids = PVE::Tools::split_list(extract_param($param, 'vmid'));
> - }
> +
> + $param->{vmids} = PVE::VZDump->get_included_guests($param, $nodename);
> + my @vmids = @{$param->{vmids}};

so this is just to not change a few call sites down below ;)

>  
>   if($param->{stop}){
>   PVE::VZDump::stop_running_backups();
>   return 'OK' if !scalar(@vmids);
>   }
>  
> - my $skiplist = [];
> - if (!$param->{all}) {
> - if (!$param->{node} || $param->{node} eq $nodename) {

so here skip list was only done when not including all VMs (since when 
doing --all, we'd usually skip lots of VMs and don't want to be too 
noisy I guess?)

> - my $vmlist = PVE::Cluster::get_vmlist();
> - my @localvmids = ();
> - foreach my $vmid (@vmids) {
> - my $d = $vmlist->{ids}->{$vmid};
> - if ($d && ($d->{node} ne $nodename)) {
> - push @$skiplist, $vmid;
> - } else {
> - push @localvmids, $vmid;
> - }
> - }
> - @vmids = @localvmids;

and here the actual skipping happened (well, see below ;)). which means 
that the API from this point on already had the filtered list, for 
decisions such as

> - # silent exit if specified VMs run on other nodes
> - return "OK" if !scalar(@vmids);

this ^ , but also whether to associate the forked worker with a single 
VMID or not, etc.pp.

> - }
> -
> - $param->{vmids} = PVE::VZDump::check_vmids(@vmids)
> - }
> -
> - my @exclude = PVE::Tools::split_list(extract_param($param, 'exclude'));
> - $param->{exclude} = PVE::VZDump::check_vmids(@exclude);
> -
>   # exclude-path list need to be 0 separated
>   if (defined($param->{'exclude-path'})) {
>   my @expaths = split(/\0/, $param->{'exclude-path'} || '');
> @@ -130,7 +102,7 @@ __PACKAGE__->register_method ({
>   die "interrupted by signal\n";
>   };
>  
> - my $vzdump = PVE::VZDump->new($cmdline, $param, $skiplist);
> + my $vzdump = PVE::VZDump->new($cmdline, $param);
>  
>   eval {
>   $vzdump->getlock($upid); # only one process allowed
> diff --git a/PVE/VZDump.pm b/PVE/VZDump.pm
> index f3274196..9368d439 100644
> --- a/PVE/VZDump.pm
> +++ b/PVE/VZDump.pm
> @@ -21,6 +21,7 @@ use PVE::RPCEnvironment;
>  use PVE::Storage;
>  use PVE::VZDump::Common;
>  use PVE::VZDump::Plugin;
> +use PVE::Tools qw(extract_param);
>  
>  my @posix_filesystems = qw(ext3 ext4 nfs nfs4 reiserfs xfs);
>  
> @@ -379,7 +380,7 @@ sub sendmail {
>  };
>  
>  sub new {
> -my ($class, $cmdline, $opts, $skiplist) = @_;
> +my ($class, $cmdline, $opts) = @_;
>  
>  mkpath $logdir;
>  
> @@ -418,8 +419,7 @@ sub new {
>  $opts->{dumpdir}