Re: [Libguestfs] [PATCH 5/5] v2v: Add -o rhv-upload output mode.

2018-02-23 Thread Pino Toscano
On Thursday, 22 February 2018 14:57:25 CET Richard W.M. Jones wrote:
> PROBLEMS:
>  - Spools to a temporary disk
>  - Need to specify direct/indirect upload via flag
>  - Location of ca.pem
>  - Target cluster
>  - Delete disks on failure, or rename disks on success?
>  - Handling of sparseness in raw format disks
> 
> This adds a new output mode to virt-v2v.  virt-v2v -o rhv-upload
> streams images directly to an oVirt or RHV >= 4 Data Domain using the
> oVirt SDK v4.  It is more efficient than -o rhv because it does not
> need to go via the Export Storage Domain, and is possible for humans
> to use unlike -o vdsm.
> 
> The implementation uses the Python SDK by running snippets of Python
> code to interact with the ‘ovirtsdk4’ module.  It requires both Python 3
> and the Python SDK v4 to be installed at run time (these are not,
> however, new dependencies of virt-v2v since most people wouldn't have
> them).
> ---

Since the patch is still RFC, and I cannot comment on the actual
procedure, I'll just leave few misc comments.

> +| `RHV_Upload ->
> +  let output_conn =
> +match output_conn with
> +| None ->
> +   error (f_"-o rhv-upload: output connection was not specified, use 
> ‘-oc’ to point to the oVirt or RHV server REST API URL")

More than "output connection", I'd just mention "REST API URL"
directly.

> +# Wait til the disk is up.  The transfer cannot start if the
> +# disk is locked.
> +disk_service = disks_service.disk_service(disk_id)
> +while True:
> +time.sleep(5)
> +disk = disk_service.get()
> +if disk.status == types.DiskStatus.OK:
> +break

Is a busy loop the only way to wait for an OK disk?

Regardless, this (and other busy loops in the Python snippets) IMHO
need also a timeout, otherwise any error coming from oVirt will block
v2v indefinitely...

> +# This seems to give the best throughput when uploading from Yaniv's
> +# machine to a server that drops the data. You may need to tune this
> +# on your setup.
> +BUF_SIZE = 128 * 1024
> +
> +with open(image_path, \"rb\") as disk:
> +pos = 0
> +while pos < image_size:
> +# Send the next chunk to the proxy.
> +to_read = min(image_size - pos, BUF_SIZE)
> +chunk = disk.read(to_read)
> +if not chunk:
> +transfer_service.pause()
> +raise RuntimeError(\"Unexpected end of file at pos=%%d\" %% pos)
> +
> +proxy_connection.send(chunk)
> +pos += len(chunk)

No need for 'pos', just use disk.tell() to get the current position
in the file being read.

Also, I'd flip it to the other side: instead of count the read bytes
until the file size, just keep read & send bytes, and error out when
reading more data than the file size.

Maybe something like the untested:

  with open(image_path, "rb") as disk:
  while True:
  # Send the next chunk to the proxy.
  chunk = disk.read(BUF_SIZE)
  if not chunk:
  transfer_service.pause()
  raise RuntimeError("Unexpected end of file at pos=%d" % 
file.tell())

  if disk.tell() > image_size:
  transfer_service.pause()
  # maybe even stat() image_path, and show more details in the 
exception message
  raise RuntimeError("Read more data than expected: pos=%d vs 
size=%d" % file.tell(), image_size)

  proxy_connection.send(chunk)

> +# Get the response
> +response = proxy_connection.getresponse()
> +if response.status != 200:
> +transfer_service.pause()
> +print(\"Upload failed: %%s %%s\" %%
> +  (response.status, response.reason))
> +sys.exit(1)

sys.exit() vs raise?

> +(* Find the Python 3 binary. *)
> +let find_python3 () =
> +  let rec loop = function
> +| [] ->
> +   error "could not locate Python 3 binary on the $PATH.  You may have 
> to install Python 3.  If Python 3 is already installed then you may need to 
> create a directory containing a binary called ‘python3’ which runs Python 3."
> +| python :: rest ->
> +   (* Use shell_command first to check the binary exists. *)
> +   let cmd = sprintf "%s --help >/dev/null 2>&1" (quote python) in

Std_utils.which here fits better IMHO.  If the binary is broken, then
the run_python after this will fail anyway.

> +(* Parse the -oc URI. *)
> +let parse_output_conn oc =
> +  let uri = Xml.parse_uri oc in
> +  if uri.Xml.uri_scheme <> Some "https" then
> +error (f_"rhv-upload: -oc: URI must start with https://...";);
> +  if uri.Xml.uri_server = None then
> +error (f_"rhv-upload: -oc: no remote server name in the URI");
> +  if uri.Xml.uri_path = None || uri.Xml.uri_path = Some "/" then
> +error (f_"rhv-upload: -oc: URI path component looks incorrect");
> +  let username =
> +match uri.Xml.uri_user with
> +| None ->
> +   warning (f_"rhv-upload: -oc: username was missing from URI, assuming 
> ‘admin@internal’");
> +   "admin@internal"
> +| Some user -> user in
> +  (* Reconst

Re: [Libguestfs] [PATCH 5/5] v2v: Add -o rhv-upload output mode.

2018-02-23 Thread Tomáš Golembiovský
On Fri, 23 Feb 2018 10:06:25 +
"Richard W.M. Jones"  wrote:

> On Thu, Feb 22, 2018 at 07:19:39PM +0100, Tomáš Golembiovský wrote:
> > The previous patches seem OK. On a quick glance this one also seems a
> > good start.
> > 
> > On Thu, 22 Feb 2018 13:57:25 +
> > "Richard W.M. Jones"  wrote:
> >   
> > > PROBLEMS:
> > >  - Spools to a temporary disk
> > >  - Need to specify direct/indirect upload via flag
> > >  - Location of ca.pem  
> > 
> > We surely need a new argument for that.
> > 
> > When running on VDSM host the certificate is in
> > '/etc/pki/vdsm/certs/cacert.pem'. Maybe we can use it as default?   
> 
> OK.
> 
> > >  - Target cluster
> > >  - Delete disks on failure, or rename disks on success?  
> > 
> > Definitely delete disks on failure. Unless we want to make it
> > administrators task. If you want to mark disks created by upload maybe
> > you can put something like 'uploaded by virt-v2v' in a comment? That may
> > help avoiding any renames... just an idea.  
> 
> So I guess the first question is _can_ we rename disks?

Good question! The answer is no.

I looked at the API documentation and it seems you cannot do virtually
any changes to the disk metadata once it's created. The only thing that
can be changed is 'qcow2_version' which is uninteresting for us.

Tomas


> 
> It's not always possible for virt-v2v to delete disks on every
> conceivable failure -- eg. if we're linked into a C library and that
> segfaults, there is no recovery.
> 
> If we could rename disks then we could give them temporary names
> during the upload part, and rename then right at the very end.  This
> would greatly reduce the window where we could leave an unused disk
> around.  It would mean that something would need to periodically scan
> RHV to look for the temporary disks and delete them (perhaps a single
> job after the whole conversion has finished?)
> 
> Rich.
> 
> -- 
> Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
> Read my programming and virtualization blog: http://rwmj.wordpress.com
> virt-df lists disk usage of guests without needing to install any
> software inside the virtual machine.  Supports Linux and Windows.
> http://people.redhat.com/~rjones/virt-df/


-- 
Tomáš Golembiovský 

___
Libguestfs mailing list
Libguestfs@redhat.com
https://www.redhat.com/mailman/listinfo/libguestfs

Re: [Libguestfs] [PATCH 5/5] v2v: Add -o rhv-upload output mode.

2018-02-23 Thread Richard W.M. Jones
On Thu, Feb 22, 2018 at 07:19:39PM +0100, Tomáš Golembiovský wrote:
> The previous patches seem OK. On a quick glance this one also seems a
> good start.
> 
> On Thu, 22 Feb 2018 13:57:25 +
> "Richard W.M. Jones"  wrote:
> 
> > PROBLEMS:
> >  - Spools to a temporary disk
> >  - Need to specify direct/indirect upload via flag
> >  - Location of ca.pem
> 
> We surely need a new argument for that.
> 
> When running on VDSM host the certificate is in
> '/etc/pki/vdsm/certs/cacert.pem'. Maybe we can use it as default? 

OK.

> >  - Target cluster
> >  - Delete disks on failure, or rename disks on success?
> 
> Definitely delete disks on failure. Unless we want to make it
> administrators task. If you want to mark disks created by upload maybe
> you can put something like 'uploaded by virt-v2v' in a comment? That may
> help avoiding any renames... just an idea.

So I guess the first question is _can_ we rename disks?

It's not always possible for virt-v2v to delete disks on every
conceivable failure -- eg. if we're linked into a C library and that
segfaults, there is no recovery.

If we could rename disks then we could give them temporary names
during the upload part, and rename then right at the very end.  This
would greatly reduce the window where we could leave an unused disk
around.  It would mean that something would need to periodically scan
RHV to look for the temporary disks and delete them (perhaps a single
job after the whole conversion has finished?)

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/

___
Libguestfs mailing list
Libguestfs@redhat.com
https://www.redhat.com/mailman/listinfo/libguestfs

Re: [Libguestfs] [PATCH 5/5] v2v: Add -o rhv-upload output mode.

2018-02-22 Thread Tomáš Golembiovský
On Thu, 22 Feb 2018 14:31:28 +
"Richard W.M. Jones"  wrote:

> On Thu, Feb 22, 2018 at 01:57:25PM +, Richard W.M. Jones wrote:
> > PROBLEMS:
> >  - Spools to a temporary disk
> >  - Need to specify direct/indirect upload via flag
> >  - Location of ca.pem
> >  - Target cluster
> >  - Delete disks on failure, or rename disks on success?
> >  - Handling of sparseness in raw format disks  
> 
> Actually there are a few more problems I forgot about:
> 
> - How should disks be named?  Currently it uses -000, -001 etc

IMHO this is good enough. The names are not that important, the UUID is
the key.

> - How should the OVA file refer to disks?

By disk UUID and volume UUID. You should have the first one from the
upload API and I guess you'll have to query for the volume UUID
afterwards.

Tomas

> 
> Rich.
> 
> -- 
> Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
> Read my programming and virtualization blog: http://rwmj.wordpress.com
> virt-builder quickly builds VMs from scratch
> http://libguestfs.org/virt-builder.1.html
> 
> ___
> Libguestfs mailing list
> Libguestfs@redhat.com
> https://www.redhat.com/mailman/listinfo/libguestfs


-- 
Tomáš Golembiovský 

___
Libguestfs mailing list
Libguestfs@redhat.com
https://www.redhat.com/mailman/listinfo/libguestfs

Re: [Libguestfs] [PATCH 5/5] v2v: Add -o rhv-upload output mode.

2018-02-22 Thread Tomáš Golembiovský
The previous patches seem OK. On a quick glance this one also seems a
good start.

On Thu, 22 Feb 2018 13:57:25 +
"Richard W.M. Jones"  wrote:

> PROBLEMS:
>  - Spools to a temporary disk
>  - Need to specify direct/indirect upload via flag
>  - Location of ca.pem

We surely need a new argument for that.

When running on VDSM host the certificate is in
'/etc/pki/vdsm/certs/cacert.pem'. Maybe we can use it as default? 

>  - Target cluster
>  - Delete disks on failure, or rename disks on success?

Definitely delete disks on failure. Unless we want to make it
administrators task. If you want to mark disks created by upload maybe
you can put something like 'uploaded by virt-v2v' in a comment? That may
help avoiding any renames... just an idea.


>  - Handling of sparseness in raw format disks
> 
> This adds a new output mode to virt-v2v.  virt-v2v -o rhv-upload
> streams images directly to an oVirt or RHV >= 4 Data Domain using the
> oVirt SDK v4.  It is more efficient than -o rhv because it does not
> need to go via the Export Storage Domain, and is possible for humans
> to use unlike -o vdsm.
> 
> The implementation uses the Python SDK by running snippets of Python
> code to interact with the ‘ovirtsdk4’ module.  It requires both Python 3
> and the Python SDK v4 to be installed at run time (these are not,
> however, new dependencies of virt-v2v since most people wouldn't have
> them).
> ---
>  v2v/Makefile.am   |   2 +
>  v2v/cmdline.ml|  25 +++
>  v2v/output_rhv_upload.ml  | 403 
> ++
>  v2v/output_rhv_upload.mli |  26 +++
>  4 files changed, 456 insertions(+)
> 
> diff --git a/v2v/Makefile.am b/v2v/Makefile.am
> index 83f0c30c7..c028babe6 100644
> --- a/v2v/Makefile.am
> +++ b/v2v/Makefile.am
> @@ -64,6 +64,7 @@ SOURCES_MLI = \
>   output_null.mli \
>   output_qemu.mli \
>   output_rhv.mli \
> + output_rhv_upload.mli \
>   output_vdsm.mli \
>   parse_ovf_from_ova.mli \
>   parse_libvirt_xml.mli \
> @@ -116,6 +117,7 @@ SOURCES_ML = \
>   output_local.ml \
>   output_qemu.ml \
>   output_rhv.ml \
> + output_rhv_upload.ml \
>   output_vdsm.ml \
>   inspect_source.ml \
>   target_bus_assignment.ml \
> diff --git a/v2v/cmdline.ml b/v2v/cmdline.ml
> index fceda1f82..4424863fe 100644
> --- a/v2v/cmdline.ml
> +++ b/v2v/cmdline.ml
> @@ -138,6 +138,8 @@ let parse_cmdline () =
>  | "disk" | "local" -> output_mode := `Local
>  | "null" -> output_mode := `Null
>  | "ovirt" | "rhv" | "rhev" -> output_mode := `RHV
> +| "ovirt-upload" | "ovirt_upload" | "rhv-upload" | "rhv_upload" ->
> +   output_mode := `RHV_Upload
>  | "qemu" -> output_mode := `QEmu
>  | "vdsm" -> output_mode := `VDSM
>  | s ->
> @@ -537,6 +539,29 @@ read the man page virt-v2v(1).
>Output_rhv.output_rhv os output_alloc,
>output_format, output_alloc
>  
> +| `RHV_Upload ->
> +  let output_conn =
> +match output_conn with
> +| None ->
> +   error (f_"-o rhv-upload: output connection was not specified, use 
> ‘-oc’ to point to the oVirt or RHV server REST API URL")
> +| Some oc -> oc in
> +  (* In theory we could make the password optional in future. *)
> +  let output_password =
> +match output_password with
> +| None ->
> +   error (f_"-o rhv-upload: output password file was not specified, 
> use ‘-op’ to point to a file which contains the password used to connect to 
> the oVirt or RHV server")
> +| Some op -> op in
> +  let os =
> +match output_storage with
> +| None ->
> +   error (f_"-o rhv-upload: output storage was not specified, use 
> ‘-os’");
> +| Some os -> os in
> +  if qemu_boot then
> +error_option_cannot_be_used_in_output_mode "rhv-upload" 
> "--qemu-boot";
> +  Output_rhv_upload.output_rhv_upload output_alloc output_conn
> +  output_password os,
> +  output_format, output_alloc
> +
>  | `VDSM ->
>if output_password <> None then
>  error_option_cannot_be_used_in_output_mode "vdsm" "-op";
> diff --git a/v2v/output_rhv_upload.ml b/v2v/output_rhv_upload.ml
> new file mode 100644
> index 0..77f7bc988
> --- /dev/null
> +++ b/v2v/output_rhv_upload.ml
> @@ -0,0 +1,403 @@
> +(* virt-v2v
> + * Copyright (C) 2009-2018 Red Hat Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should hav

Re: [Libguestfs] [PATCH 5/5] v2v: Add -o rhv-upload output mode.

2018-02-22 Thread Richard W.M. Jones
On Thu, Feb 22, 2018 at 01:57:25PM +, Richard W.M. Jones wrote:
> PROBLEMS:
>  - Spools to a temporary disk
>  - Need to specify direct/indirect upload via flag
>  - Location of ca.pem
>  - Target cluster
>  - Delete disks on failure, or rename disks on success?
>  - Handling of sparseness in raw format disks

Actually there are a few more problems I forgot about:

- How should disks be named?  Currently it uses -000, -001 etc
- How should the OVA file refer to disks?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html

___
Libguestfs mailing list
Libguestfs@redhat.com
https://www.redhat.com/mailman/listinfo/libguestfs


[Libguestfs] [PATCH 5/5] v2v: Add -o rhv-upload output mode.

2018-02-22 Thread Richard W.M. Jones
PROBLEMS:
 - Spools to a temporary disk
 - Need to specify direct/indirect upload via flag
 - Location of ca.pem
 - Target cluster
 - Delete disks on failure, or rename disks on success?
 - Handling of sparseness in raw format disks

This adds a new output mode to virt-v2v.  virt-v2v -o rhv-upload
streams images directly to an oVirt or RHV >= 4 Data Domain using the
oVirt SDK v4.  It is more efficient than -o rhv because it does not
need to go via the Export Storage Domain, and is possible for humans
to use unlike -o vdsm.

The implementation uses the Python SDK by running snippets of Python
code to interact with the ‘ovirtsdk4’ module.  It requires both Python 3
and the Python SDK v4 to be installed at run time (these are not,
however, new dependencies of virt-v2v since most people wouldn't have
them).
---
 v2v/Makefile.am   |   2 +
 v2v/cmdline.ml|  25 +++
 v2v/output_rhv_upload.ml  | 403 ++
 v2v/output_rhv_upload.mli |  26 +++
 4 files changed, 456 insertions(+)

diff --git a/v2v/Makefile.am b/v2v/Makefile.am
index 83f0c30c7..c028babe6 100644
--- a/v2v/Makefile.am
+++ b/v2v/Makefile.am
@@ -64,6 +64,7 @@ SOURCES_MLI = \
output_null.mli \
output_qemu.mli \
output_rhv.mli \
+   output_rhv_upload.mli \
output_vdsm.mli \
parse_ovf_from_ova.mli \
parse_libvirt_xml.mli \
@@ -116,6 +117,7 @@ SOURCES_ML = \
output_local.ml \
output_qemu.ml \
output_rhv.ml \
+   output_rhv_upload.ml \
output_vdsm.ml \
inspect_source.ml \
target_bus_assignment.ml \
diff --git a/v2v/cmdline.ml b/v2v/cmdline.ml
index fceda1f82..4424863fe 100644
--- a/v2v/cmdline.ml
+++ b/v2v/cmdline.ml
@@ -138,6 +138,8 @@ let parse_cmdline () =
 | "disk" | "local" -> output_mode := `Local
 | "null" -> output_mode := `Null
 | "ovirt" | "rhv" | "rhev" -> output_mode := `RHV
+| "ovirt-upload" | "ovirt_upload" | "rhv-upload" | "rhv_upload" ->
+   output_mode := `RHV_Upload
 | "qemu" -> output_mode := `QEmu
 | "vdsm" -> output_mode := `VDSM
 | s ->
@@ -537,6 +539,29 @@ read the man page virt-v2v(1).
   Output_rhv.output_rhv os output_alloc,
   output_format, output_alloc
 
+| `RHV_Upload ->
+  let output_conn =
+match output_conn with
+| None ->
+   error (f_"-o rhv-upload: output connection was not specified, use 
‘-oc’ to point to the oVirt or RHV server REST API URL")
+| Some oc -> oc in
+  (* In theory we could make the password optional in future. *)
+  let output_password =
+match output_password with
+| None ->
+   error (f_"-o rhv-upload: output password file was not specified, 
use ‘-op’ to point to a file which contains the password used to connect to the 
oVirt or RHV server")
+| Some op -> op in
+  let os =
+match output_storage with
+| None ->
+   error (f_"-o rhv-upload: output storage was not specified, use 
‘-os’");
+| Some os -> os in
+  if qemu_boot then
+error_option_cannot_be_used_in_output_mode "rhv-upload" "--qemu-boot";
+  Output_rhv_upload.output_rhv_upload output_alloc output_conn
+  output_password os,
+  output_format, output_alloc
+
 | `VDSM ->
   if output_password <> None then
 error_option_cannot_be_used_in_output_mode "vdsm" "-op";
diff --git a/v2v/output_rhv_upload.ml b/v2v/output_rhv_upload.ml
new file mode 100644
index 0..77f7bc988
--- /dev/null
+++ b/v2v/output_rhv_upload.ml
@@ -0,0 +1,403 @@
+(* virt-v2v
+ * Copyright (C) 2009-2018 Red Hat Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ *)
+
+open Printf
+open Unix
+
+open Std_utils
+open Tools_utils
+open Unix_utils
+open Common_gettext.Gettext
+
+open Types
+open Utils
+
+(* These correspond mostly to the fields in the Python
+ * sdk.Connection object, except for the password which
+ * is handled separately.
+ *)
+type connection = {
+  conn_url : string;
+  conn_username : string;
+  conn_debug : bool;
+}
+
+let string_of_connection conn =
+  sprintf "url=%s username=%s debug=%b"
+  conn.conn_url conn.conn_username conn.conn_debug
+
+(* Python code fragments go first.  Note the