subject:"\[Libguestfs\] \[libnbd PATCH 3\/2\] python\: Slice pread

Re: [Libguestfs] [libnbd PATCH 3/2] python: Slice pread_structured buffer from original

2022-06-09 Thread Eric Blake

On Tue, May 31, 2022 at 07:24:03PM -0500, Eric Blake wrote:
> On Wed, Jun 01, 2022 at 02:20:48AM +0300, Nir Soffer wrote:
> > On Tue, May 31, 2022 at 6:52 PM Eric Blake  wrote:
> > >
> > > On Tue, May 31, 2022 at 10:49:03AM -0500, Eric Blake wrote:
> > > > This patch fixes the corner-case regression introduced by the previous
> > > > patch where the pread_structured callback buffer lifetime ends as soon
> > > > as the callback (that is, where accessing a stashed callback parameter
> > > > can result in ValueError instead of modifying a harmless copy).  With
> > > > careful effort, we can create a memoryview of the Python object that
> > > > we read into, then slice that memoryview as part of the callback; now
> > > > the slice will be valid as long as the original object is also valid.
> > > >
> > >
> > > > | @@ -76,8 +77,24 @@ chunk_wrapper (void *user_data, const vo
> > > > |PyObject *py_subbuf = NULL;
> > > > |PyObject *py_error = NULL;
> > > > |
> > > > | -  /* Cast away const */
> > > > | -  py_subbuf = PyMemoryView_FromMemory ((char *) subbuf, count, 
> > > > PyBUF_READ);
> > > > | +  if (data->buf) {
> > 
> > In which case we don't have data->buf?
> 
> Right now, in nbd_internal_py_aio_read_structured.  Fixing that will
> eventually become patch 4/2 for this series (the idea is that instead
> of requiring the user to pass in an nbd.Buffer object, we should take
> any buffer-like object, populate data->buf with zero-copy semantics,
> and we're good to go.  But to avoid breaking back-compat, we either
> have to also special-case existing code using nbd.Buffer, or enhance
> the nbd.Buffer class to implement the buffer-like interface).

Following up on this, after first reworking aio_pread to be more
efficient in other series that landed first, I was able to respin this
series to no longer care about pread_structured
vs. aio_pread_structured, and with no temporary regression in being
unable to stash off information that survives past the callback:

https://listman.redhat.com/archives/libguestfs/2022-June/029148.html

I'm still working on patches to make nbd.Buffer not do as much
copying; right now, I'm leaning towards this design:

Add nbd.Buffer.copy_default, set to False.

Enhance existing functions along the lines of:

@classmethod
def nbd.Buffer.from_bytearray(cls, buffer, copy=None):
  if copy is None:
copy = nbd.Buffer.copy_default
  if copy:
buffer = bytearray(buffer)
  self = cls(0)
  self._o = buffer
  self._init = True
  return self

def nbd.Buffer.to_bytearray(self, copy=None):
  if copy is None:
copy = self.copy_default # falls back to nbd.Buffer.copy_default as needed
  if not hasattr(self, '_init'):
self._o = bytearray(len(self))
self._init = True
  if copy:
return bytearray(self._o)
  return self._o

With this design, newer libnbd clients default to zero-copy, and can
request a copy when needed; while code that wants to be portable to
older and newer libnbd at the same time can start out by doing:

nbd.Buffer.copy_default = True

at initialization time (since they can't pass a copy= parameter to
.{to,from}_bytearray() in older libnbd).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org
___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs

Re: [Libguestfs] [libnbd PATCH 3/2] python: Slice pread_structured buffer from original

2022-06-01 Thread Nir Soffer

On Wed, Jun 1, 2022 at 3:24 AM Eric Blake  wrote:
>
> On Wed, Jun 01, 2022 at 02:20:48AM +0300, Nir Soffer wrote:
> > On Tue, May 31, 2022 at 6:52 PM Eric Blake  wrote:
> > >
> > > On Tue, May 31, 2022 at 10:49:03AM -0500, Eric Blake wrote:
> > > > This patch fixes the corner-case regression introduced by the previous
> > > > patch where the pread_structured callback buffer lifetime ends as soon
> > > > as the callback (that is, where accessing a stashed callback parameter
> > > > can result in ValueError instead of modifying a harmless copy).  With
> > > > careful effort, we can create a memoryview of the Python object that
> > > > we read into, then slice that memoryview as part of the callback; now
> > > > the slice will be valid as long as the original object is also valid.
> > > >
> > >
> > > > | @@ -76,8 +77,24 @@ chunk_wrapper (void *user_data, const vo
> > > > |PyObject *py_subbuf = NULL;
> > > > |PyObject *py_error = NULL;
> > > > |
> > > > | -  /* Cast away const */
> > > > | -  py_subbuf = PyMemoryView_FromMemory ((char *) subbuf, count, 
> > > > PyBUF_READ);
> > > > | +  if (data->buf) {
> >
> > In which case we don't have data->buf?
>
> Right now, in nbd_internal_py_aio_read_structured.  Fixing that will
> eventually become patch 4/2 for this series (the idea is that instead
> of requiring the user to pass in an nbd.Buffer object, we should take
> any buffer-like object, populate data->buf with zero-copy semantics,
> and we're good to go.

Avoiding the nbd.Buffer class and using the buffer protocol sounds like
the way to go.

> But to avoid breaking back-compat, we either
> have to also special-case existing code using nbd.Buffer, or enhance
> the nbd.Buffer class to implement the buffer-like interface).

Backward compatibility is very important, but I'm not sure if we
have enough users of the python bindings to care about it now.

>
> >
> > > > | +PyObject *start, *end, *slice;
> > > > | +assert (PyMemoryView_Check (data->buf));
> >
> > Why do we need data->buf to be a memoryview?
>
> Maybe it doesn't.  It looks like (at least from python, rather than in
> the C coding side of things) that you can apply the slice operation to
> bytes and bytearray.  But there may be other buffer-like objects that
> don't directly support slice while memoryview always does; and one of
> the reasons memoryview is part of the standard python library is to
> make it easy to add operations on top of any buffer-like object.
> memoryview also takes care of doing a temporary copy to consolidate
> out a contiguous view if the original buffer is not contiguous; you
> don't need that with bytes or bytearray, but definitely need it with
> array.array and friends.

In the python side, the reason you need memoryview is to avoid the copy
when creating a slice. I guess the C API works in the same way but I did
not check.


> > We can create a new memoryview form data->buf - it only needs to be an 
> > object
> > supporting the buffer protocol. Basically we need the C version of:
> >
> > memoryview(buf)[start:stop]
> >
> > buf can be bytes, bytearray, mmap, or another memoryview.
>
> Yes - and that's what this C code is.  memoryview(buf) was created
> when populating data->buf (whether the original was bytes, bytearray,
> mmap, or other buffer-like object), then this code follows up with
> creating the slice [start:stop] then pulling it altogether with
> view.__getitem__(slice).

Sounds good!

I think Richard's suggestion to extract a helper for slicing a memoryview
will make this code much easier to read and review.

>
> >
> > > > | +ptrdiff_t offs = subbuf - PyMemoryView_GET_BUFFER 
> > > > (data->buf)->buf;
> >
> > Because PyMemoryView_Check is inside the assert, build with NDEBUG will
> > remove the check, and this call may crash if data->buf is not a memoryview.
>
> It's more a proof that the earlier code in
> nbd_internal_py_pread_structured correctly set data->buf.  I'm not
> worried about an NDEBUG build failing; this is one case where an
> assert() really is more for documentation.

This looks like a coding error as is. We can add a comment or change the code
to look more correct. If we really  don't have a memory view, the
slice will copy
the data (unless C level slices are unsafe, unlikely), and we want to get a view
without coping. So it would be better to fail loudly.

> > It would be nicer if we could get the offset without looking into the 
> > internal
> > buffer of the memoryview.
>
> We could pass the void* alongside the PyObject* in struct user_data.
> But ultimately, we HAVE to look at the internal pointer of the
> memoryview, in order to call nbd_pread_structured in the first place.
> Whether we look once (and populate two fields in data) or every time
> the callback is reached is less important.
>
> It would also be possible to store the original offset to
> nbd_pread_structured in user_data, then compute the slice start as the
> callback's offset - original offset, without

Re: [Libguestfs] [libnbd PATCH 3/2] python: Slice pread_structured buffer from original

2022-06-01 Thread Richard W.M. Jones



I agree with Nir's feedback and R-b's and don't have anything else to add.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v
___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs

Re: [Libguestfs] [libnbd PATCH 3/2] python: Slice pread_structured buffer from original

2022-06-01 Thread Richard W.M. Jones

On Tue, May 31, 2022 at 07:24:03PM -0500, Eric Blake wrote:
> On Wed, Jun 01, 2022 at 02:20:48AM +0300, Nir Soffer wrote:
> > Does this work?
> > 
> > PySlice_New(NULL, NULL, NULL);
> > PySlice_AdjustIndices(length, start, stop, step);
> 
> New in python 3.6.1.  README says we still target python 3.3.

There are two things we could do here.  Either move the baseline up.
Python 3.6.1 was released in March 2017 (5 years ago), and more
importantly is available in RHEL 7 which is the oldest Linux distro we
care about.

Or (harder, not necessarily better) detect the required function in
configure.ac, and add a utility function to hide the difference.

> Worse, look at the signature - it does NOT take a PyObject *, and
> therefore it cannot modify an existing slice object (rather, it is
> for easing the computation of how to turn [-1:-2:2] into a slice
> with positive bounds) - you still have to go through a step that
> converts integers into PyObject* before creating a PySlice object.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
nbdkit - Flexible, fast NBD server with plugins
https://gitlab.com/nbdkit/nbdkit
___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs

Re: [Libguestfs] [libnbd PATCH 3/2] python: Slice pread_structured buffer from original

2022-06-01 Thread Richard W.M. Jones

On Tue, May 31, 2022 at 10:52:35AM -0500, Eric Blake wrote:
> And I really wish python didn't make it so hard to grab a slice of
> another object using C code.  Having to create 3 temporary PyObjects
> instead of having a utility C function that takes normal integers was
> annoying.

You can add hand-written utility functions to python/utils.c if that
makes things easier.  Note you will need to add a prototype for the
new function(s) to the generated code in python/methods.h.

See nbd_internal_py_get_string_list for an example of this.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org
___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs

Re: [Libguestfs] [libnbd PATCH 3/2] python: Slice pread_structured buffer from original

2022-05-31 Thread Eric Blake

On Wed, Jun 01, 2022 at 02:20:48AM +0300, Nir Soffer wrote:
> On Tue, May 31, 2022 at 6:52 PM Eric Blake  wrote:
> >
> > On Tue, May 31, 2022 at 10:49:03AM -0500, Eric Blake wrote:
> > > This patch fixes the corner-case regression introduced by the previous
> > > patch where the pread_structured callback buffer lifetime ends as soon
> > > as the callback (that is, where accessing a stashed callback parameter
> > > can result in ValueError instead of modifying a harmless copy).  With
> > > careful effort, we can create a memoryview of the Python object that
> > > we read into, then slice that memoryview as part of the callback; now
> > > the slice will be valid as long as the original object is also valid.
> > >
> >
> > > | @@ -76,8 +77,24 @@ chunk_wrapper (void *user_data, const vo
> > > |PyObject *py_subbuf = NULL;
> > > |PyObject *py_error = NULL;
> > > |
> > > | -  /* Cast away const */
> > > | -  py_subbuf = PyMemoryView_FromMemory ((char *) subbuf, count, 
> > > PyBUF_READ);
> > > | +  if (data->buf) {
> 
> In which case we don't have data->buf?

Right now, in nbd_internal_py_aio_read_structured.  Fixing that will
eventually become patch 4/2 for this series (the idea is that instead
of requiring the user to pass in an nbd.Buffer object, we should take
any buffer-like object, populate data->buf with zero-copy semantics,
and we're good to go.  But to avoid breaking back-compat, we either
have to also special-case existing code using nbd.Buffer, or enhance
the nbd.Buffer class to implement the buffer-like interface).

> 
> > > | +PyObject *start, *end, *slice;
> > > | +assert (PyMemoryView_Check (data->buf));
> 
> Why do we need data->buf to be a memoryview?

Maybe it doesn't.  It looks like (at least from python, rather than in
the C coding side of things) that you can apply the slice operation to
bytes and bytearray.  But there may be other buffer-like objects that
don't directly support slice while memoryview always does; and one of
the reasons memoryview is part of the standard python library is to
make it easy to add operations on top of any buffer-like object.
memoryview also takes care of doing a temporary copy to consolidate
out a contiguous view if the original buffer is not contiguous; you
don't need that with bytes or bytearray, but definitely need it with
array.array and friends.

> 
> We can create a new memoryview form data->buf - it only needs to be an object
> supporting the buffer protocol. Basically we need the C version of:
> 
> memoryview(buf)[start:stop]
> 
> buf can be bytes, bytearray, mmap, or another memoryview.

Yes - and that's what this C code is.  memoryview(buf) was created
when populating data->buf (whether the original was bytes, bytearray,
mmap, or other buffer-like object), then this code follows up with
creating the slice [start:stop] then pulling it altogether with
view.__getitem__(slice).

> 
> > > | +ptrdiff_t offs = subbuf - PyMemoryView_GET_BUFFER (data->buf)->buf;
> 
> Because PyMemoryView_Check is inside the assert, build with NDEBUG will
> remove the check, and this call may crash if data->buf is not a memoryview.

It's more a proof that the earlier code in
nbd_internal_py_pread_structured correctly set data->buf.  I'm not
worried about an NDEBUG build failing; this is one case where an
assert() really is more for documentation.

> 
> It would be nicer if we could get the offset without looking into the internal
> buffer of the memoryview.

We could pass the void* alongside the PyObject* in struct user_data.
But ultimately, we HAVE to look at the internal pointer of the
memoryview, in order to call nbd_pread_structured in the first place.
Whether we look once (and populate two fields in data) or every time
the callback is reached is less important.

It would also be possible to store the original offset to
nbd_pread_structured in user_data, then compute the slice start as the
callback's offset - original offset, without having to do pointer
math.  But we still have to store the memoryview PyObject that we will
be slicing.  So it just becomes a question of whether struct user_data
needs to grow.

> 
> > > | +start = PyLong_FromLong (offs);
> > > | +if (!start) { PyErr_PrintEx (0); goto out; }
> > > | +end = PyLong_FromLong (offs + count);
> > > | +if (!end) { Py_DECREF (start); PyErr_PrintEx (0); goto out; }
> > > | +slice = PySlice_New (start, end, NULL);
> > > | +Py_DECREF (start);
> > > | +Py_DECREF (end);
> > > | +if (!slice) { PyErr_PrintEx (0); goto out; }
> > > | +py_subbuf = PyObject_GetItem (data->buf, slice);
> >
> > Missing a Py_DECREF (slice) here.  Easy enough to add...
> >
> > > +++ b/generator/Python.ml
> > > @@ -187,8 +187,24 @@ let
> > > pr "PyList_SET_ITEM (py_%s, i_%s, py_e_%s);\n" n n n;
> > > pr "  }\n"
> > >  | CBBytesIn (n, len) ->
> > > -   pr "  /* Cast away const */\n";
> > > -   pr "  py_%s = PyMemoryView_FromMemory ((char *) %s, %s, 
> > >

Re: [Libguestfs] [libnbd PATCH 3/2] python: Slice pread_structured buffer from original

2022-05-31 Thread Nir Soffer

On Tue, May 31, 2022 at 6:52 PM Eric Blake  wrote:
>
> On Tue, May 31, 2022 at 10:49:03AM -0500, Eric Blake wrote:
> > This patch fixes the corner-case regression introduced by the previous
> > patch where the pread_structured callback buffer lifetime ends as soon
> > as the callback (that is, where accessing a stashed callback parameter
> > can result in ValueError instead of modifying a harmless copy).  With
> > careful effort, we can create a memoryview of the Python object that
> > we read into, then slice that memoryview as part of the callback; now
> > the slice will be valid as long as the original object is also valid.
> >
>
> > | @@ -76,8 +77,24 @@ chunk_wrapper (void *user_data, const vo
> > |PyObject *py_subbuf = NULL;
> > |PyObject *py_error = NULL;
> > |
> > | -  /* Cast away const */
> > | -  py_subbuf = PyMemoryView_FromMemory ((char *) subbuf, count, 
> > PyBUF_READ);
> > | +  if (data->buf) {

In which case we don't have data->buf?

> > | +PyObject *start, *end, *slice;
> > | +assert (PyMemoryView_Check (data->buf));

Why do we need data->buf to be a memoryview?

We can create a new memoryview form data->buf - it only needs to be an object
supporting the buffer protocol. Basically we need the C version of:

memoryview(buf)[start:stop]

buf can be bytes, bytearray, mmap, or another memoryview.

> > | +ptrdiff_t offs = subbuf - PyMemoryView_GET_BUFFER (data->buf)->buf;

Because PyMemoryView_Check is inside the assert, build with NDEBUG will
remove the check, and this call may crash if data->buf is not a memoryview.

It would be nicer if we could get the offset without looking into the internal
buffer of the memoryview.

> > | +start = PyLong_FromLong (offs);
> > | +if (!start) { PyErr_PrintEx (0); goto out; }
> > | +end = PyLong_FromLong (offs + count);
> > | +if (!end) { Py_DECREF (start); PyErr_PrintEx (0); goto out; }
> > | +slice = PySlice_New (start, end, NULL);
> > | +Py_DECREF (start);
> > | +Py_DECREF (end);
> > | +if (!slice) { PyErr_PrintEx (0); goto out; }
> > | +py_subbuf = PyObject_GetItem (data->buf, slice);
>
> Missing a Py_DECREF (slice) here.  Easy enough to add...
>
> > +++ b/generator/Python.ml
> > @@ -187,8 +187,24 @@ let
> > pr "PyList_SET_ITEM (py_%s, i_%s, py_e_%s);\n" n n n;
> > pr "  }\n"
> >  | CBBytesIn (n, len) ->
> > -   pr "  /* Cast away const */\n";
> > -   pr "  py_%s = PyMemoryView_FromMemory ((char *) %s, %s, 
> > PyBUF_READ);\n" n n len;
> > +   pr "  if (data->buf) {\n";
> > +   pr "PyObject *start, *end, *slice;\n";
> > +   pr "assert (PyMemoryView_Check (data->buf));\n";
> > +   pr "ptrdiff_t offs = %s - PyMemoryView_GET_BUFFER 
> > (data->buf)->buf;\n" n;
> > +   pr "start = PyLong_FromLong (offs);\n";
> > +   pr "if (!start) { PyErr_PrintEx (0); goto out; }\n";
> > +   pr "end = PyLong_FromLong (offs + %s);\n" len;
> > +   pr "if (!end) { Py_DECREF (start); PyErr_PrintEx (0); goto out; 
> > }\n";
> > +   pr "slice = PySlice_New (start, end, NULL);\n";
> > +   pr "Py_DECREF (start);\n";
> > +   pr "Py_DECREF (end);\n";
> > +   pr "if (!slice) { PyErr_PrintEx (0); goto out; }\n";
> > +   pr "py_%s = PyObject_GetItem (data->buf, slice);\n" n;
>
> ...here
>
> And I really wish python didn't make it so hard to grab a slice of
> another object using C code.  Having to create 3 temporary PyObjects
> instead of having a utility C function that takes normal integers was
> annoying.

Does this work?

PySlice_New(NULL, NULL, NULL);
PySlice_AdjustIndices(length, start, stop, step);

Nir

___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs

Re: [Libguestfs] [libnbd PATCH 3/2] python: Slice pread_structured buffer from original

2022-05-31 Thread Eric Blake

On Tue, May 31, 2022 at 10:49:03AM -0500, Eric Blake wrote:
> This patch fixes the corner-case regression introduced by the previous
> patch where the pread_structured callback buffer lifetime ends as soon
> as the callback (that is, where accessing a stashed callback parameter
> can result in ValueError instead of modifying a harmless copy).  With
> careful effort, we can create a memoryview of the Python object that
> we read into, then slice that memoryview as part of the callback; now
> the slice will be valid as long as the original object is also valid.
> 

> | @@ -76,8 +77,24 @@ chunk_wrapper (void *user_data, const vo
> |PyObject *py_subbuf = NULL;
> |PyObject *py_error = NULL;
> |
> | -  /* Cast away const */
> | -  py_subbuf = PyMemoryView_FromMemory ((char *) subbuf, count, PyBUF_READ);
> | +  if (data->buf) {
> | +PyObject *start, *end, *slice;
> | +assert (PyMemoryView_Check (data->buf));
> | +ptrdiff_t offs = subbuf - PyMemoryView_GET_BUFFER (data->buf)->buf;
> | +start = PyLong_FromLong (offs);
> | +if (!start) { PyErr_PrintEx (0); goto out; }
> | +end = PyLong_FromLong (offs + count);
> | +if (!end) { Py_DECREF (start); PyErr_PrintEx (0); goto out; }
> | +slice = PySlice_New (start, end, NULL);
> | +Py_DECREF (start);
> | +Py_DECREF (end);
> | +if (!slice) { PyErr_PrintEx (0); goto out; }
> | +py_subbuf = PyObject_GetItem (data->buf, slice);

Missing a Py_DECREF (slice) here.  Easy enough to add...

> +++ b/generator/Python.ml
> @@ -187,8 +187,24 @@ let
> pr "PyList_SET_ITEM (py_%s, i_%s, py_e_%s);\n" n n n;
> pr "  }\n"
>  | CBBytesIn (n, len) ->
> -   pr "  /* Cast away const */\n";
> -   pr "  py_%s = PyMemoryView_FromMemory ((char *) %s, %s, 
> PyBUF_READ);\n" n n len;
> +   pr "  if (data->buf) {\n";
> +   pr "PyObject *start, *end, *slice;\n";
> +   pr "assert (PyMemoryView_Check (data->buf));\n";
> +   pr "ptrdiff_t offs = %s - PyMemoryView_GET_BUFFER 
> (data->buf)->buf;\n" n;
> +   pr "start = PyLong_FromLong (offs);\n";
> +   pr "if (!start) { PyErr_PrintEx (0); goto out; }\n";
> +   pr "end = PyLong_FromLong (offs + %s);\n" len;
> +   pr "if (!end) { Py_DECREF (start); PyErr_PrintEx (0); goto out; 
> }\n";
> +   pr "slice = PySlice_New (start, end, NULL);\n";
> +   pr "Py_DECREF (start);\n";
> +   pr "Py_DECREF (end);\n";
> +   pr "if (!slice) { PyErr_PrintEx (0); goto out; }\n";
> +   pr "py_%s = PyObject_GetItem (data->buf, slice);\n" n;

...here

And I really wish python didn't make it so hard to grab a slice of
another object using C code.  Having to create 3 temporary PyObjects
instead of having a utility C function that takes normal integers was
annoying.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org
___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs

[Libguestfs] [libnbd PATCH 3/2] python: Slice pread_structured buffer from original

2022-05-31 Thread Eric Blake

This patch fixes the corner-case regression introduced by the previous
patch where the pread_structured callback buffer lifetime ends as soon
as the callback (that is, where accessing a stashed callback parameter
can result in ValueError instead of modifying a harmless copy).  With
careful effort, we can create a memoryview of the Python object that
we read into, then slice that memoryview as part of the callback; now
the slice will be valid as long as the original object is also valid.

Note that this does not tackle the case of aio_pread_structured; for
that, we either need to make nbd.Buffer honor the Python buffer
protocol, or else change our handling of BytesPersistOut to utilize
the buffer protocol directly instead of requiring the user to
round-trip through nbd.Buffer.

The runtime of this patch is still about the same 3.0s as in the test
of the previous patch, but now you can do:

$ nbdsh
nbd> h.connect_command(["nbdkit","-s","memory","10"])
nbd> copy=None
nbd> def f(b,o,s,e):
...  global copy
...  copy = b
...  print(b[0])
...
nbd> print(copy)
None
nbd> buf = h.pread_structured(10,0,f)
0
nbd> copy

nbd> copy[0]=0x31
nbd> print(buf)
bytearray(b'1\x00\x00\x00\x00\x00\x00\x00\x00\x00')

That is, because we passed in a memoryview slice of the original
buffer instead of a temporary slice of C memory, we can now stash that
slice into a global, modify it, and see those modifications reflected
back to the overall bytearray object returned by pread_structured.

The generated diff is a bit more verbose, including some no-op lines
added into pread, aio_pread, and aio_pread_structured.  In practice,
only pread_structured is actually impacted.

| --- python/methods.c.bak  2022-05-31 09:51:30.406757934 -0500
| +++ python/methods.c  2022-05-31 10:26:01.797636894 -0500
| @@ -27,6 +27,7 @@
|  #include 
|  #include 
|  #include 
| +#include 
|
|  #include 
|
| @@ -76,8 +77,24 @@ chunk_wrapper (void *user_data, const vo
|PyObject *py_subbuf = NULL;
|PyObject *py_error = NULL;
|
| -  /* Cast away const */
| -  py_subbuf = PyMemoryView_FromMemory ((char *) subbuf, count, PyBUF_READ);
| +  if (data->buf) {
| +PyObject *start, *end, *slice;
| +assert (PyMemoryView_Check (data->buf));
| +ptrdiff_t offs = subbuf - PyMemoryView_GET_BUFFER (data->buf)->buf;
| +start = PyLong_FromLong (offs);
| +if (!start) { PyErr_PrintEx (0); goto out; }
| +end = PyLong_FromLong (offs + count);
| +if (!end) { Py_DECREF (start); PyErr_PrintEx (0); goto out; }
| +slice = PySlice_New (start, end, NULL);
| +Py_DECREF (start);
| +Py_DECREF (end);
| +if (!slice) { PyErr_PrintEx (0); goto out; }
| +py_subbuf = PyObject_GetItem (data->buf, slice);
| +  }
| +  else {
| +/* Cast away const */
| +py_subbuf = PyMemoryView_FromMemory ((char *) subbuf, count, PyBUF_READ);
| +  }
|if (!py_subbuf) { PyErr_PrintEx (0); goto out; }
|PyObject *py_error_modname = PyUnicode_FromString ("ctypes");
|if (!py_error_modname) { PyErr_PrintEx (0); goto out; }
| @@ -115,11 +132,11 @@ chunk_wrapper (void *user_data, const vo
|};
|
|   out:
| -  if (py_subbuf) {
| +  if (py_subbuf && !data->buf) {
|  PyObject *tmp = PyObject_CallMethod(py_subbuf, "release", NULL);
|  Py_XDECREF (tmp);
| -Py_DECREF (py_subbuf);
|}
| +  Py_XDECREF (py_subbuf);
|if (py_error) {
|  PyObject *py_error_ret = PyObject_GetAttrString (py_error, "value");
|  *error = PyLong_AsLong (py_error_ret);
| @@ -2304,7 +2321,7 @@ nbd_internal_py_pread (PyObject *self, P
|struct nbd_handle *h;
|int ret;
|PyObject *py_ret = NULL;
| -  PyObject *buf = NULL;
| +  PyObject *py_buf = NULL;
|Py_ssize_t count;
|uint64_t offset_u64;
|unsigned long long offset; /* really uint64_t */
| @@ -2318,20 +2335,20 @@ nbd_internal_py_pread (PyObject *self, P
|h = get_handle (py_h);
|if (!h) goto out;
|flags_u32 = flags;
| -  buf = PyByteArray_FromStringAndSize (NULL, count);
| -  if (buf == NULL) goto out;
| +  py_buf = PyByteArray_FromStringAndSize (NULL, count);
| +  if (py_buf == NULL) goto out;
|offset_u64 = offset;
|
| -  ret = nbd_pread (h, PyByteArray_AS_STRING (buf), count, offset_u64, 
flags_u32);
| +  ret = nbd_pread (h, PyByteArray_AS_STRING (py_buf), count, offset_u64, 
flags_u32);
|if (ret == -1) {
|  raise_exception ();
|  goto out;
|}
| -  py_ret = buf;
| -  buf = NULL;
| +  py_ret = py_buf;
| +  py_buf = NULL;
|
|   out:
| -  Py_XDECREF (buf);
| +  Py_XDECREF (py_buf);
|return py_ret;
|  }
|
| @@ -2342,7 +2359,7 @@ nbd_internal_py_pread_structured (PyObje
|struct nbd_handle *h;
|int ret;
|PyObject *py_ret = NULL;
| -  PyObject *buf = NULL;
| +  PyObject *py_buf = NULL;
|Py_ssize_t count;
|uint64_t offset_u64;
|unsigned long long offset; /* really uint64_t */
| @@ -2360,8 +2377,8 @@ nbd_internal_py_pread_structured (PyObje
|h = get_handle (py_h);
|if (!h) goto out;
|flags_u32 = flags;
| -  buf =

Re: [Libguestfs] [libnbd PATCH 3/2] python: Slice pread_structured buffer from original

Re: [Libguestfs] [libnbd PATCH 3/2] python: Slice pread_structured buffer from original

Re: [Libguestfs] [libnbd PATCH 3/2] python: Slice pread_structured buffer from original

Re: [Libguestfs] [libnbd PATCH 3/2] python: Slice pread_structured buffer from original

Re: [Libguestfs] [libnbd PATCH 3/2] python: Slice pread_structured buffer from original

Re: [Libguestfs] [libnbd PATCH 3/2] python: Slice pread_structured buffer from original

Re: [Libguestfs] [libnbd PATCH 3/2] python: Slice pread_structured buffer from original

Re: [Libguestfs] [libnbd PATCH 3/2] python: Slice pread_structured buffer from original

[Libguestfs] [libnbd PATCH 3/2] python: Slice pread_structured buffer from original

9 matches

Site Navigation

Mail list logo

Footer information