Re: [Qemu-devel] [PATCH v9 35/37] qapi: Change visit_type_FOO() to no longer return partial objects

2016-01-29 Thread Eric Blake
On 01/29/2016 05:03 AM, Markus Armbruster wrote:

> 
> With (1) don't assign, the caller can pick an error value by assigning
> it before the visit, and it must not access the value on error unless it
> does.
> 
> With (2) assign zero, the caller can't pick an error value, but may
> safely access the value even on error.
> 
> Tradeoff.  I figure either can work for us.
> 
>>> (3) Assign null pointer, else don't assign anything
>>>
>>> CON: inconsistent
>>> CON: mix of (1)'s and (2)'s CON
>>
>> Which I think is what I did in this patch.
> 
> I don't like the inconsistency.  It complicates the interface.

I'll go ahead and audit to see whether more scalar visits were relying
on (1) and would have to be rewritten to use style (2); vs. whether more
pointer visits were passing in uninitialized obj and would have to be
rewritten to use style (1).

> I think behavior (1) don't assign and (2) assign zero both work, we just
> have to pick one and run with it.
> 
> If we pick behavior (1) don't assign, then we should assert something
> like !obj || !*obj on entry.  With such assertions in place, I think (1)
> should be roughly as safe as (2).

I think your assessment is right, and it's now just a matter of seeing
which way to get to a consistent state is less effort (I may still end
up doing both ways as competing patches, for comparison purposes).


> or maybe returns whether something was allocated:
> 
> out_obj:
> if (visit_end_struct(v) && err) {
>qapi_free_T(*obj);
> }

I'm liking that.  Dealloc and output visitors always return false, and
input visitors may need to track something on their stack for whether
they allocated or returned error earlier on, but it results in less
generated output.  Basically, it's lowering the 'bool allocated' that I
added in this attempt out of the generated code and into the visitors.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v9 35/37] qapi: Change visit_type_FOO() to no longer return partial objects

2016-01-29 Thread Markus Armbruster
Eric Blake  writes:

> On 01/28/2016 08:24 AM, Markus Armbruster wrote:
>> Eric Blake  writes:
>> 
>>> Returning a partial object on error is an invitation for a careless
>>> caller to leak memory.  As no one outside the testsuite was actually
>>> relying on these semantics, it is cleaner to just document and
>>> guarantee that ALL pointer-based visit_type_FOO() functions always
>>> leave a safe value in *obj during an input visitor (either the new
>>> object on success, or NULL if an error is encountered).
>>>
>>> Since input visitors have blind assignment semantics, we have to
>>> track the result of whether an assignment is made all the way down
>>> to each visitor callback implementation, to avoid making decisions
>>> based on potentially uninitialized storage.
>> 
>> I'm not sure I get this paragraph.  What's "blind assignment semantics"?
>
> The caller does:
>
> {
> Foo *bar; /* uninit */
> visit_type_Foo();
> if (no error) {
> /* use bar */
> }
> }
>
> Which means the visitor core can't do 'if (*obj)', because obj may be
> uninitialized on entry; if it dereferences obj at all, it must be via
> '*obj = ...' which I'm terming a blind assignment.
>
> But I can try and come up with better wording.

I'd suggest one, but I think we should first make up our minds on error
behavior.

>>> Note that we still leave *obj unchanged after a scalar-based
>>> visit_type_FOO(); I did not feel like auditing all uses of
>>> visit_type_Enum() to see if the callers would tolerate a specific
>>> sentinel value (not to mention having to decide whether it would
>>> be better to use 0 or ENUM__MAX as that sentinel).
>> 
>> The assigning input visitor functions (core and generated) all assign
>> either a pointer to a newly allocated object, or a non-pointer scalar
>> value.
>> 
>> Possible behaviors on error:
>> 
>> (0) What we have now: assign something that must be cleaned up with the
>> dealloc visitor if it's a pointer, but is otherwise useless
>> 
>> CON: callers have to clean up
>> CON: exposes careless callers to useless values
>> 
>> (1) Don't assign anything

Need to be very careful to store only when success is assured.

Consider visiting a QAPI type that is actually two levels of containers,
say a list of a struct of scalars.  The visit is a walk of this tree
then:

list
___/ ... \___
   / \
struct1  structN
   /  ...  \ /  ...  \   
scal11   scal1M   scalN1   scalNM

Now let's consider the state when the visit reaches scalN1:

* The visits of scal11..scal1M and struct 1 have all succeeded already,
  and stored their value into their container.  Same for the visits of
  structI (I < N) omitted in the diagram, and their scalars.

* The visit of list and struct N are in progress: the object has been
  partially constructed, but not yet stored.

* The remaining visits haven't begun, and their members in objects
  already allocated are still zero.

Now the visit of scalN1 fails.  The visit of structN fails in turn,
freeing its (not yet stored, partially constructed) object.  Finally,
the visit of list fails, freeing its object.

That the failed visits of scalN1 and structN didn't store is actually
unimportant.

Of course, this isn't how things work now.  Visits of containers store
the newly allocated object before visiting members, in visit_start_*(),
and the member visits use that to find the object.

>> PRO: consistent
>> CON: exposes careless callers to uninitialized values
>
> Half-PRO: Caller can pre-initialize a default, and rely on that value on
> error.  In fact, I think we have callers doing that when visiting an
> enum, and I didn't feel up to auditing them all when first writing the
> patch.
>
> But a small audit right now shows:
>
> qom/object.c:object_property_get_enum() starts with uninitialized 'int
> ret;', hardcodes 'return 0;' on some failures, but otherwise passes it
> to visit_type_enum() then blindly returns that value even if errp is
> set.  Yuck.  Callers HAVE to check errp rather than relying on the
> return value to flag errors; although it looks like the lone caller is
> in numa.c and passes _abort.
>
> Maybe I should just bite the bullet, and audit ALL uses of visitor for
> their behavior of what to expect in *obj on error.
>> 
>> (2) Assign zero bits
>> 
>> PRO: consistent
>> CON: exposes careless callers to bogus zero values
>
> Half-CON: Caller cannot pre-initialize a default

With (1) don't assign, the caller can pick an error value by assigning
it before the visit, and it must not access the value on error unless it
does.

With (2) assign zero, the caller can't pick an error value, but may
safely access the value even on error.

Tradeoff.  I figure either can work for us.

>> (3) Assign null pointer, else don't assign anything
>> 
>> CON: inconsistent
>> CON: mix of (1)'s and (2)'s CON
>
> Which I think is what 

Re: [Qemu-devel] [PATCH v9 35/37] qapi: Change visit_type_FOO() to no longer return partial objects

2016-01-28 Thread Eric Blake
On 01/28/2016 08:24 AM, Markus Armbruster wrote:
> Eric Blake  writes:
> 
>> Returning a partial object on error is an invitation for a careless
>> caller to leak memory.  As no one outside the testsuite was actually
>> relying on these semantics, it is cleaner to just document and
>> guarantee that ALL pointer-based visit_type_FOO() functions always
>> leave a safe value in *obj during an input visitor (either the new
>> object on success, or NULL if an error is encountered).
>>
>> Since input visitors have blind assignment semantics, we have to
>> track the result of whether an assignment is made all the way down
>> to each visitor callback implementation, to avoid making decisions
>> based on potentially uninitialized storage.
> 
> I'm not sure I get this paragraph.  What's "blind assignment semantics"?

The caller does:

{
Foo *bar; /* uninit */
visit_type_Foo();
if (no error) {
/* use bar */
}
}

Which means the visitor core can't do 'if (*obj)', because obj may be
uninitialized on entry; if it dereferences obj at all, it must be via
'*obj = ...' which I'm terming a blind assignment.

But I can try and come up with better wording.

> 
>> Note that we still leave *obj unchanged after a scalar-based
>> visit_type_FOO(); I did not feel like auditing all uses of
>> visit_type_Enum() to see if the callers would tolerate a specific
>> sentinel value (not to mention having to decide whether it would
>> be better to use 0 or ENUM__MAX as that sentinel).
> 
> The assigning input visitor functions (core and generated) all assign
> either a pointer to a newly allocated object, or a non-pointer scalar
> value.
> 
> Possible behaviors on error:
> 
> (0) What we have now: assign something that must be cleaned up with the
> dealloc visitor if it's a pointer, but is otherwise useless
> 
> CON: callers have to clean up
> CON: exposes careless callers to useless values
> 
> (1) Don't assign anything
> 
> PRO: consistent
> CON: exposes careless callers to uninitialized values

Half-PRO: Caller can pre-initialize a default, and rely on that value on
error.  In fact, I think we have callers doing that when visiting an
enum, and I didn't feel up to auditing them all when first writing the
patch.

But a small audit right now shows:

qom/object.c:object_property_get_enum() starts with uninitialized 'int
ret;', hardcodes 'return 0;' on some failures, but otherwise passes it
to visit_type_enum() then blindly returns that value even if errp is
set.  Yuck.  Callers HAVE to check errp rather than relying on the
return value to flag errors; although it looks like the lone caller is
in numa.c and passes _abort.

Maybe I should just bite the bullet, and audit ALL uses of visitor for
their behavior of what to expect in *obj on error.

> 
> (2) Assign zero bits
> 
> PRO: consistent
> CON: exposes careless callers to bogus zero values

Half-CON: Caller cannot pre-initialize a default

> 
> (3) Assign null pointer, else don't assign anything
> 
> CON: inconsistent
> CON: mix of (1)'s and (2)'s CON

Which I think is what I did in this patch.

> 
> (4) Other ideas?

Store the caller's initial value (often all zero, but not necessarily),
and restore THAT value on error (preserves a pre-initialized default,
but leaves uninitialized garbage in place to bite careless callers)

> 
> Note that *obj is almost always null on entry, because we allocate
> objects zero-initialized.  Only root visits can expose their caller to
> uninitialized values.
> 
>> Signed-off-by: Eric Blake 
>>

>> +/* All qapi types have a corresponding function with a signature
>> + * roughly compatible with this:
>> + *
>> + * void visit_type_FOO(Visitor *v, void *obj, const char *name, Error 
>> **errp);
>> + *
>> + * where *@obj is itself a pointer or a scalar.  The visit functions for
>> + * built-in types are declared here, while the functions for qapi-defined
>> + * struct, union, enum, and list types are generated (see qapi-visit.h).
>> + * Input visitors may receive an uninitialized *@obj, and guarantee a
>> + * safe value is assigned (a new object on success, or NULL on failure).
> 
> This specifies the safe value: NULL.  Makes no sense for non-pointer
> scalars.
> 
> May input visitors really receive uninitialized *@obj?  As far as I can
> see, we routinely store a newly allocated object to *@obj on success,
> and store nothing on failure.  To be able to pass this to the dealloc
> visitor, *@obj must have been null initially, mustn't it?

Correct.  Pre-patch: either the caller pre-initialized obj to NULL (and
can then blindly pass it to the dealloc visitor regardless of whether
visit_start_struct() succeeded), or it did not (in which case the
dealloc visitor must not be called if *obj remains uninitialized because
visit_start_struct() failed, but MUST be called if visit_start_struct()
succeeded to clean up the partial object).

Post-patch: calling 

Re: [Qemu-devel] [PATCH v9 35/37] qapi: Change visit_type_FOO() to no longer return partial objects

2016-01-28 Thread Markus Armbruster
Eric Blake  writes:

> Returning a partial object on error is an invitation for a careless
> caller to leak memory.  As no one outside the testsuite was actually
> relying on these semantics, it is cleaner to just document and
> guarantee that ALL pointer-based visit_type_FOO() functions always
> leave a safe value in *obj during an input visitor (either the new
> object on success, or NULL if an error is encountered).
>
> Since input visitors have blind assignment semantics, we have to
> track the result of whether an assignment is made all the way down
> to each visitor callback implementation, to avoid making decisions
> based on potentially uninitialized storage.

I'm not sure I get this paragraph.  What's "blind assignment semantics"?

> Note that we still leave *obj unchanged after a scalar-based
> visit_type_FOO(); I did not feel like auditing all uses of
> visit_type_Enum() to see if the callers would tolerate a specific
> sentinel value (not to mention having to decide whether it would
> be better to use 0 or ENUM__MAX as that sentinel).

The assigning input visitor functions (core and generated) all assign
either a pointer to a newly allocated object, or a non-pointer scalar
value.

Possible behaviors on error:

(0) What we have now: assign something that must be cleaned up with the
dealloc visitor if it's a pointer, but is otherwise useless

CON: callers have to clean up
CON: exposes careless callers to useless values

(1) Don't assign anything

PRO: consistent
CON: exposes careless callers to uninitialized values

(2) Assign zero bits

PRO: consistent
CON: exposes careless callers to bogus zero values

(3) Assign null pointer, else don't assign anything

CON: inconsistent
CON: mix of (1)'s and (2)'s CON

(4) Other ideas?

Note that *obj is almost always null on entry, because we allocate
objects zero-initialized.  Only root visits can expose their caller to
uninitialized values.

> Signed-off-by: Eric Blake 
>
> ---
> v9: fix bug in use of errp
> v8: rebase to earlier changes
> v7: rebase to earlier changes, enhance commit message, also fix
> visit_type_str() and visit_type_any()
> v6: rebase on top of earlier doc and formatting improvements, mention
> that *obj can be uninitialized on entry to an input visitor, rework
> semantics to keep valgrind happy on uninitialized input, break some
> long lines
> ---
>  include/qapi/visitor-impl.h|  6 ++---
>  include/qapi/visitor.h | 53 
> --
>  qapi/opts-visitor.c| 11 ++---
>  qapi/qapi-dealloc-visitor.c|  9 ---
>  qapi/qapi-visit-core.c | 45 ---
>  qapi/qmp-input-visitor.c   | 18 +-
>  qapi/qmp-output-visitor.c  |  6 +++--
>  qapi/string-input-visitor.c|  6 +++--
>  qapi/string-output-visitor.c   |  3 ++-
>  scripts/qapi-visit.py  | 40 +++
>  tests/test-qmp-commands.c  | 13 +--
>  tests/test-qmp-input-strict.c  | 19 +++
>  tests/test-qmp-input-visitor.c | 10 ++--
>  13 files changed, 157 insertions(+), 82 deletions(-)
>
> diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
> index acbe7d6..8df4ba1 100644
> --- a/include/qapi/visitor-impl.h
> +++ b/include/qapi/visitor-impl.h
> @@ -26,7 +26,7 @@ struct Visitor
>  {
>  /* Must be provided to visit structs (the string visitors do not
>   * currently visit structs). */
> -void (*start_struct)(Visitor *v, const char *name, void **obj,
> +bool (*start_struct)(Visitor *v, const char *name, void **obj,
>   size_t size, Error **errp);
>  /* May be NULL; most useful for input visitors. */
>  void (*check_struct)(Visitor *v, Error **errp);
> @@ -34,13 +34,13 @@ struct Visitor
>  void (*end_struct)(Visitor *v);
>
>  /* May be NULL; most useful for input visitors. */
> -void (*start_implicit_struct)(Visitor *v, void **obj, size_t size,
> +bool (*start_implicit_struct)(Visitor *v, void **obj, size_t size,
>Error **errp);
>  /* May be NULL */
>  void (*end_implicit_struct)(Visitor *v);
>
>  /* Must be set */
> -void (*start_list)(Visitor *v, const char *name, GenericList **list,
> +bool (*start_list)(Visitor *v, const char *name, GenericList **list,
> Error **errp);
>  /* Must be set */
>  GenericList *(*next_list)(Visitor *v, GenericList *element, Error 
> **errp);
> diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h
> index 4638863..4eae633 100644
> --- a/include/qapi/visitor.h
> +++ b/include/qapi/visitor.h
> @@ -31,6 +31,27 @@
>   * visitor-impl.h.
>   */
>
> +/* All qapi types have a corresponding function with a signature
> + * roughly compatible with this:
> + *
> + * void visit_type_FOO(Visitor *v, void *obj, const char *name, Error 
> **errp);
> + *
> + * 

[Qemu-devel] [PATCH v9 35/37] qapi: Change visit_type_FOO() to no longer return partial objects

2016-01-19 Thread Eric Blake
Returning a partial object on error is an invitation for a careless
caller to leak memory.  As no one outside the testsuite was actually
relying on these semantics, it is cleaner to just document and
guarantee that ALL pointer-based visit_type_FOO() functions always
leave a safe value in *obj during an input visitor (either the new
object on success, or NULL if an error is encountered).

Since input visitors have blind assignment semantics, we have to
track the result of whether an assignment is made all the way down
to each visitor callback implementation, to avoid making decisions
based on potentially uninitialized storage.

Note that we still leave *obj unchanged after a scalar-based
visit_type_FOO(); I did not feel like auditing all uses of
visit_type_Enum() to see if the callers would tolerate a specific
sentinel value (not to mention having to decide whether it would
be better to use 0 or ENUM__MAX as that sentinel).

Signed-off-by: Eric Blake 

---
v9: fix bug in use of errp
v8: rebase to earlier changes
v7: rebase to earlier changes, enhance commit message, also fix
visit_type_str() and visit_type_any()
v6: rebase on top of earlier doc and formatting improvements, mention
that *obj can be uninitialized on entry to an input visitor, rework
semantics to keep valgrind happy on uninitialized input, break some
long lines
---
 include/qapi/visitor-impl.h|  6 ++---
 include/qapi/visitor.h | 53 --
 qapi/opts-visitor.c| 11 ++---
 qapi/qapi-dealloc-visitor.c|  9 ---
 qapi/qapi-visit-core.c | 45 ---
 qapi/qmp-input-visitor.c   | 18 +-
 qapi/qmp-output-visitor.c  |  6 +++--
 qapi/string-input-visitor.c|  6 +++--
 qapi/string-output-visitor.c   |  3 ++-
 scripts/qapi-visit.py  | 40 +++
 tests/test-qmp-commands.c  | 13 +--
 tests/test-qmp-input-strict.c  | 19 +++
 tests/test-qmp-input-visitor.c | 10 ++--
 13 files changed, 157 insertions(+), 82 deletions(-)

diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
index acbe7d6..8df4ba1 100644
--- a/include/qapi/visitor-impl.h
+++ b/include/qapi/visitor-impl.h
@@ -26,7 +26,7 @@ struct Visitor
 {
 /* Must be provided to visit structs (the string visitors do not
  * currently visit structs). */
-void (*start_struct)(Visitor *v, const char *name, void **obj,
+bool (*start_struct)(Visitor *v, const char *name, void **obj,
  size_t size, Error **errp);
 /* May be NULL; most useful for input visitors. */
 void (*check_struct)(Visitor *v, Error **errp);
@@ -34,13 +34,13 @@ struct Visitor
 void (*end_struct)(Visitor *v);

 /* May be NULL; most useful for input visitors. */
-void (*start_implicit_struct)(Visitor *v, void **obj, size_t size,
+bool (*start_implicit_struct)(Visitor *v, void **obj, size_t size,
   Error **errp);
 /* May be NULL */
 void (*end_implicit_struct)(Visitor *v);

 /* Must be set */
-void (*start_list)(Visitor *v, const char *name, GenericList **list,
+bool (*start_list)(Visitor *v, const char *name, GenericList **list,
Error **errp);
 /* Must be set */
 GenericList *(*next_list)(Visitor *v, GenericList *element, Error **errp);
diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h
index 4638863..4eae633 100644
--- a/include/qapi/visitor.h
+++ b/include/qapi/visitor.h
@@ -31,6 +31,27 @@
  * visitor-impl.h.
  */

+/* All qapi types have a corresponding function with a signature
+ * roughly compatible with this:
+ *
+ * void visit_type_FOO(Visitor *v, void *obj, const char *name, Error **errp);
+ *
+ * where *@obj is itself a pointer or a scalar.  The visit functions for
+ * built-in types are declared here, while the functions for qapi-defined
+ * struct, union, enum, and list types are generated (see qapi-visit.h).
+ * Input visitors may receive an uninitialized *@obj, and guarantee a
+ * safe value is assigned (a new object on success, or NULL on failure).
+ * Output visitors expect *@obj to be properly initialized on entry.
+ *
+ * Additionally, all qapi structs have a generated function compatible
+ * with this:
+ *
+ * void qapi_free_FOO(void *obj);
+ *
+ * which behaves like free(), even if @obj is NULL or was only partially
+ * allocated before encountering an error.
+ */
+
 /* This struct is layout-compatible with all other *List structs
  * created by the qapi generator. */
 typedef struct GenericList
@@ -62,11 +83,12 @@ typedef struct GenericList
  * Set *@errp on failure; for example, if the input stream does not
  * have a member @name or if the member is not an object.
  *
- * FIXME: For input visitors, *@obj can be assigned here even if later
- * visits will fail; this can lead to memory leaks if clients aren't
- * careful.
+ * Returns true if *@obj