Re: [PATCH 28/52] migration/rdma: Check negative error values the same way everywhere

2023-10-05 Thread Markus Armbruster
Peter Xu  writes:

> On Thu, Oct 05, 2023 at 07:45:00AM +0200, Markus Armbruster wrote:
>> Peter Xu  writes:
>> 
>> > Sorry Zhijian, I missed this email.
>> >
>> > On Wed, Oct 04, 2023 at 06:32:14PM +0200, Juan Quintela wrote:
>> >> > * Avoid non-negative integer error values.
>> >
>> > Perhaps we need to forbid that if doing this.
>> >
>> > I can see Zhijian's point, where "if (ret)" can also capture unexpected
>> > positive returns, while "if (ret < 0)" is not clear on who's handling ret>0
>> > case.  Personally I like that, too.
>> 
>> It's clear either way :)
>> 
>> The problem is calling a function whose contract specifies "return 0 on
>> success, negative value on failure".
>> 
>> If it returns positive value, the contract is broken, and all bets are
>> off.
>> 
>> If you check the return value like
>> 
>> if (ret < 0) {
>> ... handle error and fail ...
>> }
>> ... carry on ...
>> 
>> then an unexpected positive value will clearly be treated as success.
>> 
>> If you check it like
>> 
>> if (ret) {
>> ... handle error and fail ...
>> }
>> ... carry on ...
>> 
>> then it will clearly be treated as failure.
>> 
>> But we don't know what it is!  Treating it as success can be wrong,
>> treating it as failure can be just as wrong.
>
> Right, IMHO the major difference is when there's a bug in the retval
> protocl of the API we're invoking.
>
> With "if (ret)" we capture that protocol bug, treating it as a failure (of
> that buggy API). With "if (ret<0)" we don't yet capture it, either
> everything will just keep working, or something weird happens later.  Not
> so predictable in this case.

I don't think misinterpreting a violation of the contract as failure is
safer than misinterpreting it as success.

Where we have reason to worry about contract violations, we should
assert() it's not violated.




Re: [PATCH 28/52] migration/rdma: Check negative error values the same way everywhere

2023-10-05 Thread Peter Xu
On Thu, Oct 05, 2023 at 07:45:00AM +0200, Markus Armbruster wrote:
> Peter Xu  writes:
> 
> > Sorry Zhijian, I missed this email.
> >
> > On Wed, Oct 04, 2023 at 06:32:14PM +0200, Juan Quintela wrote:
> >> > * Avoid non-negative integer error values.
> >
> > Perhaps we need to forbid that if doing this.
> >
> > I can see Zhijian's point, where "if (ret)" can also capture unexpected
> > positive returns, while "if (ret < 0)" is not clear on who's handling ret>0
> > case.  Personally I like that, too.
> 
> It's clear either way :)
> 
> The problem is calling a function whose contract specifies "return 0 on
> success, negative value on failure".
> 
> If it returns positive value, the contract is broken, and all bets are
> off.
> 
> If you check the return value like
> 
> if (ret < 0) {
> ... handle error and fail ...
> }
> ... carry on ...
> 
> then an unexpected positive value will clearly be treated as success.
> 
> If you check it like
> 
> if (ret) {
> ... handle error and fail ...
> }
> ... carry on ...
> 
> then it will clearly be treated as failure.
> 
> But we don't know what it is!  Treating it as success can be wrong,
> treating it as failure can be just as wrong.

Right, IMHO the major difference is when there's a bug in the retval
protocl of the API we're invoking.

With "if (ret)" we capture that protocol bug, treating it as a failure (of
that buggy API). With "if (ret<0)" we don't yet capture it, either
everything will just keep working, or something weird happens later.  Not
so predictable in this case.

Thanks,

-- 
Peter Xu




Re: [PATCH 28/52] migration/rdma: Check negative error values the same way everywhere

2023-10-04 Thread Markus Armbruster
Peter Xu  writes:

> Sorry Zhijian, I missed this email.
>
> On Wed, Oct 04, 2023 at 06:32:14PM +0200, Juan Quintela wrote:
>> > * Avoid non-negative integer error values.
>
> Perhaps we need to forbid that if doing this.
>
> I can see Zhijian's point, where "if (ret)" can also capture unexpected
> positive returns, while "if (ret < 0)" is not clear on who's handling ret>0
> case.  Personally I like that, too.

It's clear either way :)

The problem is calling a function whose contract specifies "return 0 on
success, negative value on failure".

If it returns positive value, the contract is broken, and all bets are
off.

If you check the return value like

if (ret < 0) {
... handle error and fail ...
}
... carry on ...

then an unexpected positive value will clearly be treated as success.

If you check it like

if (ret) {
... handle error and fail ...
}
... carry on ...

then it will clearly be treated as failure.

But we don't know what it is!  Treating it as success can be wrong,
treating it as failure can be just as wrong.

>> 3 - I fully agree that code is more maintenable this way.  I.e. if any
>> function changes to return positive values for non error, we get
>> better coverage.
>
> This patch does at least try to unify error handling, though..
>
> $ git grep "ret < 0" migration/rdma.c | wc -l
> 36
>
> So we have half doing this and half doing that, makes sense to do the same.
>
> Let's assume no vote from my side due to pros and cons, so it's 2:1. :)
>
> Thanks,

Since I'm not a maintainer here, my vote is advisory.




Re: [PATCH 28/52] migration/rdma: Check negative error values the same way everywhere

2023-10-04 Thread Peter Xu
Sorry Zhijian, I missed this email.

On Wed, Oct 04, 2023 at 06:32:14PM +0200, Juan Quintela wrote:
> > * Avoid non-negative integer error values.

Perhaps we need to forbid that if doing this.

I can see Zhijian's point, where "if (ret)" can also capture unexpected
positive returns, while "if (ret < 0)" is not clear on who's handling ret>0
case.  Personally I like that, too.

> 3 - I fully agree that code is more maintenable this way.  I.e. if any
> function changes to return positive values for non error, we get
> better coverage.

This patch does at least try to unify error handling, though..

$ git grep "ret < 0" migration/rdma.c | wc -l
36

So we have half doing this and half doing that, makes sense to do the same.

Let's assume no vote from my side due to pros and cons, so it's 2:1. :)

Thanks,

-- 
Peter Xu




Re: [PATCH 28/52] migration/rdma: Check negative error values the same way everywhere

2023-10-04 Thread Juan Quintela
Markus Armbruster  wrote:
> "Zhijian Li (Fujitsu)"  writes:
>
>> On 18/09/2023 22:41, Markus Armbruster wrote:
>>> When a function returns 0 on success, negative value on error,
>>> checking for non-zero suffices, but checking for negative is clearer.
>>> So do that.
>>> 
>>
>> This patch is no my favor convention.
>
> Certainly a matter of taste, which means maintainers get to decide, not
> me.
>
> Failure checks can be confusing in C.  Is
>
> if (foo(...))
>
> checking for success or for failure?  Impossible to tell.  If foo()
> returns a pointer, it almost certainly checks for success.  If it
> returns bool, likewise.  But if it returns an integer, it probably
> checks for failure.
>
> Getting a condition backwards is a common coding mistake.  Consider
> patch review of
>
> if (condition) {
> obviously the error case
> }
>
> Patch review is more likely to catch a backward condition when the
> condition's sense is locally obvious.
>
> Conventions can help.  Here's the one I like:
>
> * Check for a function's failure the same way everywhere.
>
> * When a function returns something "truthy" on success, and something
>   "falsy" on failure, use
>
> if (!fun(...))
>
>   Special cases:
>
>   - bool true on success, false on failure
>
>   - non-null pointer on success, null pointer on failure
>
> * When a function returns non-negative value on success, negative value
>   on failure, use
>
> if (fun(...) < 0)
>
> * Avoid non-negative integer error values.
>
> * Avoid if (fun(...)), because it's locally ambiguous.
>
>> @Peter, Juan
>>
>> I'd like to hear your opinions.
>
> Yes, please.

I agree with Markus here for three reasons:

1 - He is my C - lawyer of reference O-)

2 - He has done a lot of work cleaning the error handling on this file,
that was completely ugly.

3 - I fully agree that code is more maintenable this way.  I.e. if any
function changes to return positive values for non error, we get
better coverage.

Later, Juan.




Re: [PATCH 28/52] migration/rdma: Check negative error values the same way everywhere

2023-09-25 Thread Markus Armbruster
"Zhijian Li (Fujitsu)"  writes:

> On 18/09/2023 22:41, Markus Armbruster wrote:
>> When a function returns 0 on success, negative value on error,
>> checking for non-zero suffices, but checking for negative is clearer.
>> So do that.
>> 
>
> This patch is no my favor convention.

Certainly a matter of taste, which means maintainers get to decide, not
me.

Failure checks can be confusing in C.  Is

if (foo(...))

checking for success or for failure?  Impossible to tell.  If foo()
returns a pointer, it almost certainly checks for success.  If it
returns bool, likewise.  But if it returns an integer, it probably
checks for failure.

Getting a condition backwards is a common coding mistake.  Consider
patch review of

if (condition) {
obviously the error case
}

Patch review is more likely to catch a backward condition when the
condition's sense is locally obvious.

Conventions can help.  Here's the one I like:

* Check for a function's failure the same way everywhere.

* When a function returns something "truthy" on success, and something
  "falsy" on failure, use

if (!fun(...))

  Special cases:

  - bool true on success, false on failure

  - non-null pointer on success, null pointer on failure

* When a function returns non-negative value on success, negative value
  on failure, use

if (fun(...) < 0)

* Avoid non-negative integer error values.

* Avoid if (fun(...)), because it's locally ambiguous.

> @Peter, Juan
>
> I'd like to hear your opinions.

Yes, please.




Re: [PATCH 28/52] migration/rdma: Check negative error values the same way everywhere

2023-09-25 Thread Zhijian Li (Fujitsu)


On 18/09/2023 22:41, Markus Armbruster wrote:
> When a function returns 0 on success, negative value on error,
> checking for non-zero suffices, but checking for negative is clearer.
> So do that.
> 

This patch is no my favor convention.

@Peter, Juan

I'd like to hear your opinions.

Thanks
Zhijian


> Signed-off-by: Markus Armbruster 
> ---
>   migration/rdma.c | 82 
>   1 file changed, 41 insertions(+), 41 deletions(-)
> 
> diff --git a/migration/rdma.c b/migration/rdma.c
> index 62d95b7d2c..6c643a1b30 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -947,7 +947,7 @@ static int qemu_rdma_resolve_host(RDMAContext *rdma, 
> Error **errp)
>   
>   /* create CM id */
>   ret = rdma_create_id(rdma->channel, >cm_id, NULL, RDMA_PS_TCP);
> -if (ret) {
> +if (ret < 0) {
>   ERROR(errp, "could not create channel id");
>   goto err_resolve_create_id;
>   }
> @@ -968,10 +968,10 @@ static int qemu_rdma_resolve_host(RDMAContext *rdma, 
> Error **errp)
>   
>   ret = rdma_resolve_addr(rdma->cm_id, NULL, e->ai_dst_addr,
>   RDMA_RESOLVE_TIMEOUT_MS);
> -if (!ret) {
> +if (ret >= 0) {
>   if (e->ai_family == AF_INET6) {
>   ret = qemu_rdma_broken_ipv6_kernel(rdma->cm_id->verbs, 
> errp);
> -if (ret) {
> +if (ret < 0) {
>   continue;
>   }
>   }
> @@ -988,7 +988,7 @@ route:
>   qemu_rdma_dump_gid("source_resolve_addr", rdma->cm_id);
>   
>   ret = rdma_get_cm_event(rdma->channel, _event);
> -if (ret) {
> +if (ret < 0) {
>   ERROR(errp, "could not perform event_addr_resolved");
>   goto err_resolve_get_addr;
>   }
> @@ -1004,13 +1004,13 @@ route:
>   
>   /* resolve route */
>   ret = rdma_resolve_route(rdma->cm_id, RDMA_RESOLVE_TIMEOUT_MS);
> -if (ret) {
> +if (ret < 0) {
>   ERROR(errp, "could not resolve rdma route");
>   goto err_resolve_get_addr;
>   }
>   
>   ret = rdma_get_cm_event(rdma->channel, _event);
> -if (ret) {
> +if (ret < 0) {
>   ERROR(errp, "could not perform event_route_resolved");
>   goto err_resolve_get_addr;
>   }
> @@ -1118,7 +1118,7 @@ static int qemu_rdma_alloc_qp(RDMAContext *rdma)
>   attr.qp_type = IBV_QPT_RC;
>   
>   ret = rdma_create_qp(rdma->cm_id, rdma->pd, );
> -if (ret) {
> +if (ret < 0) {
>   return -1;
>   }
>   
> @@ -1551,7 +1551,7 @@ static int qemu_rdma_wait_comp_channel(RDMAContext 
> *rdma,
>   
>   if (pfds[1].revents) {
>   ret = rdma_get_cm_event(rdma->channel, _event);
> -if (ret) {
> +if (ret < 0) {
>   error_report("failed to get cm event while wait "
>"completion channel");
>   return -1;
> @@ -1652,12 +1652,12 @@ static int qemu_rdma_block_for_wrid(RDMAContext *rdma,
>   
>   while (1) {
>   ret = qemu_rdma_wait_comp_channel(rdma, ch);
> -if (ret) {
> +if (ret < 0) {
>   goto err_block_for_wrid;
>   }
>   
>   ret = ibv_get_cq_event(ch, , _ctx);
> -if (ret) {
> +if (ret < 0) {
>   perror("ibv_get_cq_event");
>   goto err_block_for_wrid;
>   }
> @@ -1888,7 +1888,7 @@ static int qemu_rdma_exchange_send(RDMAContext *rdma, 
> RDMAControlHeader *head,
>*/
>   if (resp) {
>   ret = qemu_rdma_post_recv_control(rdma, RDMA_WRID_DATA);
> -if (ret) {
> +if (ret < 0) {
>   error_report("rdma migration: error posting"
>   " extra control recv for anticipated result!");
>   return -1;
> @@ -1899,7 +1899,7 @@ static int qemu_rdma_exchange_send(RDMAContext *rdma, 
> RDMAControlHeader *head,
>* Post a WR to replace the one we just consumed for the READY message.
>*/
>   ret = qemu_rdma_post_recv_control(rdma, RDMA_WRID_READY);
> -if (ret) {
> +if (ret < 0) {
>   error_report("rdma migration: error posting first control recv!");
>   return -1;
>   }
> @@ -1986,7 +1986,7 @@ static int qemu_rdma_exchange_recv(RDMAContext *rdma, 
> RDMAControlHeader *head,
>* Post a new RECV work request to replace the one we just consumed.
>*/
>   ret = qemu_rdma_post_recv_control(rdma, RDMA_WRID_READY);
> -if (ret) {
> +if (ret < 0) {
>   error_report("rdma migration: error posting second control recv!");
>   return -1;
>   }
> @@ -2311,7 +2311,7 @@ static int qemu_rdma_write(QEMUFile *f, RDMAContext 
> *rdma,
>   /* If we cannot merge it, we flush the current buffer first. */
>   if (!qemu_rdma_buffer_mergable(rdma, current_addr, len)) {
>   ret = qemu_rdma_write_flush(f, rdma);