Re: [dm-devel] [PATCH] hex2bin: make the function hex_to_bin constant-time

2022-04-27 Thread David Laight
From: Linus Torvalds
> Sent: 24 April 2022 22:42
> 
> On Sun, Apr 24, 2022 at 2:37 PM Linus Torvalds
>  wrote:
> >
> > Finally, for the same reason - please don't use ">> 8".  Because I do
> > not believe that bit 8 is well-defined in your arithmetic. The *sign*
> > bit will be, but I'm not convinced bit 8 is.
> 
> Hmm.. I think it's ok. It can indeed overflow in 'char' and change the
> sign in bit #7, but I suspect bit #8 is always fine.
> 
> Still, If you want to just extend the sign bit, ">> 31" _is_ the
> obvious thing to use (yeah, yeah, properly "sizeof(int)*8-1" or
> whatever, you get my drift).

Except that right shifts of signed values are UB.
In particular it has always been valid to do an unsigned
shift right on a 2's compliment negative number.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH] hex2bin: make the function hex_to_bin constant-time

2022-04-27 Thread David Laight
From: Mikulas Patocka
> Sent: 25 April 2022 12:04
> 
> On Mon, 25 Apr 2022, David Laight wrote:
> 
> > From: Linus Torvalds
> > > Sent: 24 April 2022 22:42
> > >
> > > On Sun, Apr 24, 2022 at 2:37 PM Linus Torvalds
> > >  wrote:
> > > >
> > > > Finally, for the same reason - please don't use ">> 8".  Because I do
> > > > not believe that bit 8 is well-defined in your arithmetic. The *sign*
> > > > bit will be, but I'm not convinced bit 8 is.
> > >
> > > Hmm.. I think it's ok. It can indeed overflow in 'char' and change the
> > > sign in bit #7, but I suspect bit #8 is always fine.
> > >
> > > Still, If you want to just extend the sign bit, ">> 31" _is_ the
> > > obvious thing to use (yeah, yeah, properly "sizeof(int)*8-1" or
> > > whatever, you get my drift).
> >
> > Except that right shifts of signed values are UB.
> > In particular it has always been valid to do an unsigned
> > shift right on a 2's compliment negative number.
> >
> > David
> 
> Yes. All the standard versions (C89, C99, C11, C2X) say that right shift
> of a negative value is implementation-defined.
> 
> So, we should cast it to "unsigned" before shifting it.

Except that the intent appears to be to replicate the sign bit.

If it is 'implementation defined' (rather than suddenly being UB)
it might be that the linux kernel requires sign propagating
right shifts of negative values.
This is typically what happens on 2's compliment systems.
But not all small cpu have the required shift instruction.
OTOH all the ones bit enough to run Linux probably do.
(And gcc doesn't support '1's compliment' or 'sign overpunch' cpus.)

The problem is that the compiler writers seem to be entering
a mindset where they are optimising code based on UB behaviour.
So given:
void foo(int x)
{
if (x >> 1 < 0)
return;
do_something();
}
they decide the test is UB, so can always be assumed to be true
and thus do_something() is compiled away.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH] hex2bin: make the function hex_to_bin constant-time

2022-04-25 Thread Mikulas Patocka



On Mon, 25 Apr 2022, David Laight wrote:

> From: Mikulas Patocka
> > Sent: 25 April 2022 12:04
> > 
> > On Mon, 25 Apr 2022, David Laight wrote:
> > 
> > > From: Linus Torvalds
> > > > Sent: 24 April 2022 22:42
> > > >
> > > > On Sun, Apr 24, 2022 at 2:37 PM Linus Torvalds
> > > >  wrote:
> > > > >
> > > > > Finally, for the same reason - please don't use ">> 8".  Because I do
> > > > > not believe that bit 8 is well-defined in your arithmetic. The *sign*
> > > > > bit will be, but I'm not convinced bit 8 is.
> > > >
> > > > Hmm.. I think it's ok. It can indeed overflow in 'char' and change the
> > > > sign in bit #7, but I suspect bit #8 is always fine.
> > > >
> > > > Still, If you want to just extend the sign bit, ">> 31" _is_ the
> > > > obvious thing to use (yeah, yeah, properly "sizeof(int)*8-1" or
> > > > whatever, you get my drift).
> > >
> > > Except that right shifts of signed values are UB.
> > > In particular it has always been valid to do an unsigned
> > > shift right on a 2's compliment negative number.
> > >
> > >   David
> > 
> > Yes. All the standard versions (C89, C99, C11, C2X) say that right shift
> > of a negative value is implementation-defined.
> > 
> > So, we should cast it to "unsigned" before shifting it.
> 
> Except that the intent appears to be to replicate the sign bit.
> 
> If it is 'implementation defined' (rather than suddenly being UB)

The standard says "If E1 has a signed type and a negative value, the 
resulting value is implementation-defined."

So, it's not undefined behavior.

> it might be that the linux kernel requires sign propagating
> right shifts of negative values.

It may be that some code in the Linux kernel already assumes that right 
shifts keep the sign. It's hard to say if such code exists.

BTW. ubsan warns about left shift of negative values, but it doesn't warn 
about right shift of negative values.

> This is typically what happens on 2's compliment systems.
> But not all small cpu have the required shift instruction.
> OTOH all the ones bit enough to run Linux probably do.
> (And gcc doesn't support '1's compliment' or 'sign overpunch' cpus.)
> 
> The problem is that the compiler writers seem to be entering
> a mindset where they are optimising code based on UB behaviour.
> So given:
> void foo(int x)
> {
>   if (x >> 1 < 0)
>   return;
>   do_something();
> }
> they decide the test is UB, so can always be assumed to be true
> and thus do_something() is compiled away.
> 
>   David

If it's implementation-defined (rather than undefined), the compiler 
shouldn't do such optimization.

The linux kernel uses "-fno-strict-overflow" which disables some of these 
UB optimizations.

Mikulas
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH] hex2bin: make the function hex_to_bin constant-time

2022-04-25 Thread Mikulas Patocka



On Mon, 25 Apr 2022, David Laight wrote:

> From: Linus Torvalds
> > Sent: 24 April 2022 22:42
> > 
> > On Sun, Apr 24, 2022 at 2:37 PM Linus Torvalds
> >  wrote:
> > >
> > > Finally, for the same reason - please don't use ">> 8".  Because I do
> > > not believe that bit 8 is well-defined in your arithmetic. The *sign*
> > > bit will be, but I'm not convinced bit 8 is.
> > 
> > Hmm.. I think it's ok. It can indeed overflow in 'char' and change the
> > sign in bit #7, but I suspect bit #8 is always fine.
> > 
> > Still, If you want to just extend the sign bit, ">> 31" _is_ the
> > obvious thing to use (yeah, yeah, properly "sizeof(int)*8-1" or
> > whatever, you get my drift).
> 
> Except that right shifts of signed values are UB.
> In particular it has always been valid to do an unsigned
> shift right on a 2's compliment negative number.
> 
>   David

Yes. All the standard versions (C89, C99, C11, C2X) say that right shift 
of a negative value is implementation-defined.

So, we should cast it to "unsigned" before shifting it.

Mikulas
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH] hex2bin: make the function hex_to_bin constant-time

2022-04-24 Thread Linus Torvalds
On Sun, Apr 24, 2022 at 2:37 PM Linus Torvalds
 wrote:
>
> Finally, for the same reason - please don't use ">> 8".  Because I do
> not believe that bit 8 is well-defined in your arithmetic. The *sign*
> bit will be, but I'm not convinced bit 8 is.

Hmm.. I think it's ok. It can indeed overflow in 'char' and change the
sign in bit #7, but I suspect bit #8 is always fine.

Still, If you want to just extend the sign bit, ">> 31" _is_ the
obvious thing to use (yeah, yeah, properly "sizeof(int)*8-1" or
whatever, you get my drift).

   Linus

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH] hex2bin: make the function hex_to_bin constant-time

2022-04-24 Thread Linus Torvalds
On Sun, Apr 24, 2022 at 1:54 PM Mikulas Patocka  wrote:
>
> + *
> + * Explanation of the logic:
> + * (ch - '9' - 1) is negative if ch <= '9'
> + * ('0' - 1 - ch) is negative if ch >= '0'

True, but...

Please, just to make me happier, make the sign of 'ch' be something
very explicit. Right now that code uses 'char ch', which could be
signed or unsigned.

It doesn't really matter in this case, since the arithmetic will be
done in 'int', and as long as 'int' is larger than 'char' as a type
(to be really nit-picky), it all ends up working ok regardless.

But just to make me happier, and to make the algorithm actually do the
_same_ thing on every architecture, please use an explicit signedness
for that 'ch' type.

Because then that 'ch >= X' is well-defined.

Again - your code _works_. That's not what I worry about. But when
playing these kinds of tricks, please make it have the same behavior
across architectures, not just "the end result will be the same
regardless".

Yes, a 'ch' with the high bit set will always be either >= '0' or <=
'9', but note how *which* one it is depends on the exact type, and
'char' is simply not well-defined.

Finally, for the same reason - please don't use ">> 8".  Because I do
not believe that bit 8 is well-defined in your arithmetic. The *sign*
bit will be, but I'm not convinced bit 8 is.

So use ">> 31" or similar.

Also, I do worry that this is *exactly* the kind of trick that a
compiler could easily turn back into a conditional. Usually compilers
tend to go the other way (ie turn conditionals into arithmetic if
possible), but..

Linus

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH] hex2bin: make the function hex_to_bin constant-time

2022-04-24 Thread Joe Perches
On Sun, 2022-04-24 at 16:54 -0400, Mikulas Patocka wrote:
> This patch changes the function hex_to_bin so that it contains no branches
> and no memory accesses.
[]
> +++ linux-2.6/lib/hexdump.c   2022-04-24 18:51:20.0 +0200
[]
> + * the next line is similar to the previous one, but we need to decode both
> + *   uppercase and lowercase letters, so we use (ch & 0xdf), which converts
> + *   lowercase to uppercase
>   */
>  int hex_to_bin(char ch)
>  {
> - if ((ch >= '0') && (ch <= '9'))
> - return ch - '0';
> - ch = tolower(ch);
> - if ((ch >= 'a') && (ch <= 'f'))
> - return ch - 'a' + 10;
> - return -1;
> + return -1 +
> + ((ch - '0' + 1) & (((ch - '9' - 1) & ('0' - 1 - ch)) >> 8)) +
> + (((ch & 0xdf) - 'A' + 11) & ch & 0xdf) - 'F' - 1) & ('A' - 
> 1 - (ch & 0xdf))) >> 8));

probably easier to read using a temporary for ch & 0xdf

int CH = ch & 0xdf;

return -1 +
   ((ch - '0' +  1) & (((ch - '9' - 1) & ('0' - 1 - ch)) >> 8)) +
   ((CH - 'A' + 11) & (((CH - 'F' - 1) & ('A' - 1 - CH)) >> 8));


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel