warning in crypto_wait_for_test+0x84/0x88

2017-01-13 Thread Daniel Borkmann

Hello,

while booting a latest -net.git kernel on s390x test machine, I've hit this one
below, just in case it was not reported yet:

[...]
[3.317031] qeth 0.0.1000: Outbound TSO not supported on eth0
[3.317123] qeth 0.0.1000: MAC address 02:a1:11:0e:bf:f0 successfully 
registered on device eth0
[3.317127] qeth 0.0.1000: Device is a Virtual NIC QDIO card (level: V633)
with link type Virt.NIC QDIO.
[3.364637] qeth 0.0.1000 enccw0.0.1000: renamed from eth0
[3.374169] Adding 501740k swap on /dev/dasda3.  Priority:-1 extents:1 
across:501740k SSFS
[3.436378] EXT4-fs (dasdb1): mounted filesystem with ordered data mode. 
Opts: (null)
[3.459612] EXT4-fs (dasda1): mounted filesystem with ordered data mode. 
Opts: (null)
[   36.777259] random: crng init done
[  123.421672] device-mapper: uevent: version 1.0.3
[  123.421747] device-mapper: ioctl: 4.35.0-ioctl (2016-06-23) initialised: 
dm-de...@redhat.com
[  124.163728] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
[  184.741563] [ cut here ]
[  184.741579] WARNING: CPU: 1 PID: 537 at crypto/algapi.c:348 
crypto_wait_for_test+0x84/0x88
[  184.741581] Modules linked in: ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 
nf_defrag_ipv4 xt_conntrack nf_conntrack iptable_filter dm_mirror 
dm_region_hash dm_log dm_mod xts gf128mul aes_s390(+) des_s390 des_generic 
sha512_s390 qeth_l2 vmur nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables 
ext4 jbd2 mbcache dasd_eckd_mod dasd_mod qeth ccwgroup qdio
[  184.741598] CPU: 1 PID: 537 Comm: systemd-udevd Not tainted 4.10.0-rc3+ #4
[  184.741601] Hardware name: IBM  2964 NE1  716
  (z/VM)
[  184.741605] task: f544e200 task.stack: f419c000
[  184.741608] Krnl PSW : 0704f0018000 003cc484 
(crypto_wait_for_test+0x84/0x88)
[  184.741613]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 
RI:0 EA:3
[  184.741616] Krnl GPRS: 00058e00 fe00 0001 
0001
[  184.741619]006863e0 f3f00800  
03ff80460850
[  184.741633]03ff80460650 03ff80460618 03ff80460980 
f419fe88
[  184.741634] f4e40c00 003cc45a 
f419fb60
[  184.741644] Krnl Code: 003cc476: f073a7f40001srp 
2036(8,%r10),1,3
   003cc47c: a7f4fff6   brc 15,3cc468
  #003cc480: a7f40001   brc 15,3cc482
  >003cc484: a7f4fff2brc 15,3cc468
   003cc488: ebdff0800024   stmg%r13,%r15,128(%r15)
   003cc48e: a7f13fc0   tmll%r15,16320
   003cc492: b90400ef   lgr %r14,%r15
   003cc496: e3f0ffd0ff71   lay %r15,-48(%r15)
[  184.741658] Call Trace:
[  184.741659] ([<003cc45a>] crypto_wait_for_test+0x5a/0x88)
[  184.741661]  [<003cc7ee>] crypto_register_alg+0x76/0x88
[  184.741712]  [<03ff8045e490>] aes_s390_register_alg+0x28/0xb98 [aes_s390]
[  184.741722]  [<03ff80463116>] cpu_feature_match_MSA_init+0x116/0x1000 
[aes_s390]
[  184.741724]  [<00100276>] do_one_initcall+0x46/0x148
[  184.741729]  [<0024e264>] do_init_module+0x74/0x218
[  184.741735]  [<001cc586>] load_module+0x141e/0x1888
[  184.741737]  [<001ccbd0>] SyS_finit_module+0x98/0xd0
[  184.741741]  [<0068a1e6>] system_call+0xd6/0x25c
[  184.741742] Last Breaking-Event-Address:
[  184.741743]  [<003cc480>] crypto_wait_for_test+0x80/0x88
[  184.741744] ---[ end trace bf4d899231b60bca ]---
[  184.741765] [ cut here ]
[  184.741767] WARNING: CPU: 1 PID: 537 at crypto/algapi.c:342 
crypto_wait_for_test+0x7c/0x88
[  184.741768] Modules linked in: ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 
nf_defrag_ipv4 xt_conntrack nf_conntrack iptable_filter dm_mirror 
dm_region_hash dm_log dm_mod xts gf128mul aes_s390(+) des_s390 des_generic 
sha512_s390 qeth_l2 vmur nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables 
ext4 jbd2 mbcache dasd_eckd_mod dasd_mod qeth ccwgroup qdio
[  184.741775] CPU: 1 PID: 537 Comm: systemd-udevd Tainted: GW   
4.10.0-rc3+ #4
[  184.741776] Hardware name: IBM  2964 NE1  716
  (z/VM)
[  184.741777] task: f544e200 task.stack: f419c000
[  184.741778] Krnl PSW : 0704c0018000 003cc47c 
(crypto_wait_for_test+0x7c/0x88)
[  184.741780]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 
RI:0 EA:3
[  184.741781] Krnl GPRS: 0001 8001 0001 
0002
[  184.741783]003cc432 115e  
03ff80460850
[  184.741784]03ff80460650 03ff80460618 03ff80460980 
f419fe88
[  184.741785] f6f3d000 

Re: [PATCH v2 7/8] net: Rename TCA*BPF_DIGEST to ..._SHA256

2017-01-13 Thread Daniel Borkmann

On 01/11/2017 07:19 PM, Andy Lutomirski wrote:

On Wed, Jan 11, 2017 at 1:09 AM, Daniel Borkmann <dan...@iogearbox.net> wrote:

[...]

Ok. Sleeping over this a bit, how about a general rename into
"prog_tag" for fdinfo and TCA_BPF_TAG resp. TCA_ACT_BPF_TAG for
the netlink attributes, fwiw, it might reduce any assumptions on
this being made? If this would be preferable, I could cook that
patch against -net for renaming it?


That would be fine with me.

I think there are two reasonable approaches to computing the actual tag.

1. Use a standard, modern cryptographic hash.  SHA-256, SHA-512,
Blake2b, whatever.  SHA-1 is a bad choice in part because it's partly
broken and in part because the implementation in lib/ is a real mess
to use (as you noticed while writing the code).

2. Use whatever algorithm you like but make the tag so short that it's
obviously not collision-free.  48 or 64 bits is probably reasonable.

The intermediate versions are just asking for trouble.


Yeah agree, I've just sent a patch to rework this a bit and it got
also reasonably small for net. Cleanups, if needed, can be done in
net-next once that's pulled into it.

Thanks,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 7/8] net: Rename TCA*BPF_DIGEST to ..._SHA256

2017-01-11 Thread Daniel Borkmann

Hi Andy,

On 01/11/2017 04:11 AM, Andy Lutomirski wrote:

On Tue, Jan 10, 2017 at 4:50 PM, Daniel Borkmann <dan...@iogearbox.net> wrote:

On 01/11/2017 12:24 AM, Andy Lutomirski wrote:


This makes it easier to add another digest algorithm down the road if
needed.  It also serves to force any programs that might have been
written against a kernel that had the old field name to notice the
change and make any necessary changes.

This shouldn't violate any stable API policies, as no released kernel
has ever had TCA*BPF_DIGEST.


Imho, this and patch 6/8 is not really needed. Should there ever
another digest alg be used (doubt it), then you'd need a new nl
attribute and fdinfo line anyway to keep existing stuff intact.
Nobody made the claim that you can just change this underneath
and not respecting abi for existing applications when I read from
above that such apps now will get "forced" to notice a change.


Fair enough.  I was more concerned about prerelease iproute2 versions,
but maybe that's a nonissue.  I'll drop these two patches.


Ok. Sleeping over this a bit, how about a general rename into
"prog_tag" for fdinfo and TCA_BPF_TAG resp. TCA_ACT_BPF_TAG for
the netlink attributes, fwiw, it might reduce any assumptions on
this being made? If this would be preferable, I could cook that
patch against -net for renaming it?

Thanks,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 7/8] net: Rename TCA*BPF_DIGEST to ..._SHA256

2017-01-10 Thread Daniel Borkmann

On 01/11/2017 12:24 AM, Andy Lutomirski wrote:

This makes it easier to add another digest algorithm down the road if
needed.  It also serves to force any programs that might have been
written against a kernel that had the old field name to notice the
change and make any necessary changes.

This shouldn't violate any stable API policies, as no released kernel
has ever had TCA*BPF_DIGEST.


Imho, this and patch 6/8 is not really needed. Should there ever
another digest alg be used (doubt it), then you'd need a new nl
attribute and fdinfo line anyway to keep existing stuff intact.
Nobody made the claim that you can just change this underneath
and not respecting abi for existing applications when I read from
above that such apps now will get "forced" to notice a change.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 4.10 1/6] crypto/sha256: Refactor the API so it can be used without shash

2016-12-27 Thread Daniel Borkmann

On 12/27/2016 10:58 AM, Herbert Xu wrote:

On Mon, Dec 26, 2016 at 10:08:48AM -0800, Andy Lutomirski wrote:


According to Daniel, the networking folks want to let embedded systems
include BPF without requiring the crypto core.


Last I checked the IPv4 stack depended on the crypto API so this
sounds bogus.


I think there's a bit of a mixup here with what I said. To clarify,
requirement back then from tracing folks was that bpf engine and
therefore bpf syscall can be build w/o networking enabled for small
devices, so dependencies preferably need to be kept on a absolute
minimum, same counts for either making it suddenly a depend on
CRYPTO or a select CRYPTO for just those few lines that can be
pulled in from lib/ code instead.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 4.10 3/6] bpf: Use SHA256 instead of SHA1 for bpf digests

2016-12-24 Thread Daniel Borkmann

On 12/24/2016 03:22 AM, Andy Lutomirski wrote:

BPF digests are intended to be used to avoid reloading programs that
are already loaded.  For use cases (CRIU?) where untrusted programs
are involved, intentional hash collisions could cause the wrong BPF
program to execute.  Additionally, if BPF digests are ever used
in-kernel to skip verification, a hash collision could give privilege
escalation directly.


Just for the record, digests will never ever be used to skip the
verification step, so I don't know why this idea even comes up
here (?) or is part of the changelog? As this will never be done
anyway, rather drop that part so we can avoid confusion on this?

Wrt untrusted programs, I don't see much of a use on this facility
in general for them. Something like a tail call map would quite
likely only be private to the application. And again, I really doubt
we'll have something like user namespace support in the foreseeable
future. Anyway, that said, I don't really have a big issue if you
want to switch to sha256, though.


SHA1 is no longer considered adequately collision-resistant (see, for
example, all the major browsers dropping support for SHA1
certificates).  Use SHA256 instead.

I moved the digest field to keep all of the bpf program metadata in
the same cache line.

Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: Alexei Starovoitov <a...@kernel.org>
Signed-off-by: Andy Lutomirski <l...@kernel.org>


--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BPF hash algo (Re: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5)

2016-12-23 Thread Daniel Borkmann

On 12/23/2016 11:59 AM, Hannes Frederic Sowa wrote:

On Fri, 2016-12-23 at 11:04 +0100, Daniel Borkmann wrote:

On 12/22/2016 05:59 PM, Hannes Frederic Sowa wrote:

On Thu, 2016-12-22 at 08:07 -0800, Andy Lutomirski wrote:

[...]

The hashing is not a proper sha1 neither, unfortunately. I think that
is why it will have a custom implementation in iproute2?


Still trying to catch up on this admittedly bit confusing thread. I
did run automated tests over couple of days comparing the data I got
from fdinfo with the one from af_alg and found no mismatch on the test
cases varying from min to max possible program sizes. In the process
of testing, as you might have seen on netdev, I found couple of other
bugs in bpf code along the way and fixed them up as well. So my question,
do you or Andy or anyone participating in claiming this have any
concrete data or test cases that suggests something different? If yes,
I'm very curious to hear about it and willing fix it up, of course.
When I'm back from pto I'll prep and cook up my test suite to be
included into the selftests/bpf/, should have done this initially,
sorry about that. I'll also post something to expose the alg, that
sounds fine to me.


Looking into your code closer, I noticed that you indeed seem to do the
finalization of sha-1 by hand by aligning and padding the buffer
accordingly and also patching in the necessary payload length.

Apologies for my side for claiming that this is not correct sha1
output, I was only looking at sha_transform and its implementation and
couldn't see the padding and finalization round with embedding the data
length in there and hadn't thought of it being done manually.

Anyway, is it difficult to get the sha finalization into some common
code library? It is not very bpf specific and crypto code reviewers
won't find it there at all.


Yes, sure, I'll rework it that way (early next year when I'm back if
that's fine with you).

Thanks,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BPF hash algo (Re: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5)

2016-12-23 Thread Daniel Borkmann

On 12/22/2016 06:25 PM, Andy Lutomirski wrote:

On Thu, Dec 22, 2016 at 8:59 AM, Hannes Frederic Sowa

[...]

I wondered if bpf program loading should have used the module loading
infrastructure from the beginning...


That would be way too complicated and would be nasty for the unprivileged cases.


Also, there are users be it privileged or not that don't require to have a full
obj loader from user space, but are fine with just hard-coding parts or all of
the insns in their application. Back then we settled with using fds based on
Andy's suggestion, it has both ups and downs as we saw along the way but worked
okay thus far.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BPF hash algo (Re: [kernel-hardening] Re: [PATCH v7 3/6] random: use SipHash in place of MD5)

2016-12-23 Thread Daniel Borkmann

On 12/22/2016 05:59 PM, Hannes Frederic Sowa wrote:

On Thu, 2016-12-22 at 08:07 -0800, Andy Lutomirski wrote:

On Thu, Dec 22, 2016 at 7:51 AM, Hannes Frederic Sowa
<han...@stressinduktion.org> wrote:

On Thu, 2016-12-22 at 16:41 +0100, Jason A. Donenfeld wrote:

On Thu, Dec 22, 2016 at 4:33 PM, Hannes Frederic Sowa
<han...@stressinduktion.org> wrote:

IPv6 you cannot touch anymore. The hashing algorithm is part of uAPI.
You don't want to give people new IPv6 addresses with the same stable
secret (across reboots) after a kernel upgrade. Maybe they lose
connectivity then and it is extra work?


Ahh, too bad. So it goes.


If no other users survive we can put it into the ipv6 module.


The bpf hash stuff can be changed during this merge window, as it is
not yet in a released kernel. Albeit I would probably have preferred
something like sha256 here, which can be easily replicated by user
space tools (minus the problem of patching out references to not
hashable data, which must be zeroed).


Oh, interesting, so time is of the essence then. Do you want to handle
changing the new eBPF code to something not-SHA1 before it's too late,
as part of a ne


w patchset that can fast track itself to David? And

then I can preserve my large series for the next merge window.


This algorithm should be a non-seeded algorithm, because the hashes
should be stable and verifiable by user space tooling. Thus this would
need a hashing algorithm that is hardened against pre-image
attacks/collision resistance, which siphash is not. I would prefer some
higher order SHA algorithm for that actually.



You mean:

commit 7bd509e311f408f7a5132fcdde2069af65fa05ae
Author: Daniel Borkmann <dan...@iogearbox.net>
Date:   Sun Dec 4 23:19:41 2016 +0100

 bpf: add prog_digest and expose it via fdinfo/netlink


Yes, please!  This actually matters for security -- imagine a
malicious program brute-forcing a collision so that it gets loaded
wrong.  And this is IMO a use case for SHA-256 or SHA-512/256
(preferably the latter).  Speed basically doesn't matter here and
Blake2 is both less stable (didn't they slightly change it recently?)
and much less well studied.


We don't prevent ebpf programs being loaded based on the digest but
just to uniquely identify loaded programs from user space and match up
with their source.

There have been talks about signing bpf programs, thus this would
probably need another digest algorithm additionally to that one,
wasting probably instructions. Probably going somewhere in direction of
PKCS#7 might be the thing to do (which leads to the problem to make
PKCS#7 attackable by ordinary unpriv users, hmpf).


My inclination would have been to seed them with something that isn't
exposed to userspace for the precise reason that it would prevent user
code from making assumptions about what's in the hash.  But if there's
a use case for why user code needs to be able to calculate the hash on
its own, then that's fine.  But perhaps the actual fdinfo string
should be "sha256:abcd1234..." to give some flexibility down the road.

Also:

+   result = (__force __be32 *)fp->digest;
+   for (i = 0; i < SHA_DIGEST_WORDS; i++)
+   result[i] = cpu_to_be32(fp->digest[i]);

Everyone, please, please, please don't open-code crypto primitives.
Is this and the code above it even correct?  It might be but on a very
brief glance it looks wrong to me.  If you're doing this to avoid
depending on crypto, then fix crypto so you can pull in the algorithm
without pulling in the whole crypto core.


The hashing is not a proper sha1 neither, unfortunately. I think that
is why it will have a custom implementation in iproute2?


Still trying to catch up on this admittedly bit confusing thread. I
did run automated tests over couple of days comparing the data I got
from fdinfo with the one from af_alg and found no mismatch on the test
cases varying from min to max possible program sizes. In the process
of testing, as you might have seen on netdev, I found couple of other
bugs in bpf code along the way and fixed them up as well. So my question,
do you or Andy or anyone participating in claiming this have any
concrete data or test cases that suggests something different? If yes,
I'm very curious to hear about it and willing fix it up, of course.
When I'm back from pto I'll prep and cook up my test suite to be
included into the selftests/bpf/, should have done this initially,
sorry about that. I'll also post something to expose the alg, that
sounds fine to me.


I wondered if bpf program loading should have used the module loading
infrastructure from the beginning...


At the very least, there should be a separate function that calculates
the hash of a buffer and that function should explicitly run itself
against test vectors of various lengths to make sure that it's
calculating what it claims to be calculating.  And it doesn't look
like the userspace code has landed, so, if this thing i

[PATCH crypto-2.6] lib: make memzero_explicit more robust against dead store elimination

2015-04-28 Thread Daniel Borkmann
In commit 0b053c951829 (lib: memzero_explicit: use barrier instead
of OPTIMIZER_HIDE_VAR), we made memzero_explicit() more robust in
case LTO would decide to inline memzero_explicit() and eventually
find out it could be elimiated as dead store.

While using barrier() works well for the case of gcc, recent efforts
from LLVMLinux people suggest to use llvm as an alternative to gcc,
and there, Stephan found in a simple stand-alone user space example
that llvm could nevertheless optimize and thus elimitate the memset().
A similar issue has been observed in the referenced llvm bug report,
which is regarded as not-a-bug.

The fix in this patch now works for both compilers (also tested with
more aggressive optimization levels). Arguably, in the current kernel
tree it's more of a theoretical issue, but imho, it's better to be
pedantic about it.

It's clearly visible though, with the below code: if we would have
used barrier()-only here, llvm would have omitted clearing, not so
with barrier_data() variant:

  static inline void memzero_explicit(void *s, size_t count)
  {
memset(s, 0, count);
barrier_data(s);
  }

  int main(void)
  {
char buff[20];
memzero_explicit(buff, sizeof(buff));
return 0;
  }

  $ gcc -O2 test.c
  $ gdb a.out
  (gdb) disassemble main
  Dump of assembler code for function main:
   0x00400400  +0: lea   -0x28(%rsp),%rax
   0x00400405  +5: movq  $0x0,-0x28(%rsp)
   0x0040040e +14: movq  $0x0,-0x20(%rsp)
   0x00400417 +23: movl  $0x0,-0x18(%rsp)
   0x0040041f +31: xor   %eax,%eax
   0x00400421 +33: retq
  End of assembler dump.

  $ clang -O2 test.c
  $ gdb a.out
  (gdb) disassemble main
  Dump of assembler code for function main:
   0x004004f0  +0: xorps  %xmm0,%xmm0
   0x004004f3  +3: movaps %xmm0,-0x18(%rsp)
   0x004004f8  +8: movl   $0x0,-0x8(%rsp)
   0x00400500 +16: lea-0x18(%rsp),%rax
   0x00400505 +21: xor%eax,%eax
   0x00400507 +23: retq
  End of assembler dump.

As clang (but also icc) defines __GNUC__, it's sufficient to define this
in compiler-gcc.h only.

Reference: https://llvm.org/bugs/show_bug.cgi?id=15495
Reported-by: Stephan Mueller smuel...@chronox.de
Signed-off-by: Daniel Borkmann dan...@iogearbox.net
Cc: Theodore Ts'o ty...@mit.edu
Cc: Stephan Mueller smuel...@chronox.de
Cc: Hannes Frederic Sowa han...@stressinduktion.org
Cc: mancha security manc...@zoho.com
Cc: Mark Charlebois charl...@gmail.com
Cc: Behan Webster beh...@converseincode.com
---
 include/linux/compiler-gcc.h | 16 +++-
 include/linux/compiler.h |  4 
 lib/string.c |  2 +-
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index cdf13ca..371e560 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -9,10 +9,24 @@
   + __GNUC_MINOR__ * 100 \
   + __GNUC_PATCHLEVEL__)
 
-
 /* Optimization barrier */
+
 /* The volatile is due to gcc bugs */
 #define barrier() __asm__ __volatile__(: : :memory)
+/*
+ * This version is i.e. to prevent dead stores elimination on @ptr
+ * where gcc and llvm may behave differently when otherwise using
+ * normal barrier(): while gcc behavior gets along with a normal
+ * barrier(), llvm needs an explicit input variable to be assumed
+ * clobbered. The issue is as follows: while the inline asm might
+ * access any memory it wants, the compiler could have fit all of
+ * @ptr into memory registers instead, and since @ptr never escaped
+ * from that, it proofed that the inline asm wasn't touching any of
+ * it. This version works well with both compilers, i.e. we're telling
+ * the compiler that the inline asm absolutely may see the contents
+ * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
+ */
+#define barrier_data(ptr) __asm__ __volatile__(: :r(ptr) :memory)
 
 /*
  * This macro obfuscates arithmetic on a variable address so that gcc
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 0e41ca0..8677225 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -169,6 +169,10 @@ void ftrace_likely_update(struct ftrace_branch_data *f, 
int val, int expect);
 # define barrier() __memory_barrier()
 #endif
 
+#ifndef barrier_data
+# define barrier_data(ptr) barrier()
+#endif
+
 /* Unreachable code */
 #ifndef unreachable
 # define unreachable() do { } while (1)
diff --git a/lib/string.c b/lib/string.c
index a579201..bb3d4b6 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -607,7 +607,7 @@ EXPORT_SYMBOL(memset);
 void memzero_explicit(void *s, size_t count)
 {
memset(s, 0, count);
-   barrier();
+   barrier_data(s);
 }
 EXPORT_SYMBOL(memzero_explicit);
 
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo

Re: [BUG/PATCH] kernel RNG and its secrets

2015-04-27 Thread Daniel Borkmann

On 04/27/2015 10:41 PM, Stephan Mueller wrote:
...

It seems you have the code already in mind, so please if you could write it
:-)


Ok, sure. I'll cook something by tomorrow morning.

Cheers,
Daniel
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG/PATCH] kernel RNG and its secrets

2015-04-27 Thread Daniel Borkmann

On 04/27/2015 09:10 PM, Stephan Mueller wrote:
...

I posted the issue on the clang mailing list on April 10 -- no word so far. I
would interpret this as a sign that it is a no-issue for them.


Hm. ;)

Here's a bug report on the topic, gcc vs llvm:

  https://llvm.org/bugs/show_bug.cgi?id=15495

Lets add a new barrier macro to linux/compiler{,-gcc}.h, f.e.

  #define barrier_data(ptr) __asm__ __volatile__( : : r (ptr) : memory)

or the version Mancha proposed. You could wrap that ...

  #define OPTIMIZER_HIDE(ptr)   barrier_data(ptr)

... and use that one for memzero_explicit() instead:

  void memzero_explicit(void *s, size_t count)
  {
 memset(s, 0, count);
 OPTIMIZER_HIDE(s);
  }

It certainly needs comments explaining in what situations to use
which OPTIMIZER_HIDE* variants, etc.

Do you want to send a patch?
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: crypto: fips - Move fips_enabled sysctl into fips.c

2015-04-23 Thread Daniel Borkmann

On 04/22/2015 07:02 AM, Herbert Xu wrote:

There is currently a large ifdef FIPS code section in proc.c.
Ostensibly it's there because the fips_enabled sysctl sits under
/proc/sys/crypto.  However, no other crypto sysctls exist.

In fact, the whole ethos of the crypto API is against such user
interfaces so this patch moves all the FIPS sysctl code over to
fips.c.

Signed-off-by: Herbert Xu herb...@gondor.apana.org.au

...


  int fips_enabled;
  EXPORT_SYMBOL_GPL(fips_enabled);


While you're at it, this should also be marked as __read_mostly.

Thanks,
Daniel
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG/PATCH] kernel RNG and its secrets

2015-03-18 Thread Daniel Borkmann

[ Cc'ing Cesar ]

On 03/18/2015 10:53 AM, mancha wrote:

Hi.

The kernel RNG introduced memzero_explicit in d4c5efdb9777 to protect
memory cleansing against things like dead store optimization:

void memzero_explicit(void *s, size_t count)
{
memset(s, 0, count);
OPTIMIZER_HIDE_VAR(s);
}

OPTIMIZER_HIDE_VAR, introduced in fe8c8a126806 to protect crypto_memneq
against timing analysis, is defined when using gcc as:

#define OPTIMIZER_HIDE_VAR(var) __asm__ ( : =r (var) : 0 (var))

My tests with gcc 4.8.2 on x86 find it insufficient to prevent gcc from
optimizing out memset (i.e. secrets remain in memory).


Could you elaborate on your test case?

memzero_explicit() is actually an EXPORT_SYMBOL(), are you saying
that gcc removes the call to memzero_explicit() entirely, inlines
it, and then optimizes the memset() eventually away?

Last time I looked, it emitted a call to memzero_explicit(), and
inside memzero_explicit() it did the memset() as it cannot make
any assumption from there. I'm using gcc (GCC) 4.8.3 20140911
(Red Hat 4.8.3-7).


Two things that do work:

__asm__ __volatile__ ( : =r (var) : 0 (var))

and

__asm__ __volatile__(: : :memory)

The first is OPTIMIZER_HIDE_VAR plus a volatile qualifier and the second
is barrier() [as defined when using gcc].

I propose memzero_explicit use barrier().

--- a/lib/string.c
+++ b/lib/string.c
@@ -616,7 +616,7 @@ EXPORT_SYMBOL(memset);
  void memzero_explicit(void *s, size_t count)
  {
 memset(s, 0, count);
-   OPTIMIZER_HIDE_VAR(s);
+   barrier();
  }
  EXPORT_SYMBOL(memzero_explicit);

For any attribution deemed necessary, please use mancha security.
Please CC me on replies.

--mancha

PS CC'ing Herbert Xu in case this impacts crypto_memneq.



--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG/PATCH] kernel RNG and its secrets

2015-03-18 Thread Daniel Borkmann

On 03/18/2015 11:50 AM, Hannes Frederic Sowa wrote:



On Wed, Mar 18, 2015, at 10:53, mancha wrote:

Hi.

The kernel RNG introduced memzero_explicit in d4c5efdb9777 to protect
memory cleansing against things like dead store optimization:

void memzero_explicit(void *s, size_t count)
{
memset(s, 0, count);
OPTIMIZER_HIDE_VAR(s);
}

OPTIMIZER_HIDE_VAR, introduced in fe8c8a126806 to protect crypto_memneq
against timing analysis, is defined when using gcc as:

#define OPTIMIZER_HIDE_VAR(var) __asm__ ( : =r (var) : 0 (var))

My tests with gcc 4.8.2 on x86 find it insufficient to prevent gcc from
optimizing out memset (i.e. secrets remain in memory).

Two things that do work:

__asm__ __volatile__ ( : =r (var) : 0 (var))


You are correct, volatile signature should be added to
OPTIMIZER_HIDE_VAR. Because we use an output variable =r, gcc is
allowed to check if it is needed and may remove the asm statement.
Another option would be to just use var as an input variable - asm
blocks without output variables are always considered being volatile by
gcc.

Can you send a patch?

I don't think it is security critical, as Daniel pointed out, the call
will happen because the function is an external call to the crypto
functions, thus the compiler has to flush memory on return.


Just had a look.

$ gdb vmlinux
(gdb) disassemble memzero_explicit
Dump of assembler code for function memzero_explicit:
   0x813a18b0 +0:   push   %rbp
   0x813a18b1 +1:   mov%rsi,%rdx
   0x813a18b4 +4:   xor%esi,%esi
   0x813a18b6 +6:   mov%rsp,%rbp
   0x813a18b9 +9:   callq  0x813a7120 memset
   0x813a18be +14:  pop%rbp
   0x813a18bf +15:  retq
End of assembler dump.

(gdb) disassemble extract_entropy
[...]
   0x814a5000 +304: sub%r15,%rbx
   0x814a5003 +307: jne0x814a4f80 
extract_entropy+176
   0x814a5009 +313: mov%r12,%rdi
   0x814a500c +316: mov$0xa,%esi
   0x814a5011 +321: callq  0x813a18b0 memzero_explicit
   0x814a5016 +326: mov-0x48(%rbp),%rax
[...]

I would be fine with __volatile__.

Thanks a lot mancha, could you send a patch?

Best,
Daniel
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG/PATCH] kernel RNG and its secrets

2015-03-18 Thread Daniel Borkmann

On 03/18/2015 01:20 PM, Stephan Mueller wrote:

Am Mittwoch, 18. März 2015, 13:19:07 schrieb Hannes Frederic Sowa:

Hi Hannes,


On Wed, Mar 18, 2015, at 13:14, Stephan Mueller wrote:

Am Mittwoch, 18. März 2015, 13:02:12 schrieb Hannes Frederic Sowa:

Hi Hannes,


On Wed, Mar 18, 2015, at 12:09, Stephan Mueller wrote:

Am Mittwoch, 18. März 2015, 11:56:43 schrieb Daniel Borkmann:

On 03/18/2015 11:50 AM, Hannes Frederic Sowa wrote:

On Wed, Mar 18, 2015, at 10:53, mancha wrote:

Hi.

The kernel RNG introduced memzero_explicit in d4c5efdb9777 to
protect

memory cleansing against things like dead store optimization:
 void memzero_explicit(void *s, size_t count)
 {

 memset(s, 0, count);
 OPTIMIZER_HIDE_VAR(s);

 }

OPTIMIZER_HIDE_VAR, introduced in fe8c8a126806 to protect
crypto_memneq

against timing analysis, is defined when using gcc as:
 #define OPTIMIZER_HIDE_VAR(var) __asm__ ( : =r (var) :
 0
 (var))

My tests with gcc 4.8.2 on x86 find it insufficient to prevent
gcc
from optimizing out memset (i.e. secrets remain in memory).

Two things that do work:
 __asm__ __volatile__ ( : =r (var) : 0 (var))


You are correct, volatile signature should be added to
OPTIMIZER_HIDE_VAR. Because we use an output variable =r, gcc
is
allowed to check if it is needed and may remove the asm
statement.
Another option would be to just use var as an input variable -
asm
blocks without output variables are always considered being
volatile
by gcc.

Can you send a patch?

I don't think it is security critical, as Daniel pointed out,
the
call
will happen because the function is an external call to the
crypto
functions, thus the compiler has to flush memory on return.


Just had a look.

$ gdb vmlinux
(gdb) disassemble memzero_explicit

Dump of assembler code for function memzero_explicit:
0x813a18b0 +0:  push   %rbp
0x813a18b1 +1:  mov%rsi,%rdx
0x813a18b4 +4:  xor%esi,%esi
0x813a18b6 +6:  mov%rsp,%rbp
0x813a18b9 +9:  callq  0x813a7120


memset


0x813a18be +14: pop%rbp
0x813a18bf +15: retq

End of assembler dump.

(gdb) disassemble extract_entropy
[...]

0x814a5000 +304:sub%r15,%rbx
0x814a5003 +307:jne0x814a4f80

extract_entropy+176 0x814a5009 +313:mov%r12,%rdi

0x814a500c +316:mov$0xa,%esi
0x814a5011 +321:callq  0x813a18b0

memzero_explicit 0x814a5016 +326:   mov
-0x48(%rbp),%rax
[...]

I would be fine with __volatile__.


Are we sure that simply adding a __volatile__ works in any case? I
just did a test with a simple user space app:

static inline void memset_secure(void *s, int c, size_t n)
{

 memset(s, c, n);
 //__asm__ __volatile__(: : :memory);
 __asm__ __volatile__( : =r (s) : 0 (s));

}


Good point, thanks!

Of course an input or output of s does not force the memory pointed
to
by s being flushed.


My proposal would be to add a

#define OPTIMIZER_HIDE_MEM(ptr, len) __asm__ __volatile__ ( : :
m(
({ struct { u8 b[len]; } *p = (void *)ptr ; *p; }) )

and use this in the code function.

This is documented in gcc manual 6.43.2.5.


That one adds the zeroization instructuctions. But now there are much
more than with the barrier.

   400469:   48 c7 04 24 00 00 00movq   $0x0,(%rsp)
   400470:   00
   400471:   48 c7 44 24 08 00 00movq   $0x0,0x8(%rsp)
   400478:   00 00
   40047a:   c7 44 24 10 00 00 00movl   $0x0,0x10(%rsp)
   400481:   00
   400482:   48 c7 44 24 20 00 00movq   $0x0,0x20(%rsp)
   400489:   00 00
   40048b:   48 c7 44 24 28 00 00movq   $0x0,0x28(%rsp)
   400492:   00 00
   400494:   c7 44 24 30 00 00 00movl   $0x0,0x30(%rsp)
   40049b:   00

Any ideas?


Hmm, correct definition of u8?


I use unsigned char


Which version of gcc do you use? I can't see any difference if I
compile your example at -O2.


gcc-Version 4.9.2 20150212 (Red Hat 4.9.2-6) (GCC)


I can see the same with the gcc version I previously posted. So
it clears the 20 bytes from your example (movq, movq, movl) at
two locations, presumably buf[] and b[].

Best,
Daniel
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH -crypto] lib: memzero_explicit: use barrier instead of OPTIMIZER_HIDE_VAR

2015-03-18 Thread Daniel Borkmann
From: mancha security manc...@zoho.com

OPTIMIZER_HIDE_VAR(), as defined when using gcc, is insufficient to
ensure protection from dead store optimization.

For the random driver and crypto drivers, calls are emitted ...

  $ gdb vmlinux
  (gdb) disassemble memzero_explicit
  Dump of assembler code for function memzero_explicit:
0x813a18b0 +0:push   %rbp
0x813a18b1 +1:mov%rsi,%rdx
0x813a18b4 +4:xor%esi,%esi
0x813a18b6 +6:mov%rsp,%rbp
0x813a18b9 +9:callq  0x813a7120 memset
0x813a18be +14:   pop%rbp
0x813a18bf +15:   retq
  End of assembler dump.

  (gdb) disassemble extract_entropy
  [...]
0x814a5009 +313:  mov%r12,%rdi
0x814a500c +316:  mov$0xa,%esi
0x814a5011 +321:  callq  0x813a18b0 memzero_explicit
0x814a5016 +326:  mov-0x48(%rbp),%rax
  [...]

... but in case in future we might use facilities such as LTO, then
OPTIMIZER_HIDE_VAR() is not sufficient to protect gcc from a possible
eviction of the memset(). We have to use a compiler barrier instead.

Minimal test example when we assume memzero_explicit() would *not* be
a call, but would have been *inlined* instead:

  static inline void memzero_explicit(void *s, size_t count)
  {
memset(s, 0, count);
foo
  }

  int main(void)
  {
char buff[20];

snprintf(buff, sizeof(buff) - 1, test);
printf(%s, buff);

memzero_explicit(buff, sizeof(buff));
return 0;
  }

With foo := OPTIMIZER_HIDE_VAR():

  (gdb) disassemble main
  Dump of assembler code for function main:
  [...]
   0x00400464 +36:callq  0x400410 printf@plt
   0x00400469 +41:xor%eax,%eax
   0x0040046b +43:add$0x28,%rsp
   0x0040046f +47:retq
  End of assembler dump.

With foo := barrier():

  (gdb) disassemble main
  Dump of assembler code for function main:
  [...]
   0x00400464 +36:callq  0x400410 printf@plt
   0x00400469 +41:movq   $0x0,(%rsp)
   0x00400471 +49:movq   $0x0,0x8(%rsp)
   0x0040047a +58:movl   $0x0,0x10(%rsp)
   0x00400482 +66:xor%eax,%eax
   0x00400484 +68:add$0x28,%rsp
   0x00400488 +72:retq
  End of assembler dump.

As can be seen, movq, movq, movl are being emitted inlined
via memset().

Reference: http://thread.gmane.org/gmane.linux.kernel.cryptoapi/13764/
Fixes: d4c5efdb9777 (random: add and use memzero_explicit() for clearing data)
Cc: Hannes Frederic Sowa han...@stressinduktion.org
Cc: Stephan Mueller smuel...@chronox.de
Cc: Theodore Ts'o ty...@mit.edu
Signed-off-by: mancha security manc...@zoho.com
Signed-off-by: Daniel Borkmann dan...@iogearbox.net
---
 Sending to Herbert as crypto/random are the main users.
 Based against -crypto tree. Thanks!

 lib/string.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/string.c b/lib/string.c
index ce81aae..a579201 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -607,7 +607,7 @@ EXPORT_SYMBOL(memset);
 void memzero_explicit(void *s, size_t count)
 {
memset(s, 0, count);
-   OPTIMIZER_HIDE_VAR(s);
+   barrier();
 }
 EXPORT_SYMBOL(memzero_explicit);
 
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG/PATCH] kernel RNG and its secrets

2015-03-18 Thread Daniel Borkmann

On 03/18/2015 06:14 PM, mancha wrote:
...

Patch 0001 fixes the dead store issue in memzero_explicit().


Thanks! I have issued the fix for the memzero bug to Herbert in
your authorship as discussed, also giving some more context.

For the 2nd issue, lets wait for Cesar.

Thanks again!
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] lib: memzero_explicit: add comment for its usage

2015-01-05 Thread Daniel Borkmann
Lets improve the comment to add a note on when to use memzero_explicit()
for those not digging through the git logs. We don't want people to
pollute places with memzero_explicit() where it's not really necessary.

Reference: https://lkml.org/lkml/2015/1/4/190
Suggested-by: Herbert Xu herb...@gondor.apana.org.au
Signed-off-by: Daniel Borkmann dbork...@redhat.com
---
 [ Sending to -crypto as it's most relevant here and suggested by
   Herbert anyway. ]

 lib/string.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/lib/string.c b/lib/string.c
index 1006330..d984ec4 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -604,6 +604,11 @@ EXPORT_SYMBOL(memset);
  * @s: Pointer to the start of the area.
  * @count: The size of the area.
  *
+ * Note: usually using memset() is just fine (!), but in cases
+ * where clearing out _local_ data at the end of a scope is
+ * necessary, memzero_explicit() should be used instead in
+ * order to prevent the compiler from optimising away zeroing.
+ *
  * memzero_explicit() doesn't need an arch-specific version as
  * it just invokes the one of memset() implicitly.
  */
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] crypto: AF_ALG: add random number generator support

2014-11-12 Thread Daniel Borkmann

On 11/12/2014 08:05 AM, Stephan Mueller wrote:

This patch adds the random number generator support for AF_ALG.

A random number generator's purpose is to generate data without
requiring the caller to provide any data. Therefore, the AF_ALG
interface handler for RNGs only implements a callback handler for
recvmsg.

...

+static int rng_recvmsg(struct kiocb *unused, struct socket *sock,
+  struct msghdr *msg, size_t len, int flags)
+{
+   struct sock *sk = sock-sk;
+   struct alg_sock *ask = alg_sk(sk);
+   struct rng_ctx *ctx = ask-private;
+   int err = -EFAULT;
+
+   if (0 == len)


if (len == 0)
...

[And also other places.]

We don't use Yoda condition style in the kernel.


+   return 0;
+   if (MAXSIZE  len)
+   len = MAXSIZE;
+
+   lock_sock(sk);
+   len = crypto_rng_get_bytes(ctx-drng, ctx-result, len);
+   if (0  len)
+   goto unlock;
+
+   err = memcpy_toiovec(msg-msg_iov, ctx-result, len);
+   memset(ctx-result, 0, err);
+


This looks buggy.

If copy_to_user() fails from within memcpy_toiovec(), we call memset()
with a negative return value which is interpreted as size_t and thus
causes a buffer overflow writing beyond ctx-result, no?

If it succeeds, we call memset(ctx-result, 0, 0) .


+unlock:
+   release_sock(sk);
+
+   return err ? err : len;
+}
+
+static struct proto_ops algif_rng_ops = {
+   .family =   PF_ALG,
+
+   .connect=   sock_no_connect,
+   .socketpair =   sock_no_socketpair,
+   .getname=   sock_no_getname,
+   .ioctl  =   sock_no_ioctl,
+   .listen =   sock_no_listen,
+   .shutdown   =   sock_no_shutdown,
+   .getsockopt =   sock_no_getsockopt,
+   .mmap   =   sock_no_mmap,
+   .bind   =   sock_no_bind,
+   .accept =   sock_no_accept,
+   .setsockopt =   sock_no_setsockopt,
+   .poll   =   sock_no_poll,
+   .sendmsg=   sock_no_sendmsg,
+   .sendpage   =   sock_no_sendpage,
+
+   .release=   af_alg_release,
+   .recvmsg=   rng_recvmsg,
+};
+
+static void *rng_bind(const char *name, u32 type, u32 mask)
+{
+   return crypto_alloc_rng(name, type, mask);
+}
+
+static void rng_release(void *private)
+{
+   crypto_free_rng(private);
+}
+
+static void rng_sock_destruct(struct sock *sk)
+{
+   struct alg_sock *ask = alg_sk(sk);
+   struct rng_ctx *ctx = ask-private;
+
+   memset(ctx-result, 0, MAXSIZE);


memset(ctx-result, 0, sizeof(ctx-result));


+   sock_kfree_s(sk, ctx, ctx-len);
+   af_alg_release_parent(sk);
+}
+
+static int rng_accept_parent(void *private, struct sock *sk)
+{
+   struct rng_ctx *ctx;
+   struct alg_sock *ask = alg_sk(sk);
+   unsigned int len = sizeof(*ctx);
+   int seedsize = crypto_rng_seedsize(private);
+   int ret = -ENOMEM;
+
+   ctx = sock_kmalloc(sk, len, GFP_KERNEL);
+   if (!ctx)
+   return -ENOMEM;
+   memset(ctx-result, 0, MAXSIZE);


Ditto...


+   ctx-len = len;
+
+   if (seedsize) {
+   u8 *buf = kmalloc(seedsize, GFP_KERNEL);
+   if (!buf)
+   goto err;
+   get_random_bytes(buf, seedsize);
+   ret = crypto_rng_reset(private, buf, len);
+   kzfree(buf);


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] crypto: AF_ALG: add random number generator support

2014-11-12 Thread Daniel Borkmann

On 11/12/2014 05:54 PM, Stephan Mueller wrote:

Am Mittwoch, 12. November 2014, 17:15:52 schrieb Daniel Borkmann:

On 11/12/2014 08:05 AM, Stephan Mueller wrote:

This patch adds the random number generator support for AF_ALG.

A random number generator's purpose is to generate data without
requiring the caller to provide any data. Therefore, the AF_ALG
interface handler for RNGs only implements a callback handler for
recvmsg.


...


+static int rng_recvmsg(struct kiocb *unused, struct socket *sock,
+  struct msghdr *msg, size_t len, int flags)
+{
+   struct sock *sk = sock-sk;
+   struct alg_sock *ask = alg_sk(sk);
+   struct rng_ctx *ctx = ask-private;
+   int err = -EFAULT;
+
+   if (0 == len)


if (len == 0)
...

[And also other places.]

We don't use Yoda condition style in the kernel.


Well, there is a very good reason for using the approach I have: we all have
done the error of forgetting the second = sign.

In my case, the compiler will complain and we fix the error right away.

In your case, nobody is complaining but we introduced a nasty, potentially
hard to debug error. Thus, I very much like to keep my version just to be on
the safe side.

Note, there was even a backdoor I have seen where the missing 2nd equal sign
introduced a privilege escalation.

Therefore, my standard coding practice is to have a fixed value on the left
side and the variable on the right side of any comparison.


I understand, but then please add this proposal first into ...

  Documentation/CodingStyle

The problem is that while the rest of the kernel does not follow
this coding style, it's also much harder to read and/or program
this way for people not being used to. So the danger of bugs
slipping in this way is at least equally high. Besides that, this
argument would also only account for '==' checks.


+   return 0;
+   if (MAXSIZE  len)
+   len = MAXSIZE;
+
+   lock_sock(sk);
+   len = crypto_rng_get_bytes(ctx-drng, ctx-result, len);
+   if (0  len)
+   goto unlock;
+
+   err = memcpy_toiovec(msg-msg_iov, ctx-result, len);
+   memset(ctx-result, 0, err);
+


This looks buggy.

If copy_to_user() fails from within memcpy_toiovec(), we call memset()
with a negative return value which is interpreted as size_t and thus
causes a buffer overflow writing beyond ctx-result, no?

If it succeeds, we call memset(ctx-result, 0, 0) .


Right, good catch, I have to add a catch for negative error here.


Hm? Don't you rather mean to say to unconditionally do something like ...

  memzero_explicit(ctx-result, len);

...

+   memset(ctx-result, 0, MAXSIZE);


memset(ctx-result, 0, sizeof(ctx-result));


Ok, if this is desired, fine with me.


Yes, please.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] crypto: AF_ALG: add random number generator support

2014-11-12 Thread Daniel Borkmann

On 11/12/2014 06:46 PM, Stephan Mueller wrote:
...

* I unconditionally use the memset after memcpy as you indicated. Once
the cryptodev tree contains the memzero_explicit call, I will start
picking up that function.


Herbert merged it actually in this morning, so it's already part of
the cryptodev tree by now.


Essentially, I throught of the line you suggested.


Ok, thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: crypto: zeroization of sensitive data in af_alg

2014-11-11 Thread Daniel Borkmann

On 11/11/2014 05:16 AM, Stephan Mueller wrote:
...

That is a good idea.

Herbert: I can prepare a patch that uses memzero_explicit. However, your
current tree does not yet implement that function as it was added to Linus'
tree after you pulled from it.


Yep, Ted took it [1] on top of the random driver fix as discussed.

  [1] 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7185ad2672a7d50bc384de0e38d90b75d99f3d82
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] crypto: AF_ALG - zeroize message digest buffer

2014-11-11 Thread Daniel Borkmann

Hi Stephan,

On 11/11/2014 05:37 AM, Stephan Mueller wrote:

Zeroize the buffer holding the message digest calculated for the
consumer before the buffer is released by the hash AF_ALG interface
handler.

Signed-off-by: Stephan Mueller smuel...@chronox.de
---
  crypto/algif_hash.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index 8502462..f75db4c 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -258,6 +258,8 @@ static void hash_sock_destruct(struct sock *sk)
struct alg_sock *ask = alg_sk(sk);
struct hash_ctx *ctx = ask-private;

+   memzero_explicit(ctx-result,
+crypto_ahash_digestsize(crypto_ahash_reqtfm(ctx-req)));
sock_kfree_s(sk, ctx-result,


Perhaps something like this (alternatively kzfree() would work, too) ...

static void __sock_kfree_s(struct sock *sk, void *mem, int size,
   bool clear_mem)
{
if (WARN_ON_ONCE(!mem))
return;
if (clear_mem)
memzero_explicit(mem, size);
kfree(mem);
atomic_sub(size, sk-sk_omem_alloc);
}

void sock_kfree_s(struct sock *sk, void *mem, int size)
{
__sock_kfree_s(sk, mem, size, false);
}
EXPORT_SYMBOL(sock_kfree_s);

void sock_kzfree_s(struct sock *sk, void *mem, int size)
{
__sock_kfree_s(sk, mem, size, true);
}
EXPORT_SYMBOL(sock_kzfree_s);

... so you could then just use it as drop-in in various places:

sock_kzfree_s(sk, ctx-result, ...);


 crypto_ahash_digestsize(crypto_ahash_reqtfm(ctx-req)));
sock_kfree_s(sk, ctx, ctx-len);



Thanks,
Daniel
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: memset() in crypto code?

2014-10-08 Thread Daniel Borkmann

On 10/08/2014 04:30 AM, Sandy Harris wrote:

I have started a thread about this on the gcc help mailing list
https://gcc.gnu.org/ml/gcc-help/2014-10/msg00047.html


Great, perhaps you want to pass a patch proposal to gcc folks?


We might consider replacinging memzero_explicit with memset_s() since
that is in the C!! standard, albeit I think as optional. IBM, Apple,
NetBSD, ... have that.
https://mail-index.netbsd.org/tech-userlevel/2012/02/24/msg006125.html


The patch you point to for NetBSD does nothing else what memzero_explicit()
or bzero_explicit() variants already internally do, only that they're
discussing about whether to adopt it or not in their user space libc ...

Cheers,
Daniel
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: memset() in crypto code?

2014-10-06 Thread Daniel Borkmann

On 10/06/2014 08:52 PM, Sandy Harris wrote:

On Mon, Oct 6, 2014 at 1:44 PM, Jason Cooper ja...@lakedaemon.net wrote:

On Sat, Oct 04, 2014 at 11:09:40PM -0400, Sandy Harris wrote:

...

There was recently a patch to the random driver to replace memset()
because, according to the submitter, gcc sometimes optimises memset()
away ...



memzero_explicit() is a good start, ...


As I see it, memzero_explicit() is a rather ugly kluge, albeit an
acceptable one in the circumstances.


Right.


A real fix would make memset() do the right thing reliably; if the
programmer puts in memset( x, 0, nbytes) then the memory should be
cleared, no ifs or buts. I do not know or care if that means changes
in the compiler or in the library code or even both, but the fix
should make the standard library code work right, not require adding a
new function and expecting everyone to use it.


That would be a desirable goal, ideally perhaps as a built-in from
the compiler itself, just as memset(). Applications such as openssh
implement for the very same purpose their bzero_explicit() variant
just as well.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: memset() in crypto code?

2014-10-05 Thread Daniel Borkmann

Hi Sandy,

On 10/05/2014 05:09 AM, Sandy Harris wrote:

There was recently a patch to the random driver to replace memset()
because, according to the submitter, gcc sometimes optimises memset()
away which might leave data unnecessarily exposed. The solution
suggested was a function called memzero_explicit(). There was a fair
bit of discussion and the patch was accepted.

In the crypto directory of the kernel source I have:

$ grep memset *.c | wc -l
133
$

I strongly suspect some of these should be fixed.


I have submitted it here one month ago for crypto and it's still
awaiting to be applied:

  http://www.spinics.net/lists/linux-crypto/msg11965.html

As the random driver patch has been applied to random -dev, it will be
available from 3.18 onwards, but the dependency for crypto is currently
there, that's why I asked Ted to take it through his tree; hopefully
this will happen soonish (but I haven't heard anything back ever since) ...

Thanks!

Daniel
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] crypto: memzero_explicit - make sure to clear out sensitive data

2014-09-25 Thread Daniel Borkmann

Hi Ted,

On 09/15/2014 01:52 PM, Herbert Xu wrote:

On Sun, Sep 07, 2014 at 11:23:38PM +0200, Daniel Borkmann wrote:

Recently, in commit 13aa93c70e71 (random: add and use memzero_explicit()
for clearing data), we have found that GCC may optimize some memset()
cases away when it detects a stack variable is not being used anymore
and going out of scope. This can happen, for example, in cases when we
are clearing out sensitive information such as keying material or any
e.g. intermediate results from crypto computations, etc.

With the help of Coccinelle, we can figure out and fix such occurences
in the crypto subsytem as well. Julia Lawall provided the following
Coccinelle program:

   @@
   type T;
   identifier x;
   @@

   T x;
   ... when exists
   when any
   -memset
   +memzero_explicit
  (x,
   -0,
  ...)
   ... when != x
   when strict

   @@
   type T;
   identifier x;
   @@

   T x[...];
   ... when exists
   when any
   -memset
   +memzero_explicit
  (x,
   -0,
  ...)
   ... when != x
   when strict

Therefore, make use of the drop-in replacement memzero_explicit() for
exactly such cases instead of using memset().

Signed-off-by: Daniel Borkmann dbork...@redhat.com
Cc: Julia Lawall julia.law...@lip6.fr
Cc: Herbert Xu herb...@gondor.apana.org.au
Cc: Theodore Ts'o ty...@mit.edu
Cc: Hannes Frederic Sowa han...@stressinduktion.org


Acked-by: Herbert Xu herb...@gondor.apana.org.au

Thanks,


Sorry for the noise, but given Herbert acked the patch and in the random
tree the infrastructure is at current present to fix this security issue,
could you take it through random tree as originally proposed between the
--- and diffstat line of this patch [1]?

Thanks a lot,

Daniel

 [1] http://www.spinics.net/lists/linux-crypto/msg11965.html
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] crypto: memzero_explicit - make sure to clear out sensitive data

2014-09-07 Thread Daniel Borkmann
Recently, in commit 13aa93c70e71 (random: add and use memzero_explicit()
for clearing data), we have found that GCC may optimize some memset()
cases away when it detects a stack variable is not being used anymore
and going out of scope. This can happen, for example, in cases when we
are clearing out sensitive information such as keying material or any
e.g. intermediate results from crypto computations, etc.

With the help of Coccinelle, we can figure out and fix such occurences
in the crypto subsytem as well. Julia Lawall provided the following
Coccinelle program:

  @@
  type T;
  identifier s;
  @@

  T s[...];
  ... when any
  memset(s,...);
  ... when != s

Therefore, make use of the drop-in replacement memzero_explicit() for
exactly such cases instead of using memset().

Signed-off-by: Daniel Borkmann dbork...@redhat.com
Cc: Julia Lawall julia.law...@lip6.fr
Cc: Herbert Xu herb...@gondor.apana.org.au
Cc: Theodore Ts'o ty...@mit.edu
Cc: Hannes Frederic Sowa han...@stressinduktion.org
---
 Herbert, Ted, I have posted my original patch that introduces
 memzero_explicit() infrastructure along with its first users
 that fixes such occurences to Ted's random tree [1]. That means,
 this one last patch on that regard would be on top of it as an
 outcome of an audit in the crypto subsystem. Would the two of
 you be okay, if Ted takes it to random as well? Changes are
 minimal and uncomplicated, but I think it would be quite
 important to get fixed. Thanks!

 [1] 
https://git.kernel.org/cgit/linux/kernel/git/tytso/random.git/commit/?h=devid=13aa93c70e716b597d2451abee5e9f0b140b831c

 crypto/cts.c| 3 ++-
 crypto/sha1_generic.c   | 2 +-
 crypto/sha256_generic.c | 5 ++---
 crypto/sha512_generic.c | 2 +-
 crypto/tgr192.c | 4 ++--
 crypto/wp512.c  | 8 
 6 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/crypto/cts.c b/crypto/cts.c
index 042223f..133f087 100644
--- a/crypto/cts.c
+++ b/crypto/cts.c
@@ -202,7 +202,8 @@ static int cts_cbc_decrypt(struct crypto_cts_ctx *ctx,
/* 5. Append the tail (BB - Ln) bytes of Xn (tmp) to Cn to create En */
memcpy(s + bsize + lastn, tmp + lastn, bsize - lastn);
/* 6. Decrypt En to create Pn-1 */
-   memset(iv, 0, sizeof(iv));
+   memzero_explicit(iv, sizeof(iv));
+
sg_set_buf(sgsrc[0], s + bsize, bsize);
sg_set_buf(sgdst[0], d, bsize);
err = crypto_blkcipher_decrypt_iv(lcldesc, sgdst, sgsrc, bsize);
diff --git a/crypto/sha1_generic.c b/crypto/sha1_generic.c
index 4279480..7bb0474 100644
--- a/crypto/sha1_generic.c
+++ b/crypto/sha1_generic.c
@@ -64,7 +64,7 @@ int crypto_sha1_update(struct shash_desc *desc, const u8 
*data,
src = data + done;
} while (done + SHA1_BLOCK_SIZE = len);
 
-   memset(temp, 0, sizeof(temp));
+   memzero_explicit(temp, sizeof(temp));
partial = 0;
}
memcpy(sctx-buffer + partial, src, len - done);
diff --git a/crypto/sha256_generic.c b/crypto/sha256_generic.c
index 5433667..32c5e5e 100644
--- a/crypto/sha256_generic.c
+++ b/crypto/sha256_generic.c
@@ -210,10 +210,9 @@ static void sha256_transform(u32 *state, const u8 *input)
 
/* clear any sensitive info... */
a = b = c = d = e = f = g = h = t1 = t2 = 0;
-   memset(W, 0, 64 * sizeof(u32));
+   memzero_explicit(W, 64 * sizeof(u32));
 }
 
-
 static int sha224_init(struct shash_desc *desc)
 {
struct sha256_state *sctx = shash_desc_ctx(desc);
@@ -316,7 +315,7 @@ static int sha224_final(struct shash_desc *desc, u8 *hash)
sha256_final(desc, D);
 
memcpy(hash, D, SHA224_DIGEST_SIZE);
-   memset(D, 0, SHA256_DIGEST_SIZE);
+   memzero_explicit(D, SHA256_DIGEST_SIZE);
 
return 0;
 }
diff --git a/crypto/sha512_generic.c b/crypto/sha512_generic.c
index 6ed124f..04d295a 100644
--- a/crypto/sha512_generic.c
+++ b/crypto/sha512_generic.c
@@ -238,7 +238,7 @@ static int sha384_final(struct shash_desc *desc, u8 *hash)
sha512_final(desc, D);
 
memcpy(hash, D, 48);
-   memset(D, 0, 64);
+   memzero_explicit(D, 64);
 
return 0;
 }
diff --git a/crypto/tgr192.c b/crypto/tgr192.c
index 8740355..3c7af0d 100644
--- a/crypto/tgr192.c
+++ b/crypto/tgr192.c
@@ -612,7 +612,7 @@ static int tgr160_final(struct shash_desc *desc, u8 * out)
 
tgr192_final(desc, D);
memcpy(out, D, TGR160_DIGEST_SIZE);
-   memset(D, 0, TGR192_DIGEST_SIZE);
+   memzero_explicit(D, TGR192_DIGEST_SIZE);
 
return 0;
 }
@@ -623,7 +623,7 @@ static int tgr128_final(struct shash_desc *desc, u8 * out)
 
tgr192_final(desc, D);
memcpy(out, D, TGR128_DIGEST_SIZE);
-   memset(D, 0, TGR192_DIGEST_SIZE);
+   memzero_explicit(D, TGR192_DIGEST_SIZE);
 
return 0;
 }
diff --git a/crypto/wp512.c b/crypto/wp512.c
index 180f1d6..ec64e77 100644
--- a/crypto/wp512.c
+++ b/crypto/wp512.c
@@ -1102,8 +1102,8 @@ static int wp384_final

Re: [PATCH] crypto: memzero_explicit - make sure to clear out sensitive data

2014-09-07 Thread Daniel Borkmann

Hi Milan,

On 09/07/2014 07:15 PM, Milan Broz wrote:

On 09/07/2014 06:46 PM, Daniel Borkmann wrote:

Recently, in commit 13aa93c70e71 (random: add and use memzero_explicit()
for clearing data), we have found that GCC may optimize some memset()
cases away when it detects a stack variable is not being used anymore
and going out of scope. This can happen, for example, in cases when we
are clearing out sensitive information such as keying material or any
e.g. intermediate results from crypto computations, etc.


Hi,

do you plan to send patches also for other crypto code in kernel?
(I am almost sure we have the same pattern in dmcrypt.)

If not, I can do this for the dmcrypt part.


Yes, please feel free and go ahead.

I have checked random driver, crypto and networking subsystem. With this
patch that I've sent here, these three are covered (in networking, there
was no such candidate, just one false positive in Bluetooth). But if you
find other areas with a similar case, feel free to go ahead and fix it,
very much appreciated.

Thanks  best,
Daniel


Milan


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH cryptodev] crypto: arch - use crypto_memneq instead of memcmp

2013-12-11 Thread Daniel Borkmann
Replace remaining occurences (just as we did in crypto/) under arch/*/crypto/
that make use of memcmp() for comparing keys or authentication tags for
usage with crypto_memneq(). It can simply be used as a drop-in replacement
for the normal memcmp().

Signed-off-by: Daniel Borkmann dbork...@redhat.com
Cc: James Yonan ja...@openvpn.net
---
 arch/s390/crypto/des_s390.c| 6 +++---
 arch/x86/crypto/aesni-intel_glue.c | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/s390/crypto/des_s390.c b/arch/s390/crypto/des_s390.c
index bcca01c..200f2a1 100644
--- a/arch/s390/crypto/des_s390.c
+++ b/arch/s390/crypto/des_s390.c
@@ -237,9 +237,9 @@ static int des3_setkey(struct crypto_tfm *tfm, const u8 
*key,
struct s390_des_ctx *ctx = crypto_tfm_ctx(tfm);
u32 *flags = tfm-crt_flags;
 
-   if (!(memcmp(key, key[DES_KEY_SIZE], DES_KEY_SIZE) 
-   memcmp(key[DES_KEY_SIZE], key[DES_KEY_SIZE * 2],
-  DES_KEY_SIZE)) 
+   if (!(crypto_memneq(key, key[DES_KEY_SIZE], DES_KEY_SIZE) 
+   crypto_memneq(key[DES_KEY_SIZE], key[DES_KEY_SIZE * 2],
+ DES_KEY_SIZE)) 
(*flags  CRYPTO_TFM_REQ_WEAK_KEY)) {
*flags |= CRYPTO_TFM_RES_WEAK_KEY;
return -EINVAL;
diff --git a/arch/x86/crypto/aesni-intel_glue.c 
b/arch/x86/crypto/aesni-intel_glue.c
index 835488b..aba34b8 100644
--- a/arch/x86/crypto/aesni-intel_glue.c
+++ b/arch/x86/crypto/aesni-intel_glue.c
@@ -1001,7 +1001,7 @@ static int __driver_rfc4106_decrypt(struct aead_request 
*req)
authTag, auth_tag_len);
 
/* Compare generated tag with passed in tag. */
-   retval = memcmp(src + tempCipherLen, authTag, auth_tag_len) ?
+   retval = crypto_memneq(src + tempCipherLen, authTag, auth_tag_len) ?
-EBADMSG : 0;
 
if (one_entry_in_sg) {
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH cryptodev] crypto: memneq: fix for archs without efficient unaligned access

2013-12-05 Thread Daniel Borkmann
Commit fe8c8a126806 introduced a possible build error for archs
that do not have CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS set. :/
Fix this up by bringing else braces outside of the ifdef.

Reported-by: Fengguang Wu fengguang...@intel.com
Fixes: fe8c8a126806 (crypto: more robust crypto_memneq)
Cc: Cesar Eduardo Barros ces...@cesarb.eti.br
Signed-off-by: Daniel Borkmann dbork...@redhat.com
---
 Sending stand-alone fix as original patch is already applied.

 crypto/memneq.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/crypto/memneq.c b/crypto/memneq.c
index 570f6f3..afed1bd 100644
--- a/crypto/memneq.c
+++ b/crypto/memneq.c
@@ -108,8 +108,9 @@ static inline unsigned long __crypto_memneq_16(const void 
*a, const void *b)
OPTIMIZER_HIDE_VAR(neq);
neq |= *(unsigned int *)(a+12) ^ *(unsigned int *)(b+12);
OPTIMIZER_HIDE_VAR(neq);
-   } else {
+   } else
 #endif /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */
+   {
neq |= *(unsigned char *)(a)^ *(unsigned char *)(b);
OPTIMIZER_HIDE_VAR(neq);
neq |= *(unsigned char *)(a+1)  ^ *(unsigned char *)(b+1);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] crypto: more robust crypto_memneq

2013-11-27 Thread Daniel Borkmann

On 11/26/2013 10:44 PM, Cesar Eduardo Barros wrote:

Em 26-11-2013 17:27, Daniel Borkmann escreveu:

On 11/26/2013 01:00 AM, Cesar Eduardo Barros wrote:

Compile-tested on x86_64.


Actually with yet another version, I hoped that the compile-tested-only
statement would eventually disappear, ohh well. ;)


I did compile test each version ;-) including verifying (with make 
crypto/memneq.i) that the macro was really generating the expected inline assembly 
(with these #ifdef chains, one has to be careful with typos).

(Actually, I compile tested with make crypto/memneq.o crypto/memneq.s 
crypto/memneq.i. I took a peek at the assembly to see if it made sense.)


Resolving the OPTIMIZER_HIDE_VAR() macro for others than GCC jnto a
barrier() seems a bit suboptimal, but assuming 99% of people will use
GCC anyway, then for the minority of the remaining, they will worst case
have a clever compiler and eventually mimic memcmp() in some situations,
or have a not-so-clever compiler and execute the full code as is.


I do not think any compiler other than gcc and icc can compile unmodified 
upstream kernel code. LLVM's clang would be the one which comes closest, but it 
has gcc-compatible inline assembly, as does icc AFAIK.

The #define to barrier() within compiler-intel.h is for some compiler called 
ECC (not icc). From what I could find about it on a quick search, it appears to 
be some kind of Itanium compiler.

That part of the header was added back in 2003, and I do not believe it is still 
relevant. A comment within that #ifdef block says Intel ECC compiler doesn't 
support gcc specific asm stmts, but there are many uses of unprotected inline 
assembly all over the kernel (including on the ia64 headers), so if that comment is true, 
the kernel will not compile with that compiler. It is probably a piece of leftover dead 
code. I only added to it because I am following RELOC_HIDE's example, and RELOC_HIDE is 
there.


Yep.

Acked-by: Daniel Borkmann dbork...@redhat.com


Anyway, I think still better than the rather ugly Makefile workaround
imho, so I'm generally fine with this.

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] crypto: more robust crypto_memneq

2013-11-26 Thread Daniel Borkmann

On 11/26/2013 01:00 AM, Cesar Eduardo Barros wrote:

Disabling compiler optimizations can be fragile, since a new
optimization could be added to -O0 or -Os that breaks the assumptions
the code is making.

Instead of disabling compiler optimizations, use a dummy inline assembly
(based on RELOC_HIDE) to block the problematic kinds of optimization,
while still allowing other optimizations to be applied to the code.

The dummy inline assembly is added after every OR, and has the
accumulator variable as its input and output. The compiler is forced to
assume that the dummy inline assembly could both depend on the
accumulator variable and change the accumulator variable, so it is
forced to compute the value correctly before the inline assembly, and
cannot assume anything about its value after the inline assembly.

This change should be enough to make crypto_memneq work correctly (with
data-independent timing) even if it is inlined at its call sites. That
can be done later in a followup patch.

Compile-tested on x86_64.


Actually with yet another version, I hoped that the compile-tested-only
statement would eventually disappear, ohh well. ;)


Signed-off-by: Cesar Eduardo Barros ces...@cesarb.eti.br


Resolving the OPTIMIZER_HIDE_VAR() macro for others than GCC jnto a
barrier() seems a bit suboptimal, but assuming 99% of people will use
GCC anyway, then for the minority of the remaining, they will worst case
have a clever compiler and eventually mimic memcmp() in some situations,
or have a not-so-clever compiler and execute the full code as is.

Anyway, I think still better than the rather ugly Makefile workaround
imho, so I'm generally fine with this.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] crypto: more robust crypto_memneq

2013-11-25 Thread Daniel Borkmann

On 11/25/2013 04:59 PM, James Yonan wrote:

On 24/11/2013 14:12, Cesar Eduardo Barros wrote:

Disabling compiler optimizations can be fragile, since a new
optimization could be added to -O0 or -Os that breaks the assumptions
the code is making.

Instead of disabling compiler optimizations, use a dummy inline assembly
(based on RELOC_HIDE) to block the problematic kinds of optimization,
while still allowing other optimizations to be applied to the code.

The dummy inline assembly is added after every OR, and has the
accumulator variable as its input and output. The compiler is forced to
assume that the dummy inline assembly could both depend on the
accumulator variable and change the accumulator variable, so it is
forced to compute the value correctly before the inline assembly, and
cannot assume anything about its value after the inline assembly.

This change should be enough to make crypto_memneq work correctly (with
data-independent timing) even if it is inlined at its call sites. That
can be done later in a followup patch.

Compile-tested on x86_64.

Signed-off-by: Cesar Eduardo Barros ces...@cesarb.eti.br


This approach using __asm__ ( : =r (var) : 0 (var)) to try to prevent 
compiler optimizations of var is interesting.

I like the fact that it's finer-grained than -Os and doesn't preclude inlining.


Agreed. This looks much better than the Makefile workaround. Do we have
a hard guarantee that in future, this will not be detected and optimized
away by the compiler?

Otherwise, works fine, e.g.:
int main(void)
{
int foo = 5;
__asm__ __volatile__ ( : =r (foo) : 0 (foo));
if (foo == 5)
return 1;
else
return 0;
}

gcc -O2 -Wall foo.c, w/ asm code:
Dump of assembler code for function main:
   0x00400390 +0:   mov$0x5,%eax
   0x00400395 +5:   cmp$0x5,%eax
   0x00400398 +8:   sete   %al
   0x0040039b +11:  movzbl %al,%eax
   0x0040039e +14:  retq

gcc -O2 -Wall foo.c, w/o asm code:
Dump of assembler code for function main:
   0x00400390 +0:   mov$0x1,%eax
   0x00400395 +5:   retq


One concern would be that __asm__ could be optimized out unless __volatile__ is 
present.

James

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] crypto: more robust crypto_memneq

2013-11-25 Thread Daniel Borkmann

On 11/25/2013 11:35 PM, Cesar Eduardo Barros wrote:

Disabling compiler optimizations can be fragile, since a new
optimization could be added to -O0 or -Os that breaks the assumptions
the code is making.

Instead of disabling compiler optimizations, use a dummy inline assembly
(based on RELOC_HIDE) to block the problematic kinds of optimization,
while still allowing other optimizations to be applied to the code.

The dummy inline assembly is added after every OR, and has the
accumulator variable as its input and output. The compiler is forced to
assume that the dummy inline assembly could both depend on the
accumulator variable and change the accumulator variable, so it is
forced to compute the value correctly before the inline assembly, and
cannot assume anything about its value after the inline assembly.

This change should be enough to make crypto_memneq work correctly (with
data-independent timing) even if it is inlined at its call sites. That
can be done later in a followup patch.

Compile-tested on x86_64.

Signed-off-by: Cesar Eduardo Barros ces...@cesarb.eti.br


Acked-by: Daniel Borkmann dbork...@redhat.com

Looks good, thanks!
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] crypto_mem_not_equal: add constant-time equality testing of memory regions

2013-09-19 Thread Daniel Borkmann

On 09/19/2013 02:13 AM, James Yonan wrote:
[...]

We can easily specify -Os in the Makefile rather than depending on #pragma 
optimize or __attribute__ optimize if they are considered broken.

Re: arch/*/crypto/... asm, not sure it's worth it given the extra effort to 
develop, test, and maintain asm for all archs.  The two things we care about 
(constant time and performance) seem readily achievable in C.

Regarding O0 vs. Os, I would tend to prefer Os because it's much faster than 
O0, but still carries the desirable property that optimizations that increase 
code size are disabled.  It seems that short-circuit optimizations would be 
disabled by this, since by definition a short-circuit optimization requires the 
addition of a compare and branch.


Ok, if we can make sure that this would overwrite global defaults in any 
circumstances,
then that approach should be fine, imho.

I would suggest that you use the crypto_mem_not_equal() function that you 
originally had
or that I was proposing, and still allow the possibility for an arch optimized 
version,
if people want to.

In that way, it can be kept simple and stupid and easy to review, just like all 
other
util functions such as memcmp etc is implemented in [1].

 [1] http://lingrok.org/xref/linux-net-next/lib/string.c#643
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] crypto_mem_not_equal: add constant-time equality testing of memory regions

2013-09-17 Thread Daniel Borkmann

On 09/16/2013 07:10 PM, James Yonan wrote:

On 16/09/2013 01:56, Daniel Borkmann wrote:

On 09/15/2013 06:59 PM, James Yonan wrote:

On 15/09/2013 09:45, Florian Weimer wrote:

* James Yonan:


+ * Constant-time equality testing of memory regions.
+ * Returns 0 when data is equal, non-zero otherwise.
+ * Fast path if size == 16.
+ */
+noinline unsigned long crypto_mem_not_equal(const void *a, const
void *b, size_t size)


I think this should really return unsigned or int, to reduce the risk
that the upper bytes are truncated because the caller uses an
inappropriate type, resulting in a bogus zero result.  Reducing the
value to 0/1 probably doesn't hurt performance too much.  It also
doesn't encode any information about the location of the difference in
the result value, which helps if that ever leaks.


The problem with returning 0/1 within the function body of
crypto_mem_not_equal is that it makes it easier for the compiler to
introduce a short-circuit optimization.

It might be better to move the test where the result is compared
against 0 into an inline function:

noinline unsigned long __crypto_mem_not_equal(const void *a, const
void *b, size_t size);

static inline int crypto_mem_not_equal(const void *a, const void *b,
size_t size) {
 return __crypto_mem_not_equal(a, b, size) != 0UL ? 1 : 0;
}

This hides the fact that we are only interested in a boolean result
from the compiler when it's compiling crypto_mem_not_equal.c, but also
ensures type safety when users test the return value.  It's also
likely to have little or no performance impact.


Well, the code snippet I've provided from NaCl [1] is not really
fast-path
as you say, but rather to prevent the compiler from doing such
optimizations
by having a transformation of the accumulated bits into 0 and 1 as an end
result (likely to prevent a short circuit), plus it has static size, so no
loops applied here that could screw up.

Variable size could be done under arch/ in asm, and if not available, that
just falls back to normal memcmp that is being transformed into a same
return
value. By that, all other archs could easily migrate afterwards. What do
you
think?

  [1] http://www.spinics.net/lists/linux-crypto/msg09558.html


I'm not sure that the differentbits - 0/-1 transform in NaCl really buys us 
anything because

 we don't care very much about making the final test of differentbits != 0 
constant-time.  An
 attacker already knows whether the test succeeded or failed -- we care more 
about making the
 failure cases constant-time.


To do this, we need to make sure that the compiler doesn't insert one or more 
early instructions

 to compare differentbits with 0xFF and then bypass the rest of the F(n) lines 
because it knows
 then that the value of differentbits cannot be changed by subsequent F(n) 
lines.  It seems that
 all of the approaches that use |= to build up the differentbits value suffer 
from this problem.


My proposed solution is rather than trying to outsmart the compiler with code 
that resists

 optimization, why not just turn optimization off directly with #pragma GCC 
optimize.  Or better
 yet, use an optimization setting that provides reasonable speed without 
introducing potential
 short-circuit optimizations.


By optimizing for size (Os), the compiler will need to turn off optimizations 
that add code

 for a possible speed benefit, and the kind of short-circuit optimizations 
that we are trying to
 avoid fall precisely into this class -- they add an instruction to check if 
the OR accumulator has
 all of its bits set, then if so, do an early exit.  So by using Os, we still 
benefit from
 optimizations that increase speed but don't increase code size.

While I was looking a bit further into this, I thought using an attribute like 
this might be a
more clean variant ...

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 92669cd..2505b1b 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -351,6 +351,11 @@ void ftrace_likely_update(struct ftrace_branch_data *f, 
int val, int expect);
  */
 #define ACCESS_ONCE(x) (*(volatile typeof(x) *)(x))

+/* Tell GCC to turn off optimization for a particular function. */
+#ifndef __do_not_optimize
+#define __do_not_optimize  __attribute__((optimize(O0)))
+#endif
+
 /* Ignore/forbid kprobes attach on very low level functions marked by this 
attribute: */
 #ifdef CONFIG_KPROBES
 # define __kprobes __attribute__((__section__(.kprobes.text)))

... however, then I found on the GCC developer mailing list [1]: That said, I 
consider the
optimize attribute code seriously broken and unmaintained (but sometimes useful 
for debugging -
and only that). [...] Does #pragma Gcc optimize work more reliably? No, it 
uses the same
mechanism internally. [...] And if these options are so broken, should they be 
marked as such
in the manual? [...] Probably yes.

Therefore, I still need to ask ... what about the option of an 
arch/*/crypto/... asm

Re: [PATCH] crypto_mem_not_equal: add constant-time equality testing of memory regions

2013-09-16 Thread Daniel Borkmann

On 09/15/2013 06:59 PM, James Yonan wrote:

On 15/09/2013 09:45, Florian Weimer wrote:

* James Yonan:


+ * Constant-time equality testing of memory regions.
+ * Returns 0 when data is equal, non-zero otherwise.
+ * Fast path if size == 16.
+ */
+noinline unsigned long crypto_mem_not_equal(const void *a, const void *b, 
size_t size)


I think this should really return unsigned or int, to reduce the risk
that the upper bytes are truncated because the caller uses an
inappropriate type, resulting in a bogus zero result.  Reducing the
value to 0/1 probably doesn't hurt performance too much.  It also
doesn't encode any information about the location of the difference in
the result value, which helps if that ever leaks.


The problem with returning 0/1 within the function body of crypto_mem_not_equal 
is that it makes it easier for the compiler to introduce a short-circuit 
optimization.

It might be better to move the test where the result is compared against 0 into 
an inline function:

noinline unsigned long __crypto_mem_not_equal(const void *a, const void *b, 
size_t size);

static inline int crypto_mem_not_equal(const void *a, const void *b, size_t 
size) {
 return __crypto_mem_not_equal(a, b, size) != 0UL ? 1 : 0;
}

This hides the fact that we are only interested in a boolean result from the 
compiler when it's compiling crypto_mem_not_equal.c, but also ensures type 
safety when users test the return value.  It's also likely to have little or no 
performance impact.


Well, the code snippet I've provided from NaCl [1] is not really fast-path
as you say, but rather to prevent the compiler from doing such optimizations
by having a transformation of the accumulated bits into 0 and 1 as an end
result (likely to prevent a short circuit), plus it has static size, so no
loops applied here that could screw up.

Variable size could be done under arch/ in asm, and if not available, that
just falls back to normal memcmp that is being transformed into a same return
value. By that, all other archs could easily migrate afterwards. What do you
think?

 [1] http://www.spinics.net/lists/linux-crypto/msg09558.html
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] crypto_memcmp: add constant-time memcmp

2013-09-13 Thread Daniel Borkmann

On 09/11/2013 07:20 PM, James Yonan wrote:

On 10/09/2013 12:57, Daniel Borkmann wrote:

There was a similar patch posted some time ago [1] on lkml, where
Florian (CC) made a good point in [2] that future compiler optimizations
could short circuit on this. This issue should probably be addressed in
such a patch here as well.

 [1] https://lkml.org/lkml/2013/2/10/131
 [2] https://lkml.org/lkml/2013/2/11/381


On 11/09/2013 06:19, Marcelo Cerri wrote:

The discussion that Daniel pointed out has another interesting point
regarding the function name. I don't think it's a good idea to name it
crypto_memcpy since it doesn't have behavior the same way as strcmp.

Florian suggested in the thread names such crypto_mem_equal, which I
think fits better here.


Ok, here's another stab at this:

* Changed the name to crypto_mem_not_equal.  The not_equal seems to
make more sense because the function returns a nonzero true value if
the memory regions are not equal.


Ok, sounds good.


* Good point that a smart optimizer might add instructions to
short-circuit the loop if all bits in ret have been set.  One way to
deal with this is to disable optimizations that might increase code
size, since a short-circuit optimization in this case would require
adding instructions.

#pragma GCC optimize (Os)

The nice thing about using #pragma is that older versions of gcc that
don't recognize it will simply ignore it, and we can probably presume
that older versions of gcc do not support a short-circuit optimization
if the latest one does not.  I did a quick test using gcc 3.4.6 at -O2,
and did not see any evidence of a short-circuit optimization.

* Improved performance when CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is
enabled.  This makes the performance roughly on-par with memcmp.


Hm, why don't we take fixed-size versions of Daniel J Bernstein from NaCl
library [1], e.g. for comparing hashes?

E.g. for 16 bytes:

int crypto_verify(const unsigned char *x,const unsigned char *y)
{
  unsigned int differentbits = 0;
#define F(i) differentbits |= x[i] ^ y[i];
  F(0)
  F(1)
  F(2)
  F(3)
  F(4)
  F(5)
  F(6)
  F(7)
  F(8)
  F(9)
  F(10)
  F(11)
  F(12)
  F(13)
  F(14)
  F(15)
  return (1  ((differentbits - 1)  8)) - 1;
}

It will return 0 if x[0], x[1], ..., x[15] are the same as y[0], y[1], ..., 
y[15],
otherwise it returns -1. That's w/o for loops, so probably more 
compiler-proof ...

  [1] http://nacl.cr.yp.to/




#pragma GCC optimize (Os)

noinline unsigned long crypto_mem_not_equal(const void *a, const void *b, 
size_t size)
{
 unsigned long ret = 0;

#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
#if BITS_PER_LONG == 64
 while (size = 8) {
 ret |= *(unsigned long *)a ^ *(unsigned long *)b;
 a += 8;
 b += 8;
 size -= 8;
 }
 if (!size)
 return ret;
#endif /* BITS_PER_LONG == 64 */
 if (sizeof(unsigned int) == 4) {
 while (size = 4) {
 ret |= *(unsigned int *)a ^ *(unsigned int *)b;
 a += 4;
 b += 4;
 size -= 4;
 }
 if (!size)
 return ret;
 }
#endif /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */
 while (size  0) {
 ret |= *(unsigned char *)a ^ *(unsigned char *)b;
 a += 1;
 b += 1;
 size -= 1;
 }
 return ret;
}

James

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] crypto_memcmp: add constant-time memcmp

2013-09-10 Thread Daniel Borkmann

On 09/10/2013 08:38 PM, James Yonan wrote:

When comparing MAC hashes, AEAD authentication tags, or other hash
values in the context of authentication or integrity checking, it
is important not to leak timing information to a potential attacker.

Bytewise memory comparisons (such as memcmp) are usually optimized so
that they return a nonzero value as soon as a mismatch is found.
This early-return behavior can leak timing information, allowing an
attacker to iteratively guess the correct result.

This patch adds a new method crypto_memcmp that has the same prototype
as standard memcmp, but that compares strings of the same length in
roughly constant time (cache misses could change the timing, but
since they don't reveal information about the content of the strings
being compared, they are effectively benign).  Note that crypto_memcmp
(unlike memcmp) can only be used to test for equality or inequality,
NOT greater-than or less-than.  This is not an issue for its use-cases
within the Crypto API.

I tried to locate all of the places in the Crypto API where memcmp was
being used for authentication or integrity checking, and convert them
over to crypto_memcmp.

crypto_memcmp is declared noinline and placed in its own source file
because a very smart compiler (or LTO) might notice that the return
value is always compared against zero/nonzero, and might then
reintroduce the same early-return optimization that we are trying to
avoid.


There was a similar patch posted some time ago [1] on lkml, where
Florian (CC) made a good point in [2] that future compiler optimizations
could short circuit on this. This issue should probably be addressed in
such a patch here as well.

 [1] https://lkml.org/lkml/2013/2/10/131
 [2] https://lkml.org/lkml/2013/2/11/381


Signed-off-by: James Yonan ja...@openvpn.net
---
  crypto/Makefile |  2 +-
  crypto/asymmetric_keys/rsa.c|  5 +++--
  crypto/authenc.c|  7 ---
  crypto/authencesn.c |  9 +
  crypto/ccm.c|  5 +++--
  crypto/crypto_memcmp.c  | 31 +++
  crypto/gcm.c|  3 ++-
  include/crypto/internal/crypto_memcmp.h | 20 
  8 files changed, 69 insertions(+), 13 deletions(-)
  create mode 100644 crypto/crypto_memcmp.c
  create mode 100644 include/crypto/internal/crypto_memcmp.h

diff --git a/crypto/Makefile b/crypto/Makefile
index 2ba0df2..39a574d 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -3,7 +3,7 @@
  #

  obj-$(CONFIG_CRYPTO) += crypto.o
-crypto-y := api.o cipher.o compress.o
+crypto-y := api.o cipher.o compress.o crypto_memcmp.o

  obj-$(CONFIG_CRYPTO_WORKQUEUE) += crypto_wq.o

diff --git a/crypto/asymmetric_keys/rsa.c b/crypto/asymmetric_keys/rsa.c
index 4a6a069..4f9a250 100644
--- a/crypto/asymmetric_keys/rsa.c
+++ b/crypto/asymmetric_keys/rsa.c
@@ -13,6 +13,7 @@
  #include linux/module.h
  #include linux/kernel.h
  #include linux/slab.h
+#include crypto/internal/crypto_memcmp.h
  #include public_key.h

  MODULE_LICENSE(GPL);
@@ -189,12 +190,12 @@ static int RSA_verify(const u8 *H, const u8 *EM, size_t 
k, size_t hash_size,
}
}

-   if (memcmp(asn1_template, EM + T_offset, asn1_size) != 0) {
+   if (crypto_memcmp(asn1_template, EM + T_offset, asn1_size) != 0) {
kleave( = -EBADMSG [EM[T] ASN.1 mismatch]);
return -EBADMSG;
}

-   if (memcmp(H, EM + T_offset + asn1_size, hash_size) != 0) {
+   if (crypto_memcmp(H, EM + T_offset + asn1_size, hash_size) != 0) {
kleave( = -EKEYREJECTED [EM[T] hash mismatch]);
return -EKEYREJECTED;
}
diff --git a/crypto/authenc.c b/crypto/authenc.c
index ffce19d..82ca98f 100644
--- a/crypto/authenc.c
+++ b/crypto/authenc.c
@@ -13,6 +13,7 @@
  #include crypto/aead.h
  #include crypto/internal/hash.h
  #include crypto/internal/skcipher.h
+#include crypto/internal/crypto_memcmp.h
  #include crypto/authenc.h
  #include crypto/scatterwalk.h
  #include linux/err.h
@@ -188,7 +189,7 @@ static void authenc_verify_ahash_update_done(struct 
crypto_async_request *areq,
scatterwalk_map_and_copy(ihash, areq_ctx-sg, areq_ctx-cryptlen,
 authsize, 0);

-   err = memcmp(ihash, ahreq-result, authsize) ? -EBADMSG : 0;
+   err = crypto_memcmp(ihash, ahreq-result, authsize) ? -EBADMSG : 0;
if (err)
goto out;

@@ -227,7 +228,7 @@ static void authenc_verify_ahash_done(struct 
crypto_async_request *areq,
scatterwalk_map_and_copy(ihash, areq_ctx-sg, areq_ctx-cryptlen,
 authsize, 0);

-   err = memcmp(ihash, ahreq-result, authsize) ? -EBADMSG : 0;
+   err = crypto_memcmp(ihash, ahreq-result, authsize) ? -EBADMSG : 0;
if (err)
goto out;

@@ -462,7 +463,7 @@ static int crypto_authenc_verify(struct 

Re: [PATCH crypto] crypto: algboss: fix NULL pointer dereference in cryptomgr_probe

2013-06-24 Thread Daniel Borkmann

On 06/24/2013 03:59 PM, Herbert Xu wrote:
...

Author: Herbert Xu herb...@gondor.apana.org.au
Date:   Mon Jun 24 21:57:42 2013 +0800

 crypto: algboss - Hold ref count on larval


...


 The use of wait_for_completion_interruptible is intentional so that
 we don't lock up the thread if a bug causes us to never wake up.

 This bug is caused by the helper thread using the larval without
 holding a reference count on it.  If the helper thread completes
 after the original thread requesting for help has gone away and
 destroyed the larval, then we get the crash above.

 So the fix is to hold a reference count on the larval.

 Cc: sta...@vger.kernel.org # 3.6+
 Reported-by: Daniel Borkmann dbork...@redhat.com
 Signed-off-by: Herbert Xu herb...@gondor.apana.org.au


Tested-by: Daniel Borkmann dbork...@redhat.com

This fixes the panic for me with the reproducer I sent off-list.

Thanks Herbert !
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH crypto] crypto: algboss: fix NULL pointer dereference in cryptomgr_probe

2013-06-20 Thread Daniel Borkmann
After having fixed a NULL pointer dereference in SCTP 1abd165e (net:
sctp: fix NULL pointer dereference in socket destruction), I ran into
the following NULL pointer dereference in the crypto subsystem with
the same reproducer, easily hit each time:

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [81070321] __wake_up_common+0x31/0x90
PGD 0
Oops:  [#1] SMP
Modules linked in: padlock_sha(F-) sha256_generic(F) sctp(F) libcrc32c(F) [..]
CPU: 6 PID: 3326 Comm: cryptomgr_probe Tainted: GF3.10.0-rc5+ #1
Hardware name: Dell Inc. PowerEdge T410/0H19HD, BIOS 1.6.3 02/01/2011
task: 88007b6cf4e0 ti: 88007b7cc000 task.ti: 88007b7cc000
RIP: 0010:[81070321]  [81070321] __wake_up_common+0x31/0x90
RSP: 0018:88007b7cde08  EFLAGS: 00010082
RAX: ffe8 RBX: 88003756c130 RCX: 
RDX:  RSI: 0003 RDI: 88003756c130
RBP: 88007b7cde48 R08:  R09: 88012b173200
R10:  R11:  R12: 0282
R13: 88003756c138 R14:  R15: 
FS:  () GS:88012fc6() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2:  CR3: 01a0b000 CR4: 07e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Stack:
 88007b7cde28 0003 88007b7cde28 88003756c130
 0282 88003756c128 81227670 
 88007b7cde78 810722b7 88007cdcf000 81a90540
Call Trace:
 [81227670] ? crypto_alloc_pcomp+0x20/0x20
 [810722b7] complete_all+0x47/0x60
 [81227708] cryptomgr_probe+0x98/0xc0
 [81227670] ? crypto_alloc_pcomp+0x20/0x20
 [8106760e] kthread+0xce/0xe0
 [81067540] ? kthread_freezable_should_stop+0x70/0x70
 [815450dc] ret_from_fork+0x7c/0xb0
 [81067540] ? kthread_freezable_should_stop+0x70/0x70
Code: 41 56 41 55 41 54 53 48 83 ec 18 66 66 66 66 90 89 75 cc 89 55 c8
  4c 8d 6f 08 48 8b 57 08 41 89 cf 4d 89 c6 48 8d 42 e
RIP  [81070321] __wake_up_common+0x31/0x90
 RSP 88007b7cde08
CR2: 
---[ end trace b495b19270a4d37e ]---

My assumption is that the following is happening: the minimal SCTP
tool runs under ``echo 1  /proc/sys/net/sctp/auth_enable'', hence
it's making use of crypto_alloc_hash() via sctp_auth_init_hmacs().
It forks itself, heavily allocates, binds, listens and waits in
accept on sctp sockets, and then randomly kills some of them (no
need for an actual client in this case to hit this). Then, again,
allocating, binding, etc, and then killing child processes.

The problem that might be happening here is that cryptomgr requests
the module to probe/load through cryptomgr_schedule_probe(), but
before the thread handler cryptomgr_probe() returns, we return from
the wait_for_completion_interruptible() function and probably already
have cleared up larval, thus we run into a NULL pointer dereference
when in cryptomgr_probe() complete_all() is being called.

If we wait with wait_for_completion() instead, this panic will not
occur anymore. This is valid, because in case a signal is pending,
cryptomgr_probe() returns from probing anyway with properly calling
complete_all().

Signed-off-by: Daniel Borkmann dbork...@redhat.com
---
 v1-v2:
  - Submitting as non-RFC
  - Slightly improving commit message

 crypto/algboss.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/crypto/algboss.c b/crypto/algboss.c
index 769219b..eee89a5 100644
--- a/crypto/algboss.c
+++ b/crypto/algboss.c
@@ -195,7 +195,7 @@ static int cryptomgr_schedule_probe(struct crypto_larval 
*larval)
if (IS_ERR(thread))
goto err_free_param;
 
-   wait_for_completion_interruptible(larval-completion);
+   wait_for_completion(larval-completion);
 
return NOTIFY_STOP;
 
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH crypto] crypto: algboss: fix NULL pointer dereference in cryptomgr_probe

2013-06-20 Thread Daniel Borkmann
After having fixed a NULL pointer dereference in SCTP 1abd165e (net:
sctp: fix NULL pointer dereference in socket destruction), I ran into
the following NULL pointer dereference in the crypto subsystem with
the same reproducer, easily hit each time:

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [81070321] __wake_up_common+0x31/0x90
PGD 0
Oops:  [#1] SMP
Modules linked in: padlock_sha(F-) sha256_generic(F) sctp(F) libcrc32c(F) [..]
CPU: 6 PID: 3326 Comm: cryptomgr_probe Tainted: GF3.10.0-rc5+ #1
Hardware name: Dell Inc. PowerEdge T410/0H19HD, BIOS 1.6.3 02/01/2011
task: 88007b6cf4e0 ti: 88007b7cc000 task.ti: 88007b7cc000
RIP: 0010:[81070321]  [81070321] __wake_up_common+0x31/0x90
RSP: 0018:88007b7cde08  EFLAGS: 00010082
RAX: ffe8 RBX: 88003756c130 RCX: 
RDX:  RSI: 0003 RDI: 88003756c130
RBP: 88007b7cde48 R08:  R09: 88012b173200
R10:  R11:  R12: 0282
R13: 88003756c138 R14:  R15: 
FS:  () GS:88012fc6() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2:  CR3: 01a0b000 CR4: 07e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Stack:
 88007b7cde28 0003 88007b7cde28 88003756c130
 0282 88003756c128 81227670 
 88007b7cde78 810722b7 88007cdcf000 81a90540
Call Trace:
 [81227670] ? crypto_alloc_pcomp+0x20/0x20
 [810722b7] complete_all+0x47/0x60
 [81227708] cryptomgr_probe+0x98/0xc0
 [81227670] ? crypto_alloc_pcomp+0x20/0x20
 [8106760e] kthread+0xce/0xe0
 [81067540] ? kthread_freezable_should_stop+0x70/0x70
 [815450dc] ret_from_fork+0x7c/0xb0
 [81067540] ? kthread_freezable_should_stop+0x70/0x70
Code: 41 56 41 55 41 54 53 48 83 ec 18 66 66 66 66 90 89 75 cc 89 55 c8
  4c 8d 6f 08 48 8b 57 08 41 89 cf 4d 89 c6 48 8d 42 e
RIP  [81070321] __wake_up_common+0x31/0x90
 RSP 88007b7cde08
CR2: 
---[ end trace b495b19270a4d37e ]---

My assumption is that the following is happening: the minimal SCTP
tool runs under ``echo 1  /proc/sys/net/sctp/auth_enable'', hence
it's making use of crypto_alloc_hash() via sctp_auth_init_hmacs().
It forks itself, heavily allocates, binds, listens and waits in
accept on sctp sockets, and then randomly kills some of them (no
need for an actual client in this case to hit this). Then, again,
allocating, binding, etc, and then killing child processes.

The problem that might be happening here is that cryptomgr requests
the module to probe/load through cryptomgr_schedule_probe(), but
before the thread handler cryptomgr_probe() returns, we return from
the wait_for_completion_interruptible() function and probably already
have cleared up larval, thus we run into a NULL pointer dereference
when in cryptomgr_probe() complete_all() is being called.

If we wait with wait_for_completion() instead, this panic will not
occur anymore. This is valid, because in case a signal is pending,
cryptomgr_probe() returns from probing anyway with properly calling
complete_all().

Signed-off-by: Daniel Borkmann dbork...@redhat.com
---
 v1-v2:
  - Submitting as non-RFC
  - Slightly improving commit message

 crypto/algboss.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/crypto/algboss.c b/crypto/algboss.c
index 769219b..eee89a5 100644
--- a/crypto/algboss.c
+++ b/crypto/algboss.c
@@ -195,7 +195,7 @@ static int cryptomgr_schedule_probe(struct crypto_larval 
*larval)
if (IS_ERR(thread))
goto err_free_param;
 
-   wait_for_completion_interruptible(larval-completion);
+   wait_for_completion(larval-completion);
 
return NOTIFY_STOP;
 
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC, PATCH] crypto: algboss: fix NULL pointer dereference in cryptomgr_probe

2013-06-14 Thread Daniel Borkmann
After having fixed a NULL pointer dereference in SCTP 1abd165e (net:
sctp: fix NULL pointer dereference in socket destruction), I ran into
the following NULL pointer dereference in the crypto subsystem with
the same reproducer, easily hit each time:

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [81070321] __wake_up_common+0x31/0x90
PGD 0
Oops:  [#1] SMP
Modules linked in: padlock_sha(F-) sha256_generic(F) sctp(F) libcrc32c(F) [..]
CPU: 6 PID: 3326 Comm: cryptomgr_probe Tainted: GF3.10.0-rc5+ #1
Hardware name: Dell Inc. PowerEdge T410/0H19HD, BIOS 1.6.3 02/01/2011
task: 88007b6cf4e0 ti: 88007b7cc000 task.ti: 88007b7cc000
RIP: 0010:[81070321]  [81070321] __wake_up_common+0x31/0x90
RSP: 0018:88007b7cde08  EFLAGS: 00010082
RAX: ffe8 RBX: 88003756c130 RCX: 
RDX:  RSI: 0003 RDI: 88003756c130
RBP: 88007b7cde48 R08:  R09: 88012b173200
R10:  R11:  R12: 0282
R13: 88003756c138 R14:  R15: 
FS:  () GS:88012fc6() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2:  CR3: 01a0b000 CR4: 07e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Stack:
 88007b7cde28 0003 88007b7cde28 88003756c130
 0282 88003756c128 81227670 
 88007b7cde78 810722b7 88007cdcf000 81a90540
Call Trace:
 [81227670] ? crypto_alloc_pcomp+0x20/0x20
 [810722b7] complete_all+0x47/0x60
 [81227708] cryptomgr_probe+0x98/0xc0
 [81227670] ? crypto_alloc_pcomp+0x20/0x20
 [8106760e] kthread+0xce/0xe0
 [81067540] ? kthread_freezable_should_stop+0x70/0x70
 [815450dc] ret_from_fork+0x7c/0xb0
 [81067540] ? kthread_freezable_should_stop+0x70/0x70
Code: 41 56 41 55 41 54 53 48 83 ec 18 66 66 66 66 90 89 75 cc 89 55 c8
  4c 8d 6f 08 48 8b 57 08 41 89 cf 4d 89 c6 48 8d 42 e
RIP  [81070321] __wake_up_common+0x31/0x90
 RSP 88007b7cde08
CR2: 
---[ end trace b495b19270a4d37e ]---

My assumption is that the following is happening: the minimal SCTP
tool runs under ``echo 1  /proc/sys/net/sctp/auth_enable'', hence
it's making use of crypto_alloc_hash() via sctp_auth_init_hmacs().
It forks itself, heavily allocates, binds, listens and waits in
accept on sctp sockets, and then randomly kills some of them (no
need for an actual client in this case to hit this). Then, again,
allocating, binding, etc, and then killing child processes.

The problem that might be happening here is that cryptomgr requests
the module to probe/load through cryptomgr_schedule_probe(), but
before the thread handler cryptomgr_probe() returns, we return from
the wait_for_completion_interruptible() function and probably already
have cleared up larval, thus we run into a NULL pointer dereference
when in cryptomgr_probe() complete_all() is being called.

If we wait with wait_for_completion() instead, this panic will not
occur anymore.

Signed-off-by: Daniel Borkmann dbork...@redhat.com
---
 I'm not very familiar with the crypto subsystem and not entirely
 sure if this is the best solution. However, it has fixed the panic
 in my case.

 crypto/algboss.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/crypto/algboss.c b/crypto/algboss.c
index 769219b..eee89a5 100644
--- a/crypto/algboss.c
+++ b/crypto/algboss.c
@@ -195,7 +195,7 @@ static int cryptomgr_schedule_probe(struct crypto_larval 
*larval)
if (IS_ERR(thread))
goto err_free_param;
 
-   wait_for_completion_interruptible(larval-completion);
+   wait_for_completion(larval-completion);
 
return NOTIFY_STOP;
 
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html