Re: [RFC 20/22] x86/relocs: Add option to generate 64-bit relocations

2017-07-19 Thread H. Peter Anvin
<cmetc...@mellanox.com>,"Paul E . McKenney" <paul...@linux.vnet.ibm.com>,Andrew 
Morton <a...@linux-foundation.org>,Christopher Li <spa...@chrisli.org>,Dou 
Liyang <douly.f...@cn.fujitsu.com>,Masahiro Yamada 
<yamada.masah...@socionext.com>,Daniel Borkmann <dan...@iogearbox.net>,Markus 
Trippelsdorf <mar...@trippelsdorf.de>,Peter Foley <pefol...@pefoley.com>,Steven 
Rostedt <rost...@goodmis.org>,Tim Chen <tim.c.c...@linux.intel.com>,Ard 
Biesheuvel <ard.biesheu...@linaro.org>,Catalin Marinas 
<catalin.mari...@arm.com>,Matthew Wilcox <mawil...@microsoft.com>,Michal Hocko 
<mho...@suse.com>,Rob Landley <r...@landley.net>,Jiri Kosina 
<jkos...@suse.cz>,"H . J . Lu" <hjl.to...@gmail.com>,Paul Bolle 
<pebo...@tiscali.nl>,Baoquan He <b...@redhat.com>,Daniel Micay 
<danielmi...@gmail.com>,the arch/x86 maintainers 
<x...@kernel.org>,linux-crypto@vger.kernel.org,LKML 
<linux-ker...@vger.kernel.org>,xen-de...@lists.xenproject.org,kvm list 
<k...@vger.kernel.org>,Linux PM list
<linux...@vger.kernel.org>,linux-arch 
<linux-a...@vger.kernel.org>,linux-spa...@vger.kernel.org,Kernel Hardening 
<kernel-harden...@lists.openwall.com>
From: h...@zytor.com
Message-ID: <0ef6faaa-a99c-4f0d-9e4a-ad25e9395...@zytor.com>

On July 19, 2017 4:25:56 PM PDT, Thomas Garnier <thgar...@google.com> wrote:
>On Wed, Jul 19, 2017 at 4:08 PM, H. Peter Anvin <h...@zytor.com> wrote:
>> On 07/19/17 15:47, Thomas Garnier wrote:
>>> On Wed, Jul 19, 2017 at 3:33 PM, H. Peter Anvin <h...@zytor.com>
>wrote:
>>>> On 07/18/17 15:33, Thomas Garnier wrote:
>>>>> The x86 relocation tool generates a list of 32-bit signed
>integers. There
>>>>> was no need to use 64-bit integers because all addresses where
>above the 2G
>>>>> top of the memory.
>>>>>
>>>>> This change add a large-reloc option to generate 64-bit unsigned
>integers.
>>>>> It can be used when the kernel plan to go below the top 2G and
>32-bit
>>>>> integers are not enough.
>>>>
>>>> Why on Earth?  This would only be necessary if the *kernel itself*
>was
>>>> more than 2G, which isn't going to happen for the forseeable
>future.
>>>
>>> Because the relocation integer is an absolute address, not an offset
>>> in the binary. Next iteration, I can try using a 32-bit offset for
>>> everyone.
>>
>> It is an absolute address *as the kernel was originally linked*, for
>> obvious reasons.
>
>Sure when the kernel was just above 0x8000, it doesn't
>work when it goes down to 0x. That's why using an
>offset might make more sense in general.
>
>>
>> -hpa
>>

What is the motivation for changing the pre linked address at all?
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [RFC 06/22] kvm: Adapt assembly for PIE support

2017-07-19 Thread H. Peter Anvin
<paul.gortma...@windriver.com>,Chris Metcalf <cmetc...@mellanox.com>,"Paul E . 
McKenney" <paul...@linux.vnet.ibm.com>,Andrew Morton 
<a...@linux-foundation.org>,Christopher Li <spa...@chrisli.org>,Dou Liyang 
<douly.f...@cn.fujitsu.com>,Masahiro Yamada 
<yamada.masah...@socionext.com>,Daniel Borkmann <dan...@iogearbox.net>,Markus 
Trippelsdorf <mar...@trippelsdorf.de>,Peter Foley <pefol...@pefoley.com>,Steven 
Rostedt <rost...@goodmis.org>,Tim Chen <tim.c.c...@linux.intel.com>,Catalin 
Marinas <catalin.mari...@arm.com>,Matthew Wilcox 
<mawil...@microsoft.com>,Michal Hocko <mho...@suse.com>,Rob Landley 
<r...@landley.net>,Jiri Kosina <jkos...@suse.cz>,"H . J . Lu" 
<hjl.to...@gmail.com>,Paul Bolle <pebo...@tiscali.nl>,Baoquan He 
<b...@redhat.com>,Daniel Micay <danielmi...@gmail.com>,the arch/x86 maintainers 
<x...@kernel.org>,"linux-crypto@vger.kernel.org" 
<linux-crypto@vger.kernel.org>,Linux Kernel Mailing List 
<linux-ker...@vger.kernel.org>,xen-de...@lists.xenproject.org,kvm list
<k...@vger.kernel.org>,linux-pm <linux...@vger.kernel.org>,linux-arch 
<linux-a...@vger.kernel.org>,Linux-Sparse <linux-spa...@vger.kernel.org>,Kernel 
Hardening <kernel-harden...@lists.openwall.com>
From: h...@zytor.com
Message-ID: <83ba7600-bc8d-4c91-812c-dd2a0bf44...@zytor.com>

On July 19, 2017 3:58:07 PM PDT, Ard Biesheuvel <ard.biesheu...@linaro.org> 
wrote:
>On 19 July 2017 at 23:27, H. Peter Anvin <h...@zytor.com> wrote:
>> On 07/19/17 08:40, Thomas Garnier wrote:
>>>>
>>>> This doesn't look right.  It's accessing a per-cpu variable.  The
>>>> per-cpu section is an absolute, zero-based section and not subject
>to
>>>> relocation.
>>>
>>> PIE does not respect the zero-based section, it tries to have
>>> everything relative. Patch 16/22 also adapt per-cpu to work with PIE
>>> (while keeping the zero absolute design by default).
>>>
>>
>> This is silly.  The right thing is for PIE is to be explicitly
>absolute,
>> without (%rip).  The use of (%rip) memory references for percpu is
>just
>> an optimization.
>>
>
>Sadly, there is an issue in binutils that may prevent us from doing
>this as cleanly as we would want.
>
>For historical reasons, bfd.ld emits special symbols like
>__GLOBAL_OFFSET_TABLE__ as absolute symbols with a section index of
>SHN_ABS, even though it is quite obvious that they are relative like
>any other symbol that points into the image. Unfortunately, this means
>that binutils needs to emit R_X86_64_RELATIVE relocations even for
>SHN_ABS symbols, which means we lose the ability to use both absolute
>and relocatable symbols in the same PIE image (unless the reloc tool
>can filter them out)
>
>More info here:
>https://sourceware.org/bugzilla/show_bug.cgi?id=19818

The reloc tool already has the ability to filter symbols.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [PATCH 5/7] x86, boot, LLVM: Use regparm=0 for memcpy and memset

2017-06-22 Thread H. Peter Anvin
On 03/17/17 05:08, Peter Zijlstra wrote:
> On Thu, Mar 16, 2017 at 05:15:18PM -0700, Michael Davidson wrote:
>> Use the standard regparm=0 calling convention for memcpy and
>> memset when building with clang.
>>
>> This is a work around for a long standing clang bug
>> (see https://llvm.org/bugs/show_bug.cgi?id=3997) where
>> clang always uses the standard regparm=0 calling convention
>> for any implcit calls to memcpy and memset that it generates
>> (eg for structure assignments and initialization) even if an
>> alternate calling convention such as regparm=3 has been specified.
> 
> Seriously, fix LLVM already.
> 

Yes, this is a real stinker, in no small part because once clang is
fixed to DTRT then this is actually broken...

-hpa



Re: [PATCH 3/7] x86, LLVM: suppress clang warnings about unaligned accesses

2017-04-13 Thread H. Peter Anvin
On 04/13/17 16:14, Matthias Kaehlcke wrote:
> El Mon, Apr 03, 2017 at 04:01:58PM -0700 Matthias Kaehlcke ha dit:
> 
>> El Fri, Mar 17, 2017 at 04:50:19PM -0700 h...@zytor.com ha dit:
>>
>>> On March 16, 2017 5:15:16 PM PDT, Michael Davidson  wrote:
 Suppress clang warnings about potential unaliged accesses
 to members in packed structs. This gets rid of almost 10,000
 warnings about accesses to the ring 0 stack pointer in the TSS.

 Signed-off-by: Michael Davidson 
 ---
 arch/x86/Makefile | 5 +
 1 file changed, 5 insertions(+)

 diff --git a/arch/x86/Makefile b/arch/x86/Makefile
 index 894a8d18bf97..7f21703c475d 100644
 --- a/arch/x86/Makefile
 +++ b/arch/x86/Makefile
 @@ -128,6 +128,11 @@ endif
 KBUILD_CFLAGS += $(call cc-option,-maccumulate-outgoing-args)
 endif

 +ifeq ($(cc-name),clang)
 +# Suppress clang warnings about potential unaligned accesses.
 +KBUILD_CFLAGS += $(call cc-disable-warning, address-of-packed-member)
 +endif
 +
 ifdef CONFIG_X86_X32
x32_ld_ok := $(call try-run,\
/bin/echo -e '1: .quad 1b' | \
>>>
>>> Why conditional on clang?
>>
>> My understanding is that this warning is clang specific, it is not
>> listed on https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html
> 
> Actually this warning affects other platforms besides x86
> (e.g. arm64), I'll submit a patch that disables the warning on all
> platforms.
> 

Drop the ifeq ($(cc-name),clang).

You should assume that if you have to add one of those ifeq's then you
are doing something fundamentally wrong.

-hpa




Re: [PATCH 2/7] Makefile, x86, LLVM: disable unsupported optimization flags

2017-03-17 Thread H. Peter Anvin
On 03/17/17 14:32, H. Peter Anvin wrote:
> 
> NAK.  Fix your compiler, or use a wrapper script or something.  It is
> absolutely *not* acceptable to disable this since future versions of
> clang *should* support that.
> 
> That being said, it might make sense to look for a key pattern like
> "(un|not )supported" on stderr the try-run macro.  Is there really no
> -Wno- or -Werror= option to turn off this craziness?
> 

Well, guess what... I found it myself.

-W{no-,error=}ignored-optimization-argument

Either variant will make this sane.

-hpa




Re: [PATCH] x86/crypto: fix %progbits -> @progbits

2017-01-25 Thread H. Peter Anvin
On 01/19/17 13:28, Denys Vlasenko wrote:
> %progbits form is used on ARM (where @ is a comment char).
> 
> x86 consistently uses @progbits everywhere else.

However, it looks like %progbits works on all architectures (at least
include/linux/init.h seems to imply so.)  Perhaps a tree-wide
replacement the other way would make more sense.

Personally I would also like to see these parameters macroized, to keep
someone from getting them wrong, just like we have __INIT, __INITRODATA
etc already, we should just have plain __TEXT __DATA __BSS...

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Entropy sources

2016-08-25 Thread H. Peter Anvin
On 08/25/16 16:35, Sandy Harris wrote:
> On Thu, Aug 25, 2016 at 5:30 PM, H. Peter Anvin <h...@zytor.com> wrote:
> 
>> The network stack is a good source of entropy, *once it is online*.
>> However, the most serious case is while the machine is still booting,
>> when the network will not have enabled yet.
>>
>> -hpa
> 
> One possible solution is at:
> https://github.com/sandy-harris/maxwell
> 
> A small (< 700 lines) daemon that gets entropy from timer imprecision
> and variations in time for arithmetic (cache misses, interrupts, etc.)
> and pumps it into /dev/random. Make it the first userspace program
> started and all should be covered. Directory above includes a PDF doc
> with detailed rationale and some discussion of alternate solutions.
> 
> Of course if you are dealing with a system-on-a-chip or low-end
> embedded CPU & the timer is really inadequate, this will not work
> well. Conceivably well enough, but we could not know that without
> detailed analysis for each chip in question.
> 

A lot of this is exactly the same thing /dev/random already does in
kernel space.  I have already in the past expressed skepticism toward
this approach, because a lot of the analysis done has simply been bogus.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Entropy sources (was: /dev/random - a new approach)

2016-08-25 Thread H. Peter Anvin
On 08/20/16 22:37, Jeffrey Walton wrote:
>>
>> The biggest problem there is that the timer interrupt adds *no* entropy
>> unless there is a source of asynchronicity in the system.  On PCs,
>> traditionally the timer has been run from a completely different crystal
>> (14.31818 MHz) than the CPU, which is the ideal situation, but if they
>> are run off the same crystal and run in lockstep, there is very little
>> if anything there.  On some systems, the timer may even *be* the only
>> source of time, and the entropy truly is zero.
> 
> It seems like a networked computer should have an abundance on entropy
> available from the network stack. Every common case I can come up with
> includes a networked computer. If a handheld is outside of coverage,
> then it probably does not have the randomness demands because it can't
> communicate (i.e., TCP sequence numbers, key agreement, etc).
> 
> In fact, there are at least two papers that use bits from the network stack:
> 

The network stack is a good source of entropy, *once it is online*.
However, the most serious case is while the machine is still booting,
when the network will not have enabled yet.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 0/5] /dev/random - a new approach

2016-08-19 Thread H. Peter Anvin
On 08/18/16 22:56, Herbert Xu wrote:
> On Thu, Aug 18, 2016 at 10:49:47PM -0400, Theodore Ts'o wrote:
>>
>> That really depends on the system.  We can't assume that people are
>> using systems with a 100Hz clock interrupt.  More often than not
>> people are using tickless kernels these days.  That's actually the
>> problem with changing /dev/urandom to block until things are
>> initialized.
> 
> Couldn't we disable tickless until urandom has been seeded? In fact
> perhaps we should accelerate the timer interrupt rate until it has
> been seeded?
> 

The biggest problem there is that the timer interrupt adds *no* entropy
unless there is a source of asynchronicity in the system.  On PCs,
traditionally the timer has been run from a completely different crystal
(14.31818 MHz) than the CPU, which is the ideal situation, but if they
are run off the same crystal and run in lockstep, there is very little
if anything there.  On some systems, the timer may even *be* the only
source of time, and the entropy truly is zero.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 0/5] /dev/random - a new approach

2016-08-16 Thread H. Peter Anvin
On 08/16/16 15:28, H. Peter Anvin wrote:
> On 08/15/16 22:45, Stephan Mueller wrote:
>> Am Montag, 15. August 2016, 13:42:54 CEST schrieb H. Peter Anvin:
>>
>> Hi H,
>>
>>> On 08/11/16 05:24, Stephan Mueller wrote:
>>>> * prevent fast noise sources from dominating slow noise sources
>>>>
>>>>   in case of /dev/random
>>>
>>> Can someone please explain if and why this is actually desirable, and if
>>> this assessment has been passed to someone who has actual experience
>>> with cryptography at the professional level?
>>
>> There are two motivations for that:
>>
>> - the current /dev/random is compliant to NTG.1 from AIS 20/31 which 
>> requires 
>> (in brief words) that entropy comes from auditible noise sources. Currently 
>> in 
>> my LRNG only RDRAND is a fast noise source which is not auditible (and it is 
>> designed to cause a VM exit making it even harder to assess it). To make the 
>> LRNG to comply with NTG.1, RDRAND can provide entropy but must not become 
>> the 
>> sole entropy provider which is the case now with that change.
>>
>> - the current /dev/random implementation follows the same concept with the 
>> exception of 3.15 and 3.16 where RDRAND was not rate-limited. In later 
>> versions, this was changed.
>>
> 
> I'm not saying it should be *sole*.  I am questioning the value in
> limiting it, as it seems to me that it could only ever produce a worse
> result.
> 

Also, it would be great to actually get a definition for "auditable".  A
quantum white noise source which exceeds the sampling bandwidth is an
ideal RNG; how do you "audit" that?  If what you are doing is looking
for imperfections, those imperfections can be trivially emulated.  If
what you mean is an audit on the chip or circuit level, that would
require some mechanism to know that all items were built identically
without deviation, which may be possible for intelligence agencies or
the military who have full control of their supply chain, but for anyone
else that is most likely an impossible task.  How many people are going
to crack the case and look at even a discrete transistor circuit, and
how many of *those* are going to be able to discern if that circuit is
subject to RF capture, or its output even used?

I have been trying to figure out how to reasonably solve this problem
for a long time now, and it is not just a problem for RDSEED (RDRAND is
a slightly different beast.)  The only reason RDSEED exposes the problem
particularly harshly is because it is extremely high bandwidth compared
to other noise sources and it is architecturally integrated into the
CPU, but the same would apply to an external noise generator connected
via PCIe, for example.

Incidentally, I am hoping -- and this is a personal statement and
nothing official from Intel -- that at some future date RDRAND (not
RDSEED) will be fast enough that it can completely replace even
prandom_u32(), which I really hope can be non-controversial as
prandom_u32() isn't cryptographically strong in the first place.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 0/5] /dev/random - a new approach

2016-08-16 Thread H. Peter Anvin
On 08/15/16 22:45, Stephan Mueller wrote:
> Am Montag, 15. August 2016, 13:42:54 CEST schrieb H. Peter Anvin:
> 
> Hi H,
> 
>> On 08/11/16 05:24, Stephan Mueller wrote:
>>> * prevent fast noise sources from dominating slow noise sources
>>>
>>>   in case of /dev/random
>>
>> Can someone please explain if and why this is actually desirable, and if
>> this assessment has been passed to someone who has actual experience
>> with cryptography at the professional level?
> 
> There are two motivations for that:
> 
> - the current /dev/random is compliant to NTG.1 from AIS 20/31 which requires 
> (in brief words) that entropy comes from auditible noise sources. Currently 
> in 
> my LRNG only RDRAND is a fast noise source which is not auditible (and it is 
> designed to cause a VM exit making it even harder to assess it). To make the 
> LRNG to comply with NTG.1, RDRAND can provide entropy but must not become the 
> sole entropy provider which is the case now with that change.
> 
> - the current /dev/random implementation follows the same concept with the 
> exception of 3.15 and 3.16 where RDRAND was not rate-limited. In later 
> versions, this was changed.
> 

I'm not saying it should be *sole*.  I am questioning the value in
limiting it, as it seems to me that it could only ever produce a worse
result.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 0/5] /dev/random - a new approach

2016-08-15 Thread H. Peter Anvin
On 08/11/16 05:24, Stephan Mueller wrote:
> * prevent fast noise sources from dominating slow noise sources
>   in case of /dev/random

Can someone please explain if and why this is actually desirable, and if
this assessment has been passed to someone who has actual experience
with cryptography at the professional level?

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] crypto: sha256-mb - cleanup a || vs | typo

2016-06-29 Thread H. Peter Anvin
On 06/29/16 07:42, Dan Carpenter wrote:
> || and | behave basically the same here but || is intended.  It causes a
> static checker warning to mix up bitwise and logical operations.
> 
> Signed-off-by: Dan Carpenter 
> 
> diff --git a/arch/x86/crypto/sha256-mb/sha256_mb.c 
> b/arch/x86/crypto/sha256-mb/sha256_mb.c
> index c9d5dcc..4ec895a 100644
> --- a/arch/x86/crypto/sha256-mb/sha256_mb.c
> +++ b/arch/x86/crypto/sha256-mb/sha256_mb.c
> @@ -299,7 +299,7 @@ static struct sha256_hash_ctx 
> *sha256_ctx_mgr_submit(struct sha256_ctx_mgr *mgr,
>* Or if the user's buffer contains less than a whole block,
>* append as much as possible to the extra block.
>*/
> - if ((ctx->partial_block_buffer_length) | (len < SHA256_BLOCK_SIZE)) {
> + if ((ctx->partial_block_buffer_length) || (len < SHA256_BLOCK_SIZE)) {
>   /* Compute how many bytes to copy from user buffer into
>* extra block
>*/
> 

As far as I know the | was an intentional optimization, so you may way
to look at the generated code.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/7] random: replace non-blocking pool with a Chacha20-based CRNG

2016-06-20 Thread H. Peter Anvin
On 06/20/16 08:49, Stephan Mueller wrote:
> Am Montag, 20. Juni 2016, 11:01:47 schrieb Theodore Ts'o:
> 
> Hi Theodore,
> 
>>
>> So simply doing chacha20 encryption in a tight loop in the kernel
>> might not be a good proxy for what would actually happen in real life
>> when someone calls getrandom(2).  (Another good question to ask is
>> when someone might be needing to generate millions of 256-bit session
>> keys per second, when the D-H setup, even if you were using ECCDH,
>> would be largely dominating the time for the connection setup anyway.)
> 
> Is speed everything we should care about? What about:
> 
> - offloading of crypto operation from the CPU
> 

This sounds like a speed operation (and very unlikely to be a win given
the usage).

> - potentially additional security features a hardware cipher may provide like 
> cache coloring attack resistance?

How about burning that bridge when and if we get to it?  It sounds very
hypothetical.

I guess I could add in some comments here about how a lot of these
problems can be eliminated by offloading an entire DRNG into hardware,
but I don't think it is productive.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: linux/bitops.h

2016-05-06 Thread H. Peter Anvin
On May 6, 2016 1:07:13 PM PDT, Sasha Levin <sasha.le...@oracle.com> wrote:
>On 05/04/2016 08:30 PM, H. Peter Anvin wrote:
>> On 05/04/16 15:06, John Denker wrote:
>>> On 05/04/2016 02:56 PM, H. Peter Anvin wrote:
>>>>> Beware that shifting by an amount >= the number of bits in the
>>>>> word remains Undefined Behavior.
>>>
>>>> This construct has been supported as a rotate since at least gcc2.
>>>
>>> How then should we understand the story told in commit d7e35dfa?
>>> Is the story wrong?
>>>
>>> At the very least, something inconsistent is going on.  There
>>> are 8 functions.  Why did d7e35dfa change one of them but
>>> not the other 7?
>> 
>> Yes. d7e35dfa is baloney IMNSHO.  All it does is produce worse code,
>and the description even says so.
>
>No, the description says that it produces worse code for *really
>really* ancient
>GCC versions.
>
>> As I said, gcc has treated the former code as idiomatic since gcc 2,
>so that support is beyond ancient.
>
>Because something works in a specific way on one compiler isn't a
>reason to
>ignore this noncompliance with the standard.
>
>
>Thanks,
>Sasha

When the compiler in question is our flagship target and our reference 
compiler, then yes, it matters.
-- 
Sent from my Android device with K-9 Mail. Please excuse brevity and formatting.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: linux/bitops.h

2016-05-06 Thread H. Peter Anvin
On May 6, 2016 1:07:13 PM PDT, Sasha Levin <sasha.le...@oracle.com> wrote:
>On 05/04/2016 08:30 PM, H. Peter Anvin wrote:
>> On 05/04/16 15:06, John Denker wrote:
>>> On 05/04/2016 02:56 PM, H. Peter Anvin wrote:
>>>>> Beware that shifting by an amount >= the number of bits in the
>>>>> word remains Undefined Behavior.
>>>
>>>> This construct has been supported as a rotate since at least gcc2.
>>>
>>> How then should we understand the story told in commit d7e35dfa?
>>> Is the story wrong?
>>>
>>> At the very least, something inconsistent is going on.  There
>>> are 8 functions.  Why did d7e35dfa change one of them but
>>> not the other 7?
>> 
>> Yes. d7e35dfa is baloney IMNSHO.  All it does is produce worse code,
>and the description even says so.
>
>No, the description says that it produces worse code for *really
>really* ancient
>GCC versions.
>
>> As I said, gcc has treated the former code as idiomatic since gcc 2,
>so that support is beyond ancient.
>
>Because something works in a specific way on one compiler isn't a
>reason to
>ignore this noncompliance with the standard.
>
>
>Thanks,
>Sasha

4.6.2 is not "really, really ancient."
-- 
Sent from my Android device with K-9 Mail. Please excuse brevity and formatting.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: better patch for linux/bitops.h

2016-05-05 Thread H. Peter Anvin
On 05/05/2016 03:18 PM, ty...@mit.edu wrote:
> 
> So this is why I tend to take a much more pragmatic viewpoint on
> things.  Sure, it makes sense to pay attention to what the C standard
> writers are trying to do to us; but if we need to suppress certain
> optimizations to write sane kernel code --- I'm ok with that.  And
> this is why using a trust-but-verify on a specific set of compilers
> and ranges of compiler versions is a really good idea
> 

For the record, the "portable" construct has apparently only been
supported since gcc 4.6.3.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: better patch for linux/bitops.h

2016-05-05 Thread H. Peter Anvin
On May 5, 2016 3:18:09 PM PDT, ty...@mit.edu wrote:
>On Thu, May 05, 2016 at 05:34:50PM -0400, Sandy Harris wrote:
>> 
>> I completely fail to see why tests or compiler versions should be
>> part of the discussion. The C standard says the behaviour in
>> certain cases is undefined, so a standard-compliant compiler
>> can generate more-or-less any code there.
>> 
>
>> As long as any of portability, reliability or security are among our
>> goals, any code that can give undefined behaviour should be
>> considered problematic.
>
>Because compilers have been known not necessarily to obey the specs,
>and/or interpret the specs in way that not everyone agrees with.  It's
>also the case that we are *already* disabling certain C optimizations
>which are technically allowed by the spec, but which kernel
>programmers consider insane (e.g., strict aliasing).
>
>And of course, memzero_explicit() which crypto people understand is
>necessary, is something that technically compilers are allowed to
>optimize according to the spec.  So trying to write secure kernel code
>which will work on arbitrary compilers may well be impossible.
>
>And which is also why people have said (mostly in jest), "A
>sufficiently advanced compiler is indistinguishable from an
>adversary."  (I assume people will agree that optimizing away a memset
>needed to clear secrets from memory would be considered adversarial,
>at the very least!)
>
>So this is why I tend to take a much more pragmatic viewpoint on
>things.  Sure, it makes sense to pay attention to what the C standard
>writers are trying to do to us; but if we need to suppress certain
>optimizations to write sane kernel code --- I'm ok with that.  And
>this is why using a trust-but-verify on a specific set of compilers
>and ranges of compiler versions is a really good idea
>
>- Ted

I have filed a gcc bug to have the preexisting rotate idiom officially 
documented as a GNU C extension.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70967
-- 
Sent from my Android device with K-9 Mail. Please excuse brevity and formatting.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: better patch for linux/bitops.h

2016-05-05 Thread H. Peter Anvin
On 05/05/2016 03:18 PM, ty...@mit.edu wrote:
> On Thu, May 05, 2016 at 05:34:50PM -0400, Sandy Harris wrote:
>>
>> I completely fail to see why tests or compiler versions should be
>> part of the discussion. The C standard says the behaviour in
>> certain cases is undefined, so a standard-compliant compiler
>> can generate more-or-less any code there.
>>
> 
>> As long as any of portability, reliability or security are among our
>> goals, any code that can give undefined behaviour should be
>> considered problematic.
> 
> Because compilers have been known not necessarily to obey the specs,
> and/or interpret the specs in way that not everyone agrees with.  It's
> also the case that we are *already* disabling certain C optimizations
> which are technically allowed by the spec, but which kernel
> programmers consider insane (e.g., strict aliasing).
> 
> And of course, memzero_explicit() which crypto people understand is
> necessary, is something that technically compilers are allowed to
> optimize according to the spec.  So trying to write secure kernel code
> which will work on arbitrary compilers may well be impossible.
> 
> And which is also why people have said (mostly in jest), "A
> sufficiently advanced compiler is indistinguishable from an
> adversary."  (I assume people will agree that optimizing away a memset
> needed to clear secrets from memory would be considered adversarial,
> at the very least!)
> 
> So this is why I tend to take a much more pragmatic viewpoint on
> things.  Sure, it makes sense to pay attention to what the C standard
> writers are trying to do to us; but if we need to suppress certain
> optimizations to write sane kernel code --- I'm ok with that.  And
> this is why using a trust-but-verify on a specific set of compilers
> and ranges of compiler versions is a really good idea
> 

In theory, theory and practice should agree, but in practice, practice
is what counts.  I fully agree we should get rid of UD behavior where
doing so is practical, but not at the cost of breaking real-life
compilers, expecially not gcc, and to a lesser but still very real
extent icc and clang.

I would also agree that we should push the gcc developers to add to the
manual C-standard-UD behavior which are well-defined under the
gnu89/gnu99/gnu11 C dialects.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: better patch for linux/bitops.h

2016-05-05 Thread H. Peter Anvin

On 05/04/16 21:03, Jeffrey Walton wrote:

On Wed, May 4, 2016 at 11:50 PM, Theodore Ts'o  wrote:

...
But instead of arguing over what works and doesn't, let's just create
the the test set and just try it on a wide range of compilers and
architectures, hmmm?


What are the requirements? Here's a short list:

   * No undefined behavior
 - important because the compiler writers use the C standard
   * Compiles to native "rotate IMMEDIATE" if the rotate amount is a
"constant expression" and the machine provides it
 - translates to a native rotate instruction if available
 - "rotate IMM" can be 3 times faster than "rotate REG"
 - do any architectures *not* provide a rotate?
   * Compiles to native "rotate REGISTER" if the rotate is variable and
the machine provides it
 - do any architectures *not* provide a rotate?
   * Constant time
 - important to high-integrity code
 - Non-security code paths probably don't care

Maybe the first thing to do is provide a different rotates for the
constant-time requirement when its in effect?



The disagreement here is the priority between these points.  In my very 
strong opinion, "no undefined behavior" per the C standard is way less 
important than the others; what matters is what gcc and the other 
compilers we care about do.  The kernel relies on various versions of 
C-standard-undefined behavior *all over the place*; for one thing 
sizeof(void *) == sizeof(size_t) == sizeof(unsigned long)!! but they are 
well-defined in the subcontext we care about.


(And no, not all architectures provide a rotate instruction.)

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: better patch for linux/bitops.h

2016-05-04 Thread H. Peter Anvin
On May 4, 2016 7:54:12 PM PDT, Jeffrey Walton <noloa...@gmail.com> wrote:
>On Wed, May 4, 2016 at 10:41 PM, H. Peter Anvin <h...@zytor.com> wrote:
>> On May 4, 2016 6:35:44 PM PDT, Jeffrey Walton <noloa...@gmail.com>
>wrote:
>>>On Wed, May 4, 2016 at 5:52 PM, John Denker <j...@av8n.com> wrote:
>>>> On 05/04/2016 02:42 PM, I wrote:
>>>>
>>>>> I find it very odd that the other seven functions were not
>>>>> upgraded. I suggest the attached fix-others.diff would make
>>>>> things more consistent.
>>>>
>>>> Here's a replacement patch.
>>>> ...
>>>
>>>+1, commit it.
>>>
>>>Its good for three additional reasons. First, this change means the
>>>kernel is teaching the next generation the correct way to do things.
>>>Many developers aspire to be kernel hackers, and they sometimes use
>>>the code from bitops.h. I've actually run across the code in
>>>production during an audit where the developers cited bitops.h.
>>>
>>>Second, it preserves a "silent and dark" cockpit during analysis.
>That
>>>is, when analysis is run, no findings are generated. Auditors and
>>>security folks like quiet tools, much like the way pilots like their
>>>cockpits (flashing lights and sounding buzzers usually means
>something
>>>is wrong).
>>>
>>>Third, the pattern is recognized by the major compilers, so the
>kernel
>>>should not have any trouble when porting any of the compilers. I
>often
>>>use multiple compiler to tease out implementation defined behavior in
>>>a effort to achieve greater portability. Here are the citations to
>>>ensure the kernel is safe with the pattern:
>>>
>>>  * GCC: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57157
>>>  * ICC: http://software.intel.com/en-us/forums/topic/580884
>>>
>>>However, Clang may cause trouble because they don't want the
>>>responsibility of recognizing the pattern:
>>>
>>> * https://llvm.org/bugs/show_bug.cgi?id=24226#c8
>>>
>>>Instead, they provide a defective rotate. The "defect" here is its
>>>non-constant time due to the branch, so it may not be suitable for
>>>high-integrity or high-assurance code like linux-crypto:
>>>
>>>  * https://llvm.org/bugs/show_bug.cgi?id=24226#c5
>>>
>>>Jeff
>>
>> So you are actually saying outright that we should sacrifice *actual*
>portability in favor of *theoretical* portability?  What kind of
>twilight zone did we just step into?!
>
>I'm not sure what you mean. It will be well defined on all platforms.
>Clang may not recognize the pattern, which means they could run
>slower. GCC and ICC will be fine.
>
>Slower but correct code is what you have to live with until the Clang
>dev's fix their compiler.
>
>Its kind of like what Dr. Jon Bentley said: "If it doesn't have to be
>correct, I can make it as fast as you'd like it to be".
>
>Jeff

The current code works on all compilers we care about.  The code you propose 
does not; it doesn't work on anything but very recent versions of our flagship 
target compiler, and pretty your own admission might even cause security 
hazards in the kernel if compiled on clang.

That qualifies as insane in my book.  The church code is de facto portable with 
the intended outcome, the one you propose is not.
-- 
Sent from my Android device with K-9 Mail. Please excuse brevity and formatting.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: better patch for linux/bitops.h

2016-05-04 Thread H. Peter Anvin
On May 4, 2016 6:35:44 PM PDT, Jeffrey Walton  wrote:
>On Wed, May 4, 2016 at 5:52 PM, John Denker  wrote:
>> On 05/04/2016 02:42 PM, I wrote:
>>
>>> I find it very odd that the other seven functions were not
>>> upgraded. I suggest the attached fix-others.diff would make
>>> things more consistent.
>>
>> Here's a replacement patch.
>> ...
>
>+1, commit it.
>
>Its good for three additional reasons. First, this change means the
>kernel is teaching the next generation the correct way to do things.
>Many developers aspire to be kernel hackers, and they sometimes use
>the code from bitops.h. I've actually run across the code in
>production during an audit where the developers cited bitops.h.
>
>Second, it preserves a "silent and dark" cockpit during analysis. That
>is, when analysis is run, no findings are generated. Auditors and
>security folks like quiet tools, much like the way pilots like their
>cockpits (flashing lights and sounding buzzers usually means something
>is wrong).
>
>Third, the pattern is recognized by the major compilers, so the kernel
>should not have any trouble when porting any of the compilers. I often
>use multiple compiler to tease out implementation defined behavior in
>a effort to achieve greater portability. Here are the citations to
>ensure the kernel is safe with the pattern:
>
>  * GCC: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57157
>  * ICC: http://software.intel.com/en-us/forums/topic/580884
>
>However, Clang may cause trouble because they don't want the
>responsibility of recognizing the pattern:
>
> * https://llvm.org/bugs/show_bug.cgi?id=24226#c8
>
>Instead, they provide a defective rotate. The "defect" here is its
>non-constant time due to the branch, so it may not be suitable for
>high-integrity or high-assurance code like linux-crypto:
>
>  * https://llvm.org/bugs/show_bug.cgi?id=24226#c5
>
>Jeff

So you are actually saying outright that we should sacrifice *actual* 
portability in favor of *theoretical* portability?  What kind of twilight zone 
did we just step into?!
-- 
Sent from my Android device with K-9 Mail. Please excuse brevity and formatting.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: linux/bitops.h

2016-05-04 Thread H. Peter Anvin
On May 4, 2016 6:20:32 PM PDT, Jeffrey Walton <noloa...@gmail.com> wrote:
>On Wed, May 4, 2016 at 7:06 PM, Andi Kleen <a...@firstfloor.org> wrote:
>> On Wed, May 04, 2016 at 03:06:04PM -0700, John Denker wrote:
>>> On 05/04/2016 02:56 PM, H. Peter Anvin wrote:
>>> >> Beware that shifting by an amount >= the number of bits in the
>>> >> word remains Undefined Behavior.
>>>
>>> > This construct has been supported as a rotate since at least gcc2.
>>>
>>> How then should we understand the story told in commit d7e35dfa?
>>> Is the story wrong?
>>
>> I don't think Linux runs on a system where it would make a difference
>> (like a VAX), and also gcc always converts it before it could.
>> Even UBSan should not complain because it runs after the conversion
>> to ROTATE.
>>
>From what I understand, its a limitation in the barrel shifter and the
>way the shift bits are handled.
>
>Linux runs on a great number of devices, so its conceivable (likely?)
>a low-cost board would have hardware limitations that not found in
>modern desktops and servers or VAX.
>
>Jeff

This is a compiler feature, though.  And if icc or clang don't support the 
idiom they would never have compiled a working kernel.
-- 
Sent from my Android device with K-9 Mail. Please excuse brevity and formatting.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: linux/bitops.h

2016-05-04 Thread H. Peter Anvin

On 05/04/16 15:06, John Denker wrote:

On 05/04/2016 02:56 PM, H. Peter Anvin wrote:

Beware that shifting by an amount >= the number of bits in the
word remains Undefined Behavior.



This construct has been supported as a rotate since at least gcc2.


How then should we understand the story told in commit d7e35dfa?
Is the story wrong?

At the very least, something inconsistent is going on.  There
are 8 functions.  Why did d7e35dfa change one of them but
not the other 7?


Yes. d7e35dfa is baloney IMNSHO.  All it does is produce worse code, and 
the description even says so.


As I said, gcc has treated the former code as idiomatic since gcc 2, so 
that support is beyond ancient.


-hpa




--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] random: replace non-blocking pool with a Chacha20-based CRNG

2016-05-04 Thread H. Peter Anvin
On May 4, 2016 2:42:53 PM PDT, John Denker <j...@av8n.com> wrote:
>On 05/04/2016 12:07 PM, ty...@thunk.org wrote:
>
>> it doesn't hit the
>> UB case which Jeffrey was concerned about. 
>
>That should be good enough for present purposes
>
>However, in the interests of long-term maintainability, I
>would suggest sticking in a comment or assertion saying 
>that ror32(,shift) is never called with shift=0.  This
>can be removed if/when bitops.h is upgraded.
>
>There is a track record of compilers doing Bad Things in
>response to UB code, including some very counterintuitive
>Bad Things.
>
>On Wed, May 04, 2016 at 11:29:57AM -0700, H. Peter Anvin wrote:
>>>
>>> If bitops.h doesn't do the right thing, we need to
>>> fix bitops.h.
>
>Most of the ror and rol functions in linux/bitops.h
>should be considered unsafe, as currently implemented.
>http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/bitops.h?id=04974df8049fc4240d22759a91e035082ccd18b4#n103
>
>I don't see anything in the file that suggests any limits
>on the range of the second argument.  So at best it is an
>undocumented trap for the unwary.  This has demonstrably
>been a problem in the past.  The explanation in the attached
>fix-rol32.diff makes amusing reading.
>
>Of the eight functions
>  ror64, rol64, ror32, rol32, ror16, rol16, ror8, and rol8,
>only one of them can handle shifting by zero, namely rol32.
>It was upgraded on Thu Dec 3 22:04:01 2015; see the attached
>fix-rol32.diff.
>
>I find it very odd that the other seven functions were not
>upgraded.  I suggest the attached fix-others.diff would make
>things more consistent.
>
>Beware that shifting by an amount >= the number of bits in the
>word remains Undefined Behavior.  This should be either documented
>or fixed.  It could be fixed easily enough.

This construct has been supported as a rotate since at least gcc2.
-- 
Sent from my Android device with K-9 Mail. Please excuse brevity and formatting.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] random: replace non-blocking pool with a Chacha20-based CRNG

2016-05-04 Thread H. Peter Anvin
On 05/04/2016 12:07 PM, ty...@thunk.org wrote:
> 
> If we are all agreed that what is in bitops.h is considered valid,
> then we can start converting people over to using the version defined
> in bitops.h, and if there is some compiler issue we need to work
> around, at least we only need to put the workaround in a single header
> file.
> 

Yes, please.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] random: replace non-blocking pool with a Chacha20-based CRNG

2016-05-04 Thread H. Peter Anvin
On May 4, 2016 11:22:25 AM PDT, Jeffrey Walton  wrote:
>On Wed, May 4, 2016 at 1:49 PM,   wrote:
>> On Wed, May 04, 2016 at 10:40:20AM -0400, Jeffrey Walton wrote:
>>> > +static inline u32 rotl32(u32 v, u8 n)
>>> > +{
>>> > +   return (v << n) | (v >> (sizeof(v) * 8 - n));
>>> > +}
>>>
>>> That's undefined behavior when n=0.
>>
>> Sure, but it's never called with n = 0; I've double checked and the
>> compiler seems to do the right thing with the above pattern as well.
>
>> Hmm, it looks like there is a "standard" version rotate left and
>right
>> defined in include/linux/bitops.h.  So I suspect it would make sense
>> to use rol32 as defined in bitops.h --- and this is probably
>something
>
>bitops.h could work in this case, but its not an ideal solution. GCC
>does not optimize the code below as expected under all use cases
>because GCC does not recognize it as a rotate (see
>http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57157):
>
>return (v << n) | (v >> (sizeof(v) * 8 - n));
>
>And outside of special cases like Salsa, ChaCha and BLAKE2, the code
>provided in bitops.h suffers UB on arbitrary data. So I think care
>needs to be taken when selecting functions from bitops.h.
>
>> that we should do for the rest of crypto/*.c, where people seem to be
>> defininig their own version of something like rotl32 (I copied the
>> contents of crypto/chacha20_generic.c to lib/chacha20, so this
>pattern
>> of defining one's own version of rol32 isn't new).
>
>Yeah, I kind of thought there was some duplication going on.
>
>But I think bitops.h should be fixed. Many folks don't realize the
>lurking UB, and many folks don't realize its not always optimized
>well.
>
>>> I think the portable way to do a rotate that avoids UB is the
>>> following. GCC, Clang and ICC recognize the pattern, and emit a
>rotate
>>> instruction.
>>>
>>> static const unsigned int MASK=31;
>>> return (v<>(-n));
>>>
>>> You should also avoid the following because its not constant time
>due
>>> to the branch:
>>>
>>> return n == 0 ? v : (v << n) | (v >> (sizeof(v) * 8 - n));
>>>
>>
>> Where is this coming from?  I don't see this construct in the patch.
>
>My bad... It was a general observation. I've seen folks try to correct
>the UB by turning to something like that.
>
>Jeff

We don't care about UB, we care about gcc, and to a lesser extent LLVM and ICC. 
 If bitops.h doesn't do the right thing, we need to fix bitops.h.
-- 
Sent from my Android device with K-9 Mail. Please excuse brevity and formatting.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: General flags to turn things off (getrandom, pid lookup, etc)

2014-07-25 Thread H. Peter Anvin
On 07/25/2014 11:30 AM, Andy Lutomirski wrote:
 - 32-bit GDT code segments [huge attack surface]
 - 64-bit GDT code segments [probably pointless]

I presume you mean s/GDT/LDT/.

We already don't allow 64-bit LDT code segments.  Also, it is unclear to
me how 32-bit LDT segments have a huge attack surface, given that there
will realistically always be a 32-bit *GDT* segment present.

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -v5] random: introduce getrandom(2) system call

2014-07-24 Thread H. Peter Anvin
On 07/24/2014 01:54 PM, Andy Lutomirski wrote:
 
 Or that someone writes userspace code that gets -EPERM/-EACCESS on
 getrandom with GRND_RANDOM and falls back to something worse than
 getrandom w/o GRND_RANDOM.
 

-ENXIO?

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] random: limit the contribution of the hw rng to at most half

2014-07-17 Thread H. Peter Anvin
On 07/17/2014 03:03 AM, Theodore Ts'o wrote:
 For people who don't trust a hardware RNG which can not be audited,
 the changes to add support for RDSEED can be troubling since 97% or
 more of the entropy will be contributed from the in-CPU hardware RNG.
 
 We now have a in-kernel khwrngd, so for those people who do want to
 implicitly trust the CPU-based system, we could create an arch-rng
 hw_random driver, and allow khwrng refill the entropy pool.  This
 allows system administrator whether or not they trust the CPU (I
 assume the NSA will trust RDRAND/RDSEED implicitly :-), and if so,
 what level of entropy derating they want to use.
 
 The reason why this is a really good idea is that if different people
 use different levels of entropy derating, it will make it much more
 difficult to design a backdoor'ed hwrng that can be generally
 exploited in terms of the output of /dev/random when different attack
 targets are using differing levels of entropy derating.
 
 Signed-off-by: Theodore Ts'o ty...@mit.edu

I saw exactly one complaint to that nature, but that was from someone
who really wanted the nordrand option (at which point I observed that
it had inadvertently left RDSEED enabled which quickly got rectified.)
The implication was that this was a request from a specific customer who
presumably have their own audited hardware RNG.

There may have been other complaints (justified or not) but if so I
haven't seen them.  I'm wondering if we are overgeneralizing here and if
so if it wouldn't be better to defer this until the hwrng supplier for
this is ready, which probably won't happen in time for 3.17 just given
the current timeline.

-hpa

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: arch_random_refill

2014-05-11 Thread H. Peter Anvin
On 05/11/2014 04:01 PM, Stephan Mueller wrote:
 Hi Peter,
 
 some time back when the RDRAND instruction was debated, a patch was offered 
 for driver/char/random.c that in essence turned /dev/random into a frontend 
 for RDRAND in case that instruction was available. The patch kind of 
 monopolized the noise sources such that if a user space random number hog 
 pulled data from /dev/random endlessly, the (almost) only noise source was 
 RDRAND. As that patch treated RDRAND to provide entropy, the blocking 
 behavior 
 went away for /dev/random.
 

This is false in a number of ways.

First of all... we NEVER pulled either /dev/random or /dev/urandom
directly from RDRAND.  We used RDRAND directly for kernel internal
randomness uses.  Users did object to this.

 That patch did not sit well with some developers and it got finally changed 
 such that the output of RDRAND is now just XORed with the output of the 
 classic /dev/random behavior -- /dev/random is still blocking. 

Mixing in RDRAND into /dev/random and /dev/urandom is actually

 With the current development cycle for 3.15, the function arch_random_refill 
 is added as presented in [1]. It now uses RDSEED instead of RDRAND. Yet, the 
 way this function is called in random_read seems (as I have no system with an 
 RDSEED, I cannot test) to show the very same behavior as the aforementioned 
 RDRAND patch: the blocking behavior of /dev/random will be gone and RDSEED 
 will monopolize the noise sources in case of a user space hog.

There is a huge difference between this and what people objected to
earlier: we filter everything through the kernel random number pool
system, which would require a herculean mathematical effort to reverse
even if the output of RDSEED was 100% predictable.

 Note, I do not see an issue with the patch that adds RDSEED as part of 
 add_interrupt_randomness outlined in [2]. The reason is that this patch does 
 not monopolizes the noise sources.
 
 I do not want to imply that Intel (or any other chip manufacturer that will 
 hook into arch_random_refill) intentionally provides bad entropy (and this 
 email shall not start a discussion about entropy again), but I would like to 
 be able to only use noise sources that I can fully audit. As it is with 
 hardware, I am not able to see what it is doing.

I have to point out the irony in this given your previous proposals,
however...

 Thus, may I ask that arch_random_refill is revised such that it will not 
 monopolize the noise sources? If somebody wants that, he can easily use rngd.

Feel free to build the kernel without CONFIG_ARCH_RANDOM, or use the
nordrand option to the kernel.  These options are there for a reason.

Now when you mention it, though, the nordrand option should turn off
RDSEED as well as RDRAND.  It currently doesn't; that is a bug, plain
and simple.

-hpa

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: arch_random_refill

2014-05-11 Thread H. Peter Anvin
On 05/11/2014 08:36 PM, Stephan Mueller wrote:
 
 Ohh, ok, thanks for fixing that. :-) 
 
 Though what makes me wonder is the following: why are some RNGs forced to use 
 the hw_random framework whereas some others are not? What is the driver for 
 that?
 
 The current state of random.c vs. drivers/char/hw_random and the strong in-
 kernel separation between both makes me wonder. Isn't that all kind of 
 inconsistent?
 

The main differences are speed of access, trivial interface, and
architectural guarantees.  You also don't have to deal with enumeration,
DMA engines, interrupts, indirect access, or bus drivers, which all are
utterly unacceptable on a synchronous path.

That being said, it is getting clear that we most likely would be better
off with the kernel directly feeding from at least a subset of the
hw_random drivers, rather than waiting for user space to come along and
launch a daemon... after $DEITY knows how many other processes have
already been launched.  There are patches being worked on to make that
happen, although there are a fair number of potential issues, including
the fact that some of the hw_random drivers are believed to be dodgy --
for example, the TPM driver: some TPMs are believed to not contain any
entropy element and simply rely on a factory-seeded nonvolatile counter
(since the TPM has to have support for nonvolatile counters anyway, this
hardware is already present.)

-hpa



--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: arch_random_refill

2014-05-11 Thread H. Peter Anvin
On 05/11/2014 08:36 PM, Stephan Mueller wrote:
 
 But in our current predicament, not everybody trusts a few potentially easily 
 manipulated gates that have no other purpose than produce white noise which 
 are developed by the biggest chip vendor in the US. Gates which have other 
 purposes may not be that easily manipulated.


Incidentally, I disagree with the easily manipulated bit.  Yes, I have
seen the paper which says that you can do it in such a way that it
doesn't show up on *visual* examination.  However, put an electrical
probe on it and it shows up immediately.

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] crypto: sha{256,512}_ssse3 - remove asmlinkage from static functions

2014-04-17 Thread H. Peter Anvin
On 04/17/2014 08:28 AM, Marek Vasut wrote:
 On Wednesday, April 16, 2014 at 06:19:50 PM, Jianyu Zhan wrote:
 Commit 128ea04a9885(lto: Make asmlinkage __visible) restricts
 asmlinkage to externally_visible, this causes compilation warnings:

 arch/x86/crypto/sha256_ssse3_glue.c:56:1:
 warning: ‘externally_visible’ attribute have effect only on public
 objects [-Wattributes]

 static asmlinkage void (*sha256_transform_asm)(const char *, u32 *,
 u64); ^

 arch/x86/crypto/sha512_ssse3_glue.c:55:1:
 warning: ‘externally_visible’ attribute have effect only on public
 objects [-Wattributes] static asmlinkage void
 (*sha512_transform_asm)(const char *, u64 *, ^

 Drop asmlinkage here to avoid such warnings.

 Also see Commit 8783dd3a37a5853689e1(irqchip: Remove asmlinkage from
 static functions)

 Signed-off-by: Jianyu Zhan nasa4...@gmail.com
 
 Makes sense, please add my humble
 
 Reviewed-by: Marek Vasut ma...@denx.de
 

It doesn't make sense, sorry.  The right thing to drop here is not
asmlinkage, it is static: this is an external declaration.

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] crypto: sha{256,512}_ssse3 - remove asmlinkage from static functions

2014-04-17 Thread H. Peter Anvin
On 04/17/2014 09:58 PM, Herbert Xu wrote:

 It doesn't make sense, sorry.  The right thing to drop here is not
 asmlinkage, it is static: this is an external declaration.
 
 It's a function pointer that's static, not the function that
 it's pointing to.
 

{facepalm} Right, function *pointer*.

Duh.

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] x86/crypto: ghash: use C implementation for setkey()

2014-03-27 Thread H. Peter Anvin
On 03/27/2014 04:46 AM, Ard Biesheuvel wrote:
 On 27 March 2014 12:36, Herbert Xu herb...@gondor.apana.org.au wrote:
 On Thu, Mar 27, 2014 at 12:29:00PM +0100, Ard Biesheuvel wrote:
 The GHASH setkey() function uses SSE registers but fails to call
 kernel_fpu_begin()/kernel_fpu_end(). Instead of adding these calls, and
 then having to deal with the restriction that they cannot be called from
 interrupt context, move the setkey() implementation to the C domain.

 Note that setkey cannot be called from interrupt context since
 allocation/setkey is supposed to be slow-path material.

 But your approach is fine by me.
 
 I agree that it makes little sense to call this from atomic context,
 but that still means (I think, but the x86 guys should confirm) that
 you are supposed to call kernel_fpu_begin() and kernel_fpu_end().
 

Yes.  I'm suspecting calling kernel_fpu_begin() for a single GF
operation is probably not worth it, so I'm fine with reimplementing it
in integer logic.

Acked-by: H. Peter Anvin h...@linux.intel.com

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes

2014-03-24 Thread H. Peter Anvin
On 03/24/2014 09:10 AM, Mathias Krause wrote:
 The recent addition of the AVX2 variant of the SHA1 hash function wrongly
 disabled the AVX variant by introducing a flaw in the feature test. Fixed
 in patch 1.
 
 The alignment calculations of the AVX2 assembler implementation are
 questionable, too. Especially the page alignment of the stack pointer is
 broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
 code alignment is fixed.
 
 Please apply!
 

Yikes.  Yes... the alignment calculation is confused in the extreme.

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes

2014-03-24 Thread H. Peter Anvin
On 03/24/2014 09:10 AM, Mathias Krause wrote:
 The recent addition of the AVX2 variant of the SHA1 hash function wrongly
 disabled the AVX variant by introducing a flaw in the feature test. Fixed
 in patch 1.
 
 The alignment calculations of the AVX2 assembler implementation are
 questionable, too. Especially the page alignment of the stack pointer is
 broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
 code alignment is fixed.
 
 Please apply!
 
 Mathias Krause (3):
   crypto: x86/sha1 - re-enable the AVX variant
   crypto: x86/sha1 - fix stack alignment of AVX2 variant
   crypto: x86/sha1 - reduce size of the AVX2 asm implementation
 
  arch/x86/crypto/sha1_avx2_x86_64_asm.S |8 ++--
  arch/x86/crypto/sha1_ssse3_glue.c  |   26 --
  2 files changed, 18 insertions(+), 16 deletions(-)
 

For all these patches:

Reviewed-by: H. Peter Anvin h...@linux.intel.com

Thank you for doing these.

-hpa

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 1/1] crypto: SHA1 transform x86_64 AVX2

2014-03-20 Thread H. Peter Anvin
Again... Herbert, David... yours or mine?

-hpa
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 1/1] crypto: SHA1 transform x86_64 AVX2

2014-03-20 Thread H. Peter Anvin
On 03/20/2014 05:09 PM, Herbert Xu wrote:
 On Thu, Mar 20, 2014 at 03:23:27PM -0700, H. Peter Anvin wrote:
 Again... Herbert, David... yours or mine?
 
 Yes I will be taking this through cryptodev.
 

Thanks!

Acked-by: H. Peter Anvin h...@linux.intel.com

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/1] crypto: SHA1 transform x86_64 AVX2

2014-03-19 Thread H. Peter Anvin
Herbert, do you want to take this or should I?

In the former case, please feel free to add my:

Acked-by: H. Peter Anvin h...@zytor.com

-hpa


On 03/19/2014 02:21 PM, chandramouli narayanan wrote:
 From 8948c828aa929a3e2531ca88e3f05fbeb1ff53db Mon Sep 17 00:00:00 2001
 From: Chandramouli Narayanan mo...@linux.intel.com
 Date: Wed, 19 Mar 2014 09:30:12 -0700
 Subject: [PATCH v4 1/1] crypto: SHA1 transform x86_64 AVX2
 
 This git patch adds x86_64 AVX2 optimization of SHA1
 transform to crypto support. The patch has been tested with 3.14.0-rc1
 kernel.
 
 On a Haswell desktop, with turbo disabled and all cpus running
 at maximum frequency, tcrypt shows AVX2 performance improvement
 from 3% for 256 bytes update to 16% for 1024 bytes update over
 AVX implementation.
 
 This patch adds sha1_avx2_transform(), the glue, build and
 configuration changes needed for AVX2 optimization of
 SHA1 transform to crypto support.
 
 sha1-ssse3 is one module which adds the necessary optimization
 support (SSSE3/AVX/AVX2) for the low-level SHA1 transform function. With 
 better
 optimization support, transform function is overridden as the case may be.
 In the case of AVX2, due to performance reasons across datablock sizes,
 the AVX or AVX2 transform function is used at run-time as it suits best.
 The Makefile change therefore appends the necessary objects to the linkage.
 Due to this, the patch merely appends AVX2 transform to the existing build mix
 and Kconfig support and leaves the configuration build support as is.
 
 Changes noted from the initial version of this patch are based on the
 feedback from the community:
 a) check for BMI2 in addition to AVX2 support since
 __sha1_transform_avx2() uses rorx
 b) Since the module build has dependency on 64bit, it is
 redundant to check it in the code here.
 c) coding style cleanup
 d) simplification of the assembly code where macros are repetitively used.
 
 Signed-off-by: Chandramouli Narayanan mo...@linux.intel.com
 ---
  arch/x86/crypto/Makefile   |   3 +
  arch/x86/crypto/sha1_avx2_x86_64_asm.S | 706 
 +
  arch/x86/crypto/sha1_ssse3_glue.c  |  50 ++-
  crypto/Kconfig |   4 +-
  4 files changed, 754 insertions(+), 9 deletions(-)
  create mode 100644 arch/x86/crypto/sha1_avx2_x86_64_asm.S
 
 diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile
 index 6ba54d6..61d6e28 100644
 --- a/arch/x86/crypto/Makefile
 +++ b/arch/x86/crypto/Makefile
 @@ -79,6 +79,9 @@ aesni-intel-y := aesni-intel_asm.o aesni-intel_glue.o fpu.o
  aesni-intel-$(CONFIG_64BIT) += aesni-intel_avx-x86_64.o
  ghash-clmulni-intel-y := ghash-clmulni-intel_asm.o ghash-clmulni-intel_glue.o
  sha1-ssse3-y := sha1_ssse3_asm.o sha1_ssse3_glue.o
 +ifeq ($(avx2_supported),yes)
 +sha1-ssse3-y += sha1_avx2_x86_64_asm.o
 +endif
  crc32c-intel-y := crc32c-intel_glue.o
  crc32c-intel-$(CONFIG_64BIT) += crc32c-pcl-intel-asm_64.o
  crc32-pclmul-y := crc32-pclmul_asm.o crc32-pclmul_glue.o
 diff --git a/arch/x86/crypto/sha1_avx2_x86_64_asm.S 
 b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
 new file mode 100644
 index 000..559eb6c
 --- /dev/null
 +++ b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
 @@ -0,0 +1,706 @@
 +/*
 + *   Implement fast SHA-1 with AVX2 instructions. (x86_64)
 + *
 + * This file is provided under a dual BSD/GPLv2 license.  When using or
 + * redistributing this file, you may do so under either license.
 + *
 + * GPL LICENSE SUMMARY
 + *
 + * Copyright(c) 2014 Intel Corporation.
 + *
 + * This program is free software; you can redistribute it and/or modify
 + * it under the terms of version 2 of the GNU General Public License as
 + * published by the Free Software Foundation.
 + *
 + * This program is distributed in the hope that it will be useful, but
 + * WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 + * General Public License for more details.
 + *
 + * Contact Information:
 + * Ilya Albrekht ilya.albre...@intel.com
 + * Maxim Locktyukhin maxim.locktyuk...@intel.com
 + * Ronen Zohar ronen.zo...@intel.com
 + * Chandramouli Narayanan mo...@linux.intel.com
 + *
 + * BSD LICENSE
 + *
 + * Copyright(c) 2014 Intel Corporation.
 + *
 + * Redistribution and use in source and binary forms, with or without
 + * modification, are permitted provided that the following conditions
 + * are met:
 + *
 + * Redistributions of source code must retain the above copyright
 + * notice, this list of conditions and the following disclaimer.
 + * Redistributions in binary form must reproduce the above copyright
 + * notice, this list of conditions and the following disclaimer in
 + * the documentation and/or other materials provided with the
 + * distribution.
 + * Neither the name of Intel Corporation nor the names of its
 + * contributors may be used to endorse or promote products derived
 + * from this software without specific prior written permission

Re: [PATCH 2/2] SHA1 transform: x86_64 AVX2 optimization - glue build-v2

2014-03-17 Thread H. Peter Anvin
On 03/17/2014 09:53 AM, chandramouli narayanan wrote:
 On second thoughts, with sha1-sse3-(CONFIG_AS_AVX2) +=
 sha1_avx2_x86_64_asm.o, I have build issues and sha1_transform_avx2
 undefined in sha1-sss3.ko. 
 
 I can rid #ifdef CONFIG_AS_AVX2 in patch1. The following works though:
 ifeq ($(avx2_supported),yes)
 sha1-ssse3-y += sha1_avx2_x86_64_asm.o
 endif

Yes, the sad thing is that the CONFIG_AS_* things aren't real config
symbols, despite the name.  They might be in the future when Kconfig can
run test probes (something we have needed for a very long time.)

The yes versus y, though, is a total faceplant.

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RESEND 3] hwrng: add randomness to system from rng sources

2014-03-16 Thread H. Peter Anvin
On 03/03/2014 03:51 PM, Kees Cook wrote:
 When bringing a new RNG source online, it seems like it would make sense
 to use some of its bytes to make the system entropy pool more random,
 as done with all sorts of other devices that contain per-device or
 per-boot differences.
 
 Signed-off-by: Kees Cook keesc...@chromium.org

I would like to raise again the concept of at least optionally using a
kernel thread, rather than a user-space daemon, to feed hwrng output to
the kernel pools.  The main service rngd provides is FIPS tests, but
those FIPS tests were withdrawn as a standard over 10 years ago and are
known to be extremely weak, at the very best a minimal sanity check.
Furthermore, they are completely useless against the output of any RNG
which contains a cryptographic whitener, which is the vast majority of
commercial sources -- this is especially so since rngd doesn't expect to
have to do any data reduction for output coming from hwrng.

Furthermore, if more than one hwrng device is available, rngd will only
be able to read one of them because of the way /dev/hwrng is implemented
in the kernel.

For contrast, the kernel could do data reduction just fine by only
crediting the entropy coming out of each hwrng driver with a fractional
amount.

That does *not* mean that there aren't random number generators which
require significant computation better done in user space.  For example,
an audio noise daemon or a lava lamp camera which requires video processing.

-hpa

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RESEND 3] hwrng: add randomness to system from rng sources

2014-03-16 Thread H. Peter Anvin
On 03/04/2014 02:39 PM, Matt Mackall wrote:
 
 [temporarily coming out of retirement to provide a clue]
 
 The pool mixing function is intentionally _reversible_. This is a
 crucial security property.
 
 That means, if I have an initial secret pool state X, and hostile
 attacker controlled data Y, then we can do:
 
 X' = mix(X, Y)
 
  and 
 
 X = unmix(X', Y)
 
 We can see from this that the combination of (X' and Y) still contain
 the information that was originally in X. Since it's clearly not in Y..
 it must all remain in X'.
 

This of course assumes that the attacker doesn't know the state of the
pool X.

The other thing to note is that reversible doesn't necessarily mean
linear (the current mixing function is linear.)  AES, for example, is
reversible (if and only if you possess the key) but is highly nonlinear.

I'm not saying we should use AES to mix the pool -- it is almost
guaranteed to be too expensive.

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] crypto: aesni - fix build on x86 (32bit)

2014-01-06 Thread H. Peter Anvin
Can the code be adjusted to compile for 32 bit x86 or is that pointless?

Tim Chen tim.c.c...@linux.intel.com wrote:


On Mon, 2013-12-30 at 15:52 +0200, Andy Shevchenko wrote:
 It seems commit d764593a crypto: aesni - AVX and AVX2 version of
AESNI-GCM
 encode and decode breaks a build on x86_32 since it's designed only
for
 x86_64. This patch makes a compilation unit conditional to
CONFIG_64BIT and
 functions usage to CONFIG_X86_64.

Thanks for catching and fixing it.

Tim
 
 Signed-off-by: Andy Shevchenko andriy.shevche...@linux.intel.com

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] crypto: aesni - fix build on x86 (32bit)

2014-01-06 Thread H. Peter Anvin
On 01/06/2014 09:57 AM, Tim Chen wrote:
 On Mon, 2014-01-06 at 09:45 -0800, H. Peter Anvin wrote:
 Can the code be adjusted to compile for 32 bit x86 or is that pointless?

 
 Code was optimized for wide registers.  So it is only meant for x86_64.
 

Aren't the wide registers the vector registers?  Or are you also
relying on 64-bit integer registers (in which case we should just rename
the file to make it clear it is x86-64 only.)

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] crypto: aesni - fix build on x86 (32bit)

2014-01-06 Thread H. Peter Anvin
On 01/06/2014 12:26 PM, Borislav Petkov wrote:
 On Mon, Jan 06, 2014 at 10:10:55AM -0800, Tim Chen wrote:
 Yes, the code is in the file named aesni_intel_avx.S. So it should be
 clear that the code is meant for x86_64.
 
 How do you deduce aesni_intel_avx.S is meant for x86_64 only from the
 name?
 
 Shouldn't it be called aesni_intel_avx-x86_64.S, as is the naming
 convention in arch/x86/crypto/
 

Quite.

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] crypto: aesni - fix build on x86 (32bit)

2014-01-06 Thread H. Peter Anvin
On 01/06/2014 03:39 PM, Tim Chen wrote:
 
 Will renaming the file to aesni_intel_avx-x86_64.S make things clearer
 now?
 
 Tim

Yes.

Acked-by: H. Peter Anvin h...@linux.intel.com

Herbert, can you pick it up?

-hpa

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

2013-11-10 Thread H. Peter Anvin
On 11/10/2013 09:21 AM, Stephan Mueller wrote:
 
 Here you say it: we *assume*. And please bear in mind that we all know for a 
 fact that the theory behind quantum physics is not complete, because it does 
 not work together with the theory of relativity. That again is a hint that 
 there is NO proof that decay is really unpredictable.
 

Honestly, if you want to be taken seriously, this is not the kind of
thing to put in your discussion.

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3.11-rc1] crypto: Fix boot failure due to module dependency.

2013-07-19 Thread H. Peter Anvin
On 07/19/2013 04:16 PM, Greg Kroah-Hartman wrote:
 
 udev isn't doing any module loading, 'modprobe' is just being called for
 any new module alias that shows up in the system, and all of the drivers
 that match it then get loaded.
 
 How is it a problem if a module is attempted to be loaded that is
 already loaded?  How is it a problem if a different module is loaded for
 a device already bound to a driver?  Both of those should be total
 no-ops for the kernel.
 
 But, I don't know anything about the cpu code, how is loading a module
 causing problems?  That sounds like it needs to be fixes, as any root
 user can load modules whenever they want, you can't protect the kernel
 from doing that.
 

The issue here seems to be the dynamic binding nature of the crypto
subsystem.  When something needs crypto, it will request the appropriate
crypto module (e.g. crct10dif), which may race with detecting a specific
hardware accelerator based on CPUID or device information (e.g.
crct10dif_pclmul).

RAID has effectively the same issue, and we just solved it by
compiling in all the accelerators into the top-level module.

-hpa

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3.11-rc1] crypto: Fix boot failure due to module dependency.

2013-07-19 Thread H. Peter Anvin
On 07/19/2013 04:26 PM, Greg Kroah-Hartman wrote:

 RAID has effectively the same issue, and we just solved it by
 compiling in all the accelerators into the top-level module.
 
 Then there's nothing to be done in udev or kmod, right?
 

I don't know.

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3 v2] Optimize CRC32C calculation using PCLMULQDQ in crc32c-intel module

2012-09-27 Thread H. Peter Anvin

On 09/27/2012 03:44 PM, Tim Chen wrote:

Version 2
This version of the patch series fixes compilation errors for
32 bit x86 targets.

Version 1
This patch series optimized CRC32C calculations with PCLMULQDQ
instruction for crc32c-intel module.  It speeds up the original
implementation by 1.6x for 1K buffer and by 3x for buffer 4k or
more.  The tcrypt module was enhanced for doing speed test
on crc32c calculations.



Herbert - you are handling this one, right?

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] crypto: camellia-x86_64 - module init/exit functions should be static

2012-03-22 Thread H. Peter Anvin
On 03/21/2012 05:46 PM, Herbert Xu wrote:
 On Wed, Mar 21, 2012 at 05:40:03PM -0700, Randy Dunlap wrote:
 On 03/15/2012 01:11 PM, Jussi Kivilinna wrote:

 This caused conflict with twofish-x86_64-3way when compiled into kernel,
 same function names and not static.

 Have these patches been merged anywhere?
 I'm still seeing build problems in linux-next 20120321.
 
 Thanks for the reminder, I'll push these through today.

Please push these to Linus ASAP, it is breaking the x86-64 allyesconfig
build upstream right now.

-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH 00/16] Crypto keys and module signing [ver #2]

2011-12-05 Thread H. Peter Anvin
On 11/29/2011 03:42 PM, David Howells wrote:
 
 I have provided a couple of subtypes: DSA and RSA.  Both types have signature
 verification facilities available within the kernel, and both can be used for
 module signature verification with any encryption algorithm known by the PGP
 parser, provided the appropriate algorithm is compiled directly into the
 kernel.
 

Do we really need the complexity of a full OpenPGP parser?  Parsers are
notorious security problems.  Furthermore, using DSA in anything but a
hard legacy application is not something you want to encourage, so why
support DSA?

-hpa
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

2011-06-19 Thread H. Peter Anvin
On 06/19/2011 08:07 AM, Herbert Xu wrote:
 On Sun, Jun 19, 2011 at 09:38:43AM -0400, Neil Horman wrote:

 It sounds to me like, if its desireous to bypass the entropy pool, then we
 should bypass the /dev/random path altogether.  Why not write a hwrng driver
 that can export access to the rdrand instruction via a misc device.
 
 I presume the rdrand instruction can be used from user-space
 directly.
 

Yes, it can.

Again, RDRAND is not suitable for /dev/random (as opposed to
/dev/urandom users.)  /dev/urandom is used both by user space (and here
the only reason to hook it up to /dev/urandom is compatibility with
existing userspace; we are working separately to enabling user space
users like OpenSSL to use RDRAND directly) and by kernel users via the
internal APIs.

/dev/random as far as I can tell is only ever fed to userspace, however,
the guarantees that it is at least supposed to give are very, very
strict.  RDRAND do not fulfill those criteria, but we should be able to
use it as part of its implementation.

-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

2011-06-18 Thread H. Peter Anvin
On 06/17/2011 01:28 PM, Matt Mackall wrote:

 The one use case that it is cryptographically insufficient for is to
 seed a new PRNG, which probably means it is unsuitable for being fed
 as-is into /dev/random.
 
 The thing to understand about the input side of /dev/random is that it's
 COMPLETELY immune to untrusted data. So there's absolutely no harm in
 sending it data of questionable entropy so long as you don't tell it to
 account it. And, of course, if it DOES contain entropy, it makes things
 better.
 
 Think of it this way: I have a coin in my pocket. You, the attacker,
 tell me to flip it. You can do that any number of times and not improve
 your guess about the coin's state over your initial guess. This is what
 it means to have a reversible mixing function: no number of iterations
 reduces the degrees of freedom of the pool.
 

What I meant is that it is unsuitable to *bypass the pool* for
/dev/random.  I think we can -- and almost certainly should -- use
RDRAND on the input side; we just have to figure out the accounting.

However, RDRAND is high enough quality (and high enough bandwidth) that
it should be not just possible but desirable to completely bypass the
pool system for /dev/urandom users and pull straight from the RDRAND
instruction.  I don't actually know what the exact numbers look like,
but the stall conditions being looked at are of the order of every core
in the socket trying to execute RDRAND at the same time.

-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/5] tsc: wire up entropy generation function

2011-06-14 Thread H. Peter Anvin
On 06/13/2011 04:10 PM, Matt Mackall wrote:
 
 Well the issue is that samples are going to be roughly aligned to some
 multiple of the bus frequency. If an interrupt occurs on bus cycle X,
 this code will be hit at roughly TSC cycle X*M+d.
 
 This is correct; at the very least I would multiply the low 32 bits of
 the TSC with a 32-bit prime number before mixing.
 
 This multiply doesn't actually accomplish anything. Mixing known values
 into the pool is harmless because the mixing function is fully
 reversable. ie:
 
  unmix(sample, mix(sample, pool)) = pool
 
 If you didn't know the contents of pool before, you can't guess it
 after.
 
 The danger comes if you (a) already know pool and (b) can narrow sample
 to a tractable set of values. Then you can calculate a set of possible
 pool' values and compare the resulting output to confirm the actual
 state of pool' (aka state extension attack). Sticking a constant
 multiplier on sample doesn't affect this attack at all.
 

It only matters if you actually truncate it to LSBs.

 However, the big issue with this is that it's recursive... what causes
 this to be invoked... probably an interrupt, which is going to have been
 invoked by a timer, quite possible the TSC deadline timer.  Oops.
 
 ..and the sampling function already mixes in a TSC timestamp with the
 provided timestamp.

Right.

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/5] tsc: wire up entropy generation function

2011-06-13 Thread H. Peter Anvin
On 06/13/2011 05:39 PM, Kent Borg wrote:
 I was assuming that drivers, responding to an interrupt from some
 external event, would want to make this call.
 Such as a network driver.

Those already are doing this.

 Two points:
 
 1. Why look at the high-order bits?  How are they going to have values
 that cannot be inferred?  If you are looking for entropy, the low-order
 bits is where they're going keep it.  (Think Willie Sutton.)  If looking
 at the low-byte is cheaper, then let's be cheap.  If, however, if
 looking at more bytes *is* as cheap, then there is no harm.  But in any
 case let's keep the code lean enough that it can be called from an
 interrupt service routine.

Consider the case where the TSC is running in steps of 64 and you're
using the low 6 bits.

-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -v4] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-11-03 Thread H. Peter Anvin
On 11/03/2009 01:03 AM, Ingo Molnar wrote:

 .macro xmm_num opd xmm
 .ifc \xmm,%xmm0
 \opd = 0
 .endif
 .ifc \xmm,%xmm1
 \opd = 1
 .endif
 .ifc \xmm,%xmm2
 \opd = 2
 .endif
 .ifc \xmm,%xmm3
 \opd = 3
 .endif
 .ifc \xmm,%xmm4
 \opd = 4
 .endif
 .ifc \xmm,%xmm5
 \opd = 5
 .endif
 .ifc \xmm,%xmm6
 \opd = 6
 .endif
 .ifc \xmm,%xmm7
 \opd = 7
 .endif
 .ifc \xmm,%xmm8
 \opd = 8
 .endif
 .ifc \xmm,%xmm9
 \opd = 9
 .endif
 .ifc \xmm,%xmm10
 \opd = 10
 .endif
 .ifc \xmm,%xmm11
 \opd = 11
 .endif
 .ifc \xmm,%xmm12
 \opd = 12
 .endif
 .ifc \xmm,%xmm13
 \opd = 13
 .endif
 .ifc \xmm,%xmm14
 \opd = 14
 .endif
 .ifc \xmm,%xmm15
 \opd = 15
 .endif
 .endm

 .macro PSHUFB_XMM xmm1 xmm2
 xmm_num pshufb_opd1 \xmm1
 xmm_num pshufb_opd2 \xmm2
 .if (pshufb_opd1  8)  (pshufb_opd2  8)
 .byte 0x66, 0x0f, 0x38, 0x00, 0xc0 | pshufb_opd1 | (pshufb_opd23)
 .elseif (pshufb_opd1 = 8)  (pshufb_opd2  8)
 .byte 0x66, 0x41, 0x0f, 0x38, 0x00, 0xc0 | (pshufb_opd1-8) | (pshufb_opd23)
 .elseif (pshufb_opd1  8)  (pshufb_opd2 = 8)
 .byte 0x66, 0x44, 0x0f, 0x38, 0x00, 0xc0 | pshufb_opd1 | ((pshufb_opd2-8)3)
 .else
 .byte 0x66, 0x45, 0x0f, 0x38, 0x00, 0xc0 | (pshufb_opd1-8) | 
 ((pshufb_opd2-8)3)
 .endif
 .endm
 
 Looks far too clever, i like it :-) We have quite a few assembly macros 
 in arch/x86/include/asm/. The above one could be put into calling.h for 
 example.
 

I would really like to see something like that, with only one minor
tweak: please use submacros to generate the REX and MODRM bytes, since
we are *guaranteed* to want to do the same thing again.

-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 6/7] x86: Move kernel_fpu_using to asm/i387.h

2009-06-17 Thread H. Peter Anvin
Ingo Molnar wrote:
  
 +static inline int kernel_fpu_using(void)
 +{
 +if (in_interrupt()  !(read_cr0()  X86_CR0_TS))
 +return 1;
 +return 0;
 +}
 +
 
 Looks sane to me. Herbert, do you ack it?
 

Although I have to say, the structure of:

if (boolean test)
return 1;
return 0;

... truly was hit with the ugly stick.  It really should be:

static inline bool kernel_fpu_using(void)
{
return in_interrupt()  !(read_cr0()  C86_CR0_TS);
}

Huang: if I recall correctly, these functions were originally designed
to deal with the fact that VIA processors generate spurious #TS faults
due to broken design of the Padlock instructions.  The AES and PCLMUL
instructions actually use SSE registers and so will require different
structure.

-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 6/7] x86: Move kernel_fpu_using to asm/i387.h

2009-06-17 Thread H. Peter Anvin
Herbert Xu wrote:
 On Wed, Jun 17, 2009 at 10:06:44AM -0700, H. Peter Anvin wrote:
 
 Huang: if I recall correctly, these functions were originally designed
 to deal with the fact that VIA processors generate spurious #TS faults
 due to broken design of the Padlock instructions.  The AES and PCLMUL
 instructions actually use SSE registers and so will require different
 structure.
 
 No irq_ts_save was the one designed for the VIA, the Intel stuff
 does save the FPU state.
 

Ah, sorry, misremembered.  Great!  Will apply.

-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 6/7] x86: Move kernel_fpu_using to asm/i387.h

2009-06-17 Thread H. Peter Anvin
Huang Ying wrote:
 
 After some thinking, I think something as follow may be more
 appropriate:
 
 /* This may be useful for someone else */
 static inline bool fpu_using(void)
 {
   return !(read_cr0()  X86_CR0_TS);
 }
 
 static inline bool irq_fpu_using(void)
 {
   return in_interrupt()  fpu_using();
 }
 

Yes, looks good.  I'll pull in the patch as soon as I get it.

-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html