Re: [PATCH] Add Streamlined NTRU Prime sntrup761.

2023-06-19 Thread Niels Möller
Niels Möller  writes:

> And int32_divmod_uint14 looked unused. 

My mistake, it's not unused. It is used (via int32_mod_uint14) by
F3_freeze and Fq_freeze, which appear to use signed representation, |x|
<= 1 and |x| <= (q-1)/2 respectively.

> For sorting, it may need a minor reorg to get rid of the unneeded
> variant.

See patch below. I think that makes the code simpler, but it might be
best to leave as is for now.

>> My take was that it would be nice to add sntrup761 to Nettle ASAP to
>> stabilize API and establish support for the algorithm -- we can optimize
>> or improve the implementation later on (there are many optimized
>> implementations around for different architectures out there).
>
> Makes sense, if it's clear what api and abi should look like (but, e.g.,
> use of union nettle_block16 does affect the abi, I think).

Having a closer look at the header file defining the api. I see no abi
subtleties here, only naming nits.

  * sntrup761_keypair: should be sntrup761_generate_keypair, for consistency.

  * sntrup761_enc, sntrup761_dec: Maybe abbreviate less, is
_encapsulate and _decapsulate too much? Or is _enc and _dec really
established in the area?

  * SNTRUP761_PUBLICKEY_SIZE: I think it would be more consistent with
an underscore, _PUBLIC_KEY_SIZE.

  * SNTRUP761_SECRETKEY_SIZE: I prefer SNTRUP761_PRIVATE_KEY_SIZE is
more consistent (maybe "private key" is not modern, but it's the
terminology used for all other asymmetric algorithms in nettle).

  * SNTRUP761_CIPHERTEXT_SIZE: Probably right, even though I'm a bit
confused by the "ciphertext" terminology when there's no
corresponding plaintext.

  * SNTRUP761_SIZE: This needs a more specific name, maybe _SECRET_SIZE,
_SHARED_SECRET_SIZE, _SESSION_KEY_SIZE, _OUTPUT_SIZE, ...? 

In your docs, I noticed a copy-paste error in the docs for
SNTRUP761_PUBLICKEY_SIZE.

Things I think are desirable to do before merging an initial version:
Agree on naming. Rename the single-lower-case-letter macros in the .c
file to macro-like names. Add valgrind-based tests of side-channel silence.

(I'd need to read both spec and implementation closer to have more
opinions on the implementation).

Regards,
/Niels

diff --git a/sntrup761.c b/sntrup761.c
index dc1ca015..fb7fd761 100644
--- a/sntrup761.c
+++ b/sntrup761.c
@@ -55,24 +55,20 @@ crypto_hash_sha512 (unsigned char *out, const unsigned char 
*in, int inlen)
   sha512_digest (&ctx, SHA512_DIGEST_SIZE, out);
 }
 
-/* from supercop-20201130/crypto_sort/int32/portable4/int32_minmax.inc */
-#define int32_MINMAX(a,b) \
+#define uint32_MINMAX(a,b) \
 do { \
-  int64_t ab = (int64_t)b ^ (int64_t)a; \
-  int64_t c = (int64_t)b - (int64_t)a; \
-  c ^= ab & (c ^ b); \
-  c >>= 31; \
-  c &= ab; \
-  a ^= c; \
-  b ^= c; \
+  uint64_t d = (uint64_t)b - (uint64_t)a; \
+  uint32_t masked_d = (d >> 32) & d; \
+  a += masked_d; \
+  b -= masked_d; \
 } while(0)
 
-/* from supercop-20201130/crypto_sort/int32/portable4/sort.c */
+/* Based on supercop-20201130/crypto_sort/int32/portable4/sort.c, but
+   using uint32_t rather than int32_t. */
 static void
-crypto_sort_int32 (void *array, long long n)
+crypto_sort_uint32 (uint32_t *x, long long n)
 {
   long long top, p, q, r, i, j;
-  int32_t *x = array;
 
   if (n < 2)
 return;
@@ -86,11 +82,11 @@ crypto_sort_int32 (void *array, long long n)
   while (i + 2 * p <= n)
{
  for (j = i; j < i + p; ++j)
-   int32_MINMAX (x[j], x[j + p]);
+   uint32_MINMAX (x[j], x[j + p]);
  i += 2 * p;
}
   for (j = i; j < n - p; ++j)
-   int32_MINMAX (x[j], x[j + p]);
+   uint32_MINMAX (x[j], x[j + p]);
 
   i = 0;
   j = 0;
@@ -101,9 +97,9 @@ crypto_sort_int32 (void *array, long long n)
  {
if (j == n - q)
  goto done;
-   int32_t a = x[j + p];
+   uint32_t a = x[j + p];
for (r = q; r > p; r >>= 1)
- int32_MINMAX (a, x[j + r]);
+ uint32_MINMAX (a, x[j + r]);
x[j + p] = a;
++j;
if (j == i + p)
@@ -116,9 +112,9 @@ crypto_sort_int32 (void *array, long long n)
{
  for (j = i; j < i + p; ++j)
{
- int32_t a = x[j + p];
+ uint32_t a = x[j + p];
  for (r = q; r > p; r >>= 1)
-   int32_MINMAX (a, x[j + r]);
+   uint32_MINMAX (a, x[j + r]);
  x[j + p] = a;
}
  i += 2 * p;
@@ -127,9 +123,9 @@ crypto_sort_int32 (void *array, long long n)
  j = i;
  while (j < n - q)
{
- int32_t a = x[j + p];
+ uint32_t a = x[j + p];
  for (r = q; r > p; r >>= 1)
-   int32_MINMAX (a, x[j + r]);
+   uint32_MINMAX (a, x[j + r]);
  x[j + p] = a;
  ++j;
}
@@ -139,23 +135,6 @@ cr

Re: [PATCH] Add Streamlined NTRU Prime sntrup761.

2023-06-19 Thread Niels Möller
Simon Josefsson  writes:

> No objection, but I find it challenging to come up with a revised patch
> that I feel comfortable with in the near future.  I'm not sure I even
> understood what unused functions you noticed (and how?); that fix would
> be easy to do.  Gaining confidence in rewritten parts feels a bit more
> complicated.  Would you like to revise the code?

I may be able to try it out in the next few weeks. If I just check out
your branch, do tests work out of the box or do I need to somehow patch
in the DRBG-CTR-AES256 too? 

There are some style details in the current patch that bothers me a bit,
e.g., "q" used as a regular parameter/variable in most of the code, and
then defined as a macro further down.

  #define q 4591

should be changed to something like

  #define SNTRUP761_Q 4591

(or maybe just SNTRUP_Q if it is intended to parametrize the code). 

> Aligned API/ABI's are nice, and good to get right early.  Is
> 'nettle_block16' still the right way to do this, or is this possible to
> (with arguably more readable) aligned() or alignof() attributes?

I think nettle_block16 is the way to go, for representing 16 byte blocks
where we're in control of the alignment (in contrast, e.g, to user's
plaintext/cryptotext for which we don't require any alignment).

I'd like to use alignas, and make nettle_block16 16-byte aligned at
least on x86_64 (and on other archs where assembly code can benefit from
alignment larger than that of an uint64). But aligned is C11, and I
hesitate to require that (while I'm considering requiring C99). And
since this is in public header files and part of the abi, it doesn't
make sense with compiler specific ifdefs.

Regards,
/Niels

-- 
Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677.
Internet email is subject to wholesale government surveillance.
___
nettle-bugs mailing list -- nettle-bugs@lists.lysator.liu.se
To unsubscribe send an email to nettle-bugs-le...@lists.lysator.liu.se


Re: [PATCH] Add Streamlined NTRU Prime sntrup761.

2023-06-19 Thread Simon Josefsson
Niels Möller  writes:

> Simon Josefsson  writes:
>
>>> In general, it makes sense to add support for post-quantum key exchange
>>> methods, another candidate seems to be https://classic.mceliece.org/
>>> (with the drawback of much larger pubkeys).
>>
>> +1
>
> I've been asking some other people too. sntrup seems to be a good option.
> Classic mcelice makes sense too, with a different underlying problem,
> and a different tradeoff (possible more conservative security, but
> larger pubkeys). Other NIST candidates Saber and Kyber I'm told have
> some patent issues, so I prefer not to touch them until that has been
> sorted out.

Sounds good, and the same for non-streamlined NTRU Prime.

> So should we focus on getting sntrup761 in as the first post-quantum
> key exchange algorithm?

+1

>> Also, do we want to deviate from audited implementations?
>
> Good question. I think the answer is yes. Some considerations:
>
> * We need proper tests for changes, including side-channel things that
>   can be tested with valgrind.
>  
> * If I have to choose between audited code and readable code, I think I
>   would usually go for the latter.
>
> * It's nice to have code consistent with general style in Nettle. And
>   more importantly, run-time selection of code should be done with the
>   same fat machinery as for other algorithms.
>
> * To me, it seems unlikely that we could wrap the audited reference
>   implementation in a way that is both practical, and makes the audit
>   provide significant confidence in the complete Nettle implementation.

No objection, but I find it challenging to come up with a revised patch
that I feel comfortable with in the near future.  I'm not sure I even
understood what unused functions you noticed (and how?); that fix would
be easy to do.  Gaining confidence in rewritten parts feels a bit more
complicated.  Would you like to revise the code?  Maybe we can all
review on the list.

>> My take was that it would be nice to add sntrup761 to Nettle ASAP to
>> stabilize API and establish support for the algorithm -- we can optimize
>> or improve the implementation later on (there are many optimized
>> implementations around for different architectures out there).
>
> Makes sense, if it's clear what api and abi should look like (but, e.g.,
> use of union nettle_block16 does affect the abi, I think).

Aligned API/ABI's are nice, and good to get right early.  Is
'nettle_block16' still the right way to do this, or is this possible to
(with arguably more readable) aligned() or alignof() attributes?

/Simon


signature.asc
Description: PGP signature
___
nettle-bugs mailing list -- nettle-bugs@lists.lysator.liu.se
To unsubscribe send an email to nettle-bugs-le...@lists.lysator.liu.se


Re: [PATCH] Add Streamlined NTRU Prime sntrup761.

2023-06-19 Thread Niels Möller
Simon Josefsson  writes:

>> In general, it makes sense to add support for post-quantum key exchange
>> methods, another candidate seems to be https://classic.mceliece.org/
>> (with the drawback of much larger pubkeys).
>
> +1

I've been asking some other people too. sntrup seems to be a good option.
Classic mcelice makes sense too, with a different underlying problem,
and a different tradeoff (possible more conservative security, but
larger pubkeys). Other NIST candidates Saber and Kyber I'm told have
some patent issues, so I prefer not to touch them until that has been
sorted out.

So should we focus on getting sntrup761 in as the first post-quantum
key exchange algorithm?

>>> Please consider it a first iteration for early review.
>>
>> I initially looked at the arithmetics. The signed (int32) sorting and
>> division seems unused?
>
> Do you mean crypto_sort_int32?  It is called by crypto_sort_uint32.

And int32_divmod_uint14 looked unused. For sorting, it may need a minor
reorg to get rid of the unneeded variant.

>> For the side-channel silent divmod function, it seems we divide
>> exclusively with one or a few constants, then we could precompute
>> needed constants and perhaps simplify a bit.
>
> Possibly - this is reference code and supports other sntrup lengths.
> Supporting multiple lengths often leads to complexity which leads to
> reduced security.  As far as I can tell, the non-sntrup761 lengths were
> insisted upon by NIST.  So the answer depends on if we want to allow
> this code to be re-used by other sntrup lengths too.

One doesn't have to hard code a single divisor. I'm thinking of
precomputing the reciprocal (and any other magic constant depending on
the divisor), currently done at runtime as

   uint32_t v = 0x8000;
   v /= m;

(unless compiler does lots of inlining).

> Also, do we want to deviate from audited implementations?

Good question. I think the answer is yes. Some considerations:

* We need proper tests for changes, including side-channel things that
  can be tested with valgrind.
 
* If I have to choose between audited code and readable code, I think I
  would usually go for the latter.

* It's nice to have code consistent with general style in Nettle. And
  more importantly, run-time selection of code should be done with the
  same fat machinery as for other algorithms.

* To me, it seems unlikely that we could wrap the audited reference
  implementation in a way that is both practical, and makes the audit
  provide significant confidence in the complete Nettle implementation.

> My take was that it would be nice to add sntrup761 to Nettle ASAP to
> stabilize API and establish support for the algorithm -- we can optimize
> or improve the implementation later on (there are many optimized
> implementations around for different architectures out there).

Makes sense, if it's clear what api and abi should look like (but, e.g.,
use of union nettle_block16 does affect the abi, I think).

Regards,
/Niels


-- 
Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677.
Internet email is subject to wholesale government surveillance.
___
nettle-bugs mailing list -- nettle-bugs@lists.lysator.liu.se
To unsubscribe send an email to nettle-bugs-le...@lists.lysator.liu.se


Re: [PATCH] Add Streamlined NTRU Prime sntrup761.

2023-06-19 Thread Simon Josefsson
Thanks for reviewing this!

Niels Möller  writes:

> Simon Josefsson  writes:
>
>> This adds sntrup761, what do you think?
>
> What's the context/usecase? I saw some mails on the ietf-ssh list, but
> it was a bit unclear to me what the status of this algorithm is.

Sntrup761 is used by default in OpenSSH (hybrid with X25519) since some
years ago, and they are committed to support it for a many more years.
The ietf-ssh posts were about documenting that protocol.

My context for wanting it in Nettle is to reduce code duplication for
other projects that will end up implementing it too.  Unfortunately I
think we'll need to add native sntrup761 code to some other projects
anyway, until Nettle with sntrup761 is widely available.

> In general, it makes sense to add support for post-quantum key exchange
> methods, another candidate seems to be https://classic.mceliece.org/
> (with the drawback of much larger pubkeys).

+1

>> Please consider it a first iteration for early review.
>
> I initially looked at the arithmetics. The signed (int32) sorting and
> division seems unused?

Do you mean crypto_sort_int32?  It is called by crypto_sort_uint32.

> For the side-channel silent divmod function, it seems we divide
> exclusively with one or a few constants, then we could precompute
> needed constants and perhaps simplify a bit.

Possibly - this is reference code and supports other sntrup lengths.
Supporting multiple lengths often leads to complexity which leads to
reduced security.  As far as I can tell, the non-sntrup761 lengths were
insisted upon by NIST.  So the answer depends on if we want to allow
this code to be re-used by other sntrup lengths too.  Also, do we want
to deviate from audited implementations?

My take was that it would be nice to add sntrup761 to Nettle ASAP to
stabilize API and establish support for the algorithm -- we can optimize
or improve the implementation later on (there are many optimized
implementations around for different architectures out there).  The
patch I posted is similar to the reference code that is also used by
OpenSSH.

/Simon


signature.asc
Description: PGP signature
___
nettle-bugs mailing list -- nettle-bugs@lists.lysator.liu.se
To unsubscribe send an email to nettle-bugs-le...@lists.lysator.liu.se