Hi

On 2/12/22 11:58 PM, Niels Möller wrote:
Tianjia Zhang <[email protected]> writes:

Signed-off-by: Tianjia Zhang <[email protected]>
---
  Makefile.in                  |   1 +
  nettle-meta-ciphers.c        |   1 +
  nettle-meta.h                |   2 +
  sm4-meta.c                   |  49 ++++++++
  sm4.c                        | 225 +++++++++++++++++++++++++++++++++++
  sm4.h                        |  71 +++++++++++
  testsuite/meta-cipher-test.c |   3 +-
  7 files changed, 351 insertions(+), 1 deletion(-)
  create mode 100644 sm4-meta.c
  create mode 100644 sm4.c
  create mode 100644 sm4.h

Overall looks pretty good. But I wonder if one could avoid having two
copies of the subkeys.

+void
+sm4_set_key(struct sm4_ctx *ctx, const uint8_t *key)
+{
+  uint32_t rk[4];
+  int i;
+
+  rk[0] = READ_UINT32(key +  0) ^ fk[0];
+  rk[1] = READ_UINT32(key +  4) ^ fk[1];
+  rk[2] = READ_UINT32(key +  8) ^ fk[2];
+  rk[3] = READ_UINT32(key + 12) ^ fk[3];
+
+  for (i = 0; i < 32; i += 4)
+    {
+      rk[0] ^= sm4_key_sub(rk[1] ^ rk[2] ^ rk[3] ^ ck[i + 0]);
+      rk[1] ^= sm4_key_sub(rk[2] ^ rk[3] ^ rk[0] ^ ck[i + 1]);
+      rk[2] ^= sm4_key_sub(rk[3] ^ rk[0] ^ rk[1] ^ ck[i + 2]);
+      rk[3] ^= sm4_key_sub(rk[0] ^ rk[1] ^ rk[2] ^ ck[i + 3]);
+
+      ctx->rkey_enc[i + 0] = rk[0];
+      ctx->rkey_enc[i + 1] = rk[1];
+      ctx->rkey_enc[i + 2] = rk[2];
+      ctx->rkey_enc[i + 3] = rk[3];
+      ctx->rkey_dec[31 - 0 - i] = rk[0];
+      ctx->rkey_dec[31 - 1 - i] = rk[1];
+      ctx->rkey_dec[31 - 2 - i] = rk[2];
+      ctx->rkey_dec[31 - 3 - i] = rk[3];
+    }
+}

So subkeys are identical for encrypt and decrypt, just used in opposite
order? It seems unnecessary to use two copies.

+static void
+sm4_crypt_block(const uint32_t *rk, uint8_t *dst, const uint8_t *src)
+{
+  uint32_t x[4], i;

The loop counter i should have type plain int (or unsigned; in Nettle I
tend to use unsigned for non-negative values).

+  x[0] = READ_UINT32(src + 0 * 4);
+  x[1] = READ_UINT32(src + 1 * 4);
+  x[2] = READ_UINT32(src + 2 * 4);
+  x[3] = READ_UINT32(src + 3 * 4);
+
+  for (i = 0; i < 32; i += 4)
+    {
+      x[0] = sm4_round(x[0], x[1], x[2], x[3], rk[i + 0]);
+      x[1] = sm4_round(x[1], x[2], x[3], x[0], rk[i + 1]);
+      x[2] = sm4_round(x[2], x[3], x[0], x[1], rk[i + 2]);
+      x[3] = sm4_round(x[3], x[0], x[1], x[2], rk[i + 3]);
+    }

Since the x[] array is indexed only by constants, you could consider
using scalar variables

   uint32_t x0, x1, x2 x3;

Probably makes no difference with modern compilers, but we'd like the
values to be allocated in registers, not as an array on the stack. And
similarly for the rk array above.

+  WRITE_UINT32(dst + 0 * 4, x[3 - 0]);
+  WRITE_UINT32(dst + 1 * 4, x[3 - 1]);
+  WRITE_UINT32(dst + 2 * 4, x[3 - 2]);
+  WRITE_UINT32(dst + 3 * 4, x[3 - 3]);
+}

If this is the same for encrypt and decrypt, you could move the block
loop into this function too, to avoid code duplication.

+void
+sm4_encrypt(const struct sm4_ctx *context,
+           size_t length,
+           uint8_t *dst,
+           const uint8_t *src)
+{
+  const uint32_t *keys = context->rkey_enc;
[...]
+void
+sm4_decrypt(const struct sm4_ctx *context,
+           size_t length,
+           uint8_t *dst,
+           const uint8_t *src)
+{
+  const uint32_t *keys = context->rkey_dec;

I see two alternatives to use only one copy of the subkeys:

1. Keep using a single sm4_set_key function, but store only the first
    copy (currently rkey_enc). Change the sm4_decrypt function to access
    subkeys in the opposite order. To keep a shared sm4_crypt_block
    function, that function would need another +1/-1 argument used to
    determine subkey access order.

2. Implement two separate functions sm4_set_encrypt_key and
    sm4_set_decrypt_key, where the latter stores the subkeys in opposite
    order. sm4_set_decrypt_key could be implemented as
    sm4_set_encrypt_key + sm4_invert_key, where sm4_invert_key is a function
    that just reverses the order (a bit similar to _nettle_aes_invert,
    but simpler). Then the same sm4_crypt function can be used for both
    encrypt and decrypt.

Not sure what's best, but I'd lean towards (2), since simplicity of the
more performance critical encrypt/decrypt operation seems more important
than a simplicity at key setup.

Regards,
/Niels


sorry for the late reply.

Thanks for your suggestion, it helped me a lot, I agree with your second
suggestion, implement two separate set_key functions, and use one
function to unify encryption and decryption, and also improve the code
with other suggestions you gave, these will be send in a later v2 patch.

Best regards,
Tianjia
_______________________________________________
nettle-bugs mailing list
[email protected]
http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs

Reply via email to