Re: additional API for SHAKE streaming read

Niels Möller Thu, 14 Mar 2024 04:01:40 -0700

Daiki Ueno <[email protected]> writes:

>> * One could perhaps use index == 0 instead of index == block_size for
>>   the case that there is no buffered data. But the current convention
>>   does make your "if (length <= left)" nice and simple.
>
> I agree that the current convention is a bit awkward, so in the attached
> patch I changed to use index == 0 as the indicator where buffering is
> needed.  That actually makes the code simpler as we can defer buffering
> until when the data is read.  One drawback though is that it causes
> additional memcpy in a corner case where the _shake_output is used to
> retrieve data smaller than the block size.


I wonder if that will still be simpler if one also moves the
sha3_permute calls?

I have merged your previous version to a branch
add-sha3_256_shake_output, and ci looks green. So perhaps best to merge
that to master, and iterate from there?

>> * It looks a bit backwards to me that each iteration *first* copies data
>>   to the digest, and *then* calls sha3_permute. In case no more data is
>>   to be output, that sha3_permute call is wasted. It would be more
>>   natural to me to not call sha3_permute until we know the output is
>>   needed. But to fix that and still keep things nice for the first
>>   output block, I think one would need to reorganize _nettle_sha3_pad to
>>   not imply a call to sha3_permute (via sha3_absorb). So that's better
>>   done in a separate change.
>
> Right, I can do that after the current patch is settled.

I've done a bit of hacking locally. What I did was to take out the
xoring parts of sha3_absorb into it's own function sha3_xor_block, and
let sha3_pad_shake use that, without any call to sha3_permute. And then
call sha3_permute as output is needed.

>> * I'm still tempted to use ctx->index = ~index rather than ctx->index =
>>   index | INDEX_HIGH_BIT. But maybe that would just be too obfuscated.
>
> I'm actually not sure how this works.  For example, if unsigned int is
> 32-bit and index is 3, wouldn't ~index turn to 0xfffffffc, while index |
> INDEX_HIGH_BIT is 0x80000003?

It would be a different representation, with the very minor advantage
that the INDEX_HIGH_BIT value isn't needed (in source code, or handled
at runtime). Like

  index = ctx->index;

  if (index < sizeof(ctx->block)) 
    { ... first call to shake_output, pad and initialize...  }
  else
    index = ~index;

  assert (index <= sizeof(ctx->block));

  ... output processing ...

  ctx->index = ~index;

>> In next step, to also support shake128, we should generalize your code
>> using an internal function _sha3_shake_output taking block and block
>> size as arguments.
>
> Yes.

I've tried that in my local hack, I think it's rather straight-forward.
(I might be able to post corresponding patch later). What's unclear is
how much to share between _shake and shake_output. One could define
_shake as _shake_output + _init. The drawback I see is that (i) we would
allow _shake_output followed by _shake, which isn't proper api usage,
and (ii) _shake needs a lot less logic since it should always start by
padding, and it doesn't need to buffer any data, so it seems a bit wrong
to have it call shake_output that does this unneeded extra work.

/Regards,
Niels

-- 
Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677.
Internet email is subject to wholesale government surveillance.
_______________________________________________
nettle-bugs mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Re: additional API for SHAKE streaming read

Reply via email to