On 31 August 2018 at 17:56, Ard Biesheuvel wrote:
> Hi Eric,
>
> On 31 August 2018 at 10:01, Eric Biggers wrote:
>> From: Eric Biggers
>>
>> Optimize ChaCha20 NEON performance by:
>>
>> - Implementing the 8-bit rotations using the 'vtbl.8' instruction.
>> - Streamlining the part that adds the
Hi Eric,
On 31 August 2018 at 10:01, Eric Biggers wrote:
> From: Eric Biggers
>
> Optimize ChaCha20 NEON performance by:
>
> - Implementing the 8-bit rotations using the 'vtbl.8' instruction.
> - Streamlining the part that adds the original state and XORs the data.
> - Making some other small
From: Eric Biggers
Optimize ChaCha20 NEON performance by:
- Implementing the 8-bit rotations using the 'vtbl.8' instruction.
- Streamlining the part that adds the original state and XORs the data.
- Making some other small tweaks.
On ARM Cortex-A7, these optimizations improve ChaCha20