On Thu, May 24, 2018 at 12:11:35PM -0500, Denis Kenzior wrote:
> 
> Well, I'm not sure where the laziness comment is coming from.  We have at
> least two user-space implementations that implement PBKDF on top of AF_ALG.
> But a typical invocation of PBKDF runs a couple of thousand iterations.
> That is a lot of system call overhead.  Would it not be better to fix things
> to be more efficient rather than worry about how 'mistakes were made'?

Even where there is hardware acceleration, I suspect that it might be
more efficient (as in, result in a faster implementation) if the user
PBKDF application was changed to use its own in-userspace software
implementation.  Many/most hardware implementations are optimzied for
throughput (e.g., bulk data operations), and it's not obvious to me
that once you had the syscall overhead, it's actually faster to use
the hardware accleration.

Has anyone actually done the experiment and verified that it was in
fact a win to use AF_ALG on at least _some_ platform?  What about the
common cast for most platforms, even those that had some kind of
hardware accleration that could only be accessed by the kernel?

(Amusing war story: the hardware where we first experimented with ext4
encryption, the hardware "acceleration" offered by the ARM core in
question was *slower* than a well-tuned software-only implementation
on the same ARM CPU!  :-)

                                                - Ted

Reply via email to