https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=218203
Bug ID: 218203
Summary: Implement AVX2 accelerated Fletcher algorithms
Product: Base System
Version: CURRENT
Hardware: amd64
OS: Any
Status: New
Severity: Affects Only Me
Priority: ---
Component: kern
Assignee: [email protected]
Reporter: [email protected]
Intel has published a pretty straight forward implementation of Fletcher4
leveraging AVX2 instructions:
https://software.intel.com/en-us/articles/fast-computation-of-fletcher-checksums
I was able to use this white paper and compiler intrinsics to build a
rudimentary version that's nearly twice as fast. It is feasible to swap out
the existing scalar and portable implementation for this faster variant similar
to the way Linux offers SIMD accelerated versions of cryptographic and hashing
routines within their kernel.
As a matter of fact, zfsonlinux is already doing this:
https://github.com/zfsonlinux/zfs/tree/482cd9ee69e88710e9241fac220501ea4e101d19/module/zcommon
While I understand the desire to remain close to the reference ZFS
implementation with Illumos and maybe there doesn't need to be quite that many
versions of fletcher4 (they do a superscalar version that presumably tries to
take advantage of Out-of-Order execution - hoping the microarchitecture can
schedule the instructions efficiently by noticing the lack of data
dependencies), it does seem silly to ignore a working implementation that is
measurably faster for CPUs that support it. It has even been backported to
SSSE3 instructions:
https://github.com/zfsonlinux/zfs/blob/482cd9ee69e88710e9241fac220501ea4e101d19/module/zcommon/zfs_fletcher_sse.c
--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[email protected] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "[email protected]"