On 26/12/2020 22:54, Pádraig Brady wrote:
On 26/12/2020 13:53, Kristoffer Brånemyr via GNU coreutils General Discussion
wrote:
Hi,
I modified cksum to use the well known slice by 8 algorithm in the CRC
calculation, to make it faster. On my machine it is several times faster than
the unmodified cksum. It took me a while to figure out since the CRC
calculation in cksum shifts in the opposite direction than most other
implementations I've seen. I would be glad if someone could check this patch on
a big endian machine to see if it produces the correct output! It think it
might, but not sure.
You can see the patch here:
https://github.com/coreutils/coreutils/pull/43
Thanks for the patch!
I wouldn't focus on big endian perf,
but I will test on a SPARC-Enterprise-T5220 system I have access to.
I would explicitly depend on byteswap in bootstrap.conf now,
rather than relying on the transitive dependency through md5sum etc.
A 100MB file improves from 2.50s to 1.80s on the T5220
A 100MB file improves from 0.54s to 0.13s on an i3-2310M
I applied the attached to avoid:
src/cksum.c:201:15:
error: cast increases required alignment of target type
[-Werror=cast-align]
201 | datap = (uint32_t *)buf;
| ^
I'll apply after a bit more testing.
thanks again,
Pádraig
>From 75b11bfc90201b91839ef81c2bfbcee6294a0053 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= <p...@draigbrady.com>
Date: Sat, 26 Dec 2020 23:40:57 +0000
Subject: [PATCH] maint: avoid -Werror=cast-align
src/cksum.c:201:15:
error: cast increases required alignment of target type
[-Werror=cast-align]
201 | datap = (uint32_t *)buf;
| ^
---
src/cksum.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/src/cksum.c b/src/cksum.c
index 26d670d4a..714e33391 100644
--- a/src/cksum.c
+++ b/src/cksum.c
@@ -161,7 +161,7 @@ static bool have_read_stdin;
static bool
cksum (const char *file, bool print_name)
{
- unsigned char buf[BUFLEN];
+ uint32_t buf[BUFLEN/4];
uint_fast32_t crc = 0;
uintmax_t length = 0;
size_t bytes_read;
@@ -189,7 +189,6 @@ cksum (const char *file, bool print_name)
while ((bytes_read = fread (buf, 1, BUFLEN, fp)) > 0)
{
- unsigned char *cp = buf;
uint32_t *datap;
uint32_t second = 0;
@@ -209,7 +208,7 @@ cksum (const char *file, bool print_name)
}
/* And finish up last 0-7 bytes in a byte by byte fashion */
- cp = (unsigned char *)datap;
+ unsigned char *cp = (unsigned char *)datap;
while (bytes_read--)
crc = (crc << 8) ^ crctab[0][((crc >> 24) ^ *cp++) & 0xFF];
if (feof (fp))
--
2.26.2