On 05/08/10 00:12, Paul Eggert wrote: > @@ -2038,41 +2027,117 @@ static int > compare_random (char *restrict texta, size_t lena, > char *restrict textb, size_t lenb) > { > - int diff; > + uint32_t dig[2][MD5_DIGEST_SIZE / sizeof (uint32_t)];
I know this is just moved code but it confused me. Is it uint32_t for alignment (speed)? Would the following be better on 64 bit while being immune to hash size? diff --git a/src/sort.c b/src/sort.c index a8d0c14..ac913b6 100644 --- a/src/sort.c +++ b/src/sort.c @@ -2026,7 +2026,8 @@ compare_random (char *restrict texta, size_t lena, char *buf = stackbuf; size_t bufsize = sizeof stackbuf; void *allocated = NULL; - uint32_t dig[2][MD5_DIGEST_SIZE / sizeof (uint32_t)]; + /* align the digests on a size_t boundary for speed. */ + size_t dig[2][(MD5_DIGEST_SIZE + sizeof (size_t) - 1) / sizeof (size_t)]; struct md5_ctx s[2]; s[0] = s[1] = random_md5_state; @@ -2121,7 +2122,7 @@ compare_random (char *restrict texta, size_t lena, /* Compute and compare the checksums. */ md5_process_bytes (texta, lena, &s[0]); md5_finish_ctx (&s[0], dig[0]); md5_process_bytes (textb, lenb, &s[1]); md5_finish_ctx (&s[1], dig[1]); - int diff = memcmp (dig[0], dig[1], sizeof dig[0]); + int diff = memcmp (dig[0], dig[1], MD5_DIGEST_SIZE); /* Fall back on the tiebreaker if the checksums collide. */ if (! diff) > + struct md5_ctx s[2]; > + s[0] = s[1] = random_md5_state; > > - if (! hard_LC_COLLATE) > - diff = cmp_hashes (texta, lena, textb, lenb); > - else > + if (hard_LC_COLLATE) > { > - /* Transform the text into the basis of comparison, so that byte > - strings that would otherwise considered to be equal are > - considered equal here even if their bytes differ. */ > - > - char *buf = NULL; > - char stackbuf[4000]; > - size_t tlena = xmemxfrm (stackbuf, sizeof stackbuf, texta, lena); Just off hand ideas... There are a lot of expensive operations done here. Would it be worth doing a memcmp() first and only doing the rest if the bytes differ. Depends on the data I know, but given caching the up front memcmp may be worth it? Also I wonder would caching the xfrm() data in the line buffers be worth it, as the number of comparisons increases? cheers, Pádraig.