Re: [coreutils] [PATCH] sort: -R now uses less memory on long lines with internal NULs

Pádraig Brady Wed, 11 Aug 2010 15:50:07 -0700

On 05/08/10 00:12, Paul Eggert wrote:

> @@ -2038,41 +2027,117 @@ static int
>  compare_random (char *restrict texta, size_t lena,
>                  char *restrict textb, size_t lenb)
>  {
> -  int diff;
> +  uint32_t dig[2][MD5_DIGEST_SIZE / sizeof (uint32_t)];


I know this is just moved code but it confused me.
Is it uint32_t for alignment (speed)?
Would the following be better on 64 bit while
being immune to hash size?

diff --git a/src/sort.c b/src/sort.c
index a8d0c14..ac913b6 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -2026,7 +2026,8 @@ compare_random (char *restrict texta, size_t lena,
   char *buf = stackbuf;
   size_t bufsize = sizeof stackbuf;
   void *allocated = NULL;
-  uint32_t dig[2][MD5_DIGEST_SIZE / sizeof (uint32_t)];
+  /* align the digests on a size_t boundary for speed.  */
+  size_t dig[2][(MD5_DIGEST_SIZE + sizeof (size_t) - 1) / sizeof (size_t)];
   struct md5_ctx s[2];
   s[0] = s[1] = random_md5_state;

@@ -2121,7 +2122,7 @@ compare_random (char *restrict texta, size_t lena,
   /* Compute and compare the checksums.  */
   md5_process_bytes (texta, lena, &s[0]); md5_finish_ctx (&s[0], dig[0]);
   md5_process_bytes (textb, lenb, &s[1]); md5_finish_ctx (&s[1], dig[1]);
-  int diff = memcmp (dig[0], dig[1], sizeof dig[0]);
+  int diff = memcmp (dig[0], dig[1], MD5_DIGEST_SIZE);

   /* Fall back on the tiebreaker if the checksums collide.  */
   if (! diff)

> +  struct md5_ctx s[2];
> +  s[0] = s[1] = random_md5_state;
>  
> -  if (! hard_LC_COLLATE)
> -    diff = cmp_hashes (texta, lena, textb, lenb);
> -  else
> +  if (hard_LC_COLLATE)
>      {
> -      /* Transform the text into the basis of comparison, so that byte
> -         strings that would otherwise considered to be equal are
> -         considered equal here even if their bytes differ.  */
> -
> -      char *buf = NULL;
> -      char stackbuf[4000];
> -      size_t tlena = xmemxfrm (stackbuf, sizeof stackbuf, texta, lena);

Just off hand ideas...
There are a lot of expensive operations done here.
Would it be worth doing a memcmp() first and only doing
the rest if the bytes differ. Depends on the data I know,
but given caching the up front memcmp may be worth it?
Also I wonder would caching the xfrm() data in the line buffers
be worth it, as the number of comparisons increases?

cheers,
Pádraig.

Re: [coreutils] [PATCH] sort: -R now uses less memory on long lines with internal NULs

Reply via email to