Package: coreutils
Version: 8.32-4+b1

This bug exists in both Debian Buster and Debian Bullseye.

It has been fixed in upstream.

It can be reproduced by splitting a file such that size of each
chunk produced by split is larger than the block size used to read
the files (io_blksize(), bufsize, in split.c).

Example:

        # create source file
        dd if=/dev/urandom of=/tmp/datafile bs=317634560 count=1

        # create known good chunks
        split -n 635 /tmp/datafile /tmp/datafile.

        # attempt to extract one of the chunks using -n
        split -n 4/635 /tmp/datafile > /tmp/chunk4

        # compare - this will fail
        cmp /tmp/datafile.ad /tmp/chunk4

The reason is that when bytes_chunk_extract() is called, the previous
call to input_file_size() has already read bufsize bytes and left
the file pointer there.  Then bytes_chunk_extract() performs an
lseek(fd, start, SEEK_CUR) using its calculated offset "start",
and seeks past the real start point.

Here's the fix against Debian Buster, but the same can be done on Bullseye.
Upstream has an extra if statement, but is unnecessary since it is already
handled by the surrounding if(start < initial_read) check.  We are
in the else statement, so we know start >= initial_read.


diff -ru coreutils-8.30/src/split.c coreutils-8.30-chunk-fix/src/split.c
--- coreutils-8.30/src/split.c  2018-05-14 00:20:24.000000000 -0400
+++ coreutils-8.30-chunk-fix/src/split.c        2023-05-04 22:31:29.521398067 
-0400
@@ -1000,7 +1000,7 @@
     }
   else
     {
-      if (lseek (STDIN_FILENO, start, SEEK_CUR) < 0)
+      if (lseek (STDIN_FILENO, start - initial_read, SEEK_CUR) < 0)
         die (EXIT_FAILURE, errno, "%s", quotef (infile));
       initial_read = SIZE_MAX;
     }

Reply via email to