Package: coreutils Version: 8.32-4+b1 This bug exists in both Debian Buster and Debian Bullseye.
It has been fixed in upstream. It can be reproduced by splitting a file such that size of each chunk produced by split is larger than the block size used to read the files (io_blksize(), bufsize, in split.c). Example: # create source file dd if=/dev/urandom of=/tmp/datafile bs=317634560 count=1 # create known good chunks split -n 635 /tmp/datafile /tmp/datafile. # attempt to extract one of the chunks using -n split -n 4/635 /tmp/datafile > /tmp/chunk4 # compare - this will fail cmp /tmp/datafile.ad /tmp/chunk4 The reason is that when bytes_chunk_extract() is called, the previous call to input_file_size() has already read bufsize bytes and left the file pointer there. Then bytes_chunk_extract() performs an lseek(fd, start, SEEK_CUR) using its calculated offset "start", and seeks past the real start point. Here's the fix against Debian Buster, but the same can be done on Bullseye. Upstream has an extra if statement, but is unnecessary since it is already handled by the surrounding if(start < initial_read) check. We are in the else statement, so we know start >= initial_read. diff -ru coreutils-8.30/src/split.c coreutils-8.30-chunk-fix/src/split.c --- coreutils-8.30/src/split.c 2018-05-14 00:20:24.000000000 -0400 +++ coreutils-8.30-chunk-fix/src/split.c 2023-05-04 22:31:29.521398067 -0400 @@ -1000,7 +1000,7 @@ } else { - if (lseek (STDIN_FILENO, start, SEEK_CUR) < 0) + if (lseek (STDIN_FILENO, start - initial_read, SEEK_CUR) < 0) die (EXIT_FAILURE, errno, "%s", quotef (infile)); initial_read = SIZE_MAX; }