On 24/01/2021 19:55, Paul Eggert wrote:
On 1/24/21 8:52 AM, Pádraig Brady wrote:
- if (lseek (STDIN_FILENO, start, SEEK_CUR) < 0)
+ if (lseek (STDIN_FILENO, start, SEEK_SET) < 0)
Dumb question: will this handle the case where you're splitting from
stdin and stdin is a seekable file and its initial file offset is nonzero?
Right. Following on the logic from input_file_size(),
I'm going with the attached, which I'll push later.
Marking this as done.
thanks,
Pádraig
>From 8741d726327bddce3271de23af4aae4cfc185774 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= <p...@draigbrady.com>
Date: Mon, 25 Jan 2021 14:12:48 +0000
Subject: [PATCH] split: fix --number=K/N to output correct part of file
This functionality regressed with the adjustments
in commit v8.25-4-g62e7af032
* src/split.c (bytes_chunk_extract): Account for already read data
when seeking into the file.
* tests/split/b-chunk.sh: Use the hidden ---io-blksize option,
to test this functionality.
* NEWS: Mention the bug fix.
Fixes https://bugs.gnu.org/46048
---
NEWS | 4 ++++
src/split.c | 2 +-
tests/split/b-chunk.sh | 45 ++++++++++++++++++++++++------------------
3 files changed, 31 insertions(+), 20 deletions(-)
diff --git a/NEWS b/NEWS
index c2474fee3..e7fbde8ed 100644
--- a/NEWS
+++ b/NEWS
@@ -27,6 +27,10 @@ GNU coreutils NEWS -*- outline -*-
rm no longer skips an extra file when the removal of an empty directory fails.
[bug introduced by the rewrite to use fts in coreutils-8.0]
+ split --number=K/N will again correctly split chunk K of N to stdout.
+ Previously a chunk starting after 128KiB, output the wrong part of the file.
+ [bug introduced in coreutils-8.26]
+
tr no longer crashes when using --complement with certain
invalid combinations of case character classes.
[bug introduced in coreutils-8.6]
diff --git a/src/split.c b/src/split.c
index 0660da13f..59c234c12 100644
--- a/src/split.c
+++ b/src/split.c
@@ -1001,7 +1001,7 @@ bytes_chunk_extract (uintmax_t k, uintmax_t n, char *buf, size_t bufsize,
}
else
{
- if (lseek (STDIN_FILENO, start, SEEK_CUR) < 0)
+ if (lseek (STDIN_FILENO, start - initial_read, SEEK_CUR) < 0)
die (EXIT_FAILURE, errno, "%s", quotef (infile));
initial_read = SIZE_MAX;
}
diff --git a/tests/split/b-chunk.sh b/tests/split/b-chunk.sh
index 8238dcb6d..dbed681f7 100755
--- a/tests/split/b-chunk.sh
+++ b/tests/split/b-chunk.sh
@@ -35,32 +35,39 @@ split -e -n 10 /dev/null || fail=1
returns_ 1 stat x?? 2>/dev/null || fail=1
printf '1\n2\n3\n4\n5\n' > input || framework_failure_
+printf '1\n2' > exp-1 || framework_failure_
+printf '\n3\n' > exp-2 || framework_failure_
+printf '4\n5\n' > exp-3 || framework_failure_
for file in input /proc/version /sys/kernel/profiling; do
test -f $file || continue
- split -n 3 $file > out || fail=1
- split -n 1/3 $file > b1 || fail=1
- split -n 2/3 $file > b2 || fail=1
- split -n 3/3 $file > b3 || fail=1
+ for blksize in 1 2 4096; do
+ if ! test "$file" = 'input'; then
+ # For /proc like files we must be able to read all
+ # into the internal buffer to be able to determine size.
+ test "$blksize" = 4096 || continue
+ fi
- case $file in
- input)
- printf '1\n2' > exp-1
- printf '\n3\n' > exp-2
- printf '4\n5\n' > exp-3
+ split -n 3 ---io-blksize=$blksize $file > out || fail=1
+ split -n 1/3 ---io-blksize=$blksize $file > b1 || fail=1
+ split -n 2/3 ---io-blksize=$blksize $file > b2 || fail=1
+ split -n 3/3 ---io-blksize=$blksize $file > b3 || fail=1
- compare exp-1 xaa || fail=1
- compare exp-2 xab || fail=1
- compare exp-3 xac || fail=1
- ;;
- esac
+ case $file in
+ input)
+ compare exp-1 xaa || fail=1
+ compare exp-2 xab || fail=1
+ compare exp-3 xac || fail=1
+ ;;
+ esac
- compare xaa b1 || fail=1
- compare xab b2 || fail=1
- compare xac b3 || fail=1
- cat xaa xab xac | compare - $file || fail=1
- test -f xad && fail=1
+ compare xaa b1 || fail=1
+ compare xab b2 || fail=1
+ compare xac b3 || fail=1
+ cat xaa xab xac | compare - $file || fail=1
+ test -f xad && fail=1
+ done
done
Exit $fail
--
2.26.2