Re: [PATCH] cat: use splice if operating on non-files or if copy_file_range fails

Pádraig Brady Tue, 31 Mar 2026 05:27:25 -0700

On 31/03/2026 04:08, Collin Funk wrote:

Collin Funk <[email protected]> writes:

Pádraig Brady <[email protected]> writes:

On 30/03/2026 00:13, Collin Funk wrote:

On a AMD Ryzen 7 3700X system:
      $ timeout 10 taskset 1 ./src/cat-prev /dev/zero \
          | taskset 2 pv -r > /dev/null
      [1.84GiB/s]
      $ timeout 10 taskset 1 ./src/cat /dev/zero \
          | taskset 2 pv -r > /dev/null
      [7.92GiB/s]



Very nice.

Did you test on a Power10 system like the NEWS suggests?
(cfarm120.cfarm.net is Power10 BTW)


I forgot to test it there after fixing the patch.


Actually, I think I did test it with my v1 patch. However, the splice
fails there with EINVAL instead of working like on my personal system.

It is probably just configured differently than my machine. It is
unfortunate that splice fails with EINVAL for 5 different reasons...

Anyways, we can use a pipe instead of /dev/zero and it works.

However this shows a problematic case,
where we don't diagnose errors, either in error message or exit status:

   $ strace -o /dev/null -e inject=splice:error=EIO:when=3 \
     src/cat /dev/zero > t.c
   spliced 524288
   spliced -1
   splice status 1


Thanks, I'll have a look at that.

I'm thinking we should diagnose all errors if some data is spliced,
and non splice specific errors always?

Also the lseek() fallback worries me for non regular files.
What if splicing from some non seekable device to a file system that doesn't 
support splice.
Then the input to intermediate pipe would work but the lseek() could not 
restore the bytes.
I wonder could we instead do a probe with read() + vmsplice(),
and the fallback could then use the bytes in the read buffer?


My main concern when writing it was that the first splice may succeed,
but then the second splice might fail because standard output is a file
on a file system that does not support splice. In that case, I think it
would be poor behavior to print an error; we should still fall back to
read and write.

I'll think about it some more.


I attached a v2 patch which I think handles this well. Basically, we
make sure that each splice call in the first iteration succeeds. If so,
any subsequent splice error is treated as fatal. If one of the splice
calls fail in the first iteration we fall back to read and write. If the
intermediate pipe has data, we can just use a read and write loop
instead of using lseek.

It would be nice to add a test for that, but I couldn't really think of
one. I just edited the source code to behave as if neither input nor
output were pipes, then used strace to make the splice call that drains
the first intermediate pipe fail.

Also, I changed it so that we only call isapipe (STDOUT_FILENO) once
instead of once per input file.


Cool. The read/write fallback should work.
I see this is now diagnosed:

  $ strace  -e inject=splice:error=EIO:when=3 \   src/cat /dev/zero > t.c

However errors on the writing splice are not always diagnosed:

  $ strace  -e inject=splice:error=EIO:when=4 \
    src/cat /dev/zero > t.c

I think think that just needs a clause like you have already done on the read 
side.

As for tests, the odd and even straces above seem useful.

cheers,
Padraig

Re: [PATCH] cat: use splice if operating on non-files or if copy_file_range fails

Reply via email to