Re: Enhancement request for tee - please add the option to not quit on SIGPIPE when someother files are still opened

2015-11-20 Thread Jirka Hladky
Hi,

thanks for the script, it definitely provides more control than the plain
tee command. I have done some modifications so that progress can be watched
live

I have tried two versions of the main command (one with >/dev/full, other
one with >&-).

cat /dev/zero | head -c500M | (/dev/shm/AAA/coreutils-8.24/src/tee -p
$d/fifo1 $d/fifo2 $d/fifo3 $d/fifo4 >/dev/full ) 2>&1 | tee $d/run.log &
and
cat /dev/zero | head -c500M | (/dev/shm/AAA/coreutils-8.24/src/tee -p
$d/fifo1 $d/fifo2 $d/fifo3 $d/fifo4 >&- ) 2>&1 | tee $d/run.log &

Both of them are working fine, except that following messages are emitted:

tee: standard output: No space left on device
tee: standard output: Bad file descriptor

I still think that "--no-stdout" option would be helpful for this usecase.
Any feedback on that? Should I open the enhancement request ticket and post
implementation of "--no-stdout" option there?

Thanks
Jirka

PS: Modified script is attached.



On Fri, Nov 20, 2015 at 12:02 PM, Pádraig Brady  wrote:

> On 20/11/15 04:33, Assaf Gordon wrote:
> > Hello Jirka,
> >
> > Regarding this:
> >
> > On 11/19/2015 08:58 PM, Jirka Hladky wrote:
> >>> The general problem I have with >(process substitutions) are that
> >>> they are completely asynchronous.  There is no way to tell if they
> >>> are done.
> >>
> >> Yes, I agree with you on this one. However, I don't see the other way
> >> how to send the output of one process to multiple sub-processes in
> >> shell.
> >
> > If I may suggest this slightly verbose shell script (attached) - it
> should do what you want (sending output to multiple processes)
> > while still allowing tight control over each background process, and
> also collecting their results in an organized fashion
> > (ie keeping stdout,stderr,exitcode in a file for each test) - making
> further diagnosis much easier.
> >
> > if there's a need to combine the outputs from all the tests (e.g. to
> find the smallest p-value from all tests) -  it's just a matter of "cat
> *.out" once
> > all the tests are done.
> >
> > Note that this does not solve the "--no-stdout" issue - just the ">()"
> part. It should also make the shell script portable (except using GNU tee's
> "-p" parameter).
>
> Note there is no async issue with >() once the output is piped further,
> as then the background processes are waited for.
> Though yes, using fifos give more fine grained control over processes and
> exit status etc.
>
> > The output should be:
> >
> >  tee: standard output: Bad file descriptor
> >  == Test 1 exited with code 0 ==
> >  == Test 1 STDOUT ==
> >  104857600
> >  == Test 2 exited with code 0 ==
> >  == Test 2 STDOUT ==
> >  1
> >  == Test 3 exited with code 1 ==
> >  == Test 3 STDOUT ==
> >  == Test 3 STDERR ==
> >  wc: unrecognized option '--foo'
> >  Try 'wc --help' for more information.
> >  == Test 4 exited with code 0 ==
> >  == Test 4 STDOUT ==
> >  32768
> >  ==
> >  Test results stored in /tmp/tmp.esLAoUxeLQ
> >
> >
> > Comments and corrections welcomed.
>
> Yes this is a useful pattern.
> I noted something similar for use with split(1) at:
> http://lists.gnu.org/archive/html/coreutils/2011-05/msg00012.html
> with the number of parallel processes potentially determined with nproc(1).
>
> Minor comments on the script. I'd proably `rm -f fifo*` before creating
> them
> to allow clean rerun after Ctrl-C. Also the eval can be simplified to:
>   eval TEST_PID=\$TEST${i}_PID
>
> cheers,
> Pádraig.
>
>
>


5.1.sh
Description: Bourne shell script


Re: Enhancement request for tee - please add the option to not quit on SIGPIPE when someother files are still opened

2015-11-20 Thread Pádraig Brady
On 20/11/15 02:20, Pádraig Brady wrote:
> I'm coming around to making a change here.
> 
> Either be quiet about:
>   datagen | tee >(sha1sum --tag) >(md5sum --tag) >&- | sort | gpg --clearsign
> 
> Or support:
>   datagen | tee --no-stdout >(sha1sum --tag) >(md5sum --tag) | sort | gpg 
> --clearsign
> 
> I like the idea of supporting this with no new option.
> I see we have similar EBADF handling in touch and nohup.
> I'll sleep on it.

The attached supports the >&- usage above.

cheers,
Pádraig.

>From bbd741b0eb290ff94b8f0f4bbe40d4fc7e9e5ea3 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Fri, 20 Nov 2015 11:54:00 +
Subject: [PATCH] tee: don't diagnose a closed standard output

This can be useful if you want data from process substitutions
coalesced through stdout. For example:
  datagen | tee >(md5sum --tag) >(sha256sum --tag) >&- | sort

* src/tee.c (tee_files): Don't diagnose EBADF on stdout.
* tests/misc/tee.sh: Add a test case.
* doc/coreutils.texi (tee invocation): Mention that -p is
useful with pipes that may not consume all data.
Add a closed stdout example, similar to the one above.
* NEWS: Mention the change in behavior.
* THANKS.in: Add the suggester, Jirka Hladky.
---
 NEWS   |  3 +++
 THANKS.in  |  1 +
 doc/coreutils.texi | 16 
 src/tee.c  |  8 +++-
 tests/misc/tee.sh  |  7 +++
 5 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index fc5e927..5fe0ea2 100644
--- a/NEWS
+++ b/NEWS
@@ -35,6 +35,9 @@ GNU coreutils NEWS-*- outline -*-
   ls now quotes file names unambiguously and appropriate for use in a shell,
   when outputting to a terminal.
 
+  tee no longer diagnoses write errors to a closed standard output, as this
+  can be useful when further piping the output from process substitutions.
+
 ** Improvements
 
   All utilities now quote user supplied arguments in error strings,
diff --git a/THANKS.in b/THANKS.in
index 51c77ef..5c49006 100644
--- a/THANKS.in
+++ b/THANKS.in
@@ -299,6 +299,7 @@ Jesse Thilo j...@eecs.lehigh.edu
 Jie Xu  x...@iag.net
 Jim Blandy  j...@cyclic.com
 Jim Dennis  j...@starshine.org
+Jirka Hladkyjhla...@redhat.com
 Joakim Rosqvist dvl...@cs.umu.se
 Jochen Hein joc...@jochen.org
 Joe Orton   j...@manyfish.co.uk
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 8034807..1755a51 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -13019,6 +13019,11 @@ so it works with @command{zsh}, @command{bash}, and @command{ksh},
 but not with @command{/bin/sh}.  So if you write code like this
 in a shell script, be sure to start the script with @samp{#!/bin/bash}.
 
+Note also that if any of the process substitutions (or piped stdout)
+might exit early without consuming all the data, the @option{-p} option
+is needed to allow @command{tee} to continue to process the input
+to any remaining outputs.
+
 Since the above example writes to one file and one process,
 a more conventional and portable use of @command{tee} is even better:
 
@@ -13087,6 +13092,17 @@ tar chof - "$tardir" \
   | bzip2 -9 -c > your-pkg-M.N.tar.bz2
 @end example
 
+If you want to further process the output from process substitutions,
+and those outputs are smaller than the system's PIPE_BUF size, resulting
+in atomic writes, it's useful to close stdout like:
+
+@example
+tardir=your-pkg-M.N
+tar chof - "$tardir" \
+  | tee >(md5sum --tag) >(sha256sum --tag) >&- \
+  | sort | gpg --clearsign > your-pkg-M.N.tar.sig
+@end example
+
 @exitstatus
 
 
diff --git a/src/tee.c b/src/tee.c
index ae1bb30..9ef7742 100644
--- a/src/tee.c
+++ b/src/tee.c
@@ -246,7 +246,13 @@ tee_files (int nfiles, char **files)
 bool fail = errno != EPIPE || (output_error == output_error_exit
   || output_error == output_error_warn);
 if (descriptors[i] == stdout)
-  clearerr (stdout); /* Avoid redundant close_stdout diagnostic.  */
+  {
+/* Don't diagnose a closed stdout.  */
+if (errno == EBADF)
+  fail = false;
+/* Avoid redundant close_stdout diagnostic.  */
+clearerr (stdout);
+  }
 if (fail)
   {
 error (output_error == output_error_exit
diff --git a/tests/misc/tee.sh b/tests/misc/tee.sh
index f457a0b..bc51c9a 100755
--- a/tests/misc/tee.sh
+++ b/tests/misc/tee.sh
@@ -63,6 +63,13 @@ if test -w /dev/full && test -c /dev/full; then
   test $(wc -l < err) = 1 || { cat err; fail=1; }
 fi
 
+# Ensure tee doesn't diagnose a closed stdout
+# which can be useful when coalescing small atomic outputs
+# from process substitutions like:
+# $ seq 10 | tee -p 

Re: Enhancement request for tee - please add the option to not quit on SIGPIPE when someother files are still opened

2015-11-20 Thread Pádraig Brady
On 20/11/15 04:33, Assaf Gordon wrote:
> Hello Jirka,
> 
> Regarding this:
> 
> On 11/19/2015 08:58 PM, Jirka Hladky wrote:
>>> The general problem I have with >(process substitutions) are that
>>> they are completely asynchronous.  There is no way to tell if they
>>> are done.
>>
>> Yes, I agree with you on this one. However, I don't see the other way
>> how to send the output of one process to multiple sub-processes in
>> shell.
> 
> If I may suggest this slightly verbose shell script (attached) - it should do 
> what you want (sending output to multiple processes)
> while still allowing tight control over each background process, and also 
> collecting their results in an organized fashion
> (ie keeping stdout,stderr,exitcode in a file for each test) - making further 
> diagnosis much easier.
> 
> if there's a need to combine the outputs from all the tests (e.g. to find the 
> smallest p-value from all tests) -  it's just a matter of "cat *.out" once
> all the tests are done.
> 
> Note that this does not solve the "--no-stdout" issue - just the ">()" part. 
> It should also make the shell script portable (except using GNU tee's "-p" 
> parameter).

Note there is no async issue with >() once the output is piped further,
as then the background processes are waited for.
Though yes, using fifos give more fine grained control over processes and exit 
status etc.

> The output should be:
> 
>  tee: standard output: Bad file descriptor
>  == Test 1 exited with code 0 ==
>  == Test 1 STDOUT ==
>  104857600
>  == Test 2 exited with code 0 ==
>  == Test 2 STDOUT ==
>  1
>  == Test 3 exited with code 1 ==
>  == Test 3 STDOUT ==
>  == Test 3 STDERR ==
>  wc: unrecognized option '--foo'
>  Try 'wc --help' for more information.
>  == Test 4 exited with code 0 ==
>  == Test 4 STDOUT ==
>  32768
>  ==
>  Test results stored in /tmp/tmp.esLAoUxeLQ
> 
> 
> Comments and corrections welcomed.

Yes this is a useful pattern.
I noted something similar for use with split(1) at:
http://lists.gnu.org/archive/html/coreutils/2011-05/msg00012.html
with the number of parallel processes potentially determined with nproc(1).

Minor comments on the script. I'd proably `rm -f fifo*` before creating them
to allow clean rerun after Ctrl-C. Also the eval can be simplified to:
  eval TEST_PID=\$TEST${i}_PID

cheers,
Pádraig.




Re: Enhancement request for tee - please add the option to not quit on SIGPIPE when someother files are still opened

2015-11-20 Thread Bernhard Voelker
On 11/20/2015 01:08 PM, Pádraig Brady wrote:
> +  tee no longer diagnoses write errors to a closed standard output, as this
> +  can be useful when further piping the output from process substitutions.

I'm not sure this is allowed by POSIX, but at least this regresses for other
reasons of EBADF like when the file descriptor is valid but not opened for
writing:


  $ echo | /usr/bin/tee >&0 ; echo $?
  /usr/bin/tee: standard output: Bad file descriptor
  /usr/bin/tee: write error
  1

  $ echo | src/tee >&0 ; echo $?
  0

As process substitutions give not that exact control over what happens anyway,
I'm 20:80 for adding because the proposed EBADF handling is blurring the
situation for all other cases which are relying on exact error diagnostic
and exit code.

Have a nice day,
Berny




Re: Enhancement request for tee - please add the option to not quit on SIGPIPE when someother files are still opened

2015-11-20 Thread Jim Meyering
On Fri, Nov 20, 2015 at 1:08 PM, Pádraig Brady  wrote:
> On 20/11/15 02:20, Pádraig Brady wrote:
>> I'm coming around to making a change here.
>>
>> Either be quiet about:
>>   datagen | tee >(sha1sum --tag) >(md5sum --tag) >&- | sort | gpg --clearsign
>>
>> Or support:
>>   datagen | tee --no-stdout >(sha1sum --tag) >(md5sum --tag) | sort | gpg 
>> --clearsign
>>
>> I like the idea of supporting this with no new option.
>> I see we have similar EBADF handling in touch and nohup.
>> I'll sleep on it.
>
> The attached supports the >&- usage above.

Doesn't this suppress a diagnostic that is likely to be valuable to anyone who
accidentally runs an affected tool from a context with closed standard output?



Re: Enhancement request for tee - please add the option to not quit on SIGPIPE when someother files are still opened

2015-11-20 Thread Bernhard Voelker
On 11/20/2015 03:35 PM, Jirka Hladky wrote:
> I have tried two versions of the main command (one with >/dev/full, other one 
> with >&-). 
> 
> cat /dev/zero | head -c500M | (/dev/shm/AAA/coreutils-8.24/src/tee -p 
> $d/fifo1 $d/fifo2 $d/fifo3 $d/fifo4 >/dev/full )
> 2>&1 | tee $d/run.log &
> and
> cat /dev/zero | head -c500M | (/dev/shm/AAA/coreutils-8.24/src/tee -p 
> $d/fifo1 $d/fifo2 $d/fifo3 $d/fifo4 >&- ) 2>&1 |
> tee $d/run.log &
> 
> Both of them are working fine, except that following messages are emitted:
> 
> tee: standard output: No space left on device
> tee: standard output: Bad file descriptor

I'm not convinced that a new --no-stdout option is warranted:
why not simply redirect stdout to the last fifo?

  cat /dev/zero | head -c500M \
| (/dev/shm/AAA/coreutils-8.24/src/tee -p \
 $d/fifo1 $d/fifo2 $d/fifo3 > $d/fifo4 ) 2>&1 \
| > tee $d/run.log &

Have a nice day,
Berny



Re: Enhancement request for tee - please add the option to not quit on SIGPIPE when someother files are still opened

2015-11-20 Thread Eric Blake
On 11/20/2015 10:38 AM, Bernhard Voelker wrote:
> On 11/20/2015 01:08 PM, Pádraig Brady wrote:
>> +  tee no longer diagnoses write errors to a closed standard output, as this
>> +  can be useful when further piping the output from process substitutions.
> 
> I'm not sure this is allowed by POSIX,

POSIX says you are non-conforming the moment you start an application
with fd 0, 1, or 2 closed, and that all bets are off (so we can do
whatever we think makes the most sense, but if it is more than just tee
with stdout closed we may be aggravating the problem).

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


bug#21790: [PATCH] coreutils/cp: handle EOF extents correctly

2015-11-20 Thread Pádraig Brady
On 30/10/15 18:54, Pádraig Brady wrote:
> On 30/10/15 16:57, Pádraig Brady wrote:
>> On 30/10/15 09:02, Dmitry Monakhov wrote:
>>> fallocate can allocate extens beyond EOF via FALLOC_FL_KEEP_SIZE.
>>> Currenly sparse engine tries to copy such extents which is wrong and
>>> result in silent data corruption (leave file with incorrect size).
>>>
>>> ##TESTCASE
>>> echo blabla > sparse_falloc.in
>>> truncate -s 2M sparse_falloc.in
>>> fallocate -n -o 4M -l 1M sparse_falloc.in
>>> cp sparse_falloc.in sparse_falloc.out
>>> cmp sparse_falloc.in sparse_falloc.out
>>
>> Ouch.  Thanks for the analysis and patch.
>> It looks correct.  I'll analyze further before applying.
> 
> This doesn't handle the --sparse==never case
> (which is broken with and without the patch).
> 
> Also one might have an extent spanning the file size boundary,
> in which this patch could miss the remaining data?
> 
> Also currently if the source file is being extended
> while being copied, we continue to read, whereas we now wont.
> Theoretically this is an issue when st_size doesn't match
> what's available to be read, though maybe not a practical issue
> since we don't use this path for a zero st_size, nor
> sources that don't support fiemap anyway.
> 
> This might be an opportune time to rip out the fiemap stuff
> in favor of SEEK{DATA,HOLE} anyway, which I intended to do
> during this cycle.  For fix backporting sake though,
> we should apply the minimal fix to the fiemap code first.
> 
> Attached is the current minimally tested patch.

I'll apply the attached in your name soon.
Please review.

thanks,
Pádraig.
>From f79f3a83e4e43b70d2a209e6d1ec8185dac16625 Mon Sep 17 00:00:00 2001
From: Dmitry Monakhov 
Date: Fri, 30 Oct 2015 22:04:46 +
Subject: [PATCH] copy: fix copying of extents beyond the apparent file size

fallocate can allocate extents beyond EOF via FALLOC_FL_KEEP_SIZE.
Where there is a gap (hole) between the extents, and EOF is within
that gap, the final hole wasn't reproduced, resulting in silent
data corruption in the copied file (size too small).

* src/copy.c (extent_copy): Ensure we don't process extents
beyond the apparent file size, since processing and allocating
those is not currently supported.
* tests/cp/fiemap-extents.sh: Rename from tests/cp/fiemap-empty.sh
and renable parts checking the extents at and beyond EOF.
* tests/local.mk: Reference the renamed test.
* NEWS: Mention the bug fix.
---
 NEWS   |   5 +++
 src/copy.c |  18 +++-
 tests/cp/fiemap-empty.sh   | 102 -
 tests/cp/fiemap-extents.sh |  81 +++
 tests/local.mk |   2 +-
 5 files changed, 104 insertions(+), 104 deletions(-)
 delete mode 100755 tests/cp/fiemap-empty.sh
 create mode 100755 tests/cp/fiemap-extents.sh

diff --git a/NEWS b/NEWS
index fc5e927..814e29d 100644
--- a/NEWS
+++ b/NEWS
@@ -4,6 +4,11 @@ GNU coreutils NEWS-*- outline -*-
 
 ** Bug fixes
 
+  cp now correctly copies files with a hole at the end of the file,
+  and extents allocated beyond the apparent size of the file.
+  That combination resulted in the trailing hole not being reproduced.
+  [bug introduced in coreutils-8.10]
+
   ls no longer prematurely wraps lines when printing short file names.
   [bug introduced in 5.1.0]
 
diff --git a/src/copy.c b/src/copy.c
index dc1cd29..4fb3fb2 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -432,6 +432,19 @@ extent_copy (int src_fd, int dest_fd, char *buf, size_t buf_size,
   ext_len = 0;
 }
 
+  /* Truncate extent to EOF.
+ Generally this will trigger with an extent starting after
+ src_total_size, and result in creating a hole or zeros until EOF.
+ Though in a file with changing extents since src_total_size
+ was determined, we might have an extent spanning that size,
+ in which case we'll only copy data up to that size.  */
+  if (src_total_size < ext_start + ext_len)
+{
+  if (src_total_size < ext_start)
+ext_start = src_total_size;
+  ext_len = src_total_size - ext_start;
+}
+
   ext_hole_size = ext_start - last_ext_start - last_ext_len;
 
   wrote_hole_at_eof = false;
@@ -495,14 +508,17 @@ extent_copy (int src_fd, int dest_fd, char *buf, size_t buf_size,
   off_t n_read;
   empty_extent = false;
   last_ext_len = ext_len;
+  bool read_hole;
 
   if ( ! sparse_copy (src_fd, dest_fd, buf, buf_size,
   sparse_mode == SPARSE_ALWAYS ? hole_size: 0,
   true, src_name, dst_name, ext_len, _read,
-  _hole_at_eof))
+  _hole))
 goto fail;
 
   dest_pos = ext_start + n_read;
+   

Re: Enhancement request for tee - please add the option to not quit on SIGPIPE when someother files are still opened

2015-11-20 Thread Pádraig Brady
On 20/11/15 17:38, Jim Meyering wrote:
> On Fri, Nov 20, 2015 at 1:08 PM, Pádraig Brady  wrote:
>> On 20/11/15 02:20, Pádraig Brady wrote:
>>> I'm coming around to making a change here.
>>>
>>> Either be quiet about:
>>>   datagen | tee >(sha1sum --tag) >(md5sum --tag) >&- | sort | gpg 
>>> --clearsign
>>>
>>> Or support:
>>>   datagen | tee --no-stdout >(sha1sum --tag) >(md5sum --tag) | sort | gpg 
>>> --clearsign
>>>
>>> I like the idea of supporting this with no new option.
>>> I see we have similar EBADF handling in touch and nohup.
>>> I'll sleep on it.
>>
>> The attached supports the >&- usage above.
> 
> Doesn't this suppress a diagnostic that is likely to be valuable to anyone who
> accidentally runs an affected tool from a context with closed standard output?

Yes it's not ideal.
Also it doesn't map directly to closed stdout.
If we were to support it then --no-stdout would probably be best.
That would allow symmetric use of processing substitutions also.
i.e. tee --no-stdout >(cmd1) >(cmd2)
rather than the slightly awkward: tee --no-stdout >(cmd1) | cmd2

cheers,
Pádraig



Re: Enhancement request for tee - please add the option to not quit on SIGPIPE when someother files are still opened

2015-11-20 Thread Jirka Hladky
>
> > Doesn't this suppress a diagnostic that is likely to be valuable to
> anyone who
> > accidentally runs an affected tool from a context with closed standard
> output?
> Yes it's not ideal.
> Also it doesn't map directly to closed stdout.
> If we were to support it then --no-stdout would probably be best.
> That would allow symmetric use of processing substitutions also.
> i.e. tee --no-stdout >(cmd1) >(cmd2)
> rather than the slightly awkward: tee --no-stdout >(cmd1) | cmd2


This is exactly my point of view as well. If we want to support it we
should do this right. I consider both >&- and >/dev/full as workarounds. I
think that --no-stdout is the best solution for this.

Jirka

On Fri, Nov 20, 2015 at 7:08 PM, Eric Blake  wrote:

> On 11/20/2015 10:38 AM, Bernhard Voelker wrote:
> > On 11/20/2015 01:08 PM, Pádraig Brady wrote:
> >> +  tee no longer diagnoses write errors to a closed standard output, as
> this
> >> +  can be useful when further piping the output from process
> substitutions.
> >
> > I'm not sure this is allowed by POSIX,
>
> POSIX says you are non-conforming the moment you start an application
> with fd 0, 1, or 2 closed, and that all bets are off (so we can do
> whatever we think makes the most sense, but if it is more than just tee
> with stdout closed we may be aggravating the problem).
>
> --
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>
>


Re: Enhancement request for tee - please add the option to not quit on SIGPIPE when someother files are still opened

2015-11-20 Thread Jirka Hladky
Yes, *this is* the solution I was looking for!

tee -p (head -c1 | wc -c ) > >(head -c10M | wc -c)

Thanks to everybody to take part in the discussion and finally coming up
with the solution.

Could we please add this example to tee's manual page into the EXAMPLE
section? If there is anything I can do to make it happen please let me
know.

Thanks a lot!
Jirka

On Sat, Nov 21, 2015 at 12:01 AM, Bob Proulx  wrote:

> Bernhard Voelker wrote:
> > I'm not convinced that a new --no-stdout option is warranted:
> > why not simply redirect stdout to the last fifo?
> >
> >   cat /dev/zero | head -c500M \
> > | (/dev/shm/AAA/coreutils-8.24/src/tee -p \
> >  $d/fifo1 $d/fifo2 $d/fifo3 > $d/fifo4 ) 2>&1 \
> > | > tee $d/run.log &
>
> Of course!  It was so obvious that we missed seeing it!  Simply do a
> normal redirect of stdout to the process.  Thanks Bernhard for
> pointing this out.
>
> This is also true of the >(process substitutions) too.
>
>   echo foo | tee >(sleep 2 && cat) > >(sleep 5 && cat)
>
> This really argues against any need for --no-stdout.  Because if one
> wants --no-stdout it means one has forgotten about a normal
> redirection.
>
> Bob
>
>


Re: Enhancement request for tee - please add the option to not quit on SIGPIPE when someother files are still opened

2015-11-20 Thread Bob Proulx
Bernhard Voelker wrote:
> I'm not convinced that a new --no-stdout option is warranted:
> why not simply redirect stdout to the last fifo?
> 
>   cat /dev/zero | head -c500M \
> | (/dev/shm/AAA/coreutils-8.24/src/tee -p \
>  $d/fifo1 $d/fifo2 $d/fifo3 > $d/fifo4 ) 2>&1 \
> | > tee $d/run.log &

Of course!  It was so obvious that we missed seeing it!  Simply do a
normal redirect of stdout to the process.  Thanks Bernhard for
pointing this out.

This is also true of the >(process substitutions) too.

  echo foo | tee >(sleep 2 && cat) > >(sleep 5 && cat)

This really argues against any need for --no-stdout.  Because if one
wants --no-stdout it means one has forgotten about a normal
redirection.

Bob



Re: Enhancement request for tee - please add the option to not quit on SIGPIPE when someother files are still opened

2015-11-20 Thread Pádraig Brady
On 20/11/15 23:58, Jirka Hladky wrote:
>> On Sat, Nov 21, 2015 at 12:01 AM, Bob Proulx > > wrote:
>> 
>> Bernhard Voelker wrote:
>> > I'm not convinced that a new --no-stdout option is warranted:
>> > why not simply redirect stdout to the last fifo?
>> >
>> >   cat /dev/zero | head -c500M \
>> > | (/dev/shm/AAA/coreutils-8.24/src/tee -p \
>> >  $d/fifo1 $d/fifo2 $d/fifo3 > $d/fifo4 ) 2>&1 \
>> > | > tee $d/run.log &
>> 
>> Of course!  It was so obvious that we missed seeing it!  Simply do a
>> normal redirect of stdout to the process.  Thanks Bernhard for
>> pointing this out.
>> 
>> This is also true of the >(process substitutions) too.
>> 
>>   echo foo | tee >(sleep 2 && cat) > >(sleep 5 && cat)
>> 
>> This really argues against any need for --no-stdout.  Because if one
>> wants --no-stdout it means one has forgotten about a normal
>> redirection.
>
> Yes, *this is* the solution I was looking for! 
> 
> tee -p (head -c1 | wc -c ) > >(head -c10M | wc -c)
> 
> Thanks to everybody to take part in the discussion and finally coming up with 
> the solution. 
> 
> Could we please add this example to tee's manual page into the EXAMPLE 
> section? If there is anything I can do to make it happen please let me know. 

Yes it works well.
I've added the construct to the existing tee examples in the attached.

thanks,
Pádraig.

From 15e9168976444fbe67a6fcf95035352bd0dcbbe2 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Fri, 20 Nov 2015 11:54:00 +
Subject: [PATCH] doc: give a tee example for combining process substitution
 outputs

This can be useful if you want to further process data
from process substitutions. For example:
  datagen | tee >(md5sum --tag) > >(sha256sum --tag) | sort

* doc/coreutils.texi (tee invocation): Mention that -p is
useful with pipes that may not consume all data.
Add an example, similar to the one above.
* THANKS.in: Add Jirka Hladky.
---
 THANKS.in  |  1 +
 doc/coreutils.texi | 16 
 2 files changed, 17 insertions(+)

diff --git a/THANKS.in b/THANKS.in
index 51c77ef..5c49006 100644
--- a/THANKS.in
+++ b/THANKS.in
@@ -299,6 +299,7 @@ Jesse Thilo j...@eecs.lehigh.edu
 Jie Xu  x...@iag.net
 Jim Blandy  j...@cyclic.com
 Jim Dennis  j...@starshine.org
+Jirka Hladkyjhla...@redhat.com
 Joakim Rosqvist dvl...@cs.umu.se
 Jochen Hein joc...@jochen.org
 Joe Orton   j...@manyfish.co.uk
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 8034807..a73a635 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -13019,6 +13019,11 @@ so it works with @command{zsh}, @command{bash}, and @command{ksh},
 but not with @command{/bin/sh}.  So if you write code like this
 in a shell script, be sure to start the script with @samp{#!/bin/bash}.
 
+Note also that if any of the process substitutions (or piped stdout)
+might exit early without consuming all the data, the @option{-p} option
+is needed to allow @command{tee} to continue to process the input
+to any remaining outputs.
+
 Since the above example writes to one file and one process,
 a more conventional and portable use of @command{tee} is even better:
 
@@ -13087,6 +13092,17 @@ tar chof - "$tardir" \
   | bzip2 -9 -c > your-pkg-M.N.tar.bz2
 @end example
 
+If you want to further process the output from process substitutions,
+and those processes write atomically (i.e. write less than the system's
+PIPE_BUF size at a time), that's possible with a construct like:
+
+@example
+tardir=your-pkg-M.N
+tar chof - "$tardir" \
+  | tee >(md5sum --tag) > >(sha256sum --tag) \
+  | sort | gpg --clearsign > your-pkg-M.N.tar.sig
+@end example
+
 @exitstatus
 
 
-- 
2.5.0



Re: feature request: tail -H

2015-11-20 Thread Pádraig Brady
On 01/10/15 17:07, Pádraig Brady wrote:
> On 30/09/15 12:45, Stephen Shirley wrote:
>> Hi,
>> Here's the scenario: you're in a directory of updating log files
>> (could be /var/log), and you want to watch all files for specific
>> keywords. For a single file, "tail -F file | grep keyword" is
>> sufficient, but if you want to watch multiple files, "tail -F file1
>> file2 file3 | grep keyword" is much less helpful because you have no
>> way of knowing which log file the matching text is from.
>>
>> My suggestion is to add a -H flag (convention taken from grep -H aka
>> --with-filename) to tail. With -H specified, tail would no longer
>> print out headers before file contents, it would instead prefix the
>> line with the file name. With this, "tail -HF file1 file2 file3 | grep
>> keyword" is useful, because you get the filename included in the
>> matching lines.
>>
>> The workaround i've come up with in the meantime is:
>>
>>   tail -F "$@" | awk '/^$/ {next} /^==>/ {prefix=$2; next} {print
>> prefix ": " $0}'
>>
>> but it's a bit of a hack; there's no way to be sure that a header
>> string is actually a header, and not part of the file contents.
> 
> I like that. It would be similar to the grep option: -H, --with-filename

Upon more careful consideration, I'm 50:50
about adding per line processing to tail.

More robust awk would be:

  tail -Fv "$@" | awk '
/^==> .* <==$/ {prefix=substr($0,5,length-8); next}
{print prefix ":" $0}
  ' |
  grep 'blah'

Now whether that's AWKward enough and not general enough
to warrant a new tail option I'm not sure.

Perhaps we could just add the above snippet to the docs?
The big advantage is that it works everywhere already.

cheers,
Pádraig.