bug#23031: reporting write errors and handling SIGPIPE

2016-03-18 Thread Eric Blake
On 03/18/2016 11:30 AM, Assaf Gordon wrote:

> I found that on my weird setup, programs start with SIGPIPE set to
> SIG_IGN by default (not sure how I got myself into such situation).

There have been various automated robot testsuite runners (such as
Jenkins at one point in the past, although I don't know if it is still
the case) that do the equivalent of 'trap - PIPE' in their master shell;
then the rules of POSIX say that 'sh' can't do anything to undo that
setting (it's very annoying - inheriting ignored SIGPIPE is impossible
to undo in straight shell, and requires an intermediary C program).

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


bug#23031: reporting write errors and handling SIGPIPE

2016-03-18 Thread Assaf Gordon

Hello,

First,
Attached is a patch that adds errno information to 'write error' messages, e.g.:
  $ grep [...] > /dev/full
  grep: write error: No space left on device
I hope it is self-explanatory enough (comments and suggestions are welcomed).


Second,
On one gnu/linux server I'm experiencing a strange behavior (or at least, not 
understandable to me):
grep does not immediately terminates on SIGPIPE, and instead exits and prints "write 
error" (for EPIPE).
which is partially why I wrote the above patch, to try understand what's going 
on.

An example, reproducible on my machine (running on real hardware), though hard 
to reproduce inside a VM and on other servers:
   seq 10 > in
   for i in $(seq 100) ; do
  ./src/grep -s --line-buffered -v '^$' < in | head -n1 > /dev/null ;
   done
for some of the runs  (out of 100) I get an error "./src/grep: write error: Broken 
pipe" .

attached are strace/ltrace logs of such cases. the key lines:

When grep is killed by SIGPIPE:

 == strace ==
   write(1, "1\n", 2)  = 2
   write(1, "2\n", 2)  = -1 EPIPE (Broken pipe)
   --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=2960, si_uid=1004} ---
   +++ killed by SIGPIPE +++

 == ltrace ==
   fwrite_unlocked("1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14"..., 1, 2, 
0x7efc5dab9400) = 2
__errno_location()  
  = 0x7efc5dee46a0
   fflush_unlocked(0x7efc5dab9400, 0x18da000, 10, 4096) 
 = 0
   memchr("2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n1"..., '\n', 32766)  
 = 0x18da003
   fwrite_unlocked("2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n1"..., 1, 2, 
0x7efc5dab9400) = 2
   __errno_location()   
 = 0x7efc5dee46a0
   fflush_unlocked(0x7efc5dab9400, 0x18da004, 0, 2610 
   --- SIGPIPE (Broken pipe) ---
   +++ killed by SIGPIPE +++


When grep is not killed by SIGPIPE:

 == strace ==
   write(1, "1\n", 2)  = 2
   write(1, "2\n", 2)  = -1 EPIPE (Broken pipe)
   --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=2893, si_uid=1004} ---
   write(2, "./src/grep: ", 12)= 12
   write(2, "write error", 11) = 11
   write(2, ": Broken pipe", 13)   = 13
   write(2, "\n", 1)   = 1
   exit_group(2)   = ?

 == ltrace ==
   fwrite_unlocked("1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14"..., 1, 2, 
0x7f9e62f50400)   = 2
   __errno_location()   
   = 0x7f9e6337b6a0
   fflush_unlocked(0x7f9e62f50400, 0x25af000, 10, 4096) 
   = 0
   memchr("2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n1"..., '\n', 32766)  
   = 0x25af003
   fwrite_unlocked("2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n1"..., 1, 2, 
0x7f9e62f50400)   = 2
   __errno_location()   
   = 0x7f9e6337b6a0
   fflush_unlocked(0x7f9e62f50400, 0x25af004, 0, 2610 
   --- SIGPIPE (Broken pipe) ---
   <... fflush_unlocked resumed> )  
   = 0x
   __errno_location()   
   = 0x7f9e6337b6a0
   error(2, 32, 0x41caa0, 32 



The server is:
  $ uname -a
  Linux x 3.13.0-77-generic #121-Ubuntu SMP Wed Jan 20 10:50:42 UTC 2016 x86_64 
GNU/Linux

  $ gcc -v
  Using built-in specs.
  COLLECT_GCC=gcc
  
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/5.2.0/lto-wrapper
  Target: x86_64-unknown-linux-gnu
  Configured with: ../gcc-5.2.0/configure --enable-languages=c,c++
  Thread model: posix
  gcc version 5.2.0 (GCC)


Thanks for any feedback,
regards,
 - assaf


grep-write-error-msg.patch.xz
Description: application/xz


grep-SIGPIPE-killed.ltrace.log.xz
Description: application/xz


grep-SIGPIPE-killed.strace.log.xz
Description: application/xz


grep-SIGPIPE-not-killed.ltrace.log.xz
Description: application/xz


grep-SIGPIPE-not-killed.strace.log.xz
Description: application/xz


bug#23052: Make grep be able to separate output by NULL characters?

2016-03-18 Thread Chiel ten Brinke
Suppose we are doing a multiline regex pattern search on a bunch of files
and we want to extract the matches, e.g. for further processing. By
default, grep outputs matches separated by newlines, but since we are doing
multiline patterns this creates the inconvenience that we cannot easily
extract the individual matches. So we would want to have the matches
separated by null bytes. This seems to be a very straightforward feature,
and I was surprised that this was not already possible.

Here is a tiny example

grep -rzPIho '}\n\n\w\w\b' | od -a

Depending on the files in your file tree, this may yield an output like

000   }  nl  nl   m   y  nl   }  nl  nl   i   f  nl   }  nl  nl
m020   y  nl   }  nl  nl   m   y  nl   }  nl  nl   i   f  nl   }
nl040  nl   m   y  nl044

As you can see, we cannot split on newlines to obtain the matches for
further processing, since the matches contain newline characters themselves.

Now grep already has the -z/--null flag, but that works only in conjunction
with the -l flag, which makes grep output filenames instead of matches.

So here the feature request: can we make the -z flag also affect the normal
output?


Regards,

Chiel