We have a parallel build that occasionally fails with the error message
"make: write error".  Make prints that error message as it is exiting when
it detects that it has seen errors while writing to stdout.  The error it
is enountering is an EAGAIN error, which implies that something has made
its stdout non-blocking.  As far as I've been able to tell so far, this is
occurring while make is running the command "git fetch --quiet --tags".
Once that command finishes, stdout goes back to being blocking but since
this is a parallel build, make is doing other work while this git command
is running, and may attempt to write to stdout during that time.

By stracing this git command, I can see it running subcommand

ssh -p 29418 user@gerrit.domain "git-upload-pack '/repo'"

and I can see that ssh command doing this:

39828 dup(0)                            = 5
39828 dup(1)                            = 6
39828 dup(2)                            = 7
39828 ioctl(5, TCGETS, 0x7ffea2880800)  = -1 ENOTTY (Inappropriate ioctl for 
39828 fcntl(5, F_GETFL)                 = 0 (flags O_RDONLY)
39828 fcntl(5, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
39828 ioctl(6, TCGETS, 0x7ffea2880800)  = -1 ENOTTY (Inappropriate ioctl for 
39828 fcntl(6, F_GETFL)                 = 0x1 (flags O_WRONLY)
39828 fcntl(6, F_SETFL, O_WRONLY|O_NONBLOCK) = 0
39828 ioctl(7, TCGETS, {B38400 opost isig icanon echo ...}) = 0
39828 fcntl(5, F_SETFD, FD_CLOEXEC)     = 0
39828 fcntl(6, F_SETFD, FD_CLOEXEC)     = 0
39828 fcntl(7, F_SETFD, FD_CLOEXEC)     = 0
39828 ioctl(0, TCGETS, 0x7ffea28806e0)  = -1 ENOTTY (Inappropriate ioctl for 
39828 fcntl(0, F_GETFL)                 = 0x800 (flags O_RDONLY|O_NONBLOCK)
39828 fcntl(0, F_SETFL, O_RDONLY)       = 0
39828 ioctl(1, TCGETS, 0x7ffea28806e0)  = -1 ENOTTY (Inappropriate ioctl for 
39828 fcntl(1, F_GETFL)                 = 0x801 (flags O_WRONLY|O_NONBLOCK)
39828 fcntl(1, F_SETFL, O_WRONLY)       = 0
39828 ioctl(2, TCGETS, {B38400 opost isig icanon echo ...}) = 0

So ssh has dup'd descriptors 0, 1, and 2, and then turned on the O_NONBLOCK 
flag on
the copies of stdin and stdout.  You can see afterwards that ssh reads the 
flags on
descriptors 0 and 1, and both have O_NONBLOCK set.  It then clears that bit.  
near the beginning of its runs an cleared it near the end.

Should this be considered a git bug or an ssh bug or something else?

I thought I had finally figured out exactly what is happening but while writing 
now I'm not sure why my workaround appears to be working.  My workaround is to 
make's stdout into a simple program that reads make's output and writes it to 
make uses to write to, except it does a select() on descriptor 1 before 
writing, and
it makes sure to handle short counts.  But now I'm thinking that if it's the 
ssh started
indirectly by make that is messing with O_NONBLOCK, presumably it would be 
with O_NONBLOCK on the write side of the pipe that make writes to, so make 
still be encountering EAGAIN errors. And yet my workaround does seem to work.

Thanks for any light you can shed on this.

Reply via email to