On Thu, Apr 15, 2021 at 04:29:17PM +0200, Christian Weisgerber wrote: > Jordan Geoghegan: > > > --- /tmp/bad.txt Wed Apr 14 21:06:51 2021 > > +++ /tmp/good.txt Wed Apr 14 21:06:41 2021 > > I'll note that no characters have been lost between the two files. > Only the order is different. > > > The only thing that changed between these runs was me using either xargs -P > > 1 or -P 2. > > What do you expect? You run two processes in parallel that write > to the same file. Obviously their output will be interspersed in > unpredictable order. > > You seem to imagine that awk's output is line-buffered. But when > it writes to a pipe or file, its output is block-buffered. This > is default stdio behavior. Output is written in block-size increments > (16 kB in practice) without regard to lines. So, yes, you can end > up with a fragment from a line written by process #1, followed by > lines from process #2, followed by the remainder of the line from > #1, etc. > > -- > Christian "naddy" Weisgerber na...@mips.inka.de >
Right, a fflush() call after the printf makes the issue go away, but only since awk is being nice and issues a single write call for that single printf. Since awk afaik does not give such a guarantee, it is better to have each parallel invocation write to a separate file and then cat them together after all the awk runs are done. -Otto