Paul Eggert wrote: > OK, I scratched my head for a bit and came up with the following > further patch, which addresses the issues that I mentioned. ... > Subject: [PATCH] sort: simpler fix for sort -u data-loss bug > > * src/sort.c (overlap): Remove. > (fillbuf): Do not try to copy saved lines, as that is too risky > in the presence of parallelism, reallocated buffers, etc. > (sort): Invalidate any saved line before sorting a new batch. > --- > src/sort.c | 36 +-----------------------------------
Very nice! That fixes not just the original bug, but also the FMR, and eliminates my entire patch. The only cost is in writing at most one more line per buffer. I hate to look such a nice gift horse in the mouth, but it's getting late here... Would you mind adjusting that to add NEWS and mention that you've fixed the second, free-memory-read bug, too? And even add the test? If you don't find time, I'll get to that over the weekend. =============== Regarding your patch... For the record, at first I thought an input that used one (long) line per buffer would make --unique a no-op, but then I realized that in that case, each buffers-worth (one line each) would be written to its own temporary file, and the merge phase would handle the --unique semantics. Thanks again!
