On 2017-Nov-21, Peter Geoghegan wrote: > On Mon, Oct 2, 2017 at 6:04 AM, Robert Haas <robertmh...@gmail.com> wrote: > > Progress reporting on sorts seems like a tricky problem to me, as I > > said before. In most cases, a sort is going to involve an initial > > stage where it reads all the input tuples and writes out quicksorted > > runs, and then a merge phase where it merges all the output tapes into > > a sorted result. There are some complexities; for example, if the > > number of tapes is really large, then we might need multiple merge > > phases, only the last of which will produce tuples. > > This would ordinarily be the point at which I'd say "but you're very > unlikely to require multiple passes for an external sort these days". > But I won't say that on this thread, because CLUSTER generally has > unusually wide tuples, and so is much more likely to be I/O bound, to > require multiple passes, etc. (I bet the v10 enhancements > disproportionately improved CLUSTER performance.)
When the seqscan-and-sort strategy is used, we feed tuplesort with every tuple from the scan. Once that's completed, we call `performsort`, then retrieve tuples. If we see this in terms of tapes and merges, we can report the total number of each of those that we have completed. As far as I understand, we write one tape to completion, and only then start another one, right? Since there's no way to know how many tapes/merges are needed in total, it's not possible to compute a percentage of completion. That's seems okay -- we're just telling the user that progress is being made, and we only report facts not theory. Perhaps we can (also?) indicate disk I/O utilization, in terms of the number of blocks written by tuplesort. I suppose that in order to have tuplesort.c report progress, we would have to have some kind of API that tuplesort would invoke internally to indicate events such as "tape started/completed", "merge started/completed". One idea is to use a callback system; each tuplesort caller could optionally pass a callback to the "begin" function, for progress reporting purposes. Initially only cluster.c would use it, but I suppose eventually every tuplesort caller would want that. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services