Re: HashAgg's batching counter starts at 0, but Hash's starts at 1. (now: incremental sort)
On Thu, Jul 30, 2020 at 6:40 PM Justin Pryzby wrote: > Feel free to close it out. I'm satisfied that we've had a discussion about > it. Closed it out. -- Peter Geoghegan
Re: HashAgg's batching counter starts at 0, but Hash's starts at 1. (now: incremental sort)
On Thu, Jul 30, 2020 at 6:39 PM James Coleman wrote: > I very much do not like this approach, and I think it's actually > fundamentally wrong, at least for the memory check. Quicksort is not the only > option that uses memory. For now, there's only one option that spills to disk > (external merge sort), but there's no reason it has to remain that way. I wouldn't be surprised if it was possible to get SORT_TYPE_EXTERNAL_SORT even today (though I'm not sure if that's truly possible). That will happen for a regular sort node if we require randomAccess to the sort, and it happens to spill -- we can randomly access the final tape, but cannot do a final on-the-fly merge. Say for a merge join. -- Peter Geoghegan
Re: HashAgg's batching counter starts at 0, but Hash's starts at 1. (now: incremental sort)
On Thu, Jul 30, 2020 at 06:33:32PM -0700, Peter Geoghegan wrote: > On Thu, Jul 30, 2020 at 5:22 PM Justin Pryzby wrote: > > Because filtering out zero values is exactly what's intended to be avoided > > for > > nontext output. > > > > I think checking whether the method was used should result in the same > > output, > > without the literal check for zero value (which itself sets a bad example). > > It seems fine to me as-is. What about SORT_TYPE_TOP_N_HEAPSORT? Or any > other sort methods we add in the future? Feel free to close it out. I'm satisfied that we've had a discussion about it. -- Justin
Re: HashAgg's batching counter starts at 0, but Hash's starts at 1. (now: incremental sort)
On Thu, Jul 30, 2020 at 8:22 PM Justin Pryzby wrote: > On Wed, Jul 29, 2020 at 09:18:44PM -0700, Peter Geoghegan wrote: > > On Wed, Jul 29, 2020 at 9:05 PM Justin Pryzby > wrote: > > > So my 2ndary suggestion is to conditionalize based on the method > rather than > > > value of space used. > > > > What's the advantage of doing it that way? > > Because filtering out zero values is exactly what's intended to be avoided > for > nontext output. > > I think checking whether the method was used should result in the same > output, > without the literal check for zero value (which itself sets a bad example). > > --- a/src/backend/commands/explain.c > +++ b/src/backend/commands/explain.c > @@ -2824,13 +2824,13 @@ > show_incremental_sort_group_info(IncrementalSortGroupInfo *groupInfo, > appendStringInfo(, "%s Groups", groupLabel); > ExplainOpenGroup("Incremental Sort Groups", > groupName.data, true, es); > ExplainPropertyInteger("Group Count", NULL, > groupInfo->groupCount, es); > > ExplainPropertyList("Sort Methods Used", methodNames, es); > > - if (groupInfo->maxMemorySpaceUsed > 0) > + if (groupInfo->sortMethods & SORT_TYPE_QUICKSORT) > { > longavgSpace = > groupInfo->totalMemorySpaceUsed / groupInfo->groupCount; > const char *spaceTypeName; > StringInfoData memoryName; > > spaceTypeName = > tuplesort_space_type_name(SORT_SPACE_TYPE_MEMORY); > @@ -2841,13 +2841,13 @@ > show_incremental_sort_group_info(IncrementalSortGroupInfo *groupInfo, > ExplainPropertyInteger("Average Sort Space Used", > "kB", avgSpace, es); > ExplainPropertyInteger("Peak Sort Space Used", > "kB", > > groupInfo->maxMemorySpaceUsed, es); > > ExplainCloseGroup("Sort Spaces", memoryName.data, > true, es); > } > - if (groupInfo->maxDiskSpaceUsed > 0) > + if (groupInfo->sortMethods & SORT_TYPE_EXTERNAL_SORT) > { > longavgSpace = > groupInfo->totalDiskSpaceUsed / groupInfo->groupCount; > const char *spaceTypeName; > StringInfoData diskName; > > spaceTypeName = > tuplesort_space_type_name(SORT_SPACE_TYPE_DISK); > I very much do not like this approach, and I think it's actually fundamentally wrong, at least for the memory check. Quicksort is not the only option that uses memory. For now, there's only one option that spills to disk (external merge sort), but there's no reason it has to remain that way. And in the future we might accurately report memory consumed even when we've eventually spilled to disk also, so memory used would be relevant potentially even if no in-memory sort was ever performed. So I'm pretty confident checking the space used is the correct way to do this. James
Re: HashAgg's batching counter starts at 0, but Hash's starts at 1. (now: incremental sort)
On Thu, Jul 30, 2020 at 5:22 PM Justin Pryzby wrote: > Because filtering out zero values is exactly what's intended to be avoided for > nontext output. > > I think checking whether the method was used should result in the same output, > without the literal check for zero value (which itself sets a bad example). It seems fine to me as-is. What about SORT_TYPE_TOP_N_HEAPSORT? Or any other sort methods we add in the future? The way that we flatten maxDiskSpaceUsed and maxMemorySpaceUsed into "space used" on output might be kind of questionable, but it's something that we have to live with for the foreseeable future. I don't think that this is a bad example -- we don't output maxDiskSpaceUsed or maxMemorySpaceUsed at the conceptual level. -- Peter Geoghegan
Re: HashAgg's batching counter starts at 0, but Hash's starts at 1. (now: incremental sort)
On Wed, Jul 29, 2020 at 09:18:44PM -0700, Peter Geoghegan wrote: > On Wed, Jul 29, 2020 at 9:05 PM Justin Pryzby wrote: > > So my 2ndary suggestion is to conditionalize based on the method rather than > > value of space used. > > What's the advantage of doing it that way? Because filtering out zero values is exactly what's intended to be avoided for nontext output. I think checking whether the method was used should result in the same output, without the literal check for zero value (which itself sets a bad example). --- a/src/backend/commands/explain.c +++ b/src/backend/commands/explain.c @@ -2824,13 +2824,13 @@ show_incremental_sort_group_info(IncrementalSortGroupInfo *groupInfo, appendStringInfo(, "%s Groups", groupLabel); ExplainOpenGroup("Incremental Sort Groups", groupName.data, true, es); ExplainPropertyInteger("Group Count", NULL, groupInfo->groupCount, es); ExplainPropertyList("Sort Methods Used", methodNames, es); - if (groupInfo->maxMemorySpaceUsed > 0) + if (groupInfo->sortMethods & SORT_TYPE_QUICKSORT) { longavgSpace = groupInfo->totalMemorySpaceUsed / groupInfo->groupCount; const char *spaceTypeName; StringInfoData memoryName; spaceTypeName = tuplesort_space_type_name(SORT_SPACE_TYPE_MEMORY); @@ -2841,13 +2841,13 @@ show_incremental_sort_group_info(IncrementalSortGroupInfo *groupInfo, ExplainPropertyInteger("Average Sort Space Used", "kB", avgSpace, es); ExplainPropertyInteger("Peak Sort Space Used", "kB", groupInfo->maxMemorySpaceUsed, es); ExplainCloseGroup("Sort Spaces", memoryName.data, true, es); } - if (groupInfo->maxDiskSpaceUsed > 0) + if (groupInfo->sortMethods & SORT_TYPE_EXTERNAL_SORT) { longavgSpace = groupInfo->totalDiskSpaceUsed / groupInfo->groupCount; const char *spaceTypeName; StringInfoData diskName; spaceTypeName = tuplesort_space_type_name(SORT_SPACE_TYPE_DISK);