Re: Use of "long" in incremental sort code

2020-11-23 Thread Anastasia Lubennikova

On 05.11.2020 02:53, James Coleman wrote:

On Tue, Nov 3, 2020 at 4:42 PM Tomas Vondra
 wrote:

On Tue, Nov 03, 2020 at 03:53:53AM +0100, Tomas Vondra wrote:

Hi,

I took another look at this, and 99% of the patch (the fixes to sort
debug messages) seems fine to me.  Attached is the part I plan to get
committed, including commit message etc.


I've pushed this part. Thanks for the patch, Haiying Tang.


The one change I decided to remove is this change in tuplesort_free:

-   longspaceUsed;
+   int64   spaceUsed;

The reason why I think this variable should be 'long' is that we're
using it for this:

spaceUsed = LogicalTapeSetBlocks(state->tapeset);

and LogicalTapeSetBlocks is defined like this:

extern long LogicalTapeSetBlocks(LogicalTapeSet *lts);

FWIW the "long" is not introduced by incremental sort - it used to be in
tuplesort_end, the incremental sort patch just moved it to a different
function. It's a bit confusing that tuplesort_updatemax has this:

int64   spaceUsed;

But I'd argue this is actually wrong, and should be "long" instead. (And
this actually comes from the incremental sort patch, by me.)


FWIW while looking at what the other places calling LogicalTapeSetBlocks
do, and I noticed this:

uint64  disk_used = LogicalTapeSetBlocks(...);

in the disk-based hashagg patch. So that's a third data type ...


IMHO this should simply switch the current int64 variable to long, as it
was before. Not sure about about the hashagg uint64 variable.

Is there anything that actually limits tape code to using at most 4GB
on 32-bit systems?


At first glance, I haven't found anything that could limit tape code. It 
uses BufFile, which is not limited by the OS file size limit.
Still, If we want to change 'long' in LogicalTapeSetBlocks, we should 
probably also update nBlocksWritten and other variables.


As far as I see, the major part of the patch was committed, so l update 
the status of the CF entry to "Committed". Feel free to create a new 
entry, if you're going to continue working on the remaining issue.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Use of "long" in incremental sort code

2020-11-04 Thread James Coleman
On Tue, Nov 3, 2020 at 4:42 PM Tomas Vondra
 wrote:
>
> On Tue, Nov 03, 2020 at 03:53:53AM +0100, Tomas Vondra wrote:
> >Hi,
> >
> >I took another look at this, and 99% of the patch (the fixes to sort
> >debug messages) seems fine to me.  Attached is the part I plan to get
> >committed, including commit message etc.
> >
>
> I've pushed this part. Thanks for the patch, Haiying Tang.
>
> >
> >The one change I decided to remove is this change in tuplesort_free:
> >
> >-   longspaceUsed;
> >+   int64   spaceUsed;
> >
> >The reason why I think this variable should be 'long' is that we're
> >using it for this:
> >
> >spaceUsed = LogicalTapeSetBlocks(state->tapeset);
> >
> >and LogicalTapeSetBlocks is defined like this:
> >
> >extern long LogicalTapeSetBlocks(LogicalTapeSet *lts);
> >
> >FWIW the "long" is not introduced by incremental sort - it used to be in
> >tuplesort_end, the incremental sort patch just moved it to a different
> >function. It's a bit confusing that tuplesort_updatemax has this:
> >
> >int64   spaceUsed;
> >
> >But I'd argue this is actually wrong, and should be "long" instead. (And
> >this actually comes from the incremental sort patch, by me.)
> >
> >
> >FWIW while looking at what the other places calling LogicalTapeSetBlocks
> >do, and I noticed this:
> >
> >uint64  disk_used = LogicalTapeSetBlocks(...);
> >
> >in the disk-based hashagg patch. So that's a third data type ...
> >
>
> IMHO this should simply switch the current int64 variable to long, as it
> was before. Not sure about about the hashagg uint64 variable.

Is there anything that actually limits tape code to using at most 4GB
on 32-bit systems?

James




Re: Use of "long" in incremental sort code

2020-11-04 Thread Tomas Vondra



On 11/4/20 10:58 PM, David Rowley wrote:

On Wed, 4 Nov 2020 at 10:42, Tomas Vondra  wrote:

IMHO this should simply switch the current int64 variable to long, as it
was before. Not sure about about the hashagg uint64 variable.


IMO, we should just get rid of the use of "long" here.   As far as I'm
concerned, using long in the core code at all is just unnecessary and
just increases the chances of having bugs.

How often do people forget that we support a 64-bit platform that has
sizeof(long) == 4?

Can't we use size_t and ssize_t if we really need a processor
word-sized type? And use int64/uint64 when we really want a 64-bit
type.



Perhaps. But I guess it's a bit strange to have function declared as 
returning long, but store the result in int64 everywhere. That was the 
point I was trying to make - it's not just a matter of changing all the 
variables to int64, IMHO.


regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Use of "long" in incremental sort code

2020-11-04 Thread David Rowley
On Wed, 4 Nov 2020 at 10:42, Tomas Vondra  wrote:
> IMHO this should simply switch the current int64 variable to long, as it
> was before. Not sure about about the hashagg uint64 variable.

IMO, we should just get rid of the use of "long" here.   As far as I'm
concerned, using long in the core code at all is just unnecessary and
just increases the chances of having bugs.

How often do people forget that we support a 64-bit platform that has
sizeof(long) == 4?

Can't we use size_t and ssize_t if we really need a processor
word-sized type? And use int64/uint64 when we really want a 64-bit
type.

David




Re: Use of "long" in incremental sort code

2020-11-03 Thread Tomas Vondra

On Tue, Nov 03, 2020 at 03:53:53AM +0100, Tomas Vondra wrote:

Hi,

I took another look at this, and 99% of the patch (the fixes to sort
debug messages) seems fine to me.  Attached is the part I plan to get
committed, including commit message etc.



I've pushed this part. Thanks for the patch, Haiying Tang.



The one change I decided to remove is this change in tuplesort_free:

-   longspaceUsed;
+   int64   spaceUsed;

The reason why I think this variable should be 'long' is that we're
using it for this:

   spaceUsed = LogicalTapeSetBlocks(state->tapeset);

and LogicalTapeSetBlocks is defined like this:

   extern long LogicalTapeSetBlocks(LogicalTapeSet *lts);

FWIW the "long" is not introduced by incremental sort - it used to be in
tuplesort_end, the incremental sort patch just moved it to a different
function. It's a bit confusing that tuplesort_updatemax has this:

   int64   spaceUsed;

But I'd argue this is actually wrong, and should be "long" instead. (And
this actually comes from the incremental sort patch, by me.)


FWIW while looking at what the other places calling LogicalTapeSetBlocks
do, and I noticed this:

   uint64  disk_used = LogicalTapeSetBlocks(...);

in the disk-based hashagg patch. So that's a third data type ...



IMHO this should simply switch the current int64 variable to long, as it
was before. Not sure about about the hashagg uint64 variable.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: Use of "long" in incremental sort code

2020-11-02 Thread Tomas Vondra

Hi,

I took another look at this, and 99% of the patch (the fixes to sort
debug messages) seems fine to me.  Attached is the part I plan to get
committed, including commit message etc.


The one change I decided to remove is this change in tuplesort_free:

-   longspaceUsed;
+   int64   spaceUsed;

The reason why I think this variable should be 'long' is that we're
using it for this:

spaceUsed = LogicalTapeSetBlocks(state->tapeset);

and LogicalTapeSetBlocks is defined like this:

extern long LogicalTapeSetBlocks(LogicalTapeSet *lts);

FWIW the "long" is not introduced by incremental sort - it used to be in
tuplesort_end, the incremental sort patch just moved it to a different
function. It's a bit confusing that tuplesort_updatemax has this:

int64   spaceUsed;

But I'd argue this is actually wrong, and should be "long" instead. (And
this actually comes from the incremental sort patch, by me.)


FWIW while looking at what the other places calling LogicalTapeSetBlocks
do, and I noticed this:

uint64  disk_used = LogicalTapeSetBlocks(...);

in the disk-based hashagg patch. So that's a third data type ...


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 
>From 43545ab6e54821a1d459cde2ff9dea58ece2c237 Mon Sep 17 00:00:00 2001
From: tanghy 
Date: Fri, 16 Oct 2020 14:50:38 +0900
Subject: [PATCH] Use INT64_FORMAT to print int64 variables in sort debug

Commit 6ee3b5fb99 cleaned up most of the long/int64 confusion related to
incremental sort, but the sort debug messages were still using %ld for
int64 variables. So fix that.

Author: Haiying Tang
Backpatch-through: 13
Discussion: 
https://postgr.es/m/4250be9d350c4992abb722a76e288aef%40G08CNEXMBPEKD05.g08.fujitsu.local
---
 src/backend/executor/nodeIncrementalSort.c | 28 +++---
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/src/backend/executor/nodeIncrementalSort.c 
b/src/backend/executor/nodeIncrementalSort.c
index eb6cc19a60..eb1c1326de 100644
--- a/src/backend/executor/nodeIncrementalSort.c
+++ b/src/backend/executor/nodeIncrementalSort.c
@@ -333,7 +333,7 @@ switchToPresortedPrefixMode(PlanState *pstate)
 */
if (node->bounded)
{
-   SO1_printf("Setting bound on presorted prefix tuplesort to: 
%ld\n",
+   SO1_printf("Setting bound on presorted prefix tuplesort to: " 
INT64_FORMAT "\n",
   node->bound - node->bound_Done);
tuplesort_set_bound(node->prefixsort_state,
node->bound - 
node->bound_Done);
@@ -417,9 +417,9 @@ switchToPresortedPrefixMode(PlanState *pstate)
 * remaining in the large single prefix key group we think we've
 * encountered.
 */
-   SO1_printf("Moving %ld tuples to presorted prefix tuplesort\n", 
nTuples);
+   SO1_printf("Moving " INT64_FORMAT " tuples to presorted prefix 
tuplesort\n", nTuples);
node->n_fullsort_remaining -= nTuples;
-   SO1_printf("Setting n_fullsort_remaining to %ld\n", 
node->n_fullsort_remaining);
+   SO1_printf("Setting n_fullsort_remaining to " INT64_FORMAT "\n", 
node->n_fullsort_remaining);
 
if (lastTuple)
{
@@ -449,7 +449,7 @@ switchToPresortedPrefixMode(PlanState *pstate)
 * out all of those tuples, and then come back around to find 
another
 * batch.
 */
-   SO1_printf("Sorting presorted prefix tuplesort with %ld 
tuples\n", nTuples);
+   SO1_printf("Sorting presorted prefix tuplesort with " 
INT64_FORMAT " tuples\n", nTuples);
tuplesort_performsort(node->prefixsort_state);
 
INSTRUMENT_SORT_GROUP(node, prefixsort);
@@ -462,7 +462,7 @@ switchToPresortedPrefixMode(PlanState *pstate)
 * - n), so store the current number of processed 
tuples for use
 * in configuring sorting bound.
 */
-   SO2_printf("Changing bound_Done from %ld to %ld\n",
+   SO2_printf("Changing bound_Done from " INT64_FORMAT " 
to " INT64_FORMAT "\n",
   Min(node->bound, node->bound_Done + 
nTuples), node->bound_Done);
node->bound_Done = Min(node->bound, node->bound_Done + 
nTuples);
}
@@ -574,7 +574,7 @@ ExecIncrementalSort(PlanState *pstate)
 * need to re-execute the prefix mode transition 
function to pull
 * out the next prefix key group.
 */
-   SO1_printf("Re-calling switchToPresortedPrefixMode() 
because n_fullsort_remaining is > 0 (%ld)\n",
+   SO1_printf("Re-calling switchToPresortedPrefixMode() 
because n_fullsort_remaining is > 0 (" INT64_FO

Re: Use of "long" in incremental sort code

2020-10-21 Thread Tomas Vondra

On Wed, Oct 21, 2020 at 06:06:52AM +, Tang, Haiying wrote:

Hi


Found one more place needed to be changed(long -> int64).

Also changed the output for int64 data(Debug mode on & define EXEC_SORTDEBUG )

And, maybe there's a typo in " src\backend\executor\nodeIncrementalSort.c" as 
below.
Obviously, the ">=" is meaningless, right?

And, maybe there's a typo in " src\backend\executor\nodeIncrementalSort.c" as 
below.
Obviously, the ">=" is meaningless, right?

-   SO1_printf("Sorting presorted prefix tuplesort with >= %ld 
tuples\n", nTuples);
+   SO1_printf("Sorting presorted prefix tuplesort with %ld 
tuples\n", nTuples);

Please take a check at the attached patch file.


I have added it to commit fest.
https://commitfest.postgresql.org/30/2772/



Thanks, the changes seem fine to me. I'll do a bit more review and get
it pushed.


regards
Tomas




RE: Use of "long" in incremental sort code

2020-10-20 Thread Tang, Haiying
Hi

>Found one more place needed to be changed(long -> int64).
>
>Also changed the output for int64 data(Debug mode on & define EXEC_SORTDEBUG )
>
>And, maybe there's a typo in " src\backend\executor\nodeIncrementalSort.c" as 
>below.
>Obviously, the ">=" is meaningless, right?
>
>And, maybe there's a typo in " src\backend\executor\nodeIncrementalSort.c" as 
>below.
>Obviously, the ">=" is meaningless, right?
>
>-  SO1_printf("Sorting presorted prefix tuplesort with >= %ld 
>tuples\n", nTuples);
>+  SO1_printf("Sorting presorted prefix tuplesort with %ld 
>tuples\n", nTuples);
>
>Please take a check at the attached patch file.

I have added it to commit fest.
https://commitfest.postgresql.org/30/2772/

Best regards
Tang

-Original Message-
From: Tang, Haiying  
Sent: Monday, October 19, 2020 12:57 PM
To: David Rowley ; James Coleman 
Cc: pgsql-hack...@postgresql.org
Subject: RE: Use of "long" in incremental sort code

Hi

Found one more place needed to be changed(long -> int64).

Also changed the output for int64 data(Debug mode on & define EXEC_SORTDEBUG )

And, maybe there's a typo in " src\backend\executor\nodeIncrementalSort.c" as 
below.
Obviously, the ">=" is meaningless, right?

-   SO1_printf("Sorting presorted prefix tuplesort with >= %ld 
tuples\n", nTuples);
+   SO1_printf("Sorting presorted prefix tuplesort with %ld 
tuples\n", nTuples);

Please take a check at the attached patch file.

Previous disscution:
https://www.postgresql.org/message-id/CAApHDvpky%2BUhof8mryPf5i%3D6e6fib2dxHqBrhp0Qhu0NeBhLJw%40mail.gmail.com

Best regards
Tang







RE: Use of "long" in incremental sort code

2020-10-18 Thread Tang, Haiying
Hi

Found one more place needed to be changed(long -> int64).

Also changed the output for int64 data(Debug mode on & define EXEC_SORTDEBUG )

And, maybe there's a typo in " src\backend\executor\nodeIncrementalSort.c" as 
below.
Obviously, the ">=" is meaningless, right?

-   SO1_printf("Sorting presorted prefix tuplesort with >= %ld 
tuples\n", nTuples);
+   SO1_printf("Sorting presorted prefix tuplesort with %ld 
tuples\n", nTuples);

Please take a check at the attached patch file.

Previous disscution:
https://www.postgresql.org/message-id/CAApHDvpky%2BUhof8mryPf5i%3D6e6fib2dxHqBrhp0Qhu0NeBhLJw%40mail.gmail.com

Best regards
Tang





0001-Use-int64-instead-of-long-for-space-used-variables-a.patch
Description: 0001-Use-int64-instead-of-long-for-space-used-variables-a.patch


Re: Use of "long" in incremental sort code

2020-08-01 Thread David Rowley
On Sat, 1 Aug 2020 at 02:02, James Coleman  wrote:
> I'd previously attached a patch [1], and there seemed to be agreement
> it was reasonable (lightly so, but I also didn't see any
> disagreement); would someone be able to either commit the change or
> provide some additional feedback?

It looks fine to me. Pushed.

David

> [1]: 
> https://www.postgresql.org/message-id/CAAaqYe_Y5zwCTFCJeso7p34yJgf4khR8EaKeJtGd%3DQPudOad6A%40mail.gmail.com




Re: Use of "long" in incremental sort code

2020-07-31 Thread James Coleman
On Thu, Jul 30, 2020 at 10:12 PM David Rowley  wrote:
>
> On Fri, 3 Jul 2020 at 07:47, James Coleman  wrote:
> > Patch using int64 attached.
>
> I added this to the open items list for PG13.
>
> David

I'd previously attached a patch [1], and there seemed to be agreement
it was reasonable (lightly so, but I also didn't see any
disagreement); would someone be able to either commit the change or
provide some additional feedback?

Thanks,
James

[1]: 
https://www.postgresql.org/message-id/CAAaqYe_Y5zwCTFCJeso7p34yJgf4khR8EaKeJtGd%3DQPudOad6A%40mail.gmail.com




Re: Use of "long" in incremental sort code

2020-07-30 Thread David Rowley
On Fri, 3 Jul 2020 at 07:47, James Coleman  wrote:
> Patch using int64 attached.

I added this to the open items list for PG13.

David




Re: Use of "long" in incremental sort code

2020-07-02 Thread Peter Geoghegan
On Thu, Jul 2, 2020 at 12:47 PM James Coleman  wrote:
> But wouldn't that mean we'd get int on 32-bit systems, and since we're
> accumulating data we could go over that value in both memory and disk?
>
> My assumption is that it's preferable to have the "this run value" and
> the "total used across multiple runs" and both of those for disk and
> memory to be the same. In that case it seems we want to guarantee
> 64-bits.

I agree. There seems to be little reason to accommodate platform level
conventions, beyond making sure that everything works on less popular
or obsolete platforms.

I suppose that it's a little idiosyncratic to use int64 like this. But
it makes sense, and isn't nearly as ugly as the long thing, so I don't
think that it should really matter.

-- 
Peter Geoghegan




Re: Use of "long" in incremental sort code

2020-07-02 Thread Tom Lane
James Coleman  writes:
> On Thu, Jul 2, 2020 at 3:39 PM Tom Lane  wrote:
>> mumble ssize_t mumble

> But wouldn't that mean we'd get int on 32-bit systems, and since we're
> accumulating data we could go over that value in both memory and disk?

Certainly, a number that's meant to represent the amount of data *on disk*
shouldn't use ssize_t.  But I think it's appropriate if you want to
represent in-memory quantities while also allowing negative values.

I guess if you're expecting in-memory sizes exceeding 2GB, you might worry
that ssize_t could overflow.  I'm dubious that a 32-bit machine could get
to that, though, seeing that it's going to have other demands on its
address space.

> My assumption is that it's preferable to have the "this run value" and
> the "total used across multiple runs" and both of those for disk and
> memory to be the same. In that case it seems we want to guarantee
> 64-bits.

If you're not going to distinguish in-memory from not-in-memory, agreed.

regards, tom lane




Re: Use of "long" in incremental sort code

2020-07-02 Thread Peter Geoghegan
On Thu, Jul 2, 2020 at 12:44 PM Tom Lane  wrote:
> > That's from POSIX, though. I imagine MSVC won't be happy (surprise!).
>
> We've got quite a few uses of it already, so apparently it's fine.

Oh, looks like we have a compatibility hack for MSVC within
win32_port.h, where ssize_t is typedef'd to __int64. I didn't realize
that it was okay to use ssize_t.

-- 
Peter Geoghegan




Re: Use of "long" in incremental sort code

2020-07-02 Thread James Coleman
On Thu, Jul 2, 2020 at 3:39 PM Tom Lane  wrote:
>
> Peter Geoghegan  writes:
> > On Thu, Jul 2, 2020 at 10:53 AM James Coleman  wrote:
> >> Do you think it's reasonable to use int64 across the board for memory
> >> and disk space numbers then? If so, I can update the patch.
>
> > Using int64 as a replacement for long is the safest general strategy,
>
> mumble ssize_t mumble

But wouldn't that mean we'd get int on 32-bit systems, and since we're
accumulating data we could go over that value in both memory and disk?

My assumption is that it's preferable to have the "this run value" and
the "total used across multiple runs" and both of those for disk and
memory to be the same. In that case it seems we want to guarantee
64-bits.

Patch using int64 attached.

James


v2-0001-Use-int64-instead-of-long-for-space-used-variable.patch
Description: Binary data


Re: Use of "long" in incremental sort code

2020-07-02 Thread Tom Lane
Peter Geoghegan  writes:
> On Thu, Jul 2, 2020 at 12:39 PM Tom Lane  wrote:
>> mumble ssize_t mumble

> That's from POSIX, though. I imagine MSVC won't be happy (surprise!).

We've got quite a few uses of it already, so apparently it's fine.

regards, tom lane




Re: Use of "long" in incremental sort code

2020-07-02 Thread Peter Geoghegan
On Thu, Jul 2, 2020 at 12:39 PM Tom Lane  wrote:
> mumble ssize_t mumble

That's from POSIX, though. I imagine MSVC won't be happy (surprise!).

-- 
Peter Geoghegan




Re: Use of "long" in incremental sort code

2020-07-02 Thread Tom Lane
Peter Geoghegan  writes:
> On Thu, Jul 2, 2020 at 10:53 AM James Coleman  wrote:
>> Do you think it's reasonable to use int64 across the board for memory
>> and disk space numbers then? If so, I can update the patch.

> Using int64 as a replacement for long is the safest general strategy,

mumble ssize_t mumble

regards, tom lane




Re: Use of "long" in incremental sort code

2020-07-02 Thread Peter Geoghegan
On Thu, Jul 2, 2020 at 10:53 AM James Coleman  wrote:
> Do you think it's reasonable to use int64 across the board for memory
> and disk space numbers then? If so, I can update the patch.

Using int64 as a replacement for long is the safest general strategy,
and so ISTM that it might be worth doing that even in cases where it
isn't clearly necessary. After all, any code that uses long must have
been written with the assumption that that was the same thing as
int64, at least on most platforms.

There is nothing wrong with using Size/size_t, and doing so is often
slightly clearer. But it's no drop-in replacement for long.

-- 
Peter Geoghegan




Re: Use of "long" in incremental sort code

2020-07-02 Thread James Coleman
On Thu, Jul 2, 2020 at 1:36 PM Peter Geoghegan  wrote:
>
> On Mon, Jun 29, 2020 at 9:13 PM David Rowley  wrote:
> > I noticed the incremental sort code makes use of the long datatype a
> > few times, e.g in TuplesortInstrumentation and
> > IncrementalSortGroupInfo.
>
> I agree that long is terrible, and should generally be avoided.
>
> > Maybe Size would be better for the in-memory fields and uint64 for the
> > on-disk fields?
>
> FWIW we have to use int64 for the in-memory tuplesort.c fields. This
> is because it must be possible for the fields to have negative values
> in the context of tuplesort. If there is going to be a general rule
> for in-memory fields, then ISTM that it'll have to be "use int64".
>
> logtape.c uses long for on-disk fields. It also relies on negative
> values, albeit to a fairly limited degree (it uses -1 as a magic
> value).

Do you think it's reasonable to use int64 across the board for memory
and disk space numbers then? If so, I can update the patch.

James




Re: Use of "long" in incremental sort code

2020-07-02 Thread Peter Geoghegan
On Mon, Jun 29, 2020 at 9:13 PM David Rowley  wrote:
> I noticed the incremental sort code makes use of the long datatype a
> few times, e.g in TuplesortInstrumentation and
> IncrementalSortGroupInfo.

I agree that long is terrible, and should generally be avoided.

> Maybe Size would be better for the in-memory fields and uint64 for the
> on-disk fields?

FWIW we have to use int64 for the in-memory tuplesort.c fields. This
is because it must be possible for the fields to have negative values
in the context of tuplesort. If there is going to be a general rule
for in-memory fields, then ISTM that it'll have to be "use int64".

logtape.c uses long for on-disk fields. It also relies on negative
values, albeit to a fairly limited degree (it uses -1 as a magic
value).

-- 
Peter Geoghegan




Re: Use of "long" in incremental sort code

2020-07-02 Thread James Coleman
On Tue, Jun 30, 2020 at 7:21 AM Peter Eisentraut
 wrote:
>
> On 2020-06-30 06:24, David Rowley wrote:
> > On Tue, 30 Jun 2020 at 16:20, Tom Lane  wrote:
> >> There is a fairly widespread issue that memory-size-related GUCs and
> >> suchlike variables are limited to represent sizes that fit in a "long".
> >> Although Win64 is the *only* platform where that's an issue, maybe
> >> it's worth doing something about.  But we shouldn't just fix the sort
> >> code, if we do do something.
> >>
> >> (IOW, I don't agree with doing a fix that doesn't also fix work_mem.)
> >
> > I raised it mostly because this new-to-PG13-code is making the problem 
> > worse.
>
> Yeah, we recently got rid of a bunch of inappropriate use of long, so it
> seems reasonable to make this new code follow that.

I've attached a patch to make this change but with one tweak: I
decided to use unint64 for both memory and disk (rather than Size in
some cases) since we aggregated across multiple runs and have shared
code that deals with both values.

James


v1-0001-Use-unint64-instead-of-long-for-space-used-variab.patch
Description: Binary data


Re: Use of "long" in incremental sort code

2020-06-30 Thread Peter Eisentraut

On 2020-06-30 06:24, David Rowley wrote:

On Tue, 30 Jun 2020 at 16:20, Tom Lane  wrote:

There is a fairly widespread issue that memory-size-related GUCs and
suchlike variables are limited to represent sizes that fit in a "long".
Although Win64 is the *only* platform where that's an issue, maybe
it's worth doing something about.  But we shouldn't just fix the sort
code, if we do do something.

(IOW, I don't agree with doing a fix that doesn't also fix work_mem.)


I raised it mostly because this new-to-PG13-code is making the problem worse.


Yeah, we recently got rid of a bunch of inappropriate use of long, so it 
seems reasonable to make this new code follow that.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Use of "long" in incremental sort code

2020-06-29 Thread David Rowley
On Tue, 30 Jun 2020 at 16:20, Tom Lane  wrote:
> There is a fairly widespread issue that memory-size-related GUCs and
> suchlike variables are limited to represent sizes that fit in a "long".
> Although Win64 is the *only* platform where that's an issue, maybe
> it's worth doing something about.  But we shouldn't just fix the sort
> code, if we do do something.
>
> (IOW, I don't agree with doing a fix that doesn't also fix work_mem.)

I raised it mostly because this new-to-PG13-code is making the problem worse.

If we're not going to change the in-memory fields, then shouldn't we
at least change the ones for disk space tracking?

David




Re: Use of "long" in incremental sort code

2020-06-29 Thread Tom Lane
David Rowley  writes:
> I noticed the incremental sort code makes use of the long datatype a
> few times, e.g in TuplesortInstrumentation and
> IncrementalSortGroupInfo.  (64-bit windows machines have sizeof(long)
> == 4).  I understand that the values are in kilobytes and it would
> take 2TB to cause them to wrap. Never-the-less, I think it would be
> better to choose a better-suited type. work_mem is still limited to
> 2GB on 64-bit Windows machines, so perhaps there's some argument that
> it does not matter about fields that related to in-memory stuff, but
> the on-disk fields are wrong.  The in-memory fields likely raise the
> bar further for fixing the 2GB work_mem limit on Windows.

> Maybe Size would be better for the in-memory fields and uint64 for the
> on-disk fields?

There is a fairly widespread issue that memory-size-related GUCs and
suchlike variables are limited to represent sizes that fit in a "long".
Although Win64 is the *only* platform where that's an issue, maybe
it's worth doing something about.  But we shouldn't just fix the sort
code, if we do do something.

(IOW, I don't agree with doing a fix that doesn't also fix work_mem.)

regards, tom lane