Re: Fix formatting of Interval output
Joseph Koshakow writes: > On Sat, Feb 19, 2022 at 12:00 PM Tom Lane wrote: >> I think that messing with struct pg_tm might have too many side-effects. >> However, pg_tm isn't all that well adapted to intervals in the first >> place, so it'd make sense to make a new struct specifically for interval >> decoding. > Yeah that's a good idea, pg_tm is used all over the place. Is there a > reason we need an intermediate structure to convert to and from? > We could just convert directly to an Interval in DecodeInterval and > print directly from an Interval in EncodeInterval. Yeah, I thought about that too, but if you look at the other callers of interval2tm, you'll see this same set of issues. I'm inclined to keep the current code structure and just fix the data structure. Although interval2tm isn't *that* big, I don't think four copies of its logic would be an improvement. > Also I originally created a separate thread and patch because I > thought this wouldn't be directly related to my other patch, > https://commitfest.postgresql.org/37/3541/. However I think with these > discussed changes it's directly related. So I think it's best to close > this thread and patch and copy this conversion to the other thread. Agreed. regards, tom lane
Re: Fix formatting of Interval output
On Sat, Feb 19, 2022 at 12:00 PM Tom Lane wrote: > I think that messing with struct pg_tm might have too many side-effects. > However, pg_tm isn't all that well adapted to intervals in the first > place, so it'd make sense to make a new struct specifically for interval > decoding. Yeah that's a good idea, pg_tm is used all over the place. Is there a reason we need an intermediate structure to convert to and from? We could just convert directly to an Interval in DecodeInterval and print directly from an Interval in EncodeInterval. Also I originally created a separate thread and patch because I thought this wouldn't be directly related to my other patch, https://commitfest.postgresql.org/37/3541/. However I think with these discussed changes it's directly related. So I think it's best to close this thread and patch and copy this conversion to the other thread. Sorry for fragmenting the conversation. - Joe Koshakow
Re: Fix formatting of Interval output
Joseph Koshakow writes: > On Fri, Feb 18, 2022 at 11:44 PM Tom Lane wrote: >> I wonder if the most reasonable fix would be to start using int64 >> instead of int arithmetic for the values that are potentially large. >> I doubt that we'd be taking much of a performance hit on modern >> hardware. > That's an interesting idea. I've always assumed that the range of the > time fields of Intervals was 2147483647 hours 59 minutes > 59.99 seconds to -2147483648 hours -59 minutes > -59.99 seconds. However the only reason that we can't support > the full range of int64 microseconds is because the struct pg_tm fields > are only ints. If we increase those fields to int64 then we'd be able to > support the full int64 range for microseconds as well as implicitly fix > some of the overflow issues in DecodeInterval and EncodeInterval. I think that messing with struct pg_tm might have too many side-effects. However, pg_tm isn't all that well adapted to intervals in the first place, so it'd make sense to make a new struct specifically for interval decoding. regards, tom lane
Re: Fix formatting of Interval output
On Fri, Feb 18, 2022 at 11:44 PM Tom Lane wrote: > > Joseph Koshakow writes: > > When formatting the output of an Interval, we call abs() on the hours > > field of the Interval. Calling abs(INT_MIN) returns back INT_MIN > > causing the output to contain two '-' characters. The attached patch > > fixes that issue by special casing INT_MIN hours. > > Good catch, but it looks to me like three out of the four formats in > EncodeInterval have variants of this problem --- there are assumptions > throughout that code that we can compute "-x" or "abs(x)" without > fear. Not much point in fixing only one symptom. > > Also, I notice that there's an overflow hazard upstream of here, > in interval2tm: > > regression=# select interval '214748364 hours' * 11; > ERROR: interval out of range > regression=# \errverbose > ERROR: 22008: interval out of range > LOCATION: interval2tm, timestamp.c:1982 > > There's no good excuse for not being able to print a value that > we computed successfully. > > I wonder if the most reasonable fix would be to start using int64 > instead of int arithmetic for the values that are potentially large. > I doubt that we'd be taking much of a performance hit on modern > hardware. > > regards, tom lane That's an interesting idea. I've always assumed that the range of the time fields of Intervals was 2147483647 hours 59 minutes 59.99 seconds to -2147483648 hours -59 minutes -59.99 seconds. However the only reason that we can't support the full range of int64 microseconds is because the struct pg_tm fields are only ints. If we increase those fields to int64 then we'd be able to support the full int64 range for microseconds as well as implicitly fix some of the overflow issues in DecodeInterval and EncodeInterval. - Joe Koshakow
Re: Fix formatting of Interval output
Joseph Koshakow writes: > When formatting the output of an Interval, we call abs() on the hours > field of the Interval. Calling abs(INT_MIN) returns back INT_MIN > causing the output to contain two '-' characters. The attached patch > fixes that issue by special casing INT_MIN hours. Good catch, but it looks to me like three out of the four formats in EncodeInterval have variants of this problem --- there are assumptions throughout that code that we can compute "-x" or "abs(x)" without fear. Not much point in fixing only one symptom. Also, I notice that there's an overflow hazard upstream of here, in interval2tm: regression=# select interval '214748364 hours' * 11; ERROR: interval out of range regression=# \errverbose ERROR: 22008: interval out of range LOCATION: interval2tm, timestamp.c:1982 There's no good excuse for not being able to print a value that we computed successfully. I wonder if the most reasonable fix would be to start using int64 instead of int arithmetic for the values that are potentially large. I doubt that we'd be taking much of a performance hit on modern hardware. regards, tom lane
Fix formatting of Interval output
When formatting the output of an Interval, we call abs() on the hours field of the Interval. Calling abs(INT_MIN) returns back INT_MIN causing the output to contain two '-' characters. The attached patch fixes that issue by special casing INT_MIN hours. Here is an example of the issue: postgres=# SELECT INTERVAL '-2147483648 hrs'; interval --2147483648:00:00 (1 row) Cheers, Joe Koshakow From 720f0903c71a428695672aa866fc8fb0146a472e Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Thu, 17 Feb 2022 21:57:48 -0500 Subject: [PATCH] Fix Interval formatting with INT_MIN hours When formatting the output of an Interval, we call abs() on the hours field of the Interval. Calling abs(INT_MIN) returns back INT_MIN causing the output to contain two '-' characters. This commit fixes that issue by special casing INT_MIN hours. --- src/backend/utils/adt/datetime.c | 7 +-- src/test/regress/expected/interval.out | 8 src/test/regress/sql/interval.sql | 4 3 files changed, 17 insertions(+), 2 deletions(-) diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c index 7926258c06..368df715e6 100644 --- a/src/backend/utils/adt/datetime.c +++ b/src/backend/utils/adt/datetime.c @@ -4376,11 +4376,14 @@ EncodeInterval(struct pg_tm *tm, fsec_t fsec, int style, char *str) if (is_zero || hour != 0 || min != 0 || sec != 0 || fsec != 0) { bool minus = (hour < 0 || min < 0 || sec < 0 || fsec < 0); +// need to special case hours being INT_MIN because you can't call abs(INT_MIN) +bool int_min_hours = (hour == INT_MIN); sprintf(cp, "%s%s%02d:%02d:", is_zero ? "" : " ", - (minus ? "-" : (is_before ? "+" : "")), - abs(hour), abs(min)); + (minus ? (int_min_hours ? "" : "-") : (is_before ? "+" : "")), + (int_min_hours ? hour : abs(hour)), + abs(min)); cp += strlen(cp); cp = AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true); *cp = '\0'; diff --git a/src/test/regress/expected/interval.out b/src/test/regress/expected/interval.out index accd4a7d90..bb0d59ccfa 100644 --- a/src/test/regress/expected/interval.out +++ b/src/test/regress/expected/interval.out @@ -1043,3 +1043,11 @@ SELECT extract(epoch from interval '10 days'); 864000.00 (1 row) +-- test that INT_MIN number of hours is formatted properly +SET IntervalStyle to postgres; +SELECT INTERVAL '-2147483648 hrs'; + interval +--- + -2147483648:00:00 +(1 row) + diff --git a/src/test/regress/sql/interval.sql b/src/test/regress/sql/interval.sql index 6d532398bd..f1fe0a5ea0 100644 --- a/src/test/regress/sql/interval.sql +++ b/src/test/regress/sql/interval.sql @@ -355,3 +355,7 @@ SELECT f1, -- internal overflow test case SELECT extract(epoch from interval '10 days'); + +-- test that INT_MIN number of hours is formatted properly +SET IntervalStyle to postgres; +SELECT INTERVAL '-2147483648 hrs'; -- 2.25.1