Re: [HACKERS] API bug in DetermineTimeZoneOffset()

2013-11-01 Thread Robert Haas
On Fri, Nov 1, 2013 at 10:50 AM, Tom Lane  wrote:
> I wrote:
>> The second attached patch, to be applied after the first, removes the
>> existing checks of HasCTZSet in the backend.  The only visible effect of
>> this, AFAICT, is that to_char's TZ format spec now delivers something
>> useful instead of an empty string when a brute-force timezone is in use.
>> I could be persuaded either way as to whether to back-patch this part.
>> From one standpoint, this to_char behavioral change is clearly a bug fix
>> --- but it's barely possible that somebody out there thought that
>> returning an empty string for TZ was actually the intended/desirable
>> behavior.
>
> Any opinions about whether to back-patch this part or not?  It seems
> like a bug fix, but on the other hand, I don't recall any complaints
> from the field about to_char's TZ spec not working with brute-force zones.
> So maybe the prudent thing is to leave well enough alone.

I vote for leaving it alone until somebody complains.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] API bug in DetermineTimeZoneOffset()

2013-11-01 Thread Tom Lane
I wrote:
> The second attached patch, to be applied after the first, removes the
> existing checks of HasCTZSet in the backend.  The only visible effect of
> this, AFAICT, is that to_char's TZ format spec now delivers something
> useful instead of an empty string when a brute-force timezone is in use.
> I could be persuaded either way as to whether to back-patch this part.
> From one standpoint, this to_char behavioral change is clearly a bug fix
> --- but it's barely possible that somebody out there thought that
> returning an empty string for TZ was actually the intended/desirable
> behavior.

Any opinions about whether to back-patch this part or not?  It seems
like a bug fix, but on the other hand, I don't recall any complaints
from the field about to_char's TZ spec not working with brute-force zones.
So maybe the prudent thing is to leave well enough alone.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] API bug in DetermineTimeZoneOffset()

2013-10-31 Thread Tom Lane
I wrote:
> So what I'm thinking we should do is internally translate SET TIMEZONE
> with an interval value into a POSIX-style zone name in the above format,
> and then just flush HasCTZSet/CTimeZone and all the special case logic
> around them.

Attached is a set of proposed patches for this.

> 1. The zone abbreviation printed by timeofday(), to_char(), etc would now
> become "GMT+-hh[:mm[:ss]]" (or whatever we choose to stick in the angle
> brackets, but that's what I'm thinking).

After experimentation it seems that the best idea is to make the displayed
abbreviation be just "+-hh[:mm[:ss]]", using ISO sign convention.  This
avoids changes in behavior in places where the code already prints
something reasonable for the timezone name.

The first attached patch just causes session_timezone to point to a pg_tz
struct defined along these lines whenever HasCTZSet is true, plus removing
the very bogus lookaside check in DetermineTimeZoneOffset().  As exhibited
in the added regression tests, this fixes the behavior complained of in
bug #8572.  It also makes timeofday() return self-consistent data when a
brute-force timezone is in use.  AFAICT it doesn't have any other visible
effects, and it shouldn't break any third-party modules.  So I propose
back-patching this to all active branches.

The second attached patch, to be applied after the first, removes the
existing checks of HasCTZSet in the backend.  The only visible effect of
this, AFAICT, is that to_char's TZ format spec now delivers something
useful instead of an empty string when a brute-force timezone is in use.
I could be persuaded either way as to whether to back-patch this part.
>From one standpoint, this to_char behavioral change is clearly a bug fix
--- but it's barely possible that somebody out there thought that
returning an empty string for TZ was actually the intended/desirable
behavior.

The third patch, to be applied last, nukes HasCTZSet/CTimeZone entirely.
The only consequence I can see at SQL level is that SHOW TIMEZONE will
return a POSIX zone spec instead of an interval value for a brute-force
zone setting (cf change in regression test output).  Since the bare
interval value isn't actually something that SET TIMEZONE will accept,
this is clearly a step forward in self-consistency, but it's not something
I would propose to back-patch.  In any case we probably don't want to
remove these variables in back branches, on the off chance that some
third-party code is looking at them.

Comments?

regards, tom lane

diff --git a/src/backend/commands/variable.c b/src/backend/commands/variable.c
index e647ac0..b6af6e7 100644
*** a/src/backend/commands/variable.c
--- b/src/backend/commands/variable.c
*** check_timezone(char **newval, void **ext
*** 327,332 
--- 327,333 
  #else
  		myextra.CTimeZone = -interval->time;
  #endif
+ 		myextra.session_timezone = pg_tzset_offset(myextra.CTimeZone);
  		myextra.HasCTZSet = true;
  
  		pfree(interval);
*** check_timezone(char **newval, void **ext
*** 341,346 
--- 342,348 
  		{
  			/* Here we change from SQL to Unix sign convention */
  			myextra.CTimeZone = -hours * SECS_PER_HOUR;
+ 			myextra.session_timezone = pg_tzset_offset(myextra.CTimeZone);
  			myextra.HasCTZSet = true;
  		}
  		else
diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 48bf3db..1b8f109 100644
*** a/src/backend/utils/adt/datetime.c
--- b/src/backend/utils/adt/datetime.c
*** DetermineTimeZoneOffset(struct pg_tm * t
*** 1465,1476 
  after_isdst;
  	int			res;
  
- 	if (tzp == session_timezone && HasCTZSet)
- 	{
- 		tm->tm_isdst = 0;		/* for lack of a better idea */
- 		return CTimeZone;
- 	}
- 
  	/*
  	 * First, generate the pg_time_t value corresponding to the given
  	 * y/m/d/h/m/s taken as GMT time.  If this overflows, punt and decide the
--- 1465,1470 
diff --git a/src/include/pgtime.h b/src/include/pgtime.h
index 57af51e..73a0ed2 100644
*** a/src/include/pgtime.h
--- b/src/include/pgtime.h
*** extern pg_tz *log_timezone;
*** 68,73 
--- 68,74 
  
  extern void pg_timezone_initialize(void);
  extern pg_tz *pg_tzset(const char *tzname);
+ extern pg_tz *pg_tzset_offset(long gmtoffset);
  
  extern pg_tzenum *pg_tzenumerate_start(void);
  extern pg_tz *pg_tzenumerate_next(pg_tzenum *dir);
diff --git a/src/test/regress/expected/horology.out b/src/test/regress/expected/horology.out
index 553a158..2666dee 100644
*** a/src/test/regress/expected/horology.out
--- b/src/test/regress/expected/horology.out
*** DETAIL:  Value must be an integer.
*** 2935,2937 
--- 2935,2967 
  SELECT to_timestamp('100', 'FM');
  ERROR:  value for "" in source string is out of range
  DETAIL:  Value must be in the range -2147483648 to 2147483647.
+ --
+ -- Check behavior with SQL-style fixed-GMT-offset time zone (cf bug #8572)
+ --
+ SET TIME ZONE 'America/New_York';
+ SET TIME ZONE

Re: [HACKERS] API bug in DetermineTimeZoneOffset()

2013-10-31 Thread Robert Haas
On Thu, Oct 31, 2013 at 2:54 PM, Tom Lane  wrote:
> I wrote:
>> Actually, it strikes me there might be another way to do this, which is to
>> get rid of HasCTZSet/CTimeZone entirely in favor of consing up some pseudo
>> pg_tz structure that represents the desired semantics when we want a
>> "brute force" setting.
>
> After some study of the pgtz code, it turns out this is absolutely trivial
> to do by using the POSIX syntax for time zone names.  You can do something
> like
>
> set timezone to '-01:30';
>
> where the stuff between angle brackets is pretty much arbitrary and is
> used as the zone abbreviation for printout purposes.  So for example,
> after the above I get
>
> regression=# select now(), timeofday();
>now| timeofday
> --+---
>  2013-10-31 20:02:54.130828+01:30 | Thu Oct 31 20:02:54.130988 2013 GMT+01:30
> (1 row)
>
> (It's worth noting that timeofday() is another operation that doesn't
> currently give sane results when HasCTZSet is true --- it prints a time
> that matches the brute force zone, with a zone abbreviation that doesn't.)
>
> So what I'm thinking we should do is internally translate SET TIMEZONE
> with an interval value into a POSIX-style zone name in the above format,
> and then just flush HasCTZSet/CTimeZone and all the special case logic
> around them.
>
> This would create a couple of user-visible incompatibilities:
>
> 1. The zone abbreviation printed by timeofday(), to_char(), etc would now
> become "GMT+-hh[:mm[:ss]]" (or whatever we choose to stick in the angle
> brackets, but that's what I'm thinking).  However, since those
> abbreviation printouts were just completely wrong before, this doesn't
> seem like something people could complain about.
>
> 2. The value printed by SHOW TIMEZONE would change format.  Now you
> get
>
> regression=# set time zone '-1.5';
> SET
> regression=# show time zone;
>  TimeZone
> ---
>  -01:30:00
> (1 row)
>
> and what I'm proposing is to let it print the POSIX zone name, which
> in this case would be +01:30 (note the sign incompatibility
> between POSIX and ISO ...).  If anybody is sufficiently bothered by this
> then we could add a kludge to show_timezone to replicate the old
> printout, but I doubt it's a big deal.  Again, we know that very darn
> few people are using the brute-force zone feature at all (else we'd have
> heard complaints sooner), so how many apps are likely to care about the
> exact format of SHOW TIME ZONE output for this case?
>
> An intermediate position would be to include the printout kludge in
> the back-branch patches and then take it out in HEAD, so that the
> change in printout behavior only appears as of 9.4.

I think it's pretty important to avoid user-visible changes in the
back-branches, except to the minimum extent necessary to fix overtly
wrong behavior.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] API bug in DetermineTimeZoneOffset()

2013-10-31 Thread Tom Lane
I wrote:
> Actually, it strikes me there might be another way to do this, which is to
> get rid of HasCTZSet/CTimeZone entirely in favor of consing up some pseudo
> pg_tz structure that represents the desired semantics when we want a
> "brute force" setting.

After some study of the pgtz code, it turns out this is absolutely trivial
to do by using the POSIX syntax for time zone names.  You can do something
like

set timezone to '-01:30';

where the stuff between angle brackets is pretty much arbitrary and is
used as the zone abbreviation for printout purposes.  So for example,
after the above I get

regression=# select now(), timeofday();
   now| timeofday 
--+---
 2013-10-31 20:02:54.130828+01:30 | Thu Oct 31 20:02:54.130988 2013 GMT+01:30
(1 row)

(It's worth noting that timeofday() is another operation that doesn't
currently give sane results when HasCTZSet is true --- it prints a time
that matches the brute force zone, with a zone abbreviation that doesn't.)

So what I'm thinking we should do is internally translate SET TIMEZONE
with an interval value into a POSIX-style zone name in the above format,
and then just flush HasCTZSet/CTimeZone and all the special case logic
around them.

This would create a couple of user-visible incompatibilities:

1. The zone abbreviation printed by timeofday(), to_char(), etc would now
become "GMT+-hh[:mm[:ss]]" (or whatever we choose to stick in the angle
brackets, but that's what I'm thinking).  However, since those
abbreviation printouts were just completely wrong before, this doesn't
seem like something people could complain about.

2. The value printed by SHOW TIMEZONE would change format.  Now you
get

regression=# set time zone '-1.5';
SET
regression=# show time zone;
 TimeZone  
---
 -01:30:00
(1 row)

and what I'm proposing is to let it print the POSIX zone name, which
in this case would be +01:30 (note the sign incompatibility
between POSIX and ISO ...).  If anybody is sufficiently bothered by this
then we could add a kludge to show_timezone to replicate the old
printout, but I doubt it's a big deal.  Again, we know that very darn
few people are using the brute-force zone feature at all (else we'd have
heard complaints sooner), so how many apps are likely to care about the
exact format of SHOW TIME ZONE output for this case?

An intermediate position would be to include the printout kludge in
the back-branch patches and then take it out in HEAD, so that the
change in printout behavior only appears as of 9.4.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] API bug in DetermineTimeZoneOffset()

2013-10-31 Thread Tom Lane
Alvaro Herrera  writes:
> ... So, perhaps, instead of having the code
> check session_timezone explicitely, have the caller pass it down.

> This consideration probably shouldn't drive a backpatchable fix,
> however.

Well, it's impossible to do that in a back-patchable way, I'm afraid,
because wouldn't you have to pass all three of
session_timezone/HasCTZSet/CTimeZone?

Actually, it strikes me there might be another way to do this, which is to
get rid of HasCTZSet/CTimeZone entirely in favor of consing up some pseudo
pg_tz structure that represents the desired semantics when we want a
"brute force" setting.  I think this code is all leftover from a time when
we used the libc timezone routines and did not have a cozy relationship
with the tz representation --- but that's ancient history now.  Let me
go look at that idea.  It might break external code that looks directly
at HasCTZSet/CTimeZone, but I'll bet there isn't any.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] API bug in DetermineTimeZoneOffset()

2013-10-31 Thread Alvaro Herrera
Tom Lane wrote:

> I think that we should change this function to follow the API convention
> used by timestamp2tm(), namely that one passes a NULL pointer if one
> would like session_timezone/HasCTZSet/CTimeZone to control the result.
> A non-null pointer should mean to use that zone specification, period.

It would be most useful if the API of these functions did not rely on
global variables in this way, at least not in GUC variables.  (I think
the HasCTZSet/CTimeZone business is not as bad as session_timezone).
Doing that would ease future work to move all this code to src/common/
from where the ecpg implementation could also pick it up.  (AFAIR I
recently noticed that there's also frontend code that wants to do
datetime processing, pg_basebackup maybe?).  Last I checked, those
global variables were problematic, and the GUC variables were the
hardest to handle of the bunch.  So, perhaps, instead of having the code
check session_timezone explicitely, have the caller pass it down.

This consideration probably shouldn't drive a backpatchable fix,
however.

> This bug is of long standing, so I'm inclined to back-patch the fix.
> Now, that's possibly problematic if there are any third-party modules
> calling DetermineTimeZoneOffset and passing session_timezone, because
> it would mean that they'd stop honoring "brute force" zone settings.
> However, I suspect that this feature is practically unused in the field,
> else we'd have heard complaints before now.

Yep, sounds plausible.

Thanks,

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers