[I sent this (off-list by accident) earlier.   Apologies for any formatting
screwups this retransmit might contain.]

    Date:        Tue, 3 Mar 2026 10:51:11 -0800
    From:        Paul Eggert via tz <[email protected]>
    Message-ID:  <[email protected]>


   | No matter what abbreviation we use, there will be trouble. The obvious 
  | abbreviation PT does not conform to the POSIX standard

Sorry, but that is utter nonsense.

The (current) POSIX standard (the first to admit the existence of tzdb
at all, despite it being the dominant timezone database for decades now)
provides 3 formats for the TZ environment variable (XBD 8.3), which is
the only place any of this is specified at all, as best I can tell:

The first, where the first character is ':' and everything else is
unspecified.

The second, which is in the old SysV type TZ env var format, and where:

   std and dst  Indicate no less than three, nor more than {TZNAME_MAX},
                bytes that are the designation for the standard (std) or the
                Daylight Saving (dst) timezone.

where indeed PT would not be usable, but we have no reason to care about
that -  those who have ancient systems and need to use that format would
care, but that is set and used on a system by system basis, and each
system owner (each user in fact) can decide for themselves what to use.
(PDT, PST, CPT, BCT, ...)

And the third, which POSIX calls (informally anyway)

        "A format specifying a geographical timezone or a special timezone."

which allows use of tzdata (or something more or less equivalent, POSIX
doesn't specify) which applies when neither of the previous two apply
(when the TZ variable neither starts with ':' nor is in the particular
format defined for the second variant, which would be the case if, for
example, we had "TZ=PT7").

To use that, several requirements are imposed:

  The data for each geographical timezone shall include:

     � The offset from Coordinated Universal Time of the timezone's standard
       time.

     � If Daylight Saving Time (DST) is, or has historically been, observed: a
       method to discover the dates and times of transitions to and from DST
       and the offset from Coordinated Universal Time during periods when
       DST was, is, or is predicted to be, in effect.

     � The timezone names for standard time (std) and, if observed, for DST
       (dst) to be used by tzset( ). These shall each contain no more than
       {TZNAME_MAX} bytes.

Note the final sentence, "These shall each contain no more than {TZNAME_MAX}
bytes".   No mention of a minimum length at all, they could be 0 bytes if
no name for the timezone is appropriate (which it actually could be in much
of the world, where all there is is "the time").

The restriction to no less than 3 bytes in the 2nd format is simply to
avoid breaking parsers for that string, that have existed with that
assumption for years.   But as no-one much uses it any more, and certainly
no-one we care about -- even the use of a POSIX TZ lookalike string to specify
future transitions off into infinity at the end of the known (or anticipated)
summer time changes (or lack thereof) is irrelevant to that, that isn't the
value of the TZ variable, and isn't (as nor is the rest of what is in a
tzdata zone file) specified by POSIX at all - just as long as the data
provides the functionality above.

XBD 3 (Definitions) doesn't, as best I can tell, say anything at all
about timezones or abbreviations for any names thereof.

XBD 7.3.5 (locales, and LC_TIME) also says nothing I can see about
timezone names, or any length restrictions on any abbreviations that
might exist.

The spec in XBD14 for <time.h> adds tm_zone, it is a char *, and is a
"Timezone abbreviation", and (aside for some specifics about the lifetime
of its value) that's all that is said about it.   It could be anything, no
length restrictions at all, no syntax requirements either.

Similarly, the specs in XSH 3 for localtime and asctime (and ctime which is
just asctime plus a bit) say nothing about the tm_zone field (asctime nothing
at all, it isn't used there) - localtime just says of tm_zone:

     If the tm structure member tm_zone is accessed after the value
     of TZ is subsequently modified, the behaviour is undefined.

That's it.   The tm struct is required to have the relationship with the
input time_t as defined in XBD 4.19 - which says nothing at all about what
tm_zone should be set to.

The spec for gmtime() says nothing about the tm_zone field, however for
gmtime_r():

        Upon successful completion, gmtime_r( ) shall return the address
        of the structure pointed to by the argument result. The structure's
        tm_zone member shall be set to a pointer to the string "UTC",
        which shall have static storage duration.

which is the most specific definition of tm_zone's value I can find anywhere,
but is certainly not relevant here.

The mktime() function takes a struct tm, but doesn't use the tm_zone field
at all, except to (eventually, in the case of a successful conversion) set
it (and all the rest of the members of the struct tm) to what localtime()
would return given the result returned from mktime().

The strftime() format %Z accesses the tm_zone field

        Z Replaced by the timezone name or abbreviation, or by no bytes
          if no timezone information exists. [tm_isdst, tm_zone ]

That says nothing about any length requirement, and even explicitly allows
for the zone name to be absent (no bytes).

The strptime() function %Z conversion is specified as:

        Z The timezone name. If this name matches the name pointed to by
          tzname[1], and the names pointed to by tzname[0] and tzname[1]
          differ, then the tm_isdst member of the tm structure pointed to by
          tm shall be set to 1. Otherwise, if this name matches the name
          pointed to by tzname[0] then the tm_isdst member of the tm
          structure pointed to by tm shall be set to 0. The tm_zone and
          tm_gmtoff members of the structure may also be set in an unspecified
          manner. Members other than tm_isdst, tm_zone, and tm_gmtoff may be
          affected if an s conversion is also performed but shall otherwise
          not be affected.

Nothing at all about length requirements, nor syntax for that matter, all
that is more about setting tm_isdst than tm_zone.

The date command just specifies use of the strftime() conversions, such
that the default format is:

        date "+%a %b %e %H:%M:%S %Z %Y"

where the tm_zone field is entirely governed by what strftime() specifies
for %Z (which as seen above, is more or less "anything goes").

The only wording which can even half be interpreted to imply what you're
suggesting, is in the XSH 3 page for tzset():


   The tzset( ) function shall set the external variable tzname as follows:

   tzname[0] = "std";
   tzname[1] = "dst";

   where std and dst are as described in XBD Chapter 8 (on page 167).

except that XBD 8 doesn't define std or dst at all for the first of the
three formats, it does require at least 3 chars (and no more than, and all
alphanumeric or + or -) when the 2nd format for TZ is used, and basically
nothing at all (just the max length) when the 3rd variant (the one we all
use) is used.   It all doesn't matter much, the idea that a timezone has
a constant offset from UTC (for the "timezone" variable) and either one or
two names (no more) for timezone name abbreviations is all obsolete garbage,
and ought to be completely forgotten.    But if you feel that the above
requires tzname[0] to be set to a 3 letter abbrev, by all means set it to
one, but nothing says that tm_zone has to be set to the same string.


Please stop spreading nonsense about what POSIX requires.   There might
be people who believe what you say.   If you believe I'm wrong about this,
then please cite the text from the standard which supports your view.


  | or to TZDB guidelines,

If those actually require 3 (or more) chars (letters perhaps, but then +07
wouldn't work) then change the quidelines.   It is that simple.   There is
no reason for such a restriction - and what's more, we should be able to
create zones which reflect the "military" zone names (A B C ... Z, missing
just I I think it is, but it has been a while) which are all 1 char long
(but are unable to represent offsets that aren't an even number of hours).


  | I installed into the development repository the proposed 
  | patch, which uses a "-07" placeholder that also should work but is 
  | jarring in a North American context.

It is jarring in all contexts, and I fail to see any reason why North
America should be given any kind of special treatment in this international
issue.   Of course, the ideal would either be to do away with timezone
abbreviations entirely ("-07" is can be trivially derived from tm_gmtoff
should anyone really desire to output something like that), or go back to
supplying rational abbreviations for all zones (regardless of whether they
have any "offical" status or not).   Removing the abbreviations would also
avoid all the issues related to the lifetime of the tm_zone field (it
would simply be NULL, always).

kre

Reply via email to