On Thu, 2 May 2024 at 12:24, Jonathan Wakely <jwak...@redhat.com> wrote: > > I found some issues in the std::chrono::tzdb parser by testing the > tzdata "vanguard" format, which uses new features that aren't enabled in > the "main" and "rearguard" data formats. > > Since 2024a the keyword "minimum" is no longer valid for the FROM and TO > fields in a Rule line, which means that "m" is now a valid abbreviation > for "maximum". Previously we expected either "mi" or "ma". For backwards > compatibility, a FROM field beginning with "mi" is still supported and > is treated as 1900. The "maximum" keyword is only allowed in TO now, > because it makes no sense in FROM. To support these changes the > minmax_year and minmax_year2 classes for parsing FROM and TO are > replaced with a single years_from_to class that reads both fields. > > The vanguard format makes use of %z in Zone FORMAT fields, which caused > an exception to be thrown from ZoneInfo::set_abbrev because no % or / > characters were expected when a Zone doesn't use a named Rule. The > ZoneInfo::to(sys_info&) function now uses format_abbrev_str to replace > any %z with the current offset. Although format_abbrev_str also checks > for %s and STD/DST formats, those only make sense when a named Rule is > in effect, so won't occur when ZoneInfo::to(sys_info&) is used. > > This change also implements a feature that has always been missing from > time_zone::_M_get_sys_info: finding the Rule that is active before the > specified time point, so that we can correctly handle %s in the FORMAT > for the first new sys_info that gets created. This requires implementing > a poorly documented feature of zic, to get the LETTERS field from a > later transition, as described at > https://mm.icann.org/pipermail/tz/2024-April/058891.html > In order for this to work we need to be able to distinguish an empty > letters field (as used by CE%sT where the variable part is either empty > or "S") from "the letters field is not known for this transition". The > tzdata file uses "-" for an empty letters field, which libstdc++ was > previously replacing with "" when the Rule was parsed. Instead, we now > preserve the "-" in the Rule object, so that "" can be used for the case > where we don't know the letters (and so need to decide it). > > libstdc++-v3/ChangeLog: > > * src/c++20/tzdb.cc (minmax_year, minmax_year2): Remove. > (years_from_to): New class replacing minmax_year and > minmax_year2. > (format_abbrev_str, select_std_or_dst_abbrev): Move earlier in > the file. Handle "-" for letters. > (ZoneInfo::to): Use format_abbrev_str to expand %z. > (ZoneInfo::set_abbrev): Remove exception. Change parameter from > reference to value. > (operator>>(istream&, Rule&)): Do not clear letters when it > contains "-". > (time_zone::_M_get_sys_info): Add missing logic to find the Rule > in effect before the time point. > * testsuite/std/time/tzdb/1.cc: Adjust for vanguard format using > "GMT" as the Zone name, not as a Link to "Etc/GMT". > * testsuite/std/time/time_zone/sys_info_abbrev.cc: New test. > --- > libstdc++-v3/src/c++20/tzdb.cc | 265 +++++++++++------- > .../std/time/time_zone/sys_info_abbrev.cc | 106 +++++++ > libstdc++-v3/testsuite/std/time/tzdb/1.cc | 6 +- > 3 files changed, 274 insertions(+), 103 deletions(-) > create mode 100644 > libstdc++-v3/testsuite/std/time/time_zone/sys_info_abbrev.cc > > diff --git a/libstdc++-v3/src/c++20/tzdb.cc b/libstdc++-v3/src/c++20/tzdb.cc > index c7c7cc9deee..32a3a92075d 100644 > --- a/libstdc++-v3/src/c++20/tzdb.cc > +++ b/libstdc++-v3/src/c++20/tzdb.cc
[...] > @@ -557,7 +605,7 @@ namespace std::chrono > if (save_time.indicator != at_time::Wall) > { > // We don't actually store the save_time.indicator, because we > - // assume that it's always deducable from the actual offset value. > + // assume that it's always deducable from the offset value. That should be "deducible", I've fixed that locally.