On Mon, Jun 29, 2026 at 12:32 PM Jonathan Wakely <[email protected]> wrote:
> On Mon, 29 Jun 2026 at 10:21, Tomasz Kaminski <[email protected]> wrote: > > > > > > > > On Mon, Jun 29, 2026 at 10:34 AM Jonathan Wakely <[email protected]> > wrote: > >> > >> > >> > >> On Mon, 29 Jun 2026, 07:23 Tomasz Kaminski, <[email protected]> > wrote: > >>> > >>> > >>> > >>> On Fri, Jun 26, 2026 at 7:02 PM Jonathan Wakely <[email protected]> > wrote: > >>>> > >>>> Although the systemd docs say that /etc/localtime should be a symlink > to > >>>> one of the zoneinfo files, some systems make it a symlink to another > >>>> path, where that second path is a symlink to a zoneinfo file (e.g. if > >>>> /etc is mounted read-only then /etc/localtime can be a symlink to > >>>> another symlink on a writable disk, so that the system timezone can be > >>>> altered by re-pointing the symlink on the writable disk). > >>>> > >>>> In that case, using readlink would only tell us the location of the > >>>> second symlink, not which zoneinfo file it points to. Therefore, we > >>>> would not be able to extract a valid time zone name from the path, and > >>>> chrono::current_zone() would fail. > >>>> > >>>> To support multiple symlinks we could recursively keep resolving > >>>> symlinks with readlink until we reach a path from which we can > extract a > >>>> zone name. Alternatively, we can just use realpath to resolve all > >>>> symlinks to a physical file, which will be . This means we only need > one system call and > >>>> don't need the extra complexity of calling readlink in a loop. > >>>> > >>>> The realpath system call also removes redunant slashes, so we can > remove > >>>> the code that did that manually. > >>>> > >>>> The possible downsides of this approach that I'm aware of are: > >>>> > >>>> - When /etc/localtime is a symlink to /invalid/Europe/London but that > >>>> file doesn't exist. With the previous implementation we would have > >>>> resolved that symlink to the zone "Europe/London" as long as that > name > >>>> is known to the current chrono::tzdb object. With this change, we > >>>> won't get a valid zone name and current_zone() will fail. I'm not > sure > >>>> how realistic this case is. It might be plausible if libstdc++ is > >>>> using the embedded static copy of tzdata.zi and there are no > zoneinfo > >>>> files on disk at all. In that case the system might still use > >>>> /etc/localtime to name a zone, even though the symlink is dangling. > >>>> Maybe we could fall back to filesystem::weakly_canonical for this > >>>> case? > >>> > >>> I would be concerned about doing that silently, as this could lead to > >>> non-local errors in the deployed environment are extremely hard to > debug. > >>> What I mean is the situation where tzdata should be mounted. > >>> If the mounting fails, we will resolve the zone without issue, > >> > >> > >> Will we? We might resolve a name, but it still needs to be looked up in > tzdb to find a time_zone. If there's no tzdata that will fail (unless it's > one of the zones that is always present like UTC and GMT). > > > > From my reading, we fail first to the embeded static tzdb data if > opening tzdb file fails: > > tzdata_stream() : istream(nullptr) > > { > > if (string path = zoneinfo_file(tzdata_file); !path.empty()) > > { > > filebuf fbuf; > > if (fbuf.open(path, std::ios::in)) > > { > > std::construct_at(&fb, std::move(fbuf)); > > this->init(&fb); > > return; > > } > > } > > std::construct_at(&sb); > > this->init(&sb); > > } > > We can successfully look up the zone if we know about it, even if we > cannot read the zone database. > > Right - but that seems orthogonal to chrono::current_zone(), and I > don't think it is an argument against using weakly_canonical to deal > with dangling /etc/localtime symlinks. > Imagine that after removing the time zone files in a container, they are intended to mounted on the filesystem in some location (otherwise empty), effectively sharing them. If that process fails, with the weakly canonical option, here would be no indication of the problem: * current_zone name will give expected value (as we failed for dangling symlink), instead of fallback to UTC * the corresponding zone data can be loaded from static tzdb data The failure will actually appear if the application starts to depend on data from update of the this shared time zone information. The timespan between the mount failure, and the actual issue may be far removed. These are related in my mind, as they result from silently ignoring the above. both failure to load time zone database file, and dangling link in current zone. > > The semantics of current_zone are to decide on a name, then use > locate_zone(str) to look that name up (and maybe retry if the lookup > fails and we can try a different name). The presence or absence of a > valid tzdb object affects the locate_zone(str) step but I don't think > it should affect the process for deciding on a name. With the code on > trunk today, we can still extract a name from a dangling symlink (but > we can't extract a name from a chain of two or more symlinks). With my > proposed patch we would be able to extract a name from a chain of > symlinks, but would not be able to get anything from dangling > symlinks. > > > Maybe I am missing something here. > > > >> > >> > >>> and then only > >>> If the data requires updating, the update will fail. From that > perspective I much > >>> more prefer loud failure. > >> > >> > >> A dangling symlink to the Etc/GMT or UTC zone would always work OK even > if there is no tzdata, and maybe that's the expected behaviour for some > system. > > > > This system would be better served by having TZDB disabled, if that is > the consistent behavior. > > That requires a custom build of libstdc++.so whereas a minimal > container could use the system libstdc++.so and remove all the tzdata > files to save disk space. > > >> > >> > >>> > >>> As the fact that this works is libstdc++ specific > >> > >> > >> Which part? > > > > Not using the binary format. > >> > >> > >>> (other libraries and clients > >>> may use the content of the file), > >> > >> > >> They should not do that on Linux. Maybe on other systems, and for > ancient versions of Linux where /etc/localtime is a real file containing a > copy of a Zif file. But there nothing we can do in that case, we can't > resolve it to a chrino::time_zone object without comparing the file with > every file under the zoneinfo directory. > > > > Why, if there use binary format, and they do not allow lookups for other > zones, you may simply load the content > > of the file pointed out by current_zone. At least as far as I understand. > > > > For example in such case, the ancient version of linux could be > supported, by loading the content > > of the /etc/localtime with time zone database, and storing it under > distinct name in tzdb. > > Yes, that's true. There's no guarantee that current_zone() returns one > of the values in get_tzdb()->zones. And even if it does, it could be a > special entry that is created by the /etc/localtime file and not one > of the files under /usr/share/zoneinfo. But it wouldn't return > anything useful for current_zone()->name() in that case, because > there's no name in the binary TZif file. You would have to gTive it a > fake name like "localtime". > > But even in implementations which use the binary TZif files (libc++ > and HowardHinnant/date) they don't support /etc/localtime being a TZif > file, as Mark de Wever said in > https://github.com/llvm/llvm-project/issues/105634#issuecomment-2660976683 > > Libc++ and date both assume a hardcoded path such as > /usr/share/zoneinfo and then simply remove that prefix from the target > of /etc/localtime to get the zone name. Libc++ uses > filesystem::read_symlink optionally followed by filesystem::canonical > (I don't know why they don't just use filesystem::canonical in the > first place) and then filesystem::relative to remove the prefix. > Howard uses realpath, like my proposed patch, and then removes > "*zoneinfo/" from the result, and so that also doesn't support > dangling symlinks. > > I'll just add a comment noting that we don't support dangling > symlinks, and then as you suggested we can postpone supporting them > until somebody complains. > > > > > >>> I think this case (if appears) may be better > >>> served by a dedicated enviroment variable. > >>>> > >>>> > >>>> - When /etc/localtime is a symlink to /usr/share/zoneinfo/Foo/Bar > where > >>>> "Foo/Bar" is a valid zone in the chrono::tzdb object, but the Bar > file > >>>> is another symlink to ./Baz where "Foo/Bar" is also a valid zone. > >>>> With the previous implementation current_zone() would have returned > >>>> the "Foo/Bar" zone. With this change it would return "Foo/Baz". I > >>>> don't think it's realistic to have two zones which are distinct > zones > >>>> (not a Zone and a Link to it) > >>> > >>> Yes, and the link name is not accessible to the program: zone->name > >>> returns the target zone, so it is not observable if we located > Asia/Istanbul > >>> (that was symlink to Europe/Istanbul) or Asia/Istanbul. I think that > is OK. > >>>> > >>>> but where one of them is defined on-disk > >>>> using a symlink to the other. > >>>> > >>>> libstdc++-v3/ChangeLog: > >>>> > >>>> PR libstdc++/125467 > >>> > >>> This PR is created by you. Does it come from real user complain, or > this > >>> was found by you? > >> > >> > >> It was reported directly to me by a friend, as this behaviour is > preventing their codebase from replacing Howard's date library with C++20 > chrono. > >> > >> > >>> > >>>> > >>>> * src/c++20/tzdb.cc (tzdb::current_zone): Use realpath to > >>>> resolve the /etc/localtime symlink instead of readlink. > >>>> --- > >>>> > >>>> Tested x86_64-linux. > >>>> > >>>> Should we handle the dangling symlink case, maybe by using > >>>> filesystem::weakly_canonical? > >>> > >>> I would wait to see if anybody is actually impacted by it in non-error > case. > >>> I would got with this with only small modification suggested bellow. > >>>> > >>>> > >>>> > >>>> > >>>> libstdc++-v3/src/c++20/tzdb.cc | 65 > +++++++++++----------------------- > >>>> 1 file changed, 20 insertions(+), 45 deletions(-) > >>>> > >>>> diff --git a/libstdc++-v3/src/c++20/tzdb.cc > b/libstdc++-v3/src/c++20/tzdb.cc > >>>> index 0158659f79e1..acfc57f76437 100644 > >>>> --- a/libstdc++-v3/src/c++20/tzdb.cc > >>>> +++ b/libstdc++-v3/src/c++20/tzdb.cc > >>>> @@ -41,8 +41,13 @@ > >>>> # include <ext/concurrence.h> // __gnu_cxx::__mutex > >>>> #endif > >>>> > >>>> -#if defined(_GLIBCXX_HAVE_READLINK) && > defined(_GLIBCXX_HAVE_UNISTD_H) > >>>> -# include <unistd.h> // readlink > >>>> +#ifdef _GLIBCXX_HAVE_UNISTD_H > >>>> +# include <unistd.h> // _XOPEN_VERSION > >>>> +#endif > >>>> +#if defined _GLIBCXX_USE_REALPATH && _XOPEN_VERSION >= 700 > >>>> +# include <stdlib.h> // malloc, free, realpath > >>>> +#else > >>>> +# include <filesystem> // filesystem::canonicalize > >>>> #endif > >>>> > >>>> #ifdef _AIX > >>>> @@ -2094,58 +2099,28 @@ constinit tzdb_list::_Node::NumLeapSeconds > tzdb_list::_Node::num_leap_seconds; > >>>> // to have a way to force a re-read. > >>>> > >>>> #if !defined(_AIX) && !defined(_GLIBCXX_HAVE_WINDOWS_H) > >>>> -#if defined(_GLIBCXX_HAVE_READLINK) && > defined(_GLIBCXX_HAVE_UNISTD_H) > >>>> - string_view str; > >>>> - char buf[128]; // strlen("../usr/share/zoneinfo/...") is usually > < 55 > >>>> - string dynbuf; > >>>> // /etc/localtime should be a symlink that ends with a zone name, > >>>> // e.g. /etc/localtime -> /usr/share/zoneinfo/Europe/London > >>>> // > https://www.freedesktop.org/software/systemd/man/latest/localtime.html > >>>> // This should work on GNU/Linux, macOS, NetBSD, and OpenBSD. > >>>> // Some FreeBSD systems also use a symlink for /etc/localtime. > >>>> - // Use readlink directly to avoid std::filesystem overhead. > >>>> - if (auto n = ::readlink("/etc/localtime", buf, sizeof(buf)); n > > 0) > >>>> + > >>>> +#if defined _GLIBCXX_USE_REALPATH && _XOPEN_VERSION >= 700 > >>>> + unique_ptr<char[], void(*)(void*)> buf{ nullptr, &::free }; > >>>> + string_view str; > >>> > >>> A slight stylistic note: I would prefer `str` to always be string_view > (substr will do > >>> different thing otherwise, that may bite us), and declared before the > #if. > >> > >> > >> Will do. > >> > >>> > >>>> > >>>> + // Use realpath directly to avoid std::filesystem overhead. > >>>> + // We use realpath not readlink to resolve multiple levels of > symlinks. > >>>> + if (char* p = ::realpath("/etc/localtime", nullptr)) > >>>> { > >>>> - if (static_cast<size_t>(n) < sizeof(buf)) > >>>> - str = string_view(buf, n); > >>>> - else [[unlikely]] > >>>> - { > >>>> - // We read the symlink but it didn't fit in buf[], use > dynbuf. > >>>> - do > >>>> - { > >>>> - n *= 2; > >>>> - dynbuf.__resize_and_overwrite(n, [](char* p, size_t > len) { > >>>> - auto n2 = ::readlink("/etc/localtime", p, len); > >>>> - if (n2 == -1) // symlink removed or replaced by > file?! > >>>> - __throw_runtime_error("tzdb: error reading > /etc/localtime"); > >>>> - const size_t r = n2; > >>>> - return r < len ? r : 0; > >>>> - }); > >>>> - } > >>>> - while (dynbuf.empty()); > >>>> - str = dynbuf; > >>>> - } > >>>> + buf.reset(p); > >>>> + str = p; > >>>> } > >>>> +#else > >>>> + string str = > std::filesystem::canonical("/etc/localtime").string(); > >>> > >>> And here we would have: > >>> string buf = > std::filesystem::canonical("/etc/localtime").string(); > >>> str = buf; > >>>> > >>>> +#endif > >>>> > >>>> if (!str.empty()) > > I'm going to also check str != "/etc/localtime" here, so that if it's > not a symlink we don't bother trying to extract a name from it. > > > >>>> { > >>>> - // Remove any redundant slashes so we can match zone names. > >>>> - // e.g. /usr/share/zoneinfo/Europe//London is a valid symlink, > >>>> - // but won't match against "Europe/London". > >>>> - if (auto pos = str.rfind("//"); pos != str.npos) [[unlikely]] > >>>> - { > >>>> - if (str.data() != dynbuf.data()) > >>>> - dynbuf = str; > >>>> - string::size_type spos = pos; > >>>> - do > >>>> - { > >>>> - dynbuf.erase(spos, 1); > >>>> - spos = dynbuf.rfind("//", spos); > >>>> - } > >>>> - while (spos != dynbuf.npos); > >>>> - str = dynbuf; > >>>> - } > >>>> - > >>>> // Check the trailing components of the path against known > zone names. > >>>> // Valid IANA times zones can have one, two, or three parts, > e.g. > >>>> // "UTC", "Europe/London", and "America/Indiana/Indianapolis". > >>>> @@ -2171,7 +2146,7 @@ constinit tzdb_list::_Node::NumLeapSeconds > tzdb_list::_Node::num_leap_seconds; > >>>> str.substr(pos + 1))) > >>>> return tz; > >>>> } > >>>> -#endif > >>>> + > >>>> // Otherwise, look for a file naming the time zone. > >>>> string_view files[] { > >>>> "/etc/timezone", // Debian derivates > >>>> -- > >>>> 2.54.0 > >>>> > >
