Re: [gentoo-dev] Defining TZ in the base system profile?

2023-02-14 Thread Haelwenn (lanodan) Monnier

[2023-01-18 20:48:56-0500] Joshua Kinard:

So is adding a default definition of TZ to our base system /etc/profile 
something we want to look at?  I
haven't tried any other methods of benchmarking to see if not making those 
additional syscalls is just placebo
or if there are actual impacts.  Given how long this oddity has been around, I 
can't tell if it's a genuine
bug in glibc, an unoptimized corner case, or just a big nothingburger.


I would take it as a glibc bug / lack of optimisation. At least definitely one
where the fault lies in glibc given that your showed other libc as more
optimized.

And given that POSIX puts ":/etc/localtime" as implementation defined[1],
I think we should avoid it, glibc isn't alone in dealing with timezones.

1: 
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03



Re: [gentoo-dev] Defining TZ in the base system profile?

2023-01-19 Thread Michael Orlitzky
On Wed, 2023-01-18 at 20:48 -0500, Joshua Kinard wrote:
> 
> So is adding a default definition of TZ to our base system
> /etc/profile something we want to look at?  I 
> haven't tried any other methods of benchmarking to see if not making
> those additional syscalls is just placebo 
> or if there are actual impacts.  Given how long this oddity has been
> around, I can't tell if it's a genuine 
> bug in glibc, an unoptimized corner case, or just a big
> nothingburger.
> 

I thought about doing this on my laptop, and talked myself out of it.
The main counter-arguments are,

  1. ICU doesn't handle the :/etc/localtime format at the moment,

   * https://unicode-org.atlassian.net/browse/ICU-13694
   * https://github.com/nodejs/node/issues/37271

 You could readlink() it or whatever at boot, but that will cause
 changes to /etc/localtime to be mysteriously ignored.

  2. The stats are there for a "good" reason, namely to let glibc
 know if the timezone has changed on the fly.

The first one is only a temporary deal-breaker, but the second is a
tradeoff involving how often your timezone changes (user-dependent) and
what the real performance impact is (probably not much).




Re: [gentoo-dev] Defining TZ in the base system profile?

2023-01-19 Thread Arsen Arsenović

Michał Górny  writes:

> On Wed, 2023-01-18 at 20:48 -0500, Joshua Kinard wrote:
>> So this article[1] from 2017 popped up again on the tech radar via 
>> hackernews[2] and a few other sites[3].  It 
>> annotates how if the envvar TZ is undefined on a Linux system, it causes 
>> glibc to generate a number of 
>> additional syscalls, mainly stat-related calls (in my tests, newfstatat()).  
>> If defined to an actual value, 
>> such as ":/etc/localtime" (or even an empty string), glibc will instead 
>> generate far fewer, if any at all, of 
>> these stat-related syscalls.
>> 
>> [...]
>> So is adding a default definition of TZ to our base system /etc/profile 
>> something we want to look at?  I 
>> haven't tried any other methods of benchmarking to see if not making those 
>> additional syscalls is just placebo 
>> or if there are actual impacts.  Given how long this oddity has been around, 
>> I can't tell if it's a genuine 
>> bug in glibc, an unoptimized corner case, or just a big nothingburger.
>> 
>
> Am I correct that there's no real difference between setting it to
> ":/etc/localtime" and the actual timezone?
>
> I suppose it would make sense to default it.

Correct, from ``(libc)TZ Variable'':

   If the ‘TZ’ environment variable does not have a value, the operation
chooses a time zone by default.  In the GNU C Library, the default time
zone is like the specification ‘TZ=:/etc/localtime’ (or
‘TZ=:/usr/local/etc/localtime’, depending on how the GNU C Library was
configured; *note Installation::).  Other C libraries use their own rule
for choosing the default time zone, so there is little we can say about
them.

I don't suspect any downside to this approach.
-- 
Arsen Arsenović


signature.asc
Description: PGP signature


Re: [gentoo-dev] Defining TZ in the base system profile?

2023-01-18 Thread Ionen Wolkens
On Wed, Jan 18, 2023 at 08:48:56PM -0500, Joshua Kinard wrote:
> 
> So this article[1] from 2017 popped up again on the tech radar via 
> hackernews[2] and a few other sites[3].  It 
> annotates how if the envvar TZ is undefined on a Linux system, it causes 
> glibc to generate a number of 
> additional syscalls, mainly stat-related calls (in my tests, newfstatat()).  
> If defined to an actual value, 
> such as ":/etc/localtime" (or even an empty string), glibc will instead 
> generate far fewer, if any at all, of 
> these stat-related syscalls.
[...]
> 
> Thoughts?

Sounds good to me from the little I know of it, albeit I do imagine it
could raise issues with some packages that try to use/handle TZ
themselves and no telling what obscure thing this is going to break.

exa[1][2] is one example that sam mentioned, but I imagine there's
more to find.

Personally added to /etc/env.d locally anyway, will see what come of it
for the things I use, not that this covers much at all :)

[1] https://github.com/ogham/exa/issues/856
[2] https://github.com/ogham/exa/pull/867
-- 
ionen


signature.asc
Description: PGP signature


Re: [gentoo-dev] Defining TZ in the base system profile?

2023-01-18 Thread Michał Górny
On Wed, 2023-01-18 at 20:48 -0500, Joshua Kinard wrote:
> So this article[1] from 2017 popped up again on the tech radar via 
> hackernews[2] and a few other sites[3].  It 
> annotates how if the envvar TZ is undefined on a Linux system, it causes 
> glibc to generate a number of 
> additional syscalls, mainly stat-related calls (in my tests, newfstatat()).  
> If defined to an actual value, 
> such as ":/etc/localtime" (or even an empty string), glibc will instead 
> generate far fewer, if any at all, of 
> these stat-related syscalls.
> 
> [...]
> So is adding a default definition of TZ to our base system /etc/profile 
> something we want to look at?  I 
> haven't tried any other methods of benchmarking to see if not making those 
> additional syscalls is just placebo 
> or if there are actual impacts.  Given how long this oddity has been around, 
> I can't tell if it's a genuine 
> bug in glibc, an unoptimized corner case, or just a big nothingburger.
> 

Am I correct that there's no real difference between setting it to
":/etc/localtime" and the actual timezone?

I suppose it would make sense to default it.

-- 
Best regards,
Michał Górny




[gentoo-dev] Defining TZ in the base system profile?

2023-01-18 Thread Joshua Kinard



So this article[1] from 2017 popped up again on the tech radar via hackernews[2] and a few other sites[3].  It 
annotates how if the envvar TZ is undefined on a Linux system, it causes glibc to generate a number of 
additional syscalls, mainly stat-related calls (in my tests, newfstatat()).  If defined to an actual value, 
such as ":/etc/localtime" (or even an empty string), glibc will instead generate far fewer, if any at all, of 
these stat-related syscalls.


Apparently, TZ is accessed quite frequently, so this has a compound effect, according to the article, in glibc 
making thousands of unnecessary stat-related syscalls to /etc/localtime (which must be hard-coded somewhere in 
glibc for this case).  Given the article's age (five years old), I tested the example C program out, and it 
does appear to still be accurate on a modern glibc-based system.  When TZ is undefined, I get exactly nine 
newfstatat calls on /etc/localtime.  If I define TZ to ":/etc/localtime", I do not get any of these newfstatat 
calls, and if I set TZ to an empty string, glibc will call openat() against "/usr/share/zoneinfo/Universal" 
and then generate exactly two newfstatat syscalls on that handle to read it.


I ran strace() against the undefined TZ case and the ":/etc/localtime" case, normalized the hex addresses to 
get a clean diff, and this is what it looks like:


--- a   2023-01-18 20:30:36.826805343 -0500
+++ b   2023-01-18 20:30:45.106983600 -0500
@@ -1,4 +1,4 @@
-# strace ./tz_test
+# TZ=":/etc/localtime" strace ./tz_test
 execve("./tz_test", ["./tz_test"], 0x /* XX vars */) = 0
 brk(NULL)   = 0x
 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x
@@ -61,15 +61,6 @@ read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0
 lseek(3, -2260, SEEK_CUR)   = 1292
 read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\6\0\0\0\6\0\0\0\0"..., 
3584) = 2260
 close(3)= 0
-newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, 
st_size=3552, ...}, 0) = 0
-newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, 
st_size=3552, ...}, 0) = 0
-newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, 
st_size=3552, ...}, 0) = 0
-newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, 
st_size=3552, ...}, 0) = 0
-newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, 
st_size=3552, ...}, 0) = 0
-newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, 
st_size=3552, ...}, 0) = 0
-newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, 
st_size=3552, ...}, 0) = 0
-newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, 
st_size=3552, ...}, 0) = 0
-newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, 
st_size=3552, ...}, 0) = 0
 write(1, "Godspeed, dear friend!\n", 23Godspeed, dear friend!
 ) = 23
 exit_group(0)   = ?

For comparison, I tested the same program on FreeBSD and it does not exhibit this behavior at all, regardless 
of whether TZ is undefined, a value, or an empty string.  I have yet to make a similar test on a mips/musl 
chroot to see how musl handles this.


There is a rather old (2010) StackOverflow question[4] about it as well, and someone left an answer in March 
of last year about the specific code in glibc that handles TZ if it is set or is an empty string.


So is adding a default definition of TZ to our base system /etc/profile something we want to look at?  I 
haven't tried any other methods of benchmarking to see if not making those additional syscalls is just placebo 
or if there are actual impacts.  Given how long this oddity has been around, I can't tell if it's a genuine 
bug in glibc, an unoptimized corner case, or just a big nothingburger.



1. 
https://blog.packagecloud.io/set-environment-variable-save-thousands-of-system-calls/
2. https://news.ycombinator.com/item?id=34346346
3. https://vermaden.wordpress.com/posts/
4. 
https://stackoverflow.com/questions/4554271/how-to-avoid-excessive-stat-etc-localtime-calls-in-strftime-on-linux



Thoughts?

--
Joshua Kinard
Gentoo/MIPS
ku...@gentoo.org
rsa6144/5C63F4E3F5C6C943 2015-04-27
177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943

"The past tempts us, the present confuses us, the future frightens us.  And our lives slip away, moment by 
moment, lost in that vast, terrible in-between."


--Emperor Turhan, Centauri Republic