On 2025-09-30 09:28, Óscar García Amor wrote:
It is the last line
of `build()`, `./build/telegraf config > telegraf.conf`. It is at this
point that the execution stops completely if I build the package with
`extra-x86_64-build`.
Can you think of any reason why this might happen or how it could be
fixed?
I've reproduced the issue, and boiled it down to a likely interaction
of systemd-nspawn and go's threading/process handling. systemd-nspawn
is used by extra-x86_64-build implicitly to containerize the build.
The call of `telegraf config` does output, but hangs on termination
in an epoll_pwait/futex wait loop as described
in [issue 55120](https://github.com/golang/go/issues/55120).
I boiled reproduction down to running the build once until it hangs,
Ctrl-C to abort it, then running the `telegraf config` manually in a
minimal systemd-nspawn container. Always hangs.
I installed strace in the container root as well to see where exactly
the process hangs, which is how I found the unresolved go issue 55120:
```
sudo systemd-nspawn -D /var/lib/archbuild/extra-x86_64/gyroplast
/build/telegraf/src/telegraf-1.36.2/build/telegraf config
sudo systemd-nspawn -D /var/lib/archbuild/extra-x86_64/gyroplast strace
-x -y -v -ff /build/telegraf/src/telegraf-1.36.2/build/telegraf config
```
Unfortunately I'm running out of time to look into this further, but
as this is easily reproducible, someone else with better knowledge of
go and/or systemd-nspawn peculiarities may pick up here. My gut says
this may be a systemd-nspawn configuration issue, affecting go threading
or signal handling (there's a SIGURG passed between processes) hindering
go's process cleanup on termination.
Using `taskset -a -c 1 ./build/telegraf config` does NOT work around the
issue, the binary still hangs in the same way.
My strace -xyffv excerpt of periodically repeated hang loop, one period:
<unfinished ...>
[pid 5] <... epoll_pwait resumed>, [], 128, 999, NULL, 0) = 0
[pid 4] <... futex resumed>) = -1 ETIMEDOUT (Connection timed
out)
[pid 5] futex(0x5f2c52e08130, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 4] sched_yield( <unfinished ...>
[pid 5] <... futex resumed>) = 0
[pid 4] <... sched_yield resumed>) = 0
[pid 5] openat(AT_FDCWD</>, "/proc/3/stat", O_RDONLY|O_CLOEXEC
<unfinished ...>
[pid 4] sched_getaffinity(0, 8192 <unfinished ...>
[pid 5] <... openat resumed>) = 7</proc/3/stat>
[pid 4] <... sched_getaffinity resumed>, [1]) = 8
[pid 5] fcntl(7</proc/3/stat>, F_GETFL <unfinished ...>
[pid 4] pread64(3</sys/fs/cgroup/cpu.max> <unfinished ...>
[pid 5] <... fcntl resumed>) = 0x8000 (flags
O_RDONLY|O_LARGEFILE)
[pid 4] <... pread64 resumed>, "max 100000\n", 64, 0) = 11
[pid 5] fcntl(7</proc/3/stat>, F_SETFL,
O_RDONLY|O_NONBLOCK|O_LARGEFILE <unfinished ...>
[pid 4] futex(0x5f2c52dfdc50, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 5] <... fcntl resumed>) = 0
[pid 3] <... futex resumed>) = 0
[pid 4] <... futex resumed>) = 1
[pid 3] epoll_ctl(5<anon_inode:[eventpoll]>, EPOLL_CTL_ADD,
7</proc/3/stat>, {events=EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET,
data=0x7ddb66bb6c000015} <unfinished ...>
[pid 5] futex(0xc0000e3158, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 3] <... epoll_ctl resumed>) = -1 EPERM (Operation not permitted)
[pid 4] nanosleep({tv_sec=0, tv_nsec=20000} <unfinished ...>
[pid 3] fcntl(7</proc/3/stat>, F_GETFL) = 0x8800 (flags
O_RDONLY|O_NONBLOCK|O_LARGEFILE)
[pid 4] <... nanosleep resumed>, NULL) = 0
[pid 3] fcntl(7</proc/3/stat>, F_SETFL, O_RDONLY|O_LARGEFILE
<unfinished ...>
[pid 4] nanosleep({tv_sec=0, tv_nsec=20000} <unfinished ...>
[pid 3] <... fcntl resumed>) = 0
[pid 3] fstat(7</proc/3/stat> <unfinished ...>
[pid 4] <... nanosleep resumed>, NULL) = 0
[pid 3] <... fstat resumed>, {st_dev=makedev(0, 0x6c),
st_ino=267663, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0,
st_blksize=1024, st_blocks=0, st_size=0, st_atime=1759226607 /*
2025-09-30T12:03:27.583355914+0200 */, st_atime_nsec=583355914,
st_mtime=1759226607 /* 2025-09-30T12:03:27.583355914+0200 */,
st_mtime_nsec=583355914, st_ctime=1759226607 /*
2025-09-30T12:03:27.583355914+0200 */, st_ctime_nsec=583355914}) = 0
[pid 4] nanosleep({tv_sec=0, tv_nsec=20000} <unfinished ...>
[pid 3] read(7</proc/3/stat>, "3 (telegraf) R 1 1 1 34816 1 419"...,
512) = 317
[pid 4] <... nanosleep resumed>, NULL) = 0
[pid 3] read(7</proc/3/stat> <unfinished ...>
[pid 4] nanosleep({tv_sec=0, tv_nsec=20000} <unfinished ...>
[pid 3] <... read resumed>, "", 579) = 0
[pid 3] close(7</proc/3/stat> <unfinished ...>
[pid 4] <... nanosleep resumed>, NULL) = 0
[pid 3] <... close resumed>) = 0
[pid 4] nanosleep({tv_sec=0, tv_nsec=20000} <unfinished ...>
[pid 3] sysinfo({uptime=8492, loads=[26912, 310112, 222848],
totalram=33252589568, freeram=13363355648, sharedram=1062043648,
bufferram=482197504, totalswap=17178816512, freeswap=17178816512,
procs=1240, totalhigh=0, freehigh=0, mem_unit=1}) = 0
[pid 4] <... nanosleep resumed>, NULL) = 0
[pid 3] epoll_pwait(5<anon_inode:[eventpoll]> <unfinished ...>
[pid 4] nanosleep({tv_sec=0, tv_nsec=20000} <unfinished ...>
[pid 3] <... epoll_pwait resumed>, [], 128, 0, NULL, 0) = 0
[pid 3] epoll_pwait(5<anon_inode:[eventpoll]> <unfinished ...>
[pid 4] <... nanosleep resumed>, NULL) = 0
[pid 4] futex(0x5f2c52e08130, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0,
tv_nsec=999750431}
Good luck, anyone else!
— Dennis