Hi Chris,

On Mon, 2021-02-15 at 18:28 +0000, Chris Lamb wrote:
> Ah, indeed, the failure mode means that the log never made it to
> > buildd.d.o.
> 
> Curious, not heard of that failure mode — is there someplace I can
> learn about that? No worries if not.

I'm not sure if it's documented, but in this case I think enough of the
system was unresponsive or killed to make the connection back to
buildd.d.o fail.

> > I've attached a copy of the log from zani.
> 
> Ah, thanks. Unfortunately, it does not point us straight to the
> solution. I note that you titled this bug "package OOMs" — I point
> this out because the "OOM" text the log is actually the name of the
> test. As in, here is tests/integration/corrupt-dump.tcl:
> 
[...]
> Do we have confirmation somewhere that the build is actually OOMing,
> rather than it just timing out on a test that was designed to test
> *for* an OOM condition. This OOM-related bug *should* be fixed by
> virtue of them adding the test to begin with (!) but if we can show
> that it is still OOMing, I suspect that upstream will be able to
> address it quickly.

I don't know how much context would be needed, but the machine
definitely OOMed:

Feb  3 20:45:22 zani/zani kernel: redis-server invoked oom-killer: 
gfp_mask=0x6000c0(GFP_KERNEL), nodemask=(null), order=0, oom_score_adj=0
Feb  3 20:45:22 zani/zani kernel: redis-server cpuset=/ mems_allowed=0
Feb  3 20:45:22 zani/zani kernel: CPU: 0 PID: 45952 Comm: redis-server Not 
tainted 4.19.0-14-s390x #1 Debian 4.19.171-2
Feb  3 20:45:22 zani/zani kernel: Hardware name: IBM 8561 LT1 400 (z/VM 7.1.0)
Feb  3 20:45:22 zani/zani kernel: Call Trace:
Feb  3 20:45:22 zani/zani kernel: ([<0000000000113f2a>] show_stack+0x5a/0x78)
Feb  3 20:45:22 zani/zani kernel:  [<0000000000802d1a>] dump_stack+0x8a/0xb8 
Feb  3 20:45:22 zani/zani kernel:  [<0000000000800962>] dump_header+0x82/0x2c0 
Feb  3 20:45:22 zani/zani kernel:  [<00000000002b46fe>] 
oom_kill_process+0xde/0x380 
Feb  3 20:45:22 zani/zani kernel:  [<00000000002b550c>] 
out_of_memory+0x24c/0x3b8 
Feb  3 21:07:50 zani/zani kernel:  [<00000000002bd032>] 
__alloc_pages_nodemask+0x10b2/0x1160 
Feb  3 21:07:50 zani/zani kernel:  [<000000000012b0c6>] 
page_table_alloc+0x15e/0x2c8 
Feb  3 21:07:50 zani/zani kernel:  [<00000000002f8b76>] __pte_alloc+0x2e/0xf8 
Feb  3 21:07:50 zani/zani kernel:  [<00000000002ff258>] 
__handle_mm_fault+0xfc0/0x11c0 
Feb  3 21:07:50 zani/zani kernel:  [<00000000002ff584>] 
handle_mm_fault+0x12c/0x298 
Feb  3 21:07:50 zani/zani kernel:  [<0000000000123a12>] 
do_dat_exception+0x182/0x440 
Feb  3 21:07:50 zani/zani kernel:  [<000000000080d9d4>] 
pgm_check_handler+0x190/0x1e4 
...
Feb  3 21:07:50 zani/zani kernel: sshd invoked oom-killer: 
gfp_mask=0x7080c0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), nodemask=(null), order=2, 
oom_score_adj=-1000
Feb  3 21:07:50 zani/zani kernel: sshd cpuset=/ mems_allowed=0
Feb  3 21:07:50 zani/zani kernel: CPU: 0 PID: 1463 Comm: sshd Not tainted 
4.19.0-14-s390x #1 Debian 4.19.171-2
Feb  3 21:07:50 zani/zani kernel: Hardware name: IBM 8561 LT1 400 (z/VM 7.1.0)
Feb  3 21:07:50 zani/zani kernel: Call Trace:
Feb  3 21:07:50 zani/zani kernel: ([<0000000000113f2a>] show_stack+0x5a/0x78)
Feb  3 21:07:50 zani/zani kernel:  [<0000000000802d1a>] dump_stack+0x8a/0xb8 
Feb  3 21:07:50 zani/zani kernel:  [<0000000000800962>] dump_header+0x82/0x2c0 
Feb  3 21:07:50 zani/zani kernel:  [<00000000002b46fe>] 
oom_kill_process+0xde/0x380 
Feb  3 21:07:50 zani/zani kernel:  [<00000000002b550c>] 
out_of_memory+0x24c/0x3b8 
Feb  3 21:07:50 zani/zani kernel:  [<00000000002bd032>] 
__alloc_pages_nodemask+0x10b2/0x1160 
Feb  3 21:07:50 zani/zani kernel:  [<000000000013e414>] 
copy_process.part.4+0x24c/0x1fb0 
Feb  3 21:07:50 zani/zani kernel:  [<0000000000140550>] _do_fork+0xf0/0x430 
Feb  3 21:07:50 zani/zani kernel:  [<00000000001409ce>] sys_clone+0x3e/0x50 
Feb  3 21:07:50 zani/zani kernel:  [<000000000080d630>] system_call+0xd8/0x2bc 
...
Feb  3 21:07:50 zani/zani kernel: oom_reaper: reaped process 45952 
(redis-server), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
...
Feb  3 21:07:50 zani/zani kernel: sshd invoked oom-killer: 
gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, 
oom_score_adj=0
...
Feb  3 21:07:50 zani/zani kernel: munin-node invoked oom-killer: 
gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, 
oom_score_adj=0
...
Feb  3 21:07:50 zani/zani kernel: oom_reaper: reaped process 36654 (schroot), 
now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
...
Feb  3 21:07:50 zani/zani kernel: oom_reaper: reaped process 34994 (sbuild), 
now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
...
Feb  3 21:07:50 zani/zani kernel: oom_reaper: reaped process 1508 (syslog-ng), 
now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
...
Feb  3 21:07:50 zani/zani kernel: oom_reaper: reaped process 1863 (samhain), 
now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
...
Feb  3 21:07:50 zani/zani kernel: dpkg-buildpackage invoked oom-killer: 
gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0,
oom_score_adj=0
...
Feb  3 21:07:50 zani/zani kernel: oom_reaper: reaped process 36655 
(dpkg-buildpacka), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

> If it helps, this test was added in this commit:
> 
>   
> https://github.com/antirez/redis/commit/7ca00d694d44be13a3ff9ff1c96b49222ac9463b
> 
> ... which was in:
> 
>   $ git tag --contains 7ca00d694d44be13a3ff9ff1c96b49222ac9463b
>   6.2-rc1
>   6.2-rc2
>   6.2-rc3
> 
> Not sure if previous s390x builds were failing, which might be
> another route to fixing this.
> 

The most recent s390x log on 
https://buildd.debian.org/status/logs.php?pkg=redis&arch=s390x is for
5:6.2~rc1-3

Looking back, the 6.2~rc2-1 build ends with:

*** [err]: Slave is able to detect timeout during handshake in 
tests/integration/replication.tcl

The 6.2~rc2-2 build on zandonai ends with similar OOM logs in syslog as
those from zani above, as does the -3 build.

Regards,

Adam

Reply via email to