On Sun, Apr 08, 2018 at 04:01:11AM +0300, Thanos Baloukas wrote:
> On 07/04/2018 11:57 μμ, Ken Moffat wrote:
> > 
> > Every time I run the tests, within a couple of minutes I get two
> > traps for invalid opcodes, followed a few minutes later by two
> > segfaults (noted by reading syslog, the tests seem to continue as
> > normal).  But if I leave the tests running, about 20 to 30 minutes
> > later (before the end of the tests, according to what got written
> > to the log) the machine reboots.  These are all that got logged the
> > first time, nothingrelated to the reboot.
> > 
> > Apr  6 06:37:48 origin kernel: [ 3695.201558] traps: backtrace.stage[17714] 
> > trap invalid opcode ip:7f1e76696498 sp:7ffc6848af80 error:0 in 
> > libstd-23815cc482a70678.so[7f1e7663c000+155000]
> > Apr  6 06:37:48 origin kernel: [ 3695.231521] traps: backtrace.stage[17720] 
> > trap invalid opcode ip:7f76e0ff0498 sp:7ffc2bf46850 error:0 in 
> > libstd-23815cc482a70678.so[7f76e0f96000+155000]
> > Apr  6 06:41:17 origin kernel: [ 3903.999612] segfault-no-out[31250]: 
> > segfault at 0 ip 0000555aa6cd0b69 sp 00007ffeb0d96fa0 error 6 in 
> > segfault-no-out-of-stack.stage2-x86_64-unknown-linux-gnu[555aa6ccd000+5000]
> > Apr  6 06:41:23 origin kernel: [ 3910.633482] signal-exit-sta[32055]: 
> > segfault at 1 ip 000055afe3b44efc sp 00007fffc985f660 error 6 in 
> > signal-exit-status.stage2-x86_64-unknown-linux-gnu[55afe3b42000+4000]
> > Apr  6 06:41:26 origin kernel: [ 3913.231883] traps: simd-target-fea[32330] 
> > trap invalid opcode ip:55b72481e89c sp:7fff97757750 error:0 in 
> > simd-target-feature-mixup.stage2-x86_64-unknown-linux-gnu[55b72481b000+7000]

> I was going to get an 8 core 16 thread Ryzen, but after the problems
> they exhibited I decided to get a 4 core 4 thread 3 1200 for now,
> and upgrade when Ryzen 2 comes out, be tested and proved to be stable.
> 
> I knew that I needed a host with latest gcc to build LFS, so I used Fedora
> 27 with gcc-7.3. Everything went smoothly, except some core
> dumps during the gcc tests. I ran the tests only for the recommended
> by the book packages.
> 
> The only problem the machine had was freezing on idle. In the beginning
> I disabled the cool 'n quiet in the bios and that solved the problem.
> Then I found this bug report,
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=196683
> 

I've been running cool 'n quiet without any problems (apart from
this) and using lm_sensors I can see CPU frequencies at 3592 MHz on
all cores, or a bit above 3600 on only one core - for a nominal max
frequency of 3500 - and much lower frequencies when less-busy)

To be clear, you turned off C6 if that is available as a bios
option, OR added CONFIG_RCU_NO_CB_CPU and added a boot argument of
rcu_nocbs=0-3 ?

Mine has a recent motherboard, the bios seems to be a few versions
on from what was initially used.

I've not had freezes while idle - even the start of the rust
testsuites is far from idle (some compiles) - but it's another
avenue to explore.

> enabled cool 'n quiet, tried the proposed RCU solution and it worked.
> Since then the system is rock stable. Cpu at stock speed, 32GB ram at
> rated 3200MHz. Ram speed is not that important for our use, but I
> favored that ram kit because it is one of the most compatible with
> Ryzen according many testers' and users' reports.
> 

Yes, RAM variations seem to have caused problems (although recent
bios updates are supposed to help).  Mine is 2666 but only 2x4GB.

> On my svn installation I have rust-1.25.0 and used that to build
> firefox-59.0.2. When I read your mail I installed gdb and rebuilt
> to run the tests. No reboots. About 15 tests dumped core. Iindicatively:
> 
> systemd-coredump[2185]: Process 2172 (issue-24313.sta) of user 1001 dumped
> core.
> 
> Stack trace of thread 2183:
> #0  0x00007f7f8e582e5b __GI_raise (libc.so.6)
> #1  0x00007f7f8e584211 __GI_abort (libc.so.6)
> #2  0x00007f7f8edb2339 _ZN3std3sys4unix14abort_internal17he99af7afd6098d88E
> (libstd-23815cc482a70678.so)
> #3  0x00007f7f8edc6a8b _ZN3std10sys_common4util5abort17h240ecd99fa33591bE
> (libstd-23815cc482a70678.so)
> #4  0x00007f7f8ed976a2 rust_panic (libstd-23815cc482a70678.so)
> #5  0x00007f7f8ed975c2
> _ZN3std9panicking20rust_panic_with_hook17hfcd2a02d3b761d28E
> (libstd-23815cc482a70678.so)
> #6  0x00005602b2dc24c4 n/a 
> (/build/tmp/rustc-1.25.0-src/build/x86_64-unknown-linux-gnu/test/run-pass/issue-24313.stage2-x86_64-unknown-linux-gnu)
> 
> 3 traps:
> 
> Απρ 08 02:17:50 kernel: traps: backtrace.stage[11039] trap invalid opcode
> ip:7fd12c93a488 sp:7ffc99eb6230 error:0 in
> libstd-23815cc482a70678.so[7fd12c8e0000+154000]
> 
> Απρ 08 02:17:50 kernel: traps: backtrace.stage[11079] trap invalid opcode
> ip:7f8652d97488 sp:7ffc5aae6000 error:0 in
> libstd-23815cc482a70678.so[7f8652d3d000+154000]
> 
> Απρ 08 02:22:03 kernel: traps: simd-target-fea[25911] trap invalid opcode
> ip:56325971a88c sp:7ffe1a3878d0 error:0 in
> simd-target-feature-mixup.stage2-x86_64-unknown-linux-gnu[563259717000+7000]
> 

For that, *many* thanks - it confirms the problem is only when
running the tests (or rather, there is no problem with building
or running firefox ;)

I suppose that the segfaults, and reboots, might be down to a more
machine-specific issue.

> Results:
> 
> grep 'running .* tests' ../rustc-testlog | awk '{ sum += $2 } END { print
> sum }'
> 15736
> 
> grep '^test result:' ../rustc-testlog | awk  '{ sum += $6 } END { print sum
> }'
> 1
> 

That is the FAIL total, I think.  You have gdb, so those 120 (?)
gdb-related  tests should pass (I didn't have that).  But I still had
more than one failure - the two thumb v6 tests, and I think something
else.  Unfortunately, I let mine overwrite the earlier log and for
the later builds I stopped after the segfaults.

> grep '^test result:' ../rustc-testlog | awk  '{ sum += $4 } END { print sum
> }'
> 15208
> 
> grep '^test result:' ../rustc-testlog | awk  '{ sum += $8 } END { print sum
> }'
> 726
> 
> grep '^test result:' ../rustc-testlog | awk  '{ sum += $10 } END { print sum
> }'
> 0
> 
> grep '^test result:' ../rustc-testlog | awk  '{ sum += $12 } END { print sum
> }'
> 0
> 

Maths seems to still not be a strong point for the overall total :)

I guess I'd better install gdb in case that makes a difference
(beyond passing more tests it seems unlikely, but the details of
rust builds from source seem to be poorly documented, or to have
changed since they were documented.

I'll also look at C6.  And I might have to leave it running even if
there are traps, to see if it reboots.

Thanks.

ĸen
-- 
In my seventh decade astride this planet, and as my own cells degrade,
there are some things I cannot do now: skydiving, marathon running,
calculus. I couldn't do them in my 20s either, so no big loss.
            -- Derek Smalls, formerly of Spinal Tap
-- 
http://lists.linuxfromscratch.org/listinfo/blfs-dev
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page

Reply via email to