Re: [blfs-dev] Anyone building rustc on a ryzen ?

2018-04-09 Thread Thanos Baloukas

On 09/04/2018 06:36 μμ, Ken Moffat wrote:

Yes, I understood.  But in the absence of any other strategy I
decided to try it just in case (primarily because this was the first
time I'd seen 3.7GHz during this run and the load was fairly low.

I took a look at the bios options after I wrote that - but this is
the low-end A320 with no overclocking options for the CPU.  And the
RAM seemed fine in memtest86.  Will have another look later, there
was one BIOS option I need to research (soemthing like 'use AsRock
profile or AMD profile - I left it on the AsRock default), and maybe
RAM voltage can be adjusted.

ĸen



I can't predict what ryzen 2 will be, the first generation however was a
lottery, so I thought that if I'll be unlucky and have to do battle with
some buggy cpu, I better have a full-featured motherboard in my arsenal.
AsRock here too, but X370 Taichi.

--
Thanos
--
http://lists.linuxfromscratch.org/listinfo/blfs-dev
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page


Re: [blfs-dev] Anyone building rustc on a ryzen ?

2018-04-09 Thread Ken Moffat
On Mon, Apr 09, 2018 at 10:47:44AM +0300, Thanos Baloukas wrote:
> On 09/04/2018 03:07 πμ, Ken Moffat wrote:
> > 
> > Thanks.  Tried that, but it did not help.  I also tried logging
> > output from top and sensors every 5 seconds (that took two goes,
> > first time I fubar'd and emptied the log on every pass through the
> > loop).  I noticed that 5 seconds before the crash, one core was
> > running at just over 3.7 GHz (boosted) and the others were slowing
> > down.
> > 
> What I did is just a workaround for the freezing on idle problem and I
> wrote it just to answer to your question. It was no way a suggestion
> to solve your problem. As I wrote in my previous reply, I would suspect
> not enough voltage fed to the cpu and/or to ram, causing the reboots.
> 
Yes, I understood.  But in the absence of any other strategy I
decided to try it just in case (primarily because this was the first
time I'd seen 3.7GHz during this run and the load was fairly low.

I took a look at the bios options after I wrote that - but this is
the low-end A320 with no overclocking options for the CPU.  And the
RAM seemed fine in memtest86.  Will have another look later, there
was one BIOS option I need to research (soemthing like 'use AsRock
profile or AMD profile - I left it on the AsRock default), and maybe
RAM voltage can be adjusted.

ĸen
-- 
In my seventh decade astride this planet, and as my own cells degrade,
there are some things I cannot do now: skydiving, marathon running,
calculus. I couldn't do them in my 20s either, so no big loss.
-- Derek Smalls, formerly of Spinal Tap
-- 
http://lists.linuxfromscratch.org/listinfo/blfs-dev
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page


Re: [blfs-dev] Anyone building rustc on a ryzen ?

2018-04-09 Thread Thanos Baloukas

On 09/04/2018 03:07 πμ, Ken Moffat wrote:

On Sun, Apr 08, 2018 at 12:11:39PM +0300, Thanos Baloukas wrote:

On 08/04/2018 06:30 πμ, Ken Moffat wrote:

On Sun, Apr 08, 2018 at 04:01:11AM +0300, Thanos Baloukas wrote:

On 07/04/2018 11:57 μμ, Ken Moffat wrote:




The only problem the machine had was freezing on idle. In the beginning
I disabled the cool 'n quiet in the bios and that solved the problem.
Then I found this bug report,

https://bugzilla.kernel.org/show_bug.cgi?id=196683



I've been running cool 'n quiet without any problems (apart from
this) and using lm_sensors I can see CPU frequencies at 3592 MHz on
all cores, or a bit above 3600 on only one core - for a nominal max
frequency of 3500 - and much lower frequencies when less-busy)

To be clear, you turned off C6 if that is available as a bios
option, OR added CONFIG_RCU_NO_CB_CPU and added a boot argument of
rcu_nocbs=0-3 ?


Didn't play with C6 at all. Using a kernel with

# RCU Subsystem
CONFIG_TREE_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
CONFIG_TREE_SRCU=y
CONFIG_TASKS_RCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_NEED_SEGCBLIST=y
CONFIG_RCU_NOCB_CPU=y
CONFIG_HAVE_RCU_TABLE_FREE=y
# RCU Debugging
# CONFIG_PROVE_RCU is not set
# CONFIG_RCU_PERF_TEST is not set
CONFIG_RCU_TORTURE_TEST=m
CONFIG_RCU_CPU_STALL_TIMEOUT=60
# CONFIG_RCU_TRACE is not set
# CONFIG_RCU_EQS_DEBUG is not set

and rcu_nocbs=0-3 on kernel command line was enough.



Thanks.  Tried that, but it did not help.  I also tried logging
output from top and sensors every 5 seconds (that took two goes,
first time I fubar'd and emptied the log on every pass through the
loop).  I noticed that 5 seconds before the crash, one core was
running at just over 3.7 GHz (boosted) and the others were slowing
down.


What I did is just a workaround for the freezing on idle problem and I
wrote it just to answer to your question. It was no way a suggestion
to solve your problem. As I wrote in my previous reply, I would suspect
not enough voltage fed to the cpu and/or to ram, causing the reboots.


So I decided to try turning of c6 states anyway, using zenstates.py
(it's on github with some other zen stuff) - bad mistake : the Xorg
display went blank while I was watching, but the box did not reboot
and neither of the front-panel switches (reboot, on/off) worked.

For the moment I'm going to step away from this (I've got something
else to install and test here).

ĸen



--
Thanos
--
http://lists.linuxfromscratch.org/listinfo/blfs-dev
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page


Re: [blfs-dev] Anyone building rustc on a ryzen ?

2018-04-08 Thread Ken Moffat
On Sun, Apr 08, 2018 at 12:11:39PM +0300, Thanos Baloukas wrote:
> On 08/04/2018 06:30 πμ, Ken Moffat wrote:
> > On Sun, Apr 08, 2018 at 04:01:11AM +0300, Thanos Baloukas wrote:
> > > On 07/04/2018 11:57 μμ, Ken Moffat wrote:
> > > > 
> > > 
> > > The only problem the machine had was freezing on idle. In the beginning
> > > I disabled the cool 'n quiet in the bios and that solved the problem.
> > > Then I found this bug report,
> > > 
> > > https://bugzilla.kernel.org/show_bug.cgi?id=196683
> > > 
> > 
> > I've been running cool 'n quiet without any problems (apart from
> > this) and using lm_sensors I can see CPU frequencies at 3592 MHz on
> > all cores, or a bit above 3600 on only one core - for a nominal max
> > frequency of 3500 - and much lower frequencies when less-busy)
> > 
> > To be clear, you turned off C6 if that is available as a bios
> > option, OR added CONFIG_RCU_NO_CB_CPU and added a boot argument of
> > rcu_nocbs=0-3 ?
> > 
> Didn't play with C6 at all. Using a kernel with
> 
> # RCU Subsystem
> CONFIG_TREE_RCU=y
> # CONFIG_RCU_EXPERT is not set
> CONFIG_SRCU=y
> CONFIG_TREE_SRCU=y
> CONFIG_TASKS_RCU=y
> CONFIG_RCU_STALL_COMMON=y
> CONFIG_RCU_NEED_SEGCBLIST=y
> CONFIG_RCU_NOCB_CPU=y
> CONFIG_HAVE_RCU_TABLE_FREE=y
> # RCU Debugging
> # CONFIG_PROVE_RCU is not set
> # CONFIG_RCU_PERF_TEST is not set
> CONFIG_RCU_TORTURE_TEST=m
> CONFIG_RCU_CPU_STALL_TIMEOUT=60
> # CONFIG_RCU_TRACE is not set
> # CONFIG_RCU_EQS_DEBUG is not set
> 
> and rcu_nocbs=0-3 on kernel command line was enough.
> 

Thanks.  Tried that, but it did not help.  I also tried logging
output from top and sensors every 5 seconds (that took two goes,
first time I fubar'd and emptied the log on every pass through the
loop).  I noticed that 5 seconds before the crash, one core was
running at just over 3.7 GHz (boosted) and the others were slowing
down.

So I decided to try turning of c6 states anyway, using zenstates.py
(it's on github with some other zen stuff) - bad mistake : the Xorg
display went blank while I was watching, but the box did not reboot
and neither of the front-panel switches (reboot, on/off) worked.

For the moment I'm going to step away from this (I've got something
else to install and test here).

ĸen
-- 
In my seventh decade astride this planet, and as my own cells degrade,
there are some things I cannot do now: skydiving, marathon running,
calculus. I couldn't do them in my 20s either, so no big loss.
-- Derek Smalls, formerly of Spinal Tap
-- 
http://lists.linuxfromscratch.org/listinfo/blfs-dev
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page


Re: [blfs-dev] Anyone building rustc on a ryzen ?

2018-04-08 Thread Thanos Baloukas

On 08/04/2018 06:30 πμ, Ken Moffat wrote:

On Sun, Apr 08, 2018 at 04:01:11AM +0300, Thanos Baloukas wrote:

On 07/04/2018 11:57 μμ, Ken Moffat wrote:




The only problem the machine had was freezing on idle. In the beginning
I disabled the cool 'n quiet in the bios and that solved the problem.
Then I found this bug report,

https://bugzilla.kernel.org/show_bug.cgi?id=196683



I've been running cool 'n quiet without any problems (apart from
this) and using lm_sensors I can see CPU frequencies at 3592 MHz on
all cores, or a bit above 3600 on only one core - for a nominal max
frequency of 3500 - and much lower frequencies when less-busy)

To be clear, you turned off C6 if that is available as a bios
option, OR added CONFIG_RCU_NO_CB_CPU and added a boot argument of
rcu_nocbs=0-3 ?


Didn't play with C6 at all. Using a kernel with

# RCU Subsystem
CONFIG_TREE_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
CONFIG_TREE_SRCU=y
CONFIG_TASKS_RCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_NEED_SEGCBLIST=y
CONFIG_RCU_NOCB_CPU=y
CONFIG_HAVE_RCU_TABLE_FREE=y
# RCU Debugging
# CONFIG_PROVE_RCU is not set
# CONFIG_RCU_PERF_TEST is not set
CONFIG_RCU_TORTURE_TEST=m
CONFIG_RCU_CPU_STALL_TIMEOUT=60
# CONFIG_RCU_TRACE is not set
# CONFIG_RCU_EQS_DEBUG is not set

and rcu_nocbs=0-3 on kernel command line was enough.


Mine has a recent motherboard, the bios seems to be a few versions
on from what was initially used.



There is a new bios version for ryzen 2 for my motherboard, but I
thought it would be better to not update now, since the system is in a
good state and the update could make things worst. Ryzen 2 comes out in
a few weeks and I'll do the update (if and) when I'll get the new cpu.



Yes, RAM variations seem to have caused problems (although recent
bios updates are supposed to help).  Mine is 2666 but only 2x4GB.



With 2x16, while compiling with all cores I can do everything with no
lug at all.



For that, *many* thanks - it confirms the problem is only when
running the tests (or rather, there is no problem with building
or running firefox ;)



Yes, firefox-59.0.2 built with rust-1.25 runs fine.


I suppose that the segfaults, and reboots, might be down to a more
machine-specific issue.



Possibly related to not enough voltage fed to the cpu, which can be
addressed with bios updates. There are reports that overclocking the cpu
or increasing the voltages slightly, improves things, but as I said I'm
intending to upgrade soon, so I did not test anything for the moment.



Maths seems to still not be a strong point for the overall total :)


I noticed that too, but did not search further.


I'll also look at C6.  And I might have to leave it running even if
there are traps, to see if it reboots.

Thanks.


Thank you for your valuable contribution to LFS.

--
Thanos
--
http://lists.linuxfromscratch.org/listinfo/blfs-dev
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page


Re: [blfs-dev] Anyone building rustc on a ryzen ?

2018-04-07 Thread Ken Moffat
On Sun, Apr 08, 2018 at 04:01:11AM +0300, Thanos Baloukas wrote:
> On 07/04/2018 11:57 μμ, Ken Moffat wrote:
> > 
> > Every time I run the tests, within a couple of minutes I get two
> > traps for invalid opcodes, followed a few minutes later by two
> > segfaults (noted by reading syslog, the tests seem to continue as
> > normal).  But if I leave the tests running, about 20 to 30 minutes
> > later (before the end of the tests, according to what got written
> > to the log) the machine reboots.  These are all that got logged the
> > first time, nothingrelated to the reboot.
> > 
> > Apr  6 06:37:48 origin kernel: [ 3695.201558] traps: backtrace.stage[17714] 
> > trap invalid opcode ip:7f1e76696498 sp:7ffc6848af80 error:0 in 
> > libstd-23815cc482a70678.so[7f1e7663c000+155000]
> > Apr  6 06:37:48 origin kernel: [ 3695.231521] traps: backtrace.stage[17720] 
> > trap invalid opcode ip:7f76e0ff0498 sp:7ffc2bf46850 error:0 in 
> > libstd-23815cc482a70678.so[7f76e0f96000+155000]
> > Apr  6 06:41:17 origin kernel: [ 3903.999612] segfault-no-out[31250]: 
> > segfault at 0 ip 555aa6cd0b69 sp 7ffeb0d96fa0 error 6 in 
> > segfault-no-out-of-stack.stage2-x86_64-unknown-linux-gnu[555aa6ccd000+5000]
> > Apr  6 06:41:23 origin kernel: [ 3910.633482] signal-exit-sta[32055]: 
> > segfault at 1 ip 55afe3b44efc sp 7fffc985f660 error 6 in 
> > signal-exit-status.stage2-x86_64-unknown-linux-gnu[55afe3b42000+4000]
> > Apr  6 06:41:26 origin kernel: [ 3913.231883] traps: simd-target-fea[32330] 
> > trap invalid opcode ip:55b72481e89c sp:7fff97757750 error:0 in 
> > simd-target-feature-mixup.stage2-x86_64-unknown-linux-gnu[55b72481b000+7000]

> I was going to get an 8 core 16 thread Ryzen, but after the problems
> they exhibited I decided to get a 4 core 4 thread 3 1200 for now,
> and upgrade when Ryzen 2 comes out, be tested and proved to be stable.
> 
> I knew that I needed a host with latest gcc to build LFS, so I used Fedora
> 27 with gcc-7.3. Everything went smoothly, except some core
> dumps during the gcc tests. I ran the tests only for the recommended
> by the book packages.
> 
> The only problem the machine had was freezing on idle. In the beginning
> I disabled the cool 'n quiet in the bios and that solved the problem.
> Then I found this bug report,
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=196683
> 

I've been running cool 'n quiet without any problems (apart from
this) and using lm_sensors I can see CPU frequencies at 3592 MHz on
all cores, or a bit above 3600 on only one core - for a nominal max
frequency of 3500 - and much lower frequencies when less-busy)

To be clear, you turned off C6 if that is available as a bios
option, OR added CONFIG_RCU_NO_CB_CPU and added a boot argument of
rcu_nocbs=0-3 ?

Mine has a recent motherboard, the bios seems to be a few versions
on from what was initially used.

I've not had freezes while idle - even the start of the rust
testsuites is far from idle (some compiles) - but it's another
avenue to explore.

> enabled cool 'n quiet, tried the proposed RCU solution and it worked.
> Since then the system is rock stable. Cpu at stock speed, 32GB ram at
> rated 3200MHz. Ram speed is not that important for our use, but I
> favored that ram kit because it is one of the most compatible with
> Ryzen according many testers' and users' reports.
> 

Yes, RAM variations seem to have caused problems (although recent
bios updates are supposed to help).  Mine is 2666 but only 2x4GB.

> On my svn installation I have rust-1.25.0 and used that to build
> firefox-59.0.2. When I read your mail I installed gdb and rebuilt
> to run the tests. No reboots. About 15 tests dumped core. Iindicatively:
> 
> systemd-coredump[2185]: Process 2172 (issue-24313.sta) of user 1001 dumped
> core.
> 
> Stack trace of thread 2183:
> #0  0x7f7f8e582e5b __GI_raise (libc.so.6)
> #1  0x7f7f8e584211 __GI_abort (libc.so.6)
> #2  0x7f7f8edb2339 _ZN3std3sys4unix14abort_internal17he99af7afd6098d88E
> (libstd-23815cc482a70678.so)
> #3  0x7f7f8edc6a8b _ZN3std10sys_common4util5abort17h240ecd99fa33591bE
> (libstd-23815cc482a70678.so)
> #4  0x7f7f8ed976a2 rust_panic (libstd-23815cc482a70678.so)
> #5  0x7f7f8ed975c2
> _ZN3std9panicking20rust_panic_with_hook17hfcd2a02d3b761d28E
> (libstd-23815cc482a70678.so)
> #6  0x5602b2dc24c4 n/a 
> (/build/tmp/rustc-1.25.0-src/build/x86_64-unknown-linux-gnu/test/run-pass/issue-24313.stage2-x86_64-unknown-linux-gnu)
> 
> 3 traps:
> 
> Απρ 08 02:17:50 kernel: traps: backtrace.stage[11039] trap invalid opcode
> ip:7fd12c93a488 sp:7ffc99eb6230 error:0 in
> libstd-23815cc482a70678.so[7fd12c8e+154000]
> 
> Απρ 08 02:17:50 kernel: traps: backtrace.stage[11079] trap invalid opcode
> ip:7f8652d97488 sp:7ffc5aae6000 error:0 in
> libstd-23815cc482a70678.so[7f8652d3d000+154000]
> 
> Απρ 08 02:22:03 kernel: traps: simd-target-fea[25911] trap invalid opcode
> ip:56325971a88c sp:7ffe1a3878d0 error:0 in
> 

Re: [blfs-dev] Anyone building rustc on a ryzen ?

2018-04-07 Thread Thanos Baloukas

On 07/04/2018 11:57 μμ, Ken Moffat wrote:

Does anyone here use a ryzen ?  I got a 1300X recently, and building
current rustc - and using that to build firefox - seems to work
fine.  BUT - I had not bothered to run the rustc testsuite because
it takes so long.

But those initial builds of rustc were to install it in /opt -
seemed good enough, so I decided to install it in /usr and to see
how the test results looked.  Disaster.

Every time I run the tests, within a couple of minutes I get two
traps for invalid opcodes, followed a few minutes later by two
segfaults (noted by reading syslog, the tests seem to continue as
normal).  But if I leave the tests running, about 20 to 30 minutes
later (before the end of the tests, according to what got written
to the log) the machine reboots.  These are all that got logged the
first time, nothingrelated to the reboot.

Apr  6 06:37:48 origin kernel: [ 3695.201558] traps: backtrace.stage[17714] 
trap invalid opcode ip:7f1e76696498 sp:7ffc6848af80 error:0 in 
libstd-23815cc482a70678.so[7f1e7663c000+155000]
Apr  6 06:37:48 origin kernel: [ 3695.231521] traps: backtrace.stage[17720] 
trap invalid opcode ip:7f76e0ff0498 sp:7ffc2bf46850 error:0 in 
libstd-23815cc482a70678.so[7f76e0f96000+155000]
Apr  6 06:41:17 origin kernel: [ 3903.999612] segfault-no-out[31250]: segfault 
at 0 ip 555aa6cd0b69 sp 7ffeb0d96fa0 error 6 in 
segfault-no-out-of-stack.stage2-x86_64-unknown-linux-gnu[555aa6ccd000+5000]
Apr  6 06:41:23 origin kernel: [ 3910.633482] signal-exit-sta[32055]: segfault 
at 1 ip 55afe3b44efc sp 7fffc985f660 error 6 in 
signal-exit-status.stage2-x86_64-unknown-linux-gnu[55afe3b42000+4000]
Apr  6 06:41:26 origin kernel: [ 3913.231883] traps: simd-target-fea[32330] 
trap invalid opcode ip:55b72481e89c sp:7fff97757750 error:0 in 
simd-target-feature-mixup.stage2-x86_64-unknown-linux-gnu[55b72481b000+7000]

Now that I know it will reboot, I've spend some hours trying to work
around this (I'm assuming it uses opcodes not valid on Ryzen, but
for somiething which is only exercised by the testsuite) - failed
completely - and eventually raised an issue.

However, I've now looked at rustc-1.22.1 and that too gets the same
two traps.  so, I'm thinking that I should put a big warning in the
book to not run the tests on a Ryzen.

But confirmation that the tests segfault on somebody else's ryzen
would be conclusive, which is why I'm asking: has anyone already run
the rustc tests on a ryzen ?  If not, is anybody willing to give it
a go on a ryzen ?  Waiting for it to reboot is not required, just
keep an eye on the system log or journal and stop the tests if the
segfaults occur.

My machine otherwise seems stable (I replaced the nvidia graphics
card with a low-nd radeon because nouveau frequently locked up X,
since then it has seemed fine apart from this.  And no, I'm not
overclocking, the CPU seems to run cool (maximum I've seen was low
50s degrees).


I was going to get an 8 core 16 thread Ryzen, but after the problems
they exhibited I decided to get a 4 core 4 thread 3 1200 for now,
and upgrade when Ryzen 2 comes out, be tested and proved to be stable.

I knew that I needed a host with latest gcc to build LFS, so I used 
Fedora 27 with gcc-7.3. Everything went smoothly, except some core

dumps during the gcc tests. I ran the tests only for the recommended
by the book packages.

The only problem the machine had was freezing on idle. In the beginning
I disabled the cool 'n quiet in the bios and that solved the problem.
Then I found this bug report,

https://bugzilla.kernel.org/show_bug.cgi?id=196683

enabled cool 'n quiet, tried the proposed RCU solution and it worked.
Since then the system is rock stable. Cpu at stock speed, 32GB ram at
rated 3200MHz. Ram speed is not that important for our use, but I
favored that ram kit because it is one of the most compatible with
Ryzen according many testers' and users' reports.

On my svn installation I have rust-1.25.0 and used that to build
firefox-59.0.2. When I read your mail I installed gdb and rebuilt
to run the tests. No reboots. About 15 tests dumped core. Iindicatively:

systemd-coredump[2185]: Process 2172 (issue-24313.sta) of user 1001 
dumped core.


Stack trace of thread 2183:
#0  0x7f7f8e582e5b __GI_raise (libc.so.6)
#1  0x7f7f8e584211 __GI_abort (libc.so.6)
#2  0x7f7f8edb2339 
_ZN3std3sys4unix14abort_internal17he99af7afd6098d88E 
(libstd-23815cc482a70678.so)
#3  0x7f7f8edc6a8b 
_ZN3std10sys_common4util5abort17h240ecd99fa33591bE 
(libstd-23815cc482a70678.so)

#4  0x7f7f8ed976a2 rust_panic (libstd-23815cc482a70678.so)
#5  0x7f7f8ed975c2 
_ZN3std9panicking20rust_panic_with_hook17hfcd2a02d3b761d28E 
(libstd-23815cc482a70678.so)
#6  0x5602b2dc24c4 n/a 
(/build/tmp/rustc-1.25.0-src/build/x86_64-unknown-linux-gnu/test/run-pass/issue-24313.stage2-x86_64-unknown-linux-gnu)


3 traps:

Απρ 08 02:17:50 kernel: traps: backtrace.stage[11039] trap invalid 
opcode ip:7fd12c93a488 sp:7ffc99eb6230