Re: package building performance (was: Re: FreeBSD on AMD Epyc boards)

2018-02-17 Thread Rainer Duffner


> Am 17.02.2018 um 10:09 schrieb Don Lewis :
> 
> It is unfortunate that there don't seem to be any server-grade Ryzen
> motherboards.  They all seem to be gamer boards with a lot of
> unnecessary bling.



That’s because few people use servers to build packages.

Increasingly, all the other things related to a server are becoming important 
(fast memory, fast networking, fast I/O) and because everything else is 
expensive, it’s simply not economical to skimp on the CPU when everything else 
(SSD, 40GB switch ports, rack space etc.pp.) costs the same.


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: package building performance (was: Re: FreeBSD on AMD Epyc boards)

2018-02-17 Thread Don Lewis
On 14 Feb, Mark Linimon wrote:
> On Wed, Feb 14, 2018 at 09:15:53AM +0100, Kurt Jaeger wrote:
>> On the plus side: 16+16 cores, on the minus: A low CPU tact of 2.2 GHz.
>> Would a box like this be better for a package build host instead of 4+4 cores
>> with 3.x GHz ?
> 
> In my experience, "it depends".
> 
> I think that above a certain number of cores, I/O will dominate.  I _think_;
> I have never done any metrics on any of this.
> 
> The dominant term of the equation is, as you might guess, RAM.  Previous
> experience suggests that you need at least 2GB per build.  By default,
> nbuilds is set equal to ncores.  Less than 2GB-per and you're going to be
> unhappy.
> 
> (It's true that for modern systems, where large amounts of RAM are standard,
> that this is probably no longer a concern.)
> 
> Put it this way: with 4 cores and 16GB and netbooting (7GB of which was
> devoted to md(4)), I was having lots of problems on powerpc64.  The same
> machine with 64GB gives me no problems.
> 
> My guess is that after RAM, there is I/O, ncores, and speed.  But I'm just
> speculating.

I've been configuring 4 GB per builder, so on my 8-core 16-thread Ryzen
machine, that means 64 GB of RAM.  I also set USE_TMPS to "wrkdir data
localbase" in poudriere.conf, so I'm leaning pretty heavily on RAM.  I do
figure that zfs clone is more efficient than tmpfs for the builder
jails.  With this configuration, building my default set of ports is
pretty much CPU-bound.  When it starts building the the larger ports
that need a lot of space for WRKDIR, like openoffice-4,
openoffice-devel, libreoffice, chromium, etc. the machine does end up
using a lot of swap space, but it is mostly dead data from the wrkdirs,
so generally there isn't a lot of paging activity.  I also have
ALLOW_MAKE_JOBS=yes to load up the CPUs a bit more, though I did get
the best results with MAKE_JOBS_NUMBER=7 building my default port set on
this machine.  The hard drive is a fairly old WD Green that I removed
from one of my other machines, and it is plenty fast enough to keep CPU
idle % at or near zero most of the time during the build run.

I did just try out "poudriere bulk -a" on this machine to build ports
for 11.1-RELEASE amd64 and got these results:

[111amd64-default] [2018-02-14_23h40m24s] [committing:] Queued: 29787 Built: 
29277 Failed: 59Skipped: 112   Ignored: 339   Tobuild: 0  Time: 47:39:48

I did notice some periods of high idle CPU during this run, but a lot
of that was due to a bunch of the builders in the fetch state at the
same time.  Without that, the runtime would have been lower.  On the
other hand, some ports failed due to a gmake issue, and others looked
like they failed due to having problems with ALLOW_MAKE_JOBS=yes.  The
runtime would have been higher without those problems.

As far as Epyc goes, I think the larger core count would win.  A lot
depends on how effective cache is for this workload, so it would be
interesting to plot poudriere run time vs. clock speed.  If cache misses
dominate execution time, then lowering the clock speed would not hurt
that much.  Something important to keep in mind with Threadripper and
Epync is NUMA.  For best results, all of the memory channels should be
used and the work should be distributed so that the processes on each
core primarily access RAM local to that CPU die.  If this isn't the case
then the infinity fabric that connects all of the CPU die will be the
bottleneck.  The lower core clock speed on Epyc lessens that penalty,
but it is still something to be avoided if possible.

Something else to consider is price/performance.  If you want to build
packages for four OS/arch combinations, then doing it in parallel on
four Ryzen machines is likely to be both cheaper and faster than doing
the same builds sequentially on an Epyc machine with 4x the core count
and RAM.

It is unfortunate that there don't seem to be any server-grade Ryzen
motherboards.  They all seem to be gamer boards with a lot of
unnecessary bling.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: FreeBSD on AMD Epyc boards

2018-02-14 Thread Mike Tancsa
On 2/14/2018 3:15 AM, Kurt Jaeger wrote:
> 
> On the plus side: 16+16 cores, on the minus: A low CPU tact of 2.2 GHz.
> Would a box like this be better for a package build host instead of 4+4 cores
> with 3.x GHz ?

jail server.  Lots of processes

---Mike

> 


-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


package building performance (was: Re: FreeBSD on AMD Epyc boards)

2018-02-14 Thread Mark Linimon
On Wed, Feb 14, 2018 at 09:15:53AM +0100, Kurt Jaeger wrote:
> On the plus side: 16+16 cores, on the minus: A low CPU tact of 2.2 GHz.
> Would a box like this be better for a package build host instead of 4+4 cores
> with 3.x GHz ?

In my experience, "it depends".

I think that above a certain number of cores, I/O will dominate.  I _think_;
I have never done any metrics on any of this.

The dominant term of the equation is, as you might guess, RAM.  Previous
experience suggests that you need at least 2GB per build.  By default,
nbuilds is set equal to ncores.  Less than 2GB-per and you're going to be
unhappy.

(It's true that for modern systems, where large amounts of RAM are standard,
that this is probably no longer a concern.)

Put it this way: with 4 cores and 16GB and netbooting (7GB of which was
devoted to md(4)), I was having lots of problems on powerpc64.  The same
machine with 64GB gives me no problems.

My guess is that after RAM, there is I/O, ncores, and speed.  But I'm just
speculating.

mcl
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: FreeBSD on AMD Epyc boards

2018-02-14 Thread Kurt Jaeger
Hi!

> To have a bit of a work around for the Intel Meltdown bug (yes, no
> Spectre), I wanted to try out some AMD based CPUs.  So far so good using
> a SuperMicro H11SSL-i.  A decent server board using an Epyc CPU.  All
> the things you need and expect for a server grade MB

On the plus side: 16+16 cores, on the minus: A low CPU tact of 2.2 GHz.
Would a box like this be better for a package build host instead of 4+4 cores
with 3.x GHz ?

-- 
p...@opsec.eu+49 171 3101372 2 years to go !
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


FreeBSD on AMD Epyc boards

2018-02-13 Thread Mike Tancsa
To have a bit of a work around for the Intel Meltdown bug (yes, no
Spectre), I wanted to try out some AMD based CPUs.  So far so good using
a SuperMicro H11SSL-i.  A decent server board using an Epyc CPU.  All
the things you need and expect for a server grade MB

ipmi to provide remote management (SoL to BIOS and OS) and hardware
info.  The ipmi.ko driver works great with RELENG11.  This allows for
hardware watchdog support too.

amdtemp from CURRENT also works to provide cpu temp info, although
ipmitool does this as well.

Seems to sit at about 50W when idle and tops out at about 180W with 2
SSD drives and all cores busy.


Note, the bug in
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225584
and discussed in
https://reviews.freebsd.org/D14347
manifests itself fairly readily. However, the patch fixes the problem

Attached is some dmesg info for the curious.  Seems like a decent board
for FreeBSD and we are going to start deploying in a couple of spots
once we do some more burn in and testing.

---Mike





-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
Copyright (c) 1992-2018 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 11.1-STABLE #0 r329241M: Tue Feb 13 16:20:58 EST 2018
mdtan...@epyc-bsd.sentex.ca:/usr/obj/usr/src/sys/server amd64
FreeBSD clang version 5.0.1 (tags/RELEASE_501/final 320880) (based on LLVM 
5.0.1)
SRAT: No memory found for CPU 0
VT(vga): resolution 640x480
CPU: AMD EPYC 7281 16-Core Processor (2100.05-MHz K8-class CPU)
  Origin="AuthenticAMD"  Id=0x800f12  Family=0x17  Model=0x1  Stepping=2
  
Features=0x178bfbff
  
Features2=0x7ed8320b
  AMD Features=0x2e500800
  AMD 
Features2=0x35c233ff
  Structured Extended 
Features=0x209c01a9
  XSAVE Features=0xf
  AMD Extended Feature Extensions ID EBX=0x7
  SVM: NP,NRIP,VClean,AFlush,DAssist,NAsids=32768
  TSC: P-state invariant, performance statistics
real memory  = 34359738368 (32768 MB)
avail memory = 33215496192 (31676 MB)
Event timer "LAPIC" quality 100
ACPI APIC Table: < >
FreeBSD/SMP: Multiprocessor System Detected: 32 CPUs
FreeBSD/SMP: 1 package(s) x 16 core(s) x 2 hardware threads
random: unblocking device.
ioapic0: Changing APIC ID to 128
ioapic1: Changing APIC ID to 129
ioapic2: Changing APIC ID to 130
ioapic3: Changing APIC ID to 131
ioapic4: Changing APIC ID to 132
ioapic0  irqs 0-23 on motherboard
ioapic1  irqs 24-55 on motherboard
ioapic2  irqs 56-87 on motherboard
ioapic3  irqs 88-119 on motherboard
ioapic4  irqs 120-151 on motherboard
SMP: AP CPU #27 Launched!
SMP: AP CPU #19 Launched!
SMP: AP CPU #11 Launched!
SMP: AP CPU #9 Launched!
SMP: AP CPU #13 Launched!
SMP: AP CPU #29 Launched!
SMP: AP CPU #7 Launched!
SMP: AP CPU #5 Launched!
SMP: AP CPU #4 Launched!
SMP: AP CPU #26 Launched!
SMP: AP CPU #28 Launched!
SMP: AP CPU #31 Launched!
SMP: AP CPU #8 Launched!
SMP: AP CPU #23 Launched!
SMP: AP CPU #21 Launched!
SMP: AP CPU #22 Launched!
SMP: AP CPU #20 Launched!
SMP: AP CPU #17 Launched!
SMP: AP CPU #18 Launched!
SMP: AP CPU #16 Launched!
SMP: AP CPU #6 Launched!
SMP: AP CPU #30 Launched!
SMP: AP CPU #10 Launched!
SMP: AP CPU #12 Launched!
SMP: AP CPU #25 Launched!
SMP: AP CPU #24 Launched!
SMP: AP CPU #3 Launched!
SMP: AP CPU #1 Launched!
SMP: AP CPU #2 Launched!
SMP: AP CPU #15 Launched!
SMP: AP CPU #14 Launched!
Timecounter "TSC" frequency 2100049917 Hz quality 1000
random: entropy device external interface
netmap: loaded module
module_register_init: MOD_LOAD (vesa, 0x80cf13f0, 0) error 19
random: registering fast source Intel Secure Key RNG
random: fast provider: "Intel Secure Key RNG"
kbd1 at kbdmux0
nexus0
vtvga0:  on motherboard
cryptosoft0:  on motherboard
aesni0:  on motherboard
acpi0:  on motherboard
acpi0: Power Button (fixed)
cpu0:  on acpi0
cpu1:  on acpi0
cpu2:  on acpi0
cpu3:  on acpi0
cpu4:  on acpi0
cpu5:  on acpi0
cpu6:  on acpi0
cpu7:  on acpi0
cpu8:  on acpi0
cpu9:  on acpi0
cpu10:  on acpi0
cpu11:  on acpi0
cpu12:  on acpi0
cpu13:  on acpi0
cpu14:  on acpi0
cpu15:  on acpi0
cpu16:  on acpi0
cpu17:  on acpi0
cpu18:  on acpi0
cpu19:  on acpi0
cpu20:  on acpi0
cpu21:  on acpi0
cpu22:  on acpi0
cpu23:  on acpi0
cpu24:  on acpi0
cpu25:  on acpi0
cpu26:  on acpi0