Re: Intel vs AMD DragonFly 2.11 parallel kernel build tests

2011-05-12 Thread Matthew Dillon
Here is a fun statistic.  For running a server 24x7 how many days
do you have to run the Intel i7 vs the phenom II to make up for the
$100 difference in the price tag?

Using a generous 65W for the AMD and 33W for the intel, assuming
a mostly idle server, and $0.25/kWh, you get $0.192/day savings
with the i7.  With a $100 difference in price that comes to 520 days.

So if you are running a server 24x7 that is mostly idle, the Intel-i7
pays for its higher price in 520 days (a bit over 1.5 years).

If you are running under load the i7 will pay for its higher price
tag more quickly.  This is ignoring the lack of ECC issue with the
i7 though, and a Xeon system will be more expensive.

-Matt


Intel vs AMD DragonFly 2.11 parallel kernel build tests

2011-05-12 Thread Matthew Dillon
Intel vs AMD DragonFly 2.11 parallel kernel build tests

PhenomIIx6 1090T 3.2GHz w/turbo, not overclocked (6 cores)
Intel-i7 2600K   3.4GHz w/turbo, not overclocked (4 cores, 2x HT)

Tests done with 64-bit kernel, sources fully cached in tmpfs
(i.e. no disk or network activity worth mentioning during the tests)

AMD Intel   Test

71 seconds  50 seconds  Buildkernel -j12 KERNCONF=X86_64_GENERIC \
NO_MODULES=YES
183 seconds 144 seconds Buildkernel -j1 w/same parameters
unavail 33W Watt meter idle (note1)
unavail 92W Watt meter full load (buildkernel -j12) (note1)
247648K 569115K Openssl speed -elapsed -evp aes-128-cbc (note2)
108567K 110798K Openssl speed -elapsed -evp aes-128-ecb (note2)
unavail 6322 Mbits/sCryptotest -a aes 102400 8192
135ns   184ns   System call overhead getuid()

note 1: The i7 box I just built has a Seasonic gold (87%+ efficiency)
400W power supply in it and my PhenonIIx6 has a generic PSU
in it that's probably more around 75-80%.  The phenomII box
eats around 50-60W idle but I don't know how much better it
would be with a good PSU in it, so grain-of-salt.

Still, 33W idle for a high-end Intel consumer box is very
impressive.

note 2: aes-128-cbc on the intel uses the AESNI instructions available
on the SandyBridge.  The -ecb test does not.  The phenom II does
not have these instructions so we can see that cpu-bound core
logic loops are actually fairly close between the two cpus.
These tests were for 8192 byte buffers.

For the cpu tests I build the kernel core without modules, which is a
fully parallel build (building modules is not), so -j12 saturates all
available cpus and tests the turbo fallback.

The -j1 test effectively tests single-core performance for the same
workload, and being just one core it will presumably run at max-turbo
(both the AMD and Intel cpus in this test implement core turbo).


AMD Intel   Simple memory bw test (/usr/src/test/mbwtest.c)
14 GByte/s  18 GByte/s  L2
8.1 GByte/s 14 GBytes/s L3
5.2 GByte/s 11 GByte/s  Main memory

Use DDR3-1333.  Memory timings don't appear to make much of a difference
at all, even going from 9-9-9 to 7-7-7 on the i7 box.

The PhenomIIx7 box is also running w/ECC memory.  There is no ECC option
available for Intel, but I don't think the difference would be all that
great and we already knew that Intel's memory bandwidth was very
impressive on the SandyBridge chips.

Conclusions

* The Intel-i7 2600K crushes the PhenomIIx6 1090T under full parallel
  load (4 cores x 2 hyperthreads each vs 6 cores) by upwards of 30%.

* The Intel also beats the 1090T on the single-core load by 21%

* The Intel Sandybridge cpus have AESNI crypto instructions.  The
  first crypto test (aes-128-cbc) uses those instructions, the second
  does not.  Without the instructions the instruction loops running the
  crypto logic are fairly close between AMD and Intel, and with the
  instructions Intel is 2.3x faster.

  Also note that this is per-cpu core, so we are talking approximately
  6.3 GBits/sec x 4 (at least) for crypted disks, since DragonFly will
  use multiple cores for the crypto.

* Sandybridge likely edges out AMD on power savings now, certainly the
  33W idle consumption is very good.  I don't have any good comparison
  available there because my PSUs are different.  With a crappy PSU the
  AMD test box eats ~50-60W idle.  But even if we give the PSU another
  10% efficiency we are still talking 45W-54W.  Intel is gonna beat it.

* The simple system call overhead test and the non-accelerated crypto
  test shows that AMD does do well in some areas, but the crushing they
  take in the compiler test shows the limitations of on-die caches.

* There is no point running I/O tests.  AMD actually has better support
  for 6GBit/sec SATA-III than Intel on their lower-end offerings from
  a price standpoint.  Either way today's modern cpus have no trouble
  saturating even several SATA-III ports.

There is just one downside to the Intel-i7.  Well, two if you count the
price.  The downside that really gets my goat though is the lack of ECC
memory support on their consumer cpu line.  I mean, COME ON INTEL!  When
you stuff 16G of ram into a consumer box having ECC is probably going to
be a good idea.  Gamers might not care, and most 'consumers' might not
notice, but anyone who car