Shachar,

There's no single answer to your question; in-order to give you better answer I'll need some further information about your software.
Here's a couple of points that you might find interesting: (I mostly do kernel-level network streaming/filtering work, so YMMV)

* The AMD Opteron *is* the King of the Hill. I found that "my" HP 385 (Opteron 248/250) and older IBM e326 (Opteron 246/248) to be able to outperform a similarly configured (and priced) Xeon 2.8/3.4/3.6 (DL 380, IBM e345) hands down. Highly memory and I/O intensive applications like my own (which spends days btree searching and memXXX-ing itself to death) seem to *greatly* favor the Opteron's on-die memory controller. (compared to the Xeon's traditional north-bridge design). I'm still looking for ways to use the Opteron NUMA support; I *assume* that xxx_alloc_node will further improve performance.

* The dual core option is a true winner. Even the relatively cheap (?!?!) Opteron 265 machine can run circles around a quad Xeon MP machine. (Shared bus designed never really favored > 2 CPU configuration.) At less then 1000$ per 265 CPU, building a dual - dual core workstation / server is pretty inexpensive. (I plan on upgrading my private dual Opteron workstation to dual core once I find someone that's willing to buy my left kidney...)

* The GCC's x86-64 AMD64 optimization favor the Opteron greatly. Only when we optimized our code with -march=nocona we managed to level the playing field a *bit*. Somehow Intel seem to  have skimped a little when it they duplicated the AMD64 (s/EM64T/AMD64/g)
As far as I remember the Debian AMD64 port is using -march=nocona to help the Xeon save face. (Same goes for my FC4/x86-64 machines)

* The Xeon might close the gap if you have highly hyper-theadable code (little or no I/O [including memory I/O] with a lot of integer calculations). In such a (remote?) case, you might actually see a 10-15% gain per socket, maybe even slightly outperforming the Opteron. However, if you plan on using more then two sockets (dual), a shared 400/533Mhz bus doesn't play nice with Hyper-threading enabled. In general I'd stir clear of Hyperthreading on dual - or -above machines.

* Might sound weird... but while working on my previous project we saw instances where an older 2.8Ghz 533Mhz (Prestonia?) Xeon was able to outperform the 3.0Ghz 800Mhz Nocona Xeons. Go figure.

* The Itanium (1.4Ghz, Medison core?) has lousy Integer performance and memory performance. Don't touch it. (Or you'll burn... literally...)

In general I find the Opteron to be the superior platform. But again, we conducted out tests with our software, so YMMV (greatly).

Hope it helps,
Gilboa

On Sun, 2005-08-07 at 14:52 +0300, Shachar Shemesh wrote:
Hi all,

I'm looking into buying a computation server for a client. They are 
looking for the platform that will give them optimal INTEGER 
performance. I'm thinking between the 64Bits - PowerPC, Itanium and the 
EMT64/AMD64 technologies. I am also interested in more specific 
knowledge ("Xeon is better than Athelon" etc.).

Thoughts? Ideas?

Any solution picked will be running Debian Linux (Sarge), and the 
program will likely be compiled with gcc (whatever version will work best).

Thanks,

          Shachar

Reply via email to