Title: Re: [PERFORM] Hardware/OS recommendations for large databases (

Last point should be worded: “All CPUs on all machines used by a parallel database”

- Luke

On 11/16/05 9:47 AM, "Luke Lonergan" <[EMAIL PROTECTED]> wrote:


On 11/16/05 9:09 AM, "Scott Marlowe" <[EMAIL PROTECTED]> wrote:

The biggest gain is going from 1 to 2 CPUs (real cpus, like the DC
Opterons or genuine dual CPU mobo, not "hyperthreaded").  Part of the
issue isn't just raw CPU processing power.  The second CPU allows the
machine to be more responsive because it doesn't have to context switch
as much.

While I've seen plenty of single CPU servers start to bog under load
running one big query, the dual CPU machines always seem more than just
twice as snappy under similar loads.

I agree, 2 CPUs are better than one in most cases.

The discussion was kicked off by the suggestion to get 8 dual core CPUs to process a large database with postgres.  Say your decision support query takes 15 minutes to run with one CPU.  Add another and it still takes 15 minutes.  Add 15 and the same ...

OLTP is so different from Business intelligence and Decision Support that very little of this thread’s discussion is relevant IMO.

The job is to design a system that can process sequential scan as fast as possible and uses all resources (CPUs, mem, disk channels) on each query.  Sequential scan is 100x more important than random seeks.

Here are the facts so far:
  • Postgres can only use 1 CPU on each query
  • Postgres I/O for sequential scan is CPU limited to 110-120 MB/s on the fastest modern CPUs
  • Postgres disk-based sort speed is 1/10 or more slower than commercial databases and memory doesn’t improve it (much)

These are the conclusions that follow about decision support / BI system architecture for normal Postgres:
  • I/O hardware with more than 110MB/s of read bandwidth is not useful
  • More than 1 CPU is not useful
  • More RAM than a nominal amount for small table caching is not useful

In other words, big SMP doesn’t address the problem at all.  By contrast, having all CPUs on multiple machines, or even on a big SMP with lots of I/O channels, solves all of the above issues.


- Luke

Reply via email to