> I realize the excessive-context-switching-on-xeon issue has been
> discussed at length in the past, but I wanted to follow up and verify my
> conclusion from those discussions:

First off, the good news: Gavin Sherry  and OSDL may have made some progress 
on this.   We'll be testing as soon as OSDL gets the Scalable Test Platform 
running again.   If you have the CS problem (which I don't think you do, see 
below) and a test box, I'd be thrilled to have you test it.

> On a 2-way or 4-way Xeon box, there is no way to avoid excessive
> (30,000-60,000 per second) context switches when using PostgreSQL 7.4.5
> to query a data set small enough to fit into main memory under a
> significant load.

Hmmm ... some clarification:
1) I don't really consider a CS of 30,000 to 60,000 on Xeon to be excessive.  
People demonstrating the problem on dual or quad Xeon reported CS levels of 
150,000 or more.    So you probably don't have this issue at all -- depending 
on the load, your level could be considered "normal".

2) The problem is not limited to Xeon, Linux, or x86 architecture.    It has 
been demonstrated, for example, on 8-way Solaris machines.    It's just worse 
(and thus more noticable) on Xeon.

> I am experiencing said symptom on two different dual-Xeon boxes, both
> Dells with ServerWorks chipsets, running the latest RH9 and RHEL3
> kernels, respectively. The databases are 90% read, 10% write, and are
> small enough to fit entirely into main memory, between pg shared buffers
> and kernel buffers.

Ah.  Well, you do have the worst possible architecture for PostgreSQL-SMP 
performance.   The ServerWorks chipset is badly flawed (the company is now, I 
believe, bankrupt from recalled products) and Xeons have several performance 
issues on databases based on online tests.

> We recently invested in an solid-state storage device
> ( to help write
> performance. Our entire pg data directory is stored on it. Regrettably
> (and in retrospect, unsurprisingly) we found that opening up the I/O
> bottleneck does little for write performance when the server is under
> load, due to the bottleneck created by excessive context switching. 

Well, if you're CPU-bound, improved I/O won't help you, no.

> Is 
> the only solution then to move to a different SMP architecture such as
> Itanium 2 or Opteron? If so, should we expect to see an additional
> benefit from running PostgreSQL on a 64-bit architecture, versus 32-bit,
> context switching aside? 

Your performance will almost certainly be better for a variety of reasons on 
Opteron/Itanium.    However, I'm still not convinced that you have the CS 

> Alternatively, are there good 32-bit SMP 
> architectures to consider other than Xeon, given the high cost of
> Itanium 2 and Opteron systems?

AthalonMP appears to be less suseptible to the CS bug than Xeon, and the 
effect of the bug is not as severe.   However, a quad-Opteron box can be 
built for less than $6000; what's your standard for "expensive"?   If you 
don't have that much money, then you may be stuck for options.

> More generally, how have others scaled "up" their PostgreSQL
> environments? We will eventually have to invent some "outward"
> scalability within the logic of our application (e.g. do read-only
> transactions against a pool of Slony-I subscribers), but in the short
> term we still have an urgent need to scale upward. Thoughts? General
> wisdom?

As long as you're on x86, scaling outward is the way to go.   If you want to 
continue to scale upwards, ask Andrew Sullivan about his experiences running 
PostgreSQL on big IBM boxes.   But if you consider an quad-Opteron server 
expensive, I don't think that's an option for you.

Overall, though, I'm not convinced that you have the CS bug and I think it's 
more likely that you have a few "bad queries" which are dragging down the 
whole system.    Troubleshoot those and your CPU-bound problems may go away.

Josh Berkus
Aglio Database Solutions
San Francisco

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Reply via email to