Donald,

> This whole issue of data caching is  a troubling issue with postreSQL
> in that even if you ran postgreSQL on a 64 bit address space
> with larger number of CPUs you won't see much of a scale up
> and possibly even a drop. 

Since when?   Barring the context switch bug, you're not going to get a 
drop with more processors/more RAM.

You may fail to get any gain, though.  If your database is only 100MB in 
size, having 11G of cache space isn't going to help you much over having 
only 1G.   

> I am not alone in having the *expectation* 
> that a database should have some cache size parameter and
> the option to skip the file system.  

Sure, because that's the conventional wisdom, as writ by Oracle.  However, 
this comes with substantial code maintenance costs and portability 
limitations which have to be measured against any gain in performance.  

> If I use oracle, sybase, mysql 
> and maxdb they all have the ability to size a data cache and move
> to 64 bits.

And yet, we regularly outperform Sybase and MySQL on heavy OLTP loads on 
commodity x86 hardware.    So apparently DB caching isn't everything.  ;-)

I'm not saying that it's not worth testing larger database caches -- even 
taking over most of RAM -- on high-speed systems.   In fact, I'm working 
on doing that kind of test now.  However, barring test results, we can't 
assume that taking over RAM and the FS cache would have a substantial 
performance benefit; that remains to be shown.

The other thing is that we've had, and continue to have, low-hanging fruit 
which have a clear and measurable effect on performance and are fixable 
without bloating the PG code.  Some of these issues (COPY path, context 
switching, locks, GiST concurrency, some aggregates) have been addressed 
in the 8.1 code; some remain to be addressed (sorts, disk spill, 64-bit 
sort mem, other aggregates, index-only access, etc.).   Why tackle a huge, 
250-hour project which could fail when a 20-hour patch is more likely to 
provide the same performance benefit?  

We have the same discussion (annually) about mmap.  Using mmap *might* 
provide us with a huge performance boost.  However, it would *definitely* 
require 300hours (or more) of programmer time to test properly, and might 
not benefit us at all.

Of course, if *you* want to work on large database cache improvements, be 
my guest ... it's an open source project!  Submit your patches!  I'll be 
happy to test them.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org

Reply via email to