On Mon, Apr 14, 2014 at 2:46 PM, Nick Eubank <nickeub...@gmail.com> wrote:

> Any rules of thumb for work_mem, maintenance_work_mem, shared_buffer,
> etc. for a database that DOESN'T anticipate concurrent connections and that
> is doing lots of aggregate functions on large tables? All the advice I
> can find online on tuning 
> (this<http://wiki.postgresql.org/wiki/Performance_Optimization>
> , 
> this<http://media.revsys.com/talks/djangocon/2011/secrets-of-postgresql-performance.pdf>
> , this <http://www.revsys.com/writings/postgresql-performance.html> etc.)
> is written for people anticipating lots of concurrent connections.
>
> I'm a social scientist looking to use Postgres not as a database to be
> shared by multiple users, but rather as my own tool for manipulating a
> massive data set (I have 5 billion transaction records (600gb in csv) and
> want to pull out unique user pairs, estimate aggregates for individual
> users, etc.). This also means almost no writing, except to creation of new
> tables based on selections from the main table.
>
> I'm on a Windows 8 VM with 16gb ram, SCSI VMware HD, and 3 cores if that's
> important.
>

I'd go with a small shared_buffers, like 128MB, and let the OS cache as
much as possible.  This minimizes the amount of double buffering.

And set work_mem to about 6GB, then bump it up if that doesn't seem to
cause problems.

In the scenario you describe, it is probably no big deal if you guess too
high.  Monitor the process, if it it starts to go nuts just kill it and
start again with a lower work_mem.  If it is a single user system, you can
afford to be adventurous.

If you need to build indexes, you should bump up maintenance_work_mem, but
I just would do that in the local session not system wide.

Cheers,

Jeff

Reply via email to