Re: [PERFORM] Tuning Postgres for Single connection use

2014-04-15 Thread Nick Eubank
On Mon, Apr 14, 2014 at 5:19 PM, Jeff Janes jeff.ja...@gmail.com wrote:

 On Mon, Apr 14, 2014 at 2:46 PM, Nick Eubank nickeub...@gmail.com wrote:

 Any rules of thumb for work_mem, maintenance_work_mem, shared_buffer,
 etc. for a database that DOESN'T anticipate concurrent connections and that
 is doing lots of aggregate functions on large tables? All the advice I
 can find online on tuning 
 (thishttp://wiki.postgresql.org/wiki/Performance_Optimization
 , 
 thishttp://media.revsys.com/talks/djangocon/2011/secrets-of-postgresql-performance.pdf
 , this http://www.revsys.com/writings/postgresql-performance.html etc.)
 is written for people anticipating lots of concurrent connections.

 I'm a social scientist looking to use Postgres not as a database to be
 shared by multiple users, but rather as my own tool for manipulating a
 massive data set (I have 5 billion transaction records (600gb in csv) and
 want to pull out unique user pairs, estimate aggregates for individual
 users, etc.). This also means almost no writing, except to creation of new
 tables based on selections from the main table.

 I'm on a Windows 8 VM with 16gb ram, SCSI VMware HD, and 3 cores if
 that's important.


 I'd go with a small shared_buffers, like 128MB, and let the OS cache as
 much as possible.  This minimizes the amount of double buffering.

 And set work_mem to about 6GB, then bump it up if that doesn't seem to
 cause problems.

 In the scenario you describe, it is probably no big deal if you guess too
 high.  Monitor the process, if it it starts to go nuts just kill it and
 start again with a lower work_mem.  If it is a single user system, you can
 afford to be adventurous.

 If you need to build indexes, you should bump up maintenance_work_mem, but
 I just would do that in the local session not system wide.

 Cheers,

 Jeff



Quick followup Jeff: it seems that I can't set work_mem above about 1gb
(can't get to 2gb. When I update config, the values just don't change in
SHOW ALL -- integer constraint?). Is there a work around, or should I
tweak something else accordingly?

Thanks!

Nick

(Properly bottom posted this time?)


Re: [PERFORM] Tuning Postgres for Single connection use

2014-04-15 Thread Jeff Janes
On Tue, Apr 15, 2014 at 9:12 AM, Nick Eubank nickeub...@gmail.com wrote:

 On Mon, Apr 14, 2014 at 5:19 PM, Jeff Janes jeff.ja...@gmail.com wrote:


 I'd go with a small shared_buffers, like 128MB, and let the OS cache as
 much as possible.  This minimizes the amount of double buffering.

 And set work_mem to about 6GB, then bump it up if that doesn't seem to
 cause problems.

 In the scenario you describe, it is probably no big deal if you guess too
 high.  Monitor the process, if it it starts to go nuts just kill it and
 start again with a lower work_mem.  If it is a single user system, you can
 afford to be adventurous.

 If you need to build indexes, you should bump up maintenance_work_mem,
 but I just would do that in the local session not system wide.

 Cheers,

 Jeff



 Quick followup Jeff: it seems that I can't set work_mem above about 1gb
 (can't get to 2gb. When I update config, the values just don't change in
 SHOW ALL -- integer constraint?). Is there a work around, or should I
 tweak something else accordingly?



What version are you using?  What is the exact line you put in your config
file?  Did you get any errors when using that config file?  Are you sure
you actually reloaded the server, so that it reread the config file, rather
than just changing the file and then not applying the change?

I usually set work_mem within a psql connection, in which case you need to
quote the setting if you use units:

set work_mem=3GB;

But if you set it system wide in the config file the quotes should not be
needed.



 Thanks!

 Nick

 (Properly bottom posted this time?)


Looked good to me.

Cheers,

Jeff


Re: [PERFORM] Tuning Postgres for Single connection use

2014-04-15 Thread Schnabel, Robert D.


From: pgsql-performance-ow...@postgresql.org 
[mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Nick Eubank
Sent: Tuesday, April 15, 2014 11:12 AM
To: Jeff Janes
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Tuning Postgres for Single connection use



On Mon, Apr 14, 2014 at 5:19 PM, Jeff Janes jeff.ja...@gmail.com wrote:
On Mon, Apr 14, 2014 at 2:46 PM, Nick Eubank nickeub...@gmail.com wrote:
Any rules of thumb for work_mem, maintenance_work_mem, shared_buffer, etc. for 
a database that DOESN'T anticipate concurrent connections and that is doing 
lots of aggregate functions on large tables? All the advice I can find online 
on tuning (this, this, this etc.) is written for people anticipating lots of 
concurrent connections.
I'm a social scientist looking to use Postgres not as a database to be shared 
by multiple users, but rather as my own tool for manipulating a massive data 
set (I have 5 billion transaction records (600gb in csv) and want to pull out 
unique user pairs, estimate aggregates for individual users, etc.). This also 
means almost no writing, except to creation of new tables based on selections 
from the main table. 
I'm on a Windows 8 VM with 16gb ram, SCSI VMware HD, and 3 cores if that's 
important.

I'd go with a small shared_buffers, like 128MB, and let the OS cache as much as 
possible.  This minimizes the amount of double buffering.

And set work_mem to about 6GB, then bump it up if that doesn't seem to cause 
problems.

In the scenario you describe, it is probably no big deal if you guess too high. 
 Monitor the process, if it it starts to go nuts just kill it and start again 
with a lower work_mem.  If it is a single user system, you can afford to be 
adventurous.

If you need to build indexes, you should bump up maintenance_work_mem, but I 
just would do that in the local session not system wide.

Cheers,

Jeff
 


Quick followup Jeff: it seems that I can't set work_mem above about 1gb (can't 
get to 2gb. When I update config, the values just don't change in SHOW ALL -- 
integer constraint?). Is there a work around, or should I tweak something else 
accordingly? 

Thanks!

Nick

(Properly bottom posted this time?) 

[Schnabel, Robert D.] 

Nick,

I asked the same question a while ago about work_mem on Windows.  See this 
thread:
http://www.postgresql.org/message-id/17895.1315869...@sss.pgh.pa.us

Bob


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Tuning Postgres for Single connection use

2014-04-15 Thread Tom Lane
Jeff Janes jeff.ja...@gmail.com writes:
 On Tue, Apr 15, 2014 at 9:12 AM, Nick Eubank nickeub...@gmail.com wrote:
 Quick followup Jeff: it seems that I can't set work_mem above about 1gb
 (can't get to 2gb. When I update config, the values just don't change in
 SHOW ALL -- integer constraint?). Is there a work around, or should I
 tweak something else accordingly?

 What version are you using?  What is the exact line you put in your config
 file?  Did you get any errors when using that config file?  Are you sure
 you actually reloaded the server, so that it reread the config file, rather
 than just changing the file and then not applying the change?

 I usually set work_mem within a psql connection, in which case you need to
 quote the setting if you use units:
 set work_mem=3GB;

FWIW, it's generally considered a seriously *bad* idea to set work_mem as
high as 1GB in postgresql.conf: you're promising that each query running
on the server can use 1GB per sort or hash step.  You probably don't have
the machine resources to honor that promise.  (If you do, I'd like to have
your IT budget ;-))  Recommended practice is to keep the global setting
conservatively small, and bump it up locally in your session (with SET)
for individual queries that need the very large value.

But having said that, Postgres doesn't try to enforce any such practice.
My money is on what Jeff is evidently thinking: you forgot to do pg_ctl
reload, or else the setting is too large for your platform, in which case
there should have been a complaint in the postmaster log.  As noted
elsewhere, the limit for Windows is a hair under 2GB even if it's 64-bit
Windows.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] Tuning Postgres for Single connection use

2014-04-14 Thread Gavin Flower

On 15/04/14 09:46, Nick Eubank wrote:


Any rules of thumb for |work_mem|, |maintenance_work_mem|, 
|shared_buffer|, etc. for a database that DOESN'T anticipate 
concurrent connections and that is doing lots of aggregate functions 
on large tables? All the advice I can find online on tuning (this 
http://wiki.postgresql.org/wiki/Performance_Optimization, this 
http://media.revsys.com/talks/djangocon/2011/secrets-of-postgresql-performance.pdf, 
this 
http://www.revsys.com/writings/postgresql-performance.html etc.) is 
written for people anticipating lots of concurrent connections.


I'm a social scientist looking to use Postgres not as a database to be 
shared by multiple users, but rather as my own tool for manipulating a 
massive data set (I have 5 billion transaction records (600gb in csv) 
and want to pull out unique user pairs, estimate aggregates for 
individual users, etc.). This also means almost no writing, except to 
creation of new tables based on selections from the main table.


I'm on a Windows 8 VM with 16gb ram, SCSI VMware HD, and 3 cores if 
that's important.


Thanks!

Well for serious database work, I suggest upgrading to Linux - you will 
get better performance out of the same hardware and probably (a year or 
so ago, I noticed some tuning options did not apply to Microsoft O/S's, 
but I don't recall the details - these options may, or may not, apply to 
your situation) more scope for tuning.  Apart from anything else, your 
processing will not be slowed down by having to run anti-virus software!


Note that in Linux you have a wide choice of distributions and desktop 
environments: I chose Mate (http://mate-desktop.org), some people prefer 
xfce (http://www.xfce.org), I used to use GNOME 2.



Cheers,
Gavin


Re: [PERFORM] Tuning Postgres for Single connection use

2014-04-14 Thread Nick Eubank
Thanks Gavin -- would LOVE to. Sadly I'm in a weird situation
where my hardware is not under my control, so I'm stuck making the best of
what I have. Next time though! :)

On Monday, April 14, 2014, Gavin Flower gavinflo...@archidevsys.co.nz
wrote:

  On 15/04/14 09:46, Nick Eubank wrote:

  Any rules of thumb for work_mem, maintenance_work_mem, shared_buffer,
 etc. for a database that DOESN'T anticipate concurrent connections and that
 is doing lots of aggregate functions on large tables? All the advice I
 can find online on tuning 
 (thishttp://wiki.postgresql.org/wiki/Performance_Optimization
 , 
 thishttp://media.revsys.com/talks/djangocon/2011/secrets-of-postgresql-performance.pdf
 , this http://www.revsys.com/writings/postgresql-performance.html etc.)
 is written for people anticipating lots of concurrent connections.

 I'm a social scientist looking to use Postgres not as a database to be
 shared by multiple users, but rather as my own tool for manipulating a
 massive data set (I have 5 billion transaction records (600gb in csv) and
 want to pull out unique user pairs, estimate aggregates for individual
 users, etc.). This also means almost no writing, except to creation of new
 tables based on selections from the main table.

 I'm on a Windows 8 VM with 16gb ram, SCSI VMware HD, and 3 cores if that's
 important.

 Thanks!

 Well for serious database work, I suggest upgrading to Linux - you will
 get better performance out of the same hardware and probably (a year or so
 ago, I noticed some tuning options did not apply to Microsoft O/S's, but I
 don't recall the details - these options may, or may not, apply to your
 situation) more scope for tuning.  Apart from anything else, your
 processing will not be slowed down by having to run anti-virus software!

 Note that in Linux you have a wide choice of distributions and desktop
 environments: I chose Mate (http://mate-desktop.org), some people prefer
 xfce (http://www.xfce.org), I used to use GNOME 2.


 Cheers,
 Gavin



Re: [PERFORM] Tuning Postgres for Single connection use

2014-04-14 Thread Andrew Dunstan


On 04/14/2014 05:46 PM, Nick Eubank wrote:


Any rules of thumb for |work_mem|, |maintenance_work_mem|, 
|shared_buffer|, etc. for a database that DOESN'T anticipate 
concurrent connections and that is doing lots of aggregate functions 
on large tables? All the advice I can find online on tuning (this 
http://wiki.postgresql.org/wiki/Performance_Optimization, this 
http://media.revsys.com/talks/djangocon/2011/secrets-of-postgresql-performance.pdf, 
this 
http://www.revsys.com/writings/postgresql-performance.html etc.) is 
written for people anticipating lots of concurrent connections.


I'm a social scientist looking to use Postgres not as a database to be 
shared by multiple users, but rather as my own tool for manipulating a 
massive data set (I have 5 billion transaction records (600gb in csv) 
and want to pull out unique user pairs, estimate aggregates for 
individual users, etc.). This also means almost no writing, except to 
creation of new tables based on selections from the main table.


I'm on a Windows 8 VM with 16gb ram, SCSI VMware HD, and 3 cores if 
that's important.






First up would probably be don't run on Windows. shared_buffers above 
512Mb causes performance to degrade on Windows, while that threshold is 
much higher on *nix.


cheers

andrew



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Tuning Postgres for Single connection use

2014-04-14 Thread Jeff Janes
On Mon, Apr 14, 2014 at 2:46 PM, Nick Eubank nickeub...@gmail.com wrote:

 Any rules of thumb for work_mem, maintenance_work_mem, shared_buffer,
 etc. for a database that DOESN'T anticipate concurrent connections and that
 is doing lots of aggregate functions on large tables? All the advice I
 can find online on tuning 
 (thishttp://wiki.postgresql.org/wiki/Performance_Optimization
 , 
 thishttp://media.revsys.com/talks/djangocon/2011/secrets-of-postgresql-performance.pdf
 , this http://www.revsys.com/writings/postgresql-performance.html etc.)
 is written for people anticipating lots of concurrent connections.

 I'm a social scientist looking to use Postgres not as a database to be
 shared by multiple users, but rather as my own tool for manipulating a
 massive data set (I have 5 billion transaction records (600gb in csv) and
 want to pull out unique user pairs, estimate aggregates for individual
 users, etc.). This also means almost no writing, except to creation of new
 tables based on selections from the main table.

 I'm on a Windows 8 VM with 16gb ram, SCSI VMware HD, and 3 cores if that's
 important.


I'd go with a small shared_buffers, like 128MB, and let the OS cache as
much as possible.  This minimizes the amount of double buffering.

And set work_mem to about 6GB, then bump it up if that doesn't seem to
cause problems.

In the scenario you describe, it is probably no big deal if you guess too
high.  Monitor the process, if it it starts to go nuts just kill it and
start again with a lower work_mem.  If it is a single user system, you can
afford to be adventurous.

If you need to build indexes, you should bump up maintenance_work_mem, but
I just would do that in the local session not system wide.

Cheers,

Jeff


Re: [PERFORM] Tuning Postgres for Single connection use

2014-04-14 Thread Gavin Flower

In this list, please bottom post!

I've added potentially useful advice below.

On 15/04/14 11:39, Nick Eubank wrote:
Thanks Gavin -- would LOVE to. Sadly I'm in a weird situation 
where my hardware is not under my control, so I'm stuck making the 
best of what I have. Next time though! :)


On Monday, April 14, 2014, Gavin Flower gavinflo...@archidevsys.co.nz 
mailto:gavinflo...@archidevsys.co.nz wrote:


On 15/04/14 09:46, Nick Eubank wrote:


Any rules of thumb for |work_mem|, |maintenance_work_mem|,
|shared_buffer|, etc. for a database that DOESN'T anticipate
concurrent connections and that is doing lots of aggregate
functions on large tables? All the advice I can find online on
tuning (this
http://wiki.postgresql.org/wiki/Performance_Optimization, this

http://media.revsys.com/talks/djangocon/2011/secrets-of-postgresql-performance.pdf,
this
http://www.revsys.com/writings/postgresql-performance.html etc.) is
written for people anticipating lots of concurrent connections.

I'm a social scientist looking to use Postgres not as a database
to be shared by multiple users, but rather as my own tool for
manipulating a massive data set (I have 5 billion transaction
records (600gb in csv) and want to pull out unique user pairs,
estimate aggregates for individual users, etc.). This also means
almost no writing, except to creation of new tables based on
selections from the main table.

I'm on a Windows 8 VM with 16gb ram, SCSI VMware HD, and 3 cores
if that's important.

Thanks!


Well for serious database work, I suggest upgrading to Linux - you
will get better performance out of the same hardware and probably
(a year or so ago, I noticed some tuning options did not apply to
Microsoft O/S's, but I don't recall the details - these options
may, or may not, apply to your situation) more scope for tuning. 
Apart from anything else, your processing will not be slowed down

by having to run anti-virus software!

Note that in Linux you have a wide choice of distributions and
desktop environments: I chose Mate (http://mate-desktop.org), some
people prefer xfce (http://www.xfce.org), I used to use GNOME 2.


Cheers,
Gavin


Yeah, I know the feeling!

I have a client that uses MySQL (ugh!), but I won't even bother 
mentioning PostgreSQL!


Hopefully, someone more knowledgeable will give you some good advice 
specific to your O/S.


For tables that don't change, consider a packing density of 100%.

Take care in how you design your tables, and the column types.

Consider carefully the queries you are likely to use, so you can design 
appropriate indexes.


Some advice will depend on the schema you plan to use, and the type of 
queries.



Cheers,
Gavin




Re: [PERFORM] Tuning Postgres for Single connection use

2014-04-14 Thread Nick Eubank
Terrific -- thanks Gavin and Jeff! That's incredibly helpful for a n00b
like me!


On Mon, Apr 14, 2014 at 5:29 PM, Gavin Flower gavinflo...@archidevsys.co.nz
 wrote:

  In this list, please bottom post!

 I've added potentially useful advice below.


 On 15/04/14 11:39, Nick Eubank wrote:

 Thanks Gavin -- would LOVE to. Sadly I'm in a weird situation
 where my hardware is not under my control, so I'm stuck making the best of
 what I have. Next time though! :)

 On Monday, April 14, 2014, Gavin Flower gavinflo...@archidevsys.co.nz
 wrote:

  On 15/04/14 09:46, Nick Eubank wrote:

  Any rules of thumb for work_mem, maintenance_work_mem, shared_buffer,
 etc. for a database that DOESN'T anticipate concurrent connections and that
 is doing lots of aggregate functions on large tables? All the advice I
 can find online on tuning 
 (thishttp://wiki.postgresql.org/wiki/Performance_Optimization
 , 
 thishttp://media.revsys.com/talks/djangocon/2011/secrets-of-postgresql-performance.pdf
 , this http://www.revsys.com/writings/postgresql-performance.html etc.)
 is written for people anticipating lots of concurrent connections.

 I'm a social scientist looking to use Postgres not as a database to be
 shared by multiple users, but rather as my own tool for manipulating a
 massive data set (I have 5 billion transaction records (600gb in csv) and
 want to pull out unique user pairs, estimate aggregates for individual
 users, etc.). This also means almost no writing, except to creation of new
 tables based on selections from the main table.

 I'm on a Windows 8 VM with 16gb ram, SCSI VMware HD, and 3 cores if
 that's important.

 Thanks!

 Well for serious database work, I suggest upgrading to Linux - you will
 get better performance out of the same hardware and probably (a year or so
 ago, I noticed some tuning options did not apply to Microsoft O/S's, but I
 don't recall the details - these options may, or may not, apply to your
 situation) more scope for tuning.  Apart from anything else, your
 processing will not be slowed down by having to run anti-virus software!

 Note that in Linux you have a wide choice of distributions and desktop
 environments: I chose Mate (http://mate-desktop.org), some people prefer
 xfce (http://www.xfce.org), I used to use GNOME 2.


 Cheers,
 Gavin

  Yeah, I know the feeling!

 I have a client that uses MySQL (ugh!), but I won't even bother mentioning
 PostgreSQL!

 Hopefully, someone more knowledgeable will give you some good advice
 specific to your O/S.

 For tables that don't change, consider a packing density of 100%.

 Take care in how you design your tables, and the column types.

 Consider carefully the queries you are likely to use, so you can design
 appropriate indexes.

 Some advice will depend on the schema you plan to use, and the type of
 queries.


 Cheers,
 Gavin