Re: [PERFORM] Setting nice values

2006-11-06 Thread Madison Kelly

Tobias Brox wrote:

[Madison Kelly - Thu at 10:25:07AM -0500]
Will the priority of the script pass down to the pgsql queries it calls? 
I figured (likely incorrectly) that because the queries were executed by 
the psql server the queries ran with the server's priority. 


I think you are right, and in any case, I don't think the niceness
value won't help much if the bottleneck is iowait.

In our application, I've made a special function for doing
low-priority transactions which I believe is quite smart - though maybe
not always.  Before introducing this logic, we observed we had a tipping
point, too many queries, and the database wouldn't swallow them fast
enough, and the database server just jammed up, trying to work at too
many queries at once, yielding the results far too slow.

In the config file, I now have those two flags set:

 stats_start_collector = on
 stats_command_string = on

This will unfortunately cause some CPU-load, but the benefit is great
- one can actually check what the server is working with at any time:

  select * from pg_stat_activity

with those, it is possible to check a special view pg_stat_activity -
it will contain all the queries the database is working on right now.
My idea is to peek into this table - if there is no active queries,
the database is idle, and it's safe to start our low-priority
transaction.  If this view is full of stuff, one should certainly not
run any low-priority transactions, rather sleep a bit and try again
later.

 select count(*) from pg_stat_activity where not current_query like
 'IDLE%' and query_start+?now()

The algorithm takes four parameters, the time value to put in above,
the maximum number of queries allowed to run, the sleep time between
each attempt, and the amount of attempts to try before giving up.


So here are the cons and drawbacks:

 con: Given small queries and small transactions, one can tune this in
  such a way that the low priority queries (almost) never causes
  significant delay for the higher priority queries.

 con: can be used to block users of an interactive query
  application to cause disturbances on the production database.

 con: can be used for pausing low-priority batch jobs to execute only
  when the server is idle.

 drawback: unsuitable for long-running queries and transactions 


 drawback: with fixed values in the parameters above, one risks that
   the queries never gets run if the server is sufficiently stressed.

 drawback: the stats collection requires some CPU

 drawback: the select * from pg_stats_activity query requires some CPU

 drawback: the pg_stats_activity-view is constant within the
   transaction, so one has to roll back if there is activity
   (this is however not a really bad thing, because one
   certainly shouldn't live an idle transaction around if the
   database is stressed).


I can see how this would be very useful (and may make use of it later!). 
For the current job at hand though, at full tilt it can take a few hours 
to run, which puts it into your drawback section. The server in 
question is also almost under load of some sort, too.


A great tip and one I am sure to make use of later, thanks!

Madi

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Setting nice values

2006-11-06 Thread Madison Kelly

Scott Marlowe wrote:

nope, the priorities don't pass down.  you connect via a client lib to
the server, which spawns a backend process that does the work for you. 
The backend process inherits its priority from the postmaster that

spawns it, and they all run at the same priority.


Shoot, but figured. :)


Thanks for the tip, too, it's something I will try.


Sometimes it's the simple solutions that work best.  :)  Welcome to the
world of pgsql, btw...


Heh, if only I was new to pgsql I wouldn't feel silly for asking so many 
questions :P. In the same right though, I enjoy PgSQL/Linux/FOSS in 
general *because* there seems to never be a shortage of things to learn.


Thanks!

Madi

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Setting nice values

2006-11-06 Thread Madison Kelly

Andreas Kostyrka wrote:

Am Donnerstag, den 02.11.2006, 09:41 -0600 schrieb Scott Marlowe:

Sometimes it's the simple solutions that work best.  :)  Welcome to the
world of pgsql, btw...


OTOH, there are also non-simple solutions to this, which might make
sense anyway: Install slony, and run your queries against a readonly
replica of your data.


Bingo! This seems like exactly what we can/should do, and it will likely 
help with other jobs we run, too.


I feel a little silly for not having thought of this myself... Guess I 
was too focused on niceness :). Thanks!


Madi

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Setting nice values

2006-11-06 Thread Tobias Brox
[Madison Kelly - Mon at 08:10:12AM -0500]
 to run, which puts it into your drawback section. The server in 
 question is also almost under load of some sort, too.
 
 A great tip and one I am sure to make use of later, thanks!

I must have been sleepy, listing up cons vs drawbacks ;-)

Anyway, the central question is not the size of the job, but the size of
the transactions within the job - if the job consists of many
transactions, my test can be run before every transaction.  Having
transactions lasting for hours is a very bad thing to do, anyway.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Setting nice values

2006-11-06 Thread Madison Kelly

Tobias Brox wrote:

[Madison Kelly - Mon at 08:10:12AM -0500]
to run, which puts it into your drawback section. The server in 
question is also almost under load of some sort, too.


A great tip and one I am sure to make use of later, thanks!


I must have been sleepy, listing up cons vs drawbacks ;-)


:) I noticed but figured what you meant (I certainly do similar flubs!).


Anyway, the central question is not the size of the job, but the size of
the transactions within the job - if the job consists of many
transactions, my test can be run before every transaction.  Having
transactions lasting for hours is a very bad thing to do, anyway.


Ah, sorry, long single queries is what you meant. I have inherited this 
code so I am not sure how long a given query takes, though they do use a 
lot of joins and such, so I suspect it isn't quick; indexes aside. When 
I get some time (and get the backup server running) I plan to play with 
this. Currently the DB is on a production server so I am hesitant to 
poke around just now. Once I get the backup server though, I will play 
with your suggestions. I am quite curious to see how it will work out.


Thanks again!

Madi


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Setting nice values

2006-11-06 Thread Tobias Brox
[Madison Kelly - Mon at 08:48:19AM -0500]
 Ah, sorry, long single queries is what you meant. 

No - long running single transactions :-)  If it's only read-only
queries, one will probably benefit by having one transaction for every
query.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Setting nice values

2006-11-06 Thread Madison Kelly

Tobias Brox wrote:

[Madison Kelly - Mon at 08:48:19AM -0500]
Ah, sorry, long single queries is what you meant. 


No - long running single transactions :-)  If it's only read-only
queries, one will probably benefit by having one transaction for every
query.



In this case, what happens is one kinda ugly big transaction is read 
into a hash, and then looped through (usually ~10,000 rows). On each 
loop another, slightly less ugly query is performed based on the first 
query's values now in the hash (these queries being where throttling 
might help). Then after the second query is parsed a PDF file is created 
(also a big source of slowness). It isn't entirely read-only though 
because as the PDFs are created a flag is updated in the given record's 
row. So yeah, need to experiment some. :)


Madi

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Setting nice values

2006-11-05 Thread Jim Nasby

On Nov 2, 2006, at 9:14 AM, Madison Kelly wrote:

  I've got a script (perl, in case it matters) that I need to run once
a month to prepare statements. This script queries and updates the
database a *lot*. I am not concerned with the performance of the SQL
calls so much as I am about the impact it has on the server's load.

  Is there a way to limit queries speed (ie: set a low 'nice' value on
a query)? This might be an odd question, or I could be asking the
question the wrong way, but hopefully you the idea. :)


The BizGres folks have been working on resource queuing, which will  
eventually do what you want. Take a look at the BizGres mailing list  
archives for more info.

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Setting nice values

2006-11-03 Thread Andreas Kostyrka
Am Donnerstag, den 02.11.2006, 09:41 -0600 schrieb Scott Marlowe:
 Sometimes it's the simple solutions that work best.  :)  Welcome to the
 world of pgsql, btw...

OTOH, there are also non-simple solutions to this, which might make
sense anyway: Install slony, and run your queries against a readonly
replica of your data.

Andreas



signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil


[PERFORM] Setting nice values

2006-11-02 Thread Madison Kelly

Hi all,

  I've got a script (perl, in case it matters) that I need to run once
a month to prepare statements. This script queries and updates the
database a *lot*. I am not concerned with the performance of the SQL
calls so much as I am about the impact it has on the server's load.

  Is there a way to limit queries speed (ie: set a low 'nice' value on
a query)? This might be an odd question, or I could be asking the
question the wrong way, but hopefully you the idea. :)

Thanks!

Madi


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Setting nice values

2006-11-02 Thread Scott Marlowe
On Thu, 2006-11-02 at 09:14, Madison Kelly wrote:
 Hi all,
 
I've got a script (perl, in case it matters) that I need to run once
 a month to prepare statements. This script queries and updates the
 database a *lot*. I am not concerned with the performance of the SQL
 calls so much as I am about the impact it has on the server's load.
 
Is there a way to limit queries speed (ie: set a low 'nice' value on
 a query)? This might be an odd question, or I could be asking the
 question the wrong way, but hopefully you the idea. :)

While you can safely set the priority lower on the calling perl script,
setting db backend priorities lower can result in problems caused by
priority inversion  Look up that phrase on the pgsql admin, perform,
general, or hackers lists for an explanation, or go here:

http://en.wikipedia.org/wiki/Priority_inversion

I have a simple script that grabs raw data from an oracle db and shoves
it into a postgresql database for reporting purposes.  Every 100 rows I
put into postgresql, I usleep 10 or so and the load caused by that
script on both systems is minimal.  You might try something like that.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Setting nice values

2006-11-02 Thread Madison Kelly

Scott Marlowe wrote:

On Thu, 2006-11-02 at 09:14, Madison Kelly wrote:

Hi all,

   I've got a script (perl, in case it matters) that I need to run once
a month to prepare statements. This script queries and updates the
database a *lot*. I am not concerned with the performance of the SQL
calls so much as I am about the impact it has on the server's load.

   Is there a way to limit queries speed (ie: set a low 'nice' value on
a query)? This might be an odd question, or I could be asking the
question the wrong way, but hopefully you the idea. :)


While you can safely set the priority lower on the calling perl script,
setting db backend priorities lower can result in problems caused by
priority inversion  Look up that phrase on the pgsql admin, perform,
general, or hackers lists for an explanation, or go here:

http://en.wikipedia.org/wiki/Priority_inversion

I have a simple script that grabs raw data from an oracle db and shoves
it into a postgresql database for reporting purposes.  Every 100 rows I
put into postgresql, I usleep 10 or so and the load caused by that
script on both systems is minimal.  You might try something like that.


Will the priority of the script pass down to the pgsql queries it calls? 
I figured (likely incorrectly) that because the queries were executed by 
the psql server the queries ran with the server's priority. If this 
isn't the case, then perfect. :)


Thanks for the tip, too, it's something I will try.

Madi

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Setting nice values

2006-11-02 Thread Scott Marlowe
On Thu, 2006-11-02 at 09:25, Madison Kelly wrote:
 Scott Marlowe wrote:
  On Thu, 2006-11-02 at 09:14, Madison Kelly wrote:
  Hi all,
 
 I've got a script (perl, in case it matters) that I need to run once
  a month to prepare statements. This script queries and updates the
  database a *lot*. I am not concerned with the performance of the SQL
  calls so much as I am about the impact it has on the server's load.
 
 Is there a way to limit queries speed (ie: set a low 'nice' value on
  a query)? This might be an odd question, or I could be asking the
  question the wrong way, but hopefully you the idea. :)
  
  While you can safely set the priority lower on the calling perl script,
  setting db backend priorities lower can result in problems caused by
  priority inversion  Look up that phrase on the pgsql admin, perform,
  general, or hackers lists for an explanation, or go here:
  
  http://en.wikipedia.org/wiki/Priority_inversion
  
  I have a simple script that grabs raw data from an oracle db and shoves
  it into a postgresql database for reporting purposes.  Every 100 rows I
  put into postgresql, I usleep 10 or so and the load caused by that
  script on both systems is minimal.  You might try something like that.
 
 Will the priority of the script pass down to the pgsql queries it calls? 
 I figured (likely incorrectly) that because the queries were executed by 
 the psql server the queries ran with the server's priority. If this 
 isn't the case, then perfect. :)

nope, the priorities don't pass down.  you connect via a client lib to
the server, which spawns a backend process that does the work for you. 
The backend process inherits its priority from the postmaster that
spawns it, and they all run at the same priority.

 Thanks for the tip, too, it's something I will try.

Sometimes it's the simple solutions that work best.  :)  Welcome to the
world of pgsql, btw...

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Setting nice values

2006-11-02 Thread Tobias Brox
[Madison Kelly - Thu at 10:25:07AM -0500]
 Will the priority of the script pass down to the pgsql queries it calls? 
 I figured (likely incorrectly) that because the queries were executed by 
 the psql server the queries ran with the server's priority. 

I think you are right, and in any case, I don't think the niceness
value won't help much if the bottleneck is iowait.

In our application, I've made a special function for doing
low-priority transactions which I believe is quite smart - though maybe
not always.  Before introducing this logic, we observed we had a tipping
point, too many queries, and the database wouldn't swallow them fast
enough, and the database server just jammed up, trying to work at too
many queries at once, yielding the results far too slow.

In the config file, I now have those two flags set:

 stats_start_collector = on
 stats_command_string = on

This will unfortunately cause some CPU-load, but the benefit is great
- one can actually check what the server is working with at any time:

  select * from pg_stat_activity

with those, it is possible to check a special view pg_stat_activity -
it will contain all the queries the database is working on right now.
My idea is to peek into this table - if there is no active queries,
the database is idle, and it's safe to start our low-priority
transaction.  If this view is full of stuff, one should certainly not
run any low-priority transactions, rather sleep a bit and try again
later.

 select count(*) from pg_stat_activity where not current_query like
 'IDLE%' and query_start+?now()

The algorithm takes four parameters, the time value to put in above,
the maximum number of queries allowed to run, the sleep time between
each attempt, and the amount of attempts to try before giving up.


So here are the cons and drawbacks:

 con: Given small queries and small transactions, one can tune this in
  such a way that the low priority queries (almost) never causes
  significant delay for the higher priority queries.

 con: can be used to block users of an interactive query
  application to cause disturbances on the production database.

 con: can be used for pausing low-priority batch jobs to execute only
  when the server is idle.

 drawback: unsuitable for long-running queries and transactions 

 drawback: with fixed values in the parameters above, one risks that
   the queries never gets run if the server is sufficiently stressed.

 drawback: the stats collection requires some CPU

 drawback: the select * from pg_stats_activity query requires some CPU

 drawback: the pg_stats_activity-view is constant within the
   transaction, so one has to roll back if there is activity
   (this is however not a really bad thing, because one
   certainly shouldn't live an idle transaction around if the
   database is stressed).

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster