Re: [PERFORM] Setting nice values

2006-11-06 Thread Madison Kelly

Tobias Brox wrote:

[Madison Kelly - Thu at 10:25:07AM -0500]
Will the priority of the script pass down to the pgsql queries it calls? 
I figured (likely incorrectly) that because the queries were executed by 
the psql server the queries ran with the server's priority. 


I think you are right, and in any case, I don't think the niceness
value won't help much if the bottleneck is iowait.

In our application, I've made a special function for doing
low-priority transactions which I believe is quite smart - though maybe
not always.  Before introducing this logic, we observed we had a tipping
point, too many queries, and the database wouldn't swallow them fast
enough, and the database server just jammed up, trying to work at too
many queries at once, yielding the results far too slow.

In the config file, I now have those two flags set:

 stats_start_collector = on
 stats_command_string = on

This will unfortunately cause some CPU-load, but the benefit is great
- one can actually check what the server is working with at any time:

  select * from pg_stat_activity

with those, it is possible to check a special view pg_stat_activity -
it will contain all the queries the database is working on right now.
My idea is to peek into this table - if there is no active queries,
the database is idle, and it's safe to start our low-priority
transaction.  If this view is full of stuff, one should certainly not
run any low-priority transactions, rather sleep a bit and try again
later.

 select count(*) from pg_stat_activity where not current_query like
 'IDLE%' and query_start+?now()

The algorithm takes four parameters, the time value to put in above,
the maximum number of queries allowed to run, the sleep time between
each attempt, and the amount of attempts to try before giving up.


So here are the cons and drawbacks:

 con: Given small queries and small transactions, one can tune this in
  such a way that the low priority queries (almost) never causes
  significant delay for the higher priority queries.

 con: can be used to block users of an interactive query
  application to cause disturbances on the production database.

 con: can be used for pausing low-priority batch jobs to execute only
  when the server is idle.

 drawback: unsuitable for long-running queries and transactions 


 drawback: with fixed values in the parameters above, one risks that
   the queries never gets run if the server is sufficiently stressed.

 drawback: the stats collection requires some CPU

 drawback: the select * from pg_stats_activity query requires some CPU

 drawback: the pg_stats_activity-view is constant within the
   transaction, so one has to roll back if there is activity
   (this is however not a really bad thing, because one
   certainly shouldn't live an idle transaction around if the
   database is stressed).


I can see how this would be very useful (and may make use of it later!). 
For the current job at hand though, at full tilt it can take a few hours 
to run, which puts it into your drawback section. The server in 
question is also almost under load of some sort, too.


A great tip and one I am sure to make use of later, thanks!

Madi

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Setting nice values

2006-11-06 Thread Madison Kelly

Scott Marlowe wrote:

nope, the priorities don't pass down.  you connect via a client lib to
the server, which spawns a backend process that does the work for you. 
The backend process inherits its priority from the postmaster that

spawns it, and they all run at the same priority.


Shoot, but figured. :)


Thanks for the tip, too, it's something I will try.


Sometimes it's the simple solutions that work best.  :)  Welcome to the
world of pgsql, btw...


Heh, if only I was new to pgsql I wouldn't feel silly for asking so many 
questions :P. In the same right though, I enjoy PgSQL/Linux/FOSS in 
general *because* there seems to never be a shortage of things to learn.


Thanks!

Madi

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Setting nice values

2006-11-06 Thread Madison Kelly

Andreas Kostyrka wrote:

Am Donnerstag, den 02.11.2006, 09:41 -0600 schrieb Scott Marlowe:

Sometimes it's the simple solutions that work best.  :)  Welcome to the
world of pgsql, btw...


OTOH, there are also non-simple solutions to this, which might make
sense anyway: Install slony, and run your queries against a readonly
replica of your data.


Bingo! This seems like exactly what we can/should do, and it will likely 
help with other jobs we run, too.


I feel a little silly for not having thought of this myself... Guess I 
was too focused on niceness :). Thanks!


Madi

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Setting nice values

2006-11-06 Thread Tobias Brox
[Madison Kelly - Mon at 08:10:12AM -0500]
 to run, which puts it into your drawback section. The server in 
 question is also almost under load of some sort, too.
 
 A great tip and one I am sure to make use of later, thanks!

I must have been sleepy, listing up cons vs drawbacks ;-)

Anyway, the central question is not the size of the job, but the size of
the transactions within the job - if the job consists of many
transactions, my test can be run before every transaction.  Having
transactions lasting for hours is a very bad thing to do, anyway.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Setting nice values

2006-11-06 Thread Madison Kelly

Tobias Brox wrote:

[Madison Kelly - Mon at 08:10:12AM -0500]
to run, which puts it into your drawback section. The server in 
question is also almost under load of some sort, too.


A great tip and one I am sure to make use of later, thanks!


I must have been sleepy, listing up cons vs drawbacks ;-)


:) I noticed but figured what you meant (I certainly do similar flubs!).


Anyway, the central question is not the size of the job, but the size of
the transactions within the job - if the job consists of many
transactions, my test can be run before every transaction.  Having
transactions lasting for hours is a very bad thing to do, anyway.


Ah, sorry, long single queries is what you meant. I have inherited this 
code so I am not sure how long a given query takes, though they do use a 
lot of joins and such, so I suspect it isn't quick; indexes aside. When 
I get some time (and get the backup server running) I plan to play with 
this. Currently the DB is on a production server so I am hesitant to 
poke around just now. Once I get the backup server though, I will play 
with your suggestions. I am quite curious to see how it will work out.


Thanks again!

Madi


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


[PERFORM] Yet another question on LIMIT performance :/

2006-11-06 Thread Hannes Dorbath

Though I've read recent threads, I'm unsure if any matches my case.

We have 2 tables: revisions and revisions_active. revisions contains 
117707 rows, revisions_active 17827 rows.


DDL: http://hannes.imos.net/ddl.sql.txt

Joining the 2 tables without an additional condition seems ok for me 
(given our outdated hardware): http://hannes.imos.net/query_1.sql.txt


What worries me is the performance when limiting the recordset:
http://hannes.imos.net/query_2.sql.txt

Though it should only have to join a few rows it seems to scan all rows. 
From experience I thought that adding an ORDER BY on the index columns 
should speed it up. But no effect: http://hannes.imos.net/query_3.sql.txt


I'm on 8.1.5, statistics (ANALYZE) are up to date, the tables have each 
been CLUSTERed by PK, statistic target for the join columns has been set 
to 100 (without any effect).



Thanks in advance!


--
Regards,
Hannes Dorbath

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Setting nice values

2006-11-06 Thread Tobias Brox
[Madison Kelly - Mon at 08:48:19AM -0500]
 Ah, sorry, long single queries is what you meant. 

No - long running single transactions :-)  If it's only read-only
queries, one will probably benefit by having one transaction for every
query.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Yet another question on LIMIT performance :/

2006-11-06 Thread Heikki Linnakangas

Hannes Dorbath wrote:
Though it should only have to join a few rows it seems to scan all rows. 


What makes you think that's the case?

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Yet another question on LIMIT performance :/

2006-11-06 Thread Hannes Dorbath

On 06.11.2006 15:13, Heikki Linnakangas wrote:

Hannes Dorbath wrote:
Though it should only have to join a few rows it seems to scan all rows. 


What makes you think that's the case?


Sorry, not all rows, but 80753. It's not clear to me why this number is 
so high with LIMIT 10.



--
Regards,
Hannes Dorbath

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Setting nice values

2006-11-06 Thread Madison Kelly

Tobias Brox wrote:

[Madison Kelly - Mon at 08:48:19AM -0500]
Ah, sorry, long single queries is what you meant. 


No - long running single transactions :-)  If it's only read-only
queries, one will probably benefit by having one transaction for every
query.



In this case, what happens is one kinda ugly big transaction is read 
into a hash, and then looped through (usually ~10,000 rows). On each 
loop another, slightly less ugly query is performed based on the first 
query's values now in the hash (these queries being where throttling 
might help). Then after the second query is parsed a PDF file is created 
(also a big source of slowness). It isn't entirely read-only though 
because as the PDFs are created a flag is updated in the given record's 
row. So yeah, need to experiment some. :)


Madi

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Yet another question on LIMIT performance :/

2006-11-06 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 Hannes Dorbath wrote:
 Though it should only have to join a few rows it seems to scan all rows. 

 What makes you think that's the case?

What it looks like to me is that the range of keys present in
pk_revisions_active corresponds to just the upper end of the range of
keys present in pk_revisions (somehow not too surprising).  So the
mergejoin isn't the most effective plan possible for this case --- it
has to scan through much of pk_revisions before it starts getting
matches.  The planner doesn't have any model for that though, and is
costing the plan on the assumption of uniformly-distributed matches.

A nestloop plan would be faster for this specific case, but much
slower if a large number of rows were requested.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] profiling PL/pgSQL?

2006-11-06 Thread korryd






 am  Fri, dem 03.11.2006, um  3:12:14 -0800 mailte Drew Wilson folgendes:
 I have 700 lines of non-performant pgSQL code that I'd like to  
 profile to see what's going on.

 What's the best way to profile stored procedures?
 
 RAISE NOTICE, you can raise the aktual time within a transaction with
 timeofday()

Of course you only have very small values of best available with 
plpgsql debugging.

There's a GUI debugger from EnterpriseDB I believe, but I've no idea how 
good it is. Any users/company bods care to let us know?





It's an excellent debugger (of course, I'm a bit biased). 

We are working on open-sourcing it now - we needed some of the plugin features in 8.2.

As Jonah pointed out, we also have a PL/pgSQL profiler (already open-sourced but a bit tricky to build). The profiler tells you how much CPU time you spent at each line of PL/pgSQL code, how many times you executed each line of code, and how much I/O was caused by each line (number of scans, blocks fetched, blocks hit, tuples returned, tuples fetched, tuples inserted, tuples updated, tuples deleted).

It's been a while since I looked at it, but I seem to remember that it spits out an XML report that you can coax into a nice HTML page via the XSLT.

The plugin_profiler needs to be converted over to the plugin architecture in 8.2, but that's not a lot of work.

 -- Korry





--
 Korry Douglas [EMAIL PROTECTED]
 EnterpriseDB http://www.enterprisedb.com







RES: [PERFORM] Context switching

2006-11-06 Thread Carlos H. Reimer



Hi,

Sorry, but this 
message was already post some days before!

Thank 
you!

Carlos

-Mensagem original-De: 
[EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED]Em nome de Carlos H. 
ReimerEnviada em: quarta-feira, 1 de novembro de 2006 
03:23Para: pgsql-performance@postgresql.orgAssunto: 
[PERFORM] Context switching

  Hi,
  
  We've migrated 
  one of our servers from pg 7.4 to 8.1 and from times to times (4 hours) the 
  server start doing a lot of context switching and all transactions become very 
  slow.
  
  The average 
  context switching for this server as vmstat shows is 1 but when the problem 
  occurs it goes to 25.
  
  CPU and memory 
  usage are ok.
  
  What can produce 
  this context switching storms?
  
  It is a box with 
  12GB RAM and 4 processors running RedHat Enterprise Linux 
  AS.
  
  Thank you in 
  advance!
  Reimer[EMAIL PROTECTED]OpenDB 
  Serviços e Treinamentos PostgreSQL e DB2Fone: 47 3327-0878 Cel: 47 
  9602-0151www.opendb.com.br 
  


[PERFORM] Easy read-heavy benchmark kicking around?

2006-11-06 Thread Brian Hurt
I'm having a spot of problem with out storage device vendor.  Read 
performance (as measured by both bonnie++ and hdparm -t) is abysmal 
(~14Mbyte/sec), and we're trying to get them to fix it.  Unfortunately, 
they're using the fact that bonnie++ is an open source benchmark to 
weasle out of doing anything- they can't fix it unless I can show an 
impact in Postgresql.


So the question is: is there an easy to install and run, read-heavy 
benchmark out there that I can wave at them to get them to fix the 
problem?  I have a second database running on a single SATA drive, so I 
can use that as a comparison point- look, we're getting 1/3rd the read 
speed of a single SATA drive- this sucks!


Any advice?

Brian


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Easy read-heavy benchmark kicking around?

2006-11-06 Thread Luke Lonergan
Select count(*) from table-twice-size-of-ram

Divide the query time by the number of pages in the table times the pagesize 
(normally 8KB) and you have your net disk rate.

- Luke

Msg is shrt cuz m on ma treo

 -Original Message-
From:   Brian Hurt [mailto:[EMAIL PROTECTED]
Sent:   Monday, November 06, 2006 03:49 PM Eastern Standard Time
To: pgsql-performance@postgresql.org
Subject:[PERFORM] Easy read-heavy benchmark kicking around?

I'm having a spot of problem with out storage device vendor.  Read 
performance (as measured by both bonnie++ and hdparm -t) is abysmal 
(~14Mbyte/sec), and we're trying to get them to fix it.  Unfortunately, 
they're using the fact that bonnie++ is an open source benchmark to 
weasle out of doing anything- they can't fix it unless I can show an 
impact in Postgresql.

So the question is: is there an easy to install and run, read-heavy 
benchmark out there that I can wave at them to get them to fix the 
problem?  I have a second database running on a single SATA drive, so I 
can use that as a comparison point- look, we're getting 1/3rd the read 
speed of a single SATA drive- this sucks!

Any advice?

Brian


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Easy read-heavy benchmark kicking around?

2006-11-06 Thread Merlin Moncure

On 11/6/06, Brian Hurt [EMAIL PROTECTED] wrote:

I'm having a spot of problem with out storage device vendor.  Read
performance (as measured by both bonnie++ and hdparm -t) is abysmal
(~14Mbyte/sec), and we're trying to get them to fix it.  Unfortunately,
they're using the fact that bonnie++ is an open source benchmark to
weasle out of doing anything- they can't fix it unless I can show an
impact in Postgresql.

So the question is: is there an easy to install and run, read-heavy
benchmark out there that I can wave at them to get them to fix the
problem?  I have a second database running on a single SATA drive, so I
can use that as a comparison point- look, we're getting 1/3rd the read
speed of a single SATA drive- this sucks!


hitachi?

my experience with storage vendors is when they say things like that
they know full well their device completely sucks and are just
stalling so that you give up.

merlin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Easy read-heavy benchmark kicking around?

2006-11-06 Thread Scott Marlowe
On Mon, 2006-11-06 at 15:09, Merlin Moncure wrote:
 On 11/6/06, Brian Hurt [EMAIL PROTECTED] wrote:
  I'm having a spot of problem with out storage device vendor.  Read
  performance (as measured by both bonnie++ and hdparm -t) is abysmal
  (~14Mbyte/sec), and we're trying to get them to fix it.  Unfortunately,
  they're using the fact that bonnie++ is an open source benchmark to
  weasle out of doing anything- they can't fix it unless I can show an
  impact in Postgresql.
 
  So the question is: is there an easy to install and run, read-heavy
  benchmark out there that I can wave at them to get them to fix the
  problem?  I have a second database running on a single SATA drive, so I
  can use that as a comparison point- look, we're getting 1/3rd the read
  speed of a single SATA drive- this sucks!
 
 hitachi?
 
 my experience with storage vendors is when they say things like that
 they know full well their device completely sucks and are just
 stalling so that you give up.

Man, if I were the OP I'd be naming names, and letting the idiots at
INSERT MAJOR VENDOR HERE know that I was naming names to the whole of
the postgresql community and open source as well to make the point that
if they look down on open source so much, then open source should look
down on them.

PostgreSQL is open source software, BSD and Linux are open source / free
software.  bonnie++'s licensing shouldn't matter one nit, and I'd let
everyone know how shittily I was being treated by this vendor until
their fixed their crap or took it back.

Note that if you're using fibre channel etc... the problem might well be
in your own hardware / device drivers.  There are a lot of real crap FC
and relative cards out there.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Context switch storm

2006-11-06 Thread Cosimo Streppone

Andreas Kostyrka wrote:


The solution for us has been twofold:

upgrade to the newest PG version available at the time while we waited
for our new Opteron-based DB hardware to arrive.


Do you remember the exact Pg version?

--
Cosimo


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Help w/speeding up range queries?

2006-11-06 Thread Jim Nasby

On Oct 31, 2006, at 8:29 PM, Tom Lane wrote:

John Major [EMAIL PROTECTED] writes:

My problem is, I often need to execute searches of tables like these
which find All features within a range.
Ie:  select FeatureID from SIMPLE_TABLE where  
FeatureChromosomeName like

'chrX' and StartPosition  1000500 and EndPosition  200;


A standard btree index is just going to suck for these types of  
queries;

you need something that's actually designed for spatial range queries.
You might look at the contrib/seg module --- if you can store your
ranges as seg datatype then the seg overlap operator expresses what
you need to do, and searches on an overlap operator can be handled  
well

by a GIST index.

Also, there's the PostGIS stuff, though it might be overkill for what
you want.


Another possibility (think Tom has suggested in the past) is to  
define Start and End as a box, and then use the geometric functions  
built into plain PostgreSQL (though perhaps that's what he meant by  
PostGIS stuff).

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Easy read-heavy benchmark kicking around?

2006-11-06 Thread Mark Kirkwood

Brian Hurt wrote:
I'm having a spot of problem with out storage device vendor.  Read 
performance (as measured by both bonnie++ and hdparm -t) is abysmal 
(~14Mbyte/sec), and we're trying to get them to fix it.  Unfortunately, 
they're using the fact that bonnie++ is an open source benchmark to 
weasle out of doing anything- they can't fix it unless I can show an 
impact in Postgresql.


So the question is: is there an easy to install and run, read-heavy 
benchmark out there that I can wave at them to get them to fix the 
problem?  I have a second database running on a single SATA drive, so I 
can use that as a comparison point- look, we're getting 1/3rd the read 
speed of a single SATA drive- this sucks!




You could use the lineitem table from the TPC-H dataset 
(http://www.tpc.org/tpch/default.asp).


Generate the dataset for a scale factor that makes lineitem about 2x 
your ram, load the table and do:


SELECT count(*) FROM lineitem

vmstat or iostat while this is happening should display your meager 
throughput well enough to get your vendors attention (I'm checking this 
on a fairly old 4 disk system of mine as I type this - I'm seeing about 
90Mb/s...)


best wishes

Mark

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate