Re: [PERFORM] Setting nice values
Tobias Brox wrote: [Madison Kelly - Thu at 10:25:07AM -0500] Will the priority of the script pass down to the pgsql queries it calls? I figured (likely incorrectly) that because the queries were executed by the psql server the queries ran with the server's priority. I think you are right, and in any case, I don't think the niceness value won't help much if the bottleneck is iowait. In our application, I've made a special function for doing low-priority transactions which I believe is quite smart - though maybe not always. Before introducing this logic, we observed we had a tipping point, too many queries, and the database wouldn't swallow them fast enough, and the database server just jammed up, trying to work at too many queries at once, yielding the results far too slow. In the config file, I now have those two flags set: stats_start_collector = on stats_command_string = on This will unfortunately cause some CPU-load, but the benefit is great - one can actually check what the server is working with at any time: select * from pg_stat_activity with those, it is possible to check a special view pg_stat_activity - it will contain all the queries the database is working on right now. My idea is to peek into this table - if there is no active queries, the database is idle, and it's safe to start our low-priority transaction. If this view is full of stuff, one should certainly not run any low-priority transactions, rather sleep a bit and try again later. select count(*) from pg_stat_activity where not current_query like 'IDLE%' and query_start+?now() The algorithm takes four parameters, the time value to put in above, the maximum number of queries allowed to run, the sleep time between each attempt, and the amount of attempts to try before giving up. So here are the cons and drawbacks: con: Given small queries and small transactions, one can tune this in such a way that the low priority queries (almost) never causes significant delay for the higher priority queries. con: can be used to block users of an interactive query application to cause disturbances on the production database. con: can be used for pausing low-priority batch jobs to execute only when the server is idle. drawback: unsuitable for long-running queries and transactions drawback: with fixed values in the parameters above, one risks that the queries never gets run if the server is sufficiently stressed. drawback: the stats collection requires some CPU drawback: the select * from pg_stats_activity query requires some CPU drawback: the pg_stats_activity-view is constant within the transaction, so one has to roll back if there is activity (this is however not a really bad thing, because one certainly shouldn't live an idle transaction around if the database is stressed). I can see how this would be very useful (and may make use of it later!). For the current job at hand though, at full tilt it can take a few hours to run, which puts it into your drawback section. The server in question is also almost under load of some sort, too. A great tip and one I am sure to make use of later, thanks! Madi ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Setting nice values
Scott Marlowe wrote: nope, the priorities don't pass down. you connect via a client lib to the server, which spawns a backend process that does the work for you. The backend process inherits its priority from the postmaster that spawns it, and they all run at the same priority. Shoot, but figured. :) Thanks for the tip, too, it's something I will try. Sometimes it's the simple solutions that work best. :) Welcome to the world of pgsql, btw... Heh, if only I was new to pgsql I wouldn't feel silly for asking so many questions :P. In the same right though, I enjoy PgSQL/Linux/FOSS in general *because* there seems to never be a shortage of things to learn. Thanks! Madi ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Setting nice values
Andreas Kostyrka wrote: Am Donnerstag, den 02.11.2006, 09:41 -0600 schrieb Scott Marlowe: Sometimes it's the simple solutions that work best. :) Welcome to the world of pgsql, btw... OTOH, there are also non-simple solutions to this, which might make sense anyway: Install slony, and run your queries against a readonly replica of your data. Bingo! This seems like exactly what we can/should do, and it will likely help with other jobs we run, too. I feel a little silly for not having thought of this myself... Guess I was too focused on niceness :). Thanks! Madi ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Setting nice values
[Madison Kelly - Mon at 08:10:12AM -0500] to run, which puts it into your drawback section. The server in question is also almost under load of some sort, too. A great tip and one I am sure to make use of later, thanks! I must have been sleepy, listing up cons vs drawbacks ;-) Anyway, the central question is not the size of the job, but the size of the transactions within the job - if the job consists of many transactions, my test can be run before every transaction. Having transactions lasting for hours is a very bad thing to do, anyway. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Setting nice values
Tobias Brox wrote: [Madison Kelly - Mon at 08:10:12AM -0500] to run, which puts it into your drawback section. The server in question is also almost under load of some sort, too. A great tip and one I am sure to make use of later, thanks! I must have been sleepy, listing up cons vs drawbacks ;-) :) I noticed but figured what you meant (I certainly do similar flubs!). Anyway, the central question is not the size of the job, but the size of the transactions within the job - if the job consists of many transactions, my test can be run before every transaction. Having transactions lasting for hours is a very bad thing to do, anyway. Ah, sorry, long single queries is what you meant. I have inherited this code so I am not sure how long a given query takes, though they do use a lot of joins and such, so I suspect it isn't quick; indexes aside. When I get some time (and get the backup server running) I plan to play with this. Currently the DB is on a production server so I am hesitant to poke around just now. Once I get the backup server though, I will play with your suggestions. I am quite curious to see how it will work out. Thanks again! Madi ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PERFORM] Yet another question on LIMIT performance :/
Though I've read recent threads, I'm unsure if any matches my case. We have 2 tables: revisions and revisions_active. revisions contains 117707 rows, revisions_active 17827 rows. DDL: http://hannes.imos.net/ddl.sql.txt Joining the 2 tables without an additional condition seems ok for me (given our outdated hardware): http://hannes.imos.net/query_1.sql.txt What worries me is the performance when limiting the recordset: http://hannes.imos.net/query_2.sql.txt Though it should only have to join a few rows it seems to scan all rows. From experience I thought that adding an ORDER BY on the index columns should speed it up. But no effect: http://hannes.imos.net/query_3.sql.txt I'm on 8.1.5, statistics (ANALYZE) are up to date, the tables have each been CLUSTERed by PK, statistic target for the join columns has been set to 100 (without any effect). Thanks in advance! -- Regards, Hannes Dorbath ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Setting nice values
[Madison Kelly - Mon at 08:48:19AM -0500] Ah, sorry, long single queries is what you meant. No - long running single transactions :-) If it's only read-only queries, one will probably benefit by having one transaction for every query. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Yet another question on LIMIT performance :/
Hannes Dorbath wrote: Though it should only have to join a few rows it seems to scan all rows. What makes you think that's the case? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Yet another question on LIMIT performance :/
On 06.11.2006 15:13, Heikki Linnakangas wrote: Hannes Dorbath wrote: Though it should only have to join a few rows it seems to scan all rows. What makes you think that's the case? Sorry, not all rows, but 80753. It's not clear to me why this number is so high with LIMIT 10. -- Regards, Hannes Dorbath ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Setting nice values
Tobias Brox wrote: [Madison Kelly - Mon at 08:48:19AM -0500] Ah, sorry, long single queries is what you meant. No - long running single transactions :-) If it's only read-only queries, one will probably benefit by having one transaction for every query. In this case, what happens is one kinda ugly big transaction is read into a hash, and then looped through (usually ~10,000 rows). On each loop another, slightly less ugly query is performed based on the first query's values now in the hash (these queries being where throttling might help). Then after the second query is parsed a PDF file is created (also a big source of slowness). It isn't entirely read-only though because as the PDFs are created a flag is updated in the given record's row. So yeah, need to experiment some. :) Madi ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Yet another question on LIMIT performance :/
Heikki Linnakangas [EMAIL PROTECTED] writes: Hannes Dorbath wrote: Though it should only have to join a few rows it seems to scan all rows. What makes you think that's the case? What it looks like to me is that the range of keys present in pk_revisions_active corresponds to just the upper end of the range of keys present in pk_revisions (somehow not too surprising). So the mergejoin isn't the most effective plan possible for this case --- it has to scan through much of pk_revisions before it starts getting matches. The planner doesn't have any model for that though, and is costing the plan on the assumption of uniformly-distributed matches. A nestloop plan would be faster for this specific case, but much slower if a large number of rows were requested. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] profiling PL/pgSQL?
am Fri, dem 03.11.2006, um 3:12:14 -0800 mailte Drew Wilson folgendes: I have 700 lines of non-performant pgSQL code that I'd like to profile to see what's going on. What's the best way to profile stored procedures? RAISE NOTICE, you can raise the aktual time within a transaction with timeofday() Of course you only have very small values of best available with plpgsql debugging. There's a GUI debugger from EnterpriseDB I believe, but I've no idea how good it is. Any users/company bods care to let us know? It's an excellent debugger (of course, I'm a bit biased). We are working on open-sourcing it now - we needed some of the plugin features in 8.2. As Jonah pointed out, we also have a PL/pgSQL profiler (already open-sourced but a bit tricky to build). The profiler tells you how much CPU time you spent at each line of PL/pgSQL code, how many times you executed each line of code, and how much I/O was caused by each line (number of scans, blocks fetched, blocks hit, tuples returned, tuples fetched, tuples inserted, tuples updated, tuples deleted). It's been a while since I looked at it, but I seem to remember that it spits out an XML report that you can coax into a nice HTML page via the XSLT. The plugin_profiler needs to be converted over to the plugin architecture in 8.2, but that's not a lot of work. -- Korry -- Korry Douglas [EMAIL PROTECTED] EnterpriseDB http://www.enterprisedb.com
RES: [PERFORM] Context switching
Hi, Sorry, but this message was already post some days before! Thank you! Carlos -Mensagem original-De: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]Em nome de Carlos H. ReimerEnviada em: quarta-feira, 1 de novembro de 2006 03:23Para: pgsql-performance@postgresql.orgAssunto: [PERFORM] Context switching Hi, We've migrated one of our servers from pg 7.4 to 8.1 and from times to times (4 hours) the server start doing a lot of context switching and all transactions become very slow. The average context switching for this server as vmstat shows is 1 but when the problem occurs it goes to 25. CPU and memory usage are ok. What can produce this context switching storms? It is a box with 12GB RAM and 4 processors running RedHat Enterprise Linux AS. Thank you in advance! Reimer[EMAIL PROTECTED]OpenDB Serviços e Treinamentos PostgreSQL e DB2Fone: 47 3327-0878 Cel: 47 9602-0151www.opendb.com.br
[PERFORM] Easy read-heavy benchmark kicking around?
I'm having a spot of problem with out storage device vendor. Read performance (as measured by both bonnie++ and hdparm -t) is abysmal (~14Mbyte/sec), and we're trying to get them to fix it. Unfortunately, they're using the fact that bonnie++ is an open source benchmark to weasle out of doing anything- they can't fix it unless I can show an impact in Postgresql. So the question is: is there an easy to install and run, read-heavy benchmark out there that I can wave at them to get them to fix the problem? I have a second database running on a single SATA drive, so I can use that as a comparison point- look, we're getting 1/3rd the read speed of a single SATA drive- this sucks! Any advice? Brian ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Easy read-heavy benchmark kicking around?
Select count(*) from table-twice-size-of-ram Divide the query time by the number of pages in the table times the pagesize (normally 8KB) and you have your net disk rate. - Luke Msg is shrt cuz m on ma treo -Original Message- From: Brian Hurt [mailto:[EMAIL PROTECTED] Sent: Monday, November 06, 2006 03:49 PM Eastern Standard Time To: pgsql-performance@postgresql.org Subject:[PERFORM] Easy read-heavy benchmark kicking around? I'm having a spot of problem with out storage device vendor. Read performance (as measured by both bonnie++ and hdparm -t) is abysmal (~14Mbyte/sec), and we're trying to get them to fix it. Unfortunately, they're using the fact that bonnie++ is an open source benchmark to weasle out of doing anything- they can't fix it unless I can show an impact in Postgresql. So the question is: is there an easy to install and run, read-heavy benchmark out there that I can wave at them to get them to fix the problem? I have a second database running on a single SATA drive, so I can use that as a comparison point- look, we're getting 1/3rd the read speed of a single SATA drive- this sucks! Any advice? Brian ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Easy read-heavy benchmark kicking around?
On 11/6/06, Brian Hurt [EMAIL PROTECTED] wrote: I'm having a spot of problem with out storage device vendor. Read performance (as measured by both bonnie++ and hdparm -t) is abysmal (~14Mbyte/sec), and we're trying to get them to fix it. Unfortunately, they're using the fact that bonnie++ is an open source benchmark to weasle out of doing anything- they can't fix it unless I can show an impact in Postgresql. So the question is: is there an easy to install and run, read-heavy benchmark out there that I can wave at them to get them to fix the problem? I have a second database running on a single SATA drive, so I can use that as a comparison point- look, we're getting 1/3rd the read speed of a single SATA drive- this sucks! hitachi? my experience with storage vendors is when they say things like that they know full well their device completely sucks and are just stalling so that you give up. merlin ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Easy read-heavy benchmark kicking around?
On Mon, 2006-11-06 at 15:09, Merlin Moncure wrote: On 11/6/06, Brian Hurt [EMAIL PROTECTED] wrote: I'm having a spot of problem with out storage device vendor. Read performance (as measured by both bonnie++ and hdparm -t) is abysmal (~14Mbyte/sec), and we're trying to get them to fix it. Unfortunately, they're using the fact that bonnie++ is an open source benchmark to weasle out of doing anything- they can't fix it unless I can show an impact in Postgresql. So the question is: is there an easy to install and run, read-heavy benchmark out there that I can wave at them to get them to fix the problem? I have a second database running on a single SATA drive, so I can use that as a comparison point- look, we're getting 1/3rd the read speed of a single SATA drive- this sucks! hitachi? my experience with storage vendors is when they say things like that they know full well their device completely sucks and are just stalling so that you give up. Man, if I were the OP I'd be naming names, and letting the idiots at INSERT MAJOR VENDOR HERE know that I was naming names to the whole of the postgresql community and open source as well to make the point that if they look down on open source so much, then open source should look down on them. PostgreSQL is open source software, BSD and Linux are open source / free software. bonnie++'s licensing shouldn't matter one nit, and I'd let everyone know how shittily I was being treated by this vendor until their fixed their crap or took it back. Note that if you're using fibre channel etc... the problem might well be in your own hardware / device drivers. There are a lot of real crap FC and relative cards out there. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Context switch storm
Andreas Kostyrka wrote: The solution for us has been twofold: upgrade to the newest PG version available at the time while we waited for our new Opteron-based DB hardware to arrive. Do you remember the exact Pg version? -- Cosimo ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Help w/speeding up range queries?
On Oct 31, 2006, at 8:29 PM, Tom Lane wrote: John Major [EMAIL PROTECTED] writes: My problem is, I often need to execute searches of tables like these which find All features within a range. Ie: select FeatureID from SIMPLE_TABLE where FeatureChromosomeName like 'chrX' and StartPosition 1000500 and EndPosition 200; A standard btree index is just going to suck for these types of queries; you need something that's actually designed for spatial range queries. You might look at the contrib/seg module --- if you can store your ranges as seg datatype then the seg overlap operator expresses what you need to do, and searches on an overlap operator can be handled well by a GIST index. Also, there's the PostGIS stuff, though it might be overkill for what you want. Another possibility (think Tom has suggested in the past) is to define Start and End as a box, and then use the geometric functions built into plain PostgreSQL (though perhaps that's what he meant by PostGIS stuff). -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Easy read-heavy benchmark kicking around?
Brian Hurt wrote: I'm having a spot of problem with out storage device vendor. Read performance (as measured by both bonnie++ and hdparm -t) is abysmal (~14Mbyte/sec), and we're trying to get them to fix it. Unfortunately, they're using the fact that bonnie++ is an open source benchmark to weasle out of doing anything- they can't fix it unless I can show an impact in Postgresql. So the question is: is there an easy to install and run, read-heavy benchmark out there that I can wave at them to get them to fix the problem? I have a second database running on a single SATA drive, so I can use that as a comparison point- look, we're getting 1/3rd the read speed of a single SATA drive- this sucks! You could use the lineitem table from the TPC-H dataset (http://www.tpc.org/tpch/default.asp). Generate the dataset for a scale factor that makes lineitem about 2x your ram, load the table and do: SELECT count(*) FROM lineitem vmstat or iostat while this is happening should display your meager throughput well enough to get your vendors attention (I'm checking this on a fairly old 4 disk system of mine as I type this - I'm seeing about 90Mb/s...) best wishes Mark ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate