[PERFORM] Temporary Table
Does Creating Temporary table in a function and NOT dropping them affects the performance of the database? I choose Polesoft Lockspam to fight spam, and you? http://www.polesoft.com/refer.html
FW: [PERFORM] Used Memory
Here are the configuration of our database server: port = 5432 max_connections = 300 superuser_reserved_connections = 10 authentication_timeout = 60 shared_buffers = 48000 sort_mem = 32168 sync = false Do you think this is enough? Or can you recommend a better configuration for my server? The server is also running PHP and Apache but wer'e not using it extensively. For development purpose only. The database slow down is occurring most of the time (when the memory free is low) I don't think it has something to do with vacuum. We only have a full server vacuum once a day. I choose Polesoft Lockspam to fight spam, and you? http://www.polesoft.com/refer.html ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Why Index is not working on date columns.
Hi All, Thank you very much for your help in configuring the database. Can you guys please take a look at the following query and let me know why the index is not considered in the plan? Here is the extract of the condition string of the query that is taking the transaction_date in index condition: where (account.id = search_engine.account_fk) and ( account.status = 't' and account.id = '40288a820726362f0107263c55d3') and ( search_engine.id = conversion.search_engine_fk and conversion.event_type ='daily_spend' and conversion.tactic = 'PPC' and conversion.transaction_date between '2005-01-01 00:00:00' and '2005-10-31 23:59:59') group by account.id; Plan: " -> Index Scan using conversion_all on "conversion" (cost=0.00..6.02 rows=1 width=98)"" Index Cond: (((tactic)::text = 'PPC'::text) AND ((event_type)::text = 'daily_spend'::text) AND (transaction_date >= '2005-01-01 00:00:00'::timestamp without time zone) AND (transaction_date <= '2005-10-31 23:59:59'::timestamp without time zon (..)" Here is the extract of the condition string of the query that is not taking the transaction_date in index condition: where ( account.status = 't' and account.id = search_engine.account_fk and account.id = '40288a820726362f0107263c55d3' ) and ( search_engine.id = conversion.search_engine_fk and conversion.tactic = 'PPC' and conversion.transaction_date >= '2005-01-01 00:00:00' and conversion.transaction_date <= '2005-10-31 23:59:59' ) group by account.id ; Plan: " -> Index Scan using conv_evnt_tact_trans_date_sefk on "conversion" (cost=0.00..6.02 rows=1 width=132)"" Index Cond: ((("outer".id)::text = ("conversion".search_engine_fk)::text) AND (("conversion".tactic)::text = 'PPC'::text))" " Filter: ((transaction_date >= '2005-01-01 00:00:00'::timestamp without time zone) AND (transaction_date <= '2005-10-31 23:59:59'::timestamp without time zone))" I have the following indexes defined on the columns. conv_evnt_tact_trans_date_sefk : (search_engine_fk, tactic, event_type, transaction_date); conv_all : (tactic, event_type, transaction_date); I am really confused when I saw this plan. In both queries, I am using the same columns in the where condition, but the optimizer is taking different indexes in these two cases. Second, even though, I have the transaction_date column specified in the second instance, why is it not taking the constraint as index condition? Thanks in advance. Thank you, Kishore.
[PERFORM] Strange planner decision on quite simple select
Hello! I've got a table BOARD_MESSAGE (message_id int8, thread_id int8, ...) with pk on message_id and and a non_unique not_null index on thread_id. A count(*) on BOARD_MESSAGE currently yields a total of 1231171 rows, the planner estimated a total of 1232530 rows in this table. I've got pg_autovacuum running on the database and run an additional nightly VACUUM ANALYZE over it every night. I've got a few queries of the following type: select * from PUBLIC.BOARD_MESSAGE where THREAD_ID = 3354253 order byMESSAGE_ID asc limit 20 offset 0; There are currently roughly 4500 rows with this thread_id in BOARD_MESSAGE. Explain-output is like so: QUERY PLAN -- Limit (cost=0.00..3927.22 rows=20 width=1148) -> Index Scan using pk_board_message on board_message (cost=0.00..1100800.55 rows=5606 width=1148) Filter: (thread_id = 3354253) (3 rows) I didn't have the patience to actually complete an explain analyze on that one - I cancelled the query on several attempts after more than 40 minutes runtime. Now I fiddled a little with this statement and tried nudging the planner in the right direction like so: explain analyze select * from (select * from PUBLIC.BOARD_MESSAGE where THREAD_ID = 3354253 order byMESSAGE_ID asc ) as foo limit 20 offset 0; QUERY PLAN - Limit (cost=8083.59..8083.84 rows=20 width=464) (actual time=1497.455..1498.466 rows=20 loops=1) -> Subquery Scan foo (cost=8083.59..8153.67 rows=5606 width=464) (actual time=1497.447..1498.408 rows=20 loops=1) -> Sort (cost=8083.59..8097.61 rows=5606 width=1148) (actual time=1497.326..1497.353 rows=20 loops=1) Sort Key: message_id -> Index Scan using nidx_bm_thread_id on board_message (cost=0.00..7734.54 rows=5606 width=1148) (actual time=0.283..1431.752 rows=4215 loops=1) Index Cond: (thread_id = 3354253) Total runtime: 1502.138 ms Now this is much more like it. As far as I interpret the explain output, in the former case the planner decides to just sort the whole table with it's 1.2m rows by it's primary key on message_id and then filters out the few thousand rows matching the requested thread_id. In the latter case, it selects the few thousand rows with the matching thread_id _first_ and _then_ sorts them according to their message_id. The former attempt involves sorting of more than a million rows and then filtering through the result, the latter just uses the index to retrieve a few thousand rows and sorts those - which is much more efficient. What's more puzzling is that the results vary somewhat depending on the overall load situation. When using the first approach without the subselect, sometimes the planner chooses exactly the same plan as it does with the second approach - with equally satisfying results in regard to total execution time; sometimes it does use the first plan and does complete with a very acceptable execution time, too. But sometimes (when overall load is sufficiently high, I presume) it just runs and runs for minutes on end - I've had this thing running for more than one hour on several occasions until I made some changes to my app which limits the maximum execution time for a query to no more than 55 seconds. With this IMHO quite ugly subselect-workaround, performance is reproducably stable and sufficiently good under either load, so I chose to stick with it for the time being - but I'd still like to know if I could have done anything to have the planner choose the evidently better plan for the first query without such a workaround? Kind regards Markus ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Strange planner decision on quite simple select
Markus Wollny wrote: Hello! I've got a table BOARD_MESSAGE (message_id int8, thread_id int8, ...) with pk on message_id and and a non_unique not_null index on thread_id. A count(*) on BOARD_MESSAGE currently yields a total of 1231171 rows, the planner estimated a total of 1232530 rows in this table. I've got pg_autovacuum running on the database and run an additional nightly VACUUM ANALYZE over it every night. I've got a few queries of the following type: select * from PUBLIC.BOARD_MESSAGE where THREAD_ID = 3354253 order byMESSAGE_ID asc limit 20 offset 0; There are currently roughly 4500 rows with this thread_id in BOARD_MESSAGE. Explain-output is like so: QUERY PLAN -- Limit (cost=0.00..3927.22 rows=20 width=1148) -> Index Scan using pk_board_message on board_message (cost=0.00..1100800.55 rows=5606 width=1148) Filter: (thread_id = 3354253) (3 rows) I didn't have the patience to actually complete an explain analyze on that one - I cancelled the query on several attempts after more than 40 minutes runtime. Now I fiddled a little with this statement and tried nudging the planner in the right direction like so: Hmm - it shouldn't take that long. If I'm reading this right, it's expecting to have to fetch 5606 rows to match thread_id=3354253 the 20 times you've asked for. Now, what it probably doesn't know is that thread_id is correlated with message_id quite highly (actually, I don't know that, I'm guessing). So - it starts at message_id=1 and works along, but I'm figuring that it needs to reach message_id's in the 3-4 million range to see any of the required thread. Suggestions: 1. Try "ORDER BY thread_id,message_id" and see if that nudges things your way. 2. Keep #1 and try replacing the index on (thread_id) with (thread_id,message_id) -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Outer join query plans and performance
I tried on pgsql-general but got no reply. re-posting here as it's probably the best place to ask I'm having some significant performance problems with left join. Can anyone give me any pointers as to why the following 2 query plans are so different? EXPLAIN SELECT * FROM tokens.ta_tokens t LEFT JOIN tokens.ta_tokenhist h1 ON t.token_id = h1.token_id LEFT JOIN tokens.ta_tokenhist h2 ON t.token_id = h2.token_id WHERE h1.histdate = 'now'; Nested Loop Left Join (cost=0.00..68778.43 rows=2215 width=1402) -> Nested Loop (cost=0.00..55505.62 rows=2215 width=714) -> Index Scan using idx_tokenhist__histdate on ta_tokenhist h1 (cost=0.00..22970.70 rows=5752 width=688) Index Cond: (histdate = '2005-10-24 13:28:38.411844'::timestamp without time zone) -> Index Scan using ta_tokens_pkey on ta_tokens t (cost=0.00..5.64 rows=1 width=26) Index Cond: ((t.token_id)::integer = ("outer".token_id)::integer) -> Index Scan using fkx_tokenhist__tokens on ta_tokenhist h2 (cost=0.00..5.98 rows=1 width=688) Index Cond: (("outer".token_id)::integer = (h2.token_id)::integer) Performance is fine for this one and the plan is pretty much as i'd expect. This is where i hit a problem. EXPLAIN SELECT * FROM tokens.ta_tokens t LEFT JOIN tokens.ta_tokenhist h1 ON t.token_id = h1.token_id LEFT JOIN tokens.ta_tokenhist h2 ON t.token_id = h2.token_id WHERE h2.histdate = 'now'; Hash Join (cost=1249148.59..9000709.22 rows=2215 width=1402) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Hash Left Join (cost=1225660.51..8181263.40 rows=4045106 width=714) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Seq Scan on ta_tokens t (cost=0.00..71828.06 rows=4045106 width=26) -> Hash (cost=281243.21..281243.21 rows=10504921 width=688) -> Seq Scan on ta_tokenhist h1 (cost=0.00..281243.21 rows=10504921 width=688) -> Hash (cost=22970.70..22970.70 rows=5752 width=688) -> Index Scan using idx_tokenhist__histdate on ta_tokenhist h2 (cost=0.00..22970.70 rows=5752 width=688) Index Cond: (histdate = '2005-10-24 13:34:51.371905'::timestamp without time zone) I would understand if h2 was joined on h1, but it isn't. It only joins on t. can anyone give any tips on improving the performance of the second query (aside from changing the join order manually)? select version(); version -- PostgreSQL 8.0.3 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.0.2 20050821 (prerelease) (Debian 4.0.1-6) Thanks -- - Rich Doughty ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Why Index is not working on date columns.
Kishore B <[EMAIL PROTECTED]> writes: > Can you guys please take a look at the following query and let me know why > the index is not considered in the plan? "Considered" and "used" are two different things. The two examples you give have the same estimated cost (within two decimal places) so the planner sees no particular reason to choose one over the other. I surmise that you are testing on toy tables and extrapolating to what will happen on larger tables. This is an unjustified assumption. Create a realistic test data set, ANALYZE it, and then see if the planner chooses indexes you like. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Strange planner decision on quite simple select
Hi! > -Ursprüngliche Nachricht- > Von: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Im Auftrag > von Richard Huxton > Gesendet: Dienstag, 25. Oktober 2005 12:07 > An: Markus Wollny > Cc: pgsql-performance@postgresql.org > Betreff: Re: [PERFORM] Strange planner decision on quite simple select > > Hmm - it shouldn't take that long. If I'm reading this right, > it's expecting to have to fetch 5606 rows to match > thread_id=3354253 the 20 times you've asked for. Now, what it > probably doesn't know is that thread_id is correlated with > message_id quite highly (actually, I don't know that, I'm > guessing). So - it starts at message_id=1 and works along, > but I'm figuring that it needs to reach message_id's in the > 3-4 million range to see any of the required thread. Reading this I tried with adding a "AND MESSAGE_ID >= THREAD_ID" to the WHERE-clause, as you've guessed quite correctly, both message_id and thread_id are derived from the same sequence and thread_id equals the lowest message_id in a thread. This alone did quite a lot to improve things - I got stable executing times down from an average 12 seconds to a mere 2 seconds - just about the same as with the subselect. > Suggestions: > 1. Try "ORDER BY thread_id,message_id" and see if that nudges > things your way. > 2. Keep #1 and try replacing the index on (thread_id) with > (thread_id,message_id) Did both (though adding such an index during ordinary workload took some time as did the VACUUM ANALYZE afterwards) and that worked like a charm - I've got execution times down to as little as a few milliseconds - wow! Thank you very much for providing such insightful hints! Kind regards Markus ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] insertion of bytea
Hi, I have the following test setup: * PG 8.0.4 on Linux (Centos 4) compiled from source. * DB schema: essentially one table with a few int columns and one bytea column that stores blobs of 52000 bytes each, a primary key on one of the int columns. * A test client was written in C using libpq to see what rate can be reached (inserting records). The client uses a prepared tatement and bundles n inserts into a single transaction (n is variable for testing). * Hardware: different setups tested, in particular a single-opteron box with a built in SATA disk and also an array of SATA disks connected via FC. >From the test run it appears that the insert rate here is essentially CPU bound. I'm getting about 11 MB/s net transfer, regardless if I use the built in disk or the much faster array and regardless various settings (like n, shared_mem). vmstat says that disk bo is about 30MB/s (the array can do much better, I tried with dd and sync!) while the CPU is maxed out at about 90% us and 10% sy. The client accounts for just 2% CPU, most goes into the postmaster. The client inserts random data. I found out that I can improve things by 35% if I use random sequences of bytes that are in the printable range vs. full range. Question 1: Am I correct in assuming that even though I'm passing my 52000 bytes as a (char *) to PQexecPrepared(), encoding/decoding is happening (think 0 -> \000) somewhere in the transfer? Question 2: Is there a better, faster way to do these inserts? I'm unsure about large objects. I'm planning to use some custom server side functions to do computations on the bytes in these records and the large objects API doesn't appear to be well suited for this. Sidequestion: I've tried to profile the server using CFLAGS="-p -DLINUX_PROFILE". I'm getting profiling output but when I look at it using "gprof bin-somewhere/postgres $PGDATA/gmon.out" I'm only seeing what I think are the calls for the server startup. How can I profile the (forked) process that actually performs all the work on my connection? Sorry for the long post :) Bye, Chris. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Reindex - Is this necessary after a vacuum?
We are reindexing frequently, and I'm wondering if this is really necessary, given that it appears to take an exclusive lock on the table. Our table in question is vacuumed every 4 minutes, and we are reindexing after each one. I'm not a fan of locking this table that frequently, even if it is only for 5 - 10 seconds depending on load. The vacuum is a standard vacuum. Nightly we do a vacuum analyze. Thanks for any tips, ...Markus ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] insertion of bytea
On Tue, Oct 25, 2005 at 03:44:36PM +0200, Chris Mair wrote: Is there a better, faster way to do these inserts? COPY is generally the fastest way to do bulk inserts (see PQputCopyData). Mike Stone ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Outer join query plans and performance
Rich Doughty <[EMAIL PROTECTED]> writes: > EXPLAIN SELECT * > FROM > tokens.ta_tokens t LEFT JOIN > tokens.ta_tokenhist h1 ON t.token_id = h1.token_id LEFT JOIN > tokens.ta_tokenhist h2 ON t.token_id = h2.token_id > WHERE > h1.histdate = 'now'; > EXPLAIN SELECT * > FROM > tokens.ta_tokens t LEFT JOIN > tokens.ta_tokenhist h1 ON t.token_id = h1.token_id LEFT JOIN > tokens.ta_tokenhist h2 ON t.token_id = h2.token_id > WHERE > h2.histdate = 'now'; The reason these are different is that the second case constrains only the last-to-be-joined table, so the full cartesian product of t and h1 has to be formed. If this wasn't what you had in mind, you might be able to rearrange the order of the LEFT JOINs, but bear in mind that in general, changing outer-join ordering changes the results. (This is why the planner won't fix it for you.) regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Reindex - Is this necessary after a vacuum?
Markus Benne wrote: We are reindexing frequently, and I'm wondering if this is really necessary, given that it appears to take an exclusive lock on the table. Our table in question is vacuumed every 4 minutes, and we are reindexing after each one. I'm not a fan of locking this table that frequently, even if it is only for 5 - 10 seconds depending on load. The vacuum is a standard vacuum. Nightly we do a vacuum analyze. At most I'd do a nightly reindex. And in fact, I'd probably drop the index, full vacuum, recreate index. But you only need to reindex at all if you have a specific problem with the index bloating. Are you seeing this? -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] insertion of bytea
"Chris Mair" <[EMAIL PROTECTED]> writes: > Am I correct in assuming that even though I'm passing my 52000 > bytes as a (char *) to PQexecPrepared(), encoding/decoding is > happening (think 0 -> \000) somewhere in the transfer? Are you specifying it as a text or binary parameter? Have you looked to see if the stored data is what you expect? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Reindex - Is this necessary after a vacuum?
Markus Benne <[EMAIL PROTECTED]> writes: > Our table in question is vacuumed every 4 minutes, and > we are reindexing after each one. That's pretty silly. You might need a reindex once in awhile, but not every time you vacuum. The draft 8.1 docs contain some discussion of possible reasons for periodic reindexing: http://developer.postgresql.org/docs/postgres/routine-reindex.html but none of these reasons justify once-per-vacuum reindexes. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Why Index is not working on date columns.
Hi Tom, Thank you for your response. I surmise that you are testing on toy tables and extrapolating to whatwill happen on larger tables. These tables participating here contain more than 8 million records as of now, and on every day, 200K records, will add to them. Thank you, Kishore. On 10/25/05, Tom Lane <[EMAIL PROTECTED]> wrote: Kishore B <[EMAIL PROTECTED]> writes:> Can you guys please take a look at the following query and let me know why > the index is not considered in the plan?"Considered" and "used" are two different things.The two examples you give have the same estimated cost (within twodecimal places) so the planner sees no particular reason to choose one over the other.I surmise that you are testing on toy tables and extrapolating to whatwill happen on larger tables. This is an unjustified assumption.Create a realistic test data set, ANALYZE it, and then see if the planner chooses indexes you like. regards, tom lane
Re: [PERFORM] Inefficient escape codes.
Ok, thanks for the limits info, but I have that in the manual. Thanks. But what I really want to know is this: 1) All large objects of all tables inside one DATABASE is kept on only one table. True or false? Thanks =o) RodrigoOn 10/25/05, Nörder-Tuitje, Marcus <[EMAIL PROTECTED]> wrote: oh, btw, no harm, but : having 5000 tables only to gain access via city name is a major design flaw. you might consider putting all into one table working with a distributed index over yer table (city, loc_texdt, blobfield); creating a partitioned index over city. best regards -Ursprüngliche Nachricht-Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]Im Auftrag von Rodrigo MaderaGesendet: Montag, 24. Oktober 2005 21:12An: pgsql-performance@postgresql.orgBetreff: Re: [PERFORM] Inefficient escape codes.Now this interests me a lot.Please clarify this:I have 5000 tables, one for each city:City1_Photos, City2_Photos, ... City5000_Photos.Each of these tables are: CREATE TABLE CityN_Photos (location text, lo_id largeobectypeiforgot)So, what's the limit for these large objects? I heard I could only have 4 billion records for the whole database (not for each table). Is this true? If this isn't true, then would postgres manage to create all the large objects I ask him to?Also, this would be a performance penalty, wouldn't it?Much thanks for the knowledge shared,Rodrigo
Re: [PERFORM] impact of stats_command_string
On Thu, Oct 20, 2005 at 01:33:07PM -0700, [EMAIL PROTECTED] wrote: > If I turn on stats_command_string, how much impact would it have on > PostgreSQL server's performance during a period of massive data > INSERTs? Do you really need to be doing "massive data INSERTs"? Can you use COPY, which is much more efficient for bulk loads? http://www.postgresql.org/docs/8.0/interactive/populate.html -- Michael Fuhr ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] impact of stats_command_string
> If I turn on stats_command_string, how much impact would it have on > PostgreSQL server's performance during a period of massive data > INSERTs? I know that the answer to the question I'm asking will > largely depend upon different factors so I would like to know in which > situations it would be negligible or would have a signifcant impact. First of all, we have to assume your writes are buffered in some way or you are using transactions, or you will likely be i/o bound (or you have a super fast disk setup). Assuming that, I can tell you from experience on win32 that stats_command_string can be fairly expensive for certain types of access patterns. What patterns? 1. If your ratio of queries to records involved is low. 2. If you are accessing data in a very quick way, for example via prepared statements over a LAN 3. Your volume of queries is very high. In these cases, the cost is high. stats_command_string can add a fractional millisecond ( ~.2 in my setup ) to statement latency and as much as double cpu time in extreme cases...you are warned. You may want to turn it off before doing bulk loads or lengthy record iterations. Merlin ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Why Index is not working on date columns.
Kishore B <[EMAIL PROTECTED]> writes: >> I surmise that you are testing on toy tables and extrapolating to what >> will happen on larger tables. >> > These tables participating here contain more than 8 million records as of > now, and on every day, 200K records, will add to them. In that case, have you ANALYZEd the tables lately? The planner's cost estimates correspond to awfully small tables ... regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Outer join query plans and performance
Tom Lane wrote: Rich Doughty <[EMAIL PROTECTED]> writes: EXPLAIN SELECT * FROM tokens.ta_tokens t LEFT JOIN tokens.ta_tokenhist h1 ON t.token_id = h1.token_id LEFT JOIN tokens.ta_tokenhist h2 ON t.token_id = h2.token_id WHERE h1.histdate = 'now'; EXPLAIN SELECT * FROM tokens.ta_tokens t LEFT JOIN tokens.ta_tokenhist h1 ON t.token_id = h1.token_id LEFT JOIN tokens.ta_tokenhist h2 ON t.token_id = h2.token_id WHERE h2.histdate = 'now'; The reason these are different is that the second case constrains only the last-to-be-joined table, so the full cartesian product of t and h1 has to be formed. If this wasn't what you had in mind, you might be able to rearrange the order of the LEFT JOINs, but bear in mind that in general, changing outer-join ordering changes the results. (This is why the planner won't fix it for you.) FWIW mysql 4.1 (and i'm no fan at all of mysql) completes both these queries in approximately 3 seconds. postgres does the first in 6 seconds and the second in a lot longer (eventually abandoned). -- - Rich Doughty ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Why different execution times for different instances for the same query?
Hi All, We are executing a single query that returned very fast on the first instance. But when I executed the same query for multiple times, it is giving strange results. It is not coming back. When I checked with the processes running in the system, I observed that multiple instances of postmaster are running and all of them are consuming very high amounts of memory. I could also observe that they are sharing the memory in a uniform distribution across them. Please let me know if any body has experienced the same and how do they resolved it. Thank you, Kishore.
Re: [PERFORM] Why different execution times for different
On Tue, 2005-10-25 at 14:46, Kishore B wrote: > Hi All, > > We are executing a single query that returned very fast on the first > instance. But when I executed the same query for multiple times, it is > giving strange results. It is not coming back. > > When I checked with the processes running in the system, I observed > that multiple instances of postmaster are running and all of them are > consuming very high amounts of memory. I could also observe that they > are sharing the memory in a uniform distribution across them. > > Please let me know if any body has experienced the same and how do > they resolved it. You may or may not have an actual problem. For one, if they're each using 128 megs, but sharing 120 megs of that then that's not too bad. If they're each using 512 meg and sharing 100 meg of that, then you've got a problem. What is your sort mem set to? Going too high can cause memory starvation and other problems. Also, when you run top, how much memory is being used for cache and buffer. If you've still got a fair amount used for cache then you're probably ok there. What are your settings in postgresql.conf that aren't default? How's the behaviour as you run 1, then 2, then 3, then 4 and so on? Where's the "knee" with this behaviour and what are you running out of, disk IO or memory or memory bandwidth. Are you using iostat/vmstat/top/free/ipcs to check resource usage under load? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] insertion of bytea
>>Is there a better, faster way to do these inserts? > > COPY is generally the fastest way to do bulk inserts (see > PQputCopyData). Thanks :) I'll give that I try and report the results here later. Bye, Chris. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] insertion of bytea
>> Am I correct in assuming that even though I'm passing my 52000 >> bytes as a (char *) to PQexecPrepared(), encoding/decoding is >> happening (think 0 -> \000) somewhere in the transfer? > > Are you specifying it as a text or binary parameter? Have you looked to > see if the stored data is what you expect? I'm specifying it as binary (i.e. one's in PQexecPrepared's format parameter). The stored data is correct. I'll try "copy from stdin with binary" tomorrow and see what I get... Thanks & Bye, Chris. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Outer join query plans and performance
Rich Doughty <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> The reason these are different is that the second case constrains only >> the last-to-be-joined table, so the full cartesian product of t and h1 >> has to be formed. If this wasn't what you had in mind, you might be >> able to rearrange the order of the LEFT JOINs, but bear in mind that >> in general, changing outer-join ordering changes the results. (This >> is why the planner won't fix it for you.) > FWIW mysql 4.1 (and i'm no fan at all of mysql) completes both these queries > in approximately 3 seconds. Does mysql get the correct answer, though? It's hard to see how they do this fast unless they (a) are playing fast and loose with the semantics, or (b) have very substantially more analysis logic for OUTER JOIN semantics than we do. Perhaps mysql 5.x is better about this sort of thing, but for 4.x I'd definitely find theory (a) more plausible than (b). The cases that would be interesting are those where rearranging the outer join order actually does change the correct answer --- it may not in this particular case, I haven't thought hard about it. It seems fairly likely to me that they are rearranging the join order here, and I'm just wondering whether they have the logic needed to verify that such a transformation is correct. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Outer join query plans and performance
In this particular case both outer joins are to the same table, and the where clause is applied to one or the other, so it's pretty easy to prove that they should generate identical results. I'll grant that this is not generally very useful; but then, simple test cases often don't look very useful. We've had mixed results with PostgreSQL and queries with multiple outer joins when the WHERE clause limits the results based on columns from the optional tables. In at least one case which performs very well, we have enough tables to cause the "genetic" optimizer to kick in. (So I suppose there is a chance that sometimes it won't perform well, although we haven't seen that happen yet.) I can't speak to MySQL, but both Sybase and MaxDB handled such cases accurately, and chose a plan with very fast execution. Sybase, however, spent 5 to 10 seconds in the optimizer finding the sub-second plan. -Kevin >>> Tom Lane <[EMAIL PROTECTED]> >>> Rich Doughty <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> The reason these are different is that the second case constrains only >> the last-to-be-joined table, so the full cartesian product of t and h1 >> has to be formed. If this wasn't what you had in mind, you might be >> able to rearrange the order of the LEFT JOINs, but bear in mind that >> in general, changing outer-join ordering changes the results. (This >> is why the planner won't fix it for you.) > FWIW mysql 4.1 (and i'm no fan at all of mysql) completes both these queries > in approximately 3 seconds. Does mysql get the correct answer, though? It's hard to see how they do this fast unless they (a) are playing fast and loose with the semantics, or (b) have very substantially more analysis logic for OUTER JOIN semantics than we do. Perhaps mysql 5.x is better about this sort of thing, but for 4.x I'd definitely find theory (a) more plausible than (b). The cases that would be interesting are those where rearranging the outer join order actually does change the correct answer --- it may not in this particular case, I haven't thought hard about it. It seems fairly likely to me that they are rearranging the join order here, and I'm just wondering whether they have the logic needed to verify that such a transformation is correct. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Temporary Table
I am creating a temporary table in every function that I execute. Which I think is bout 100,000 temporary tables a day. What is the command for vacuuming these 3 tables? Also I read about the auto vacuum of postgresql. How can I execute this auto vacuum or the settings in the configuration? -Original Message- From: Alvaro Nunes Melo [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 25, 2005 10:58 AM To: Christian Paul B. Cosinas Subject: Re: [PERFORM] Temporary Table Hi Christian, Christian Paul B. Cosinas wrote: > Does Creating Temporary table in a function and NOT dropping them > affects the performance of the database? > I believe it will depend on how many temporary tables you will create in a daily basis. We had a performance problem caused by them, and by not monitoring properly the database size. The pg_attribite, pg_class and pg_depend tables grow a lot. When I found out that this was the problem I saw some messages in the list archieve, and now the overall performance is great. What I do is daily run VACUUM FULL and REINDEX in this three tables. Alvaro I choose Polesoft Lockspam to fight spam, and you? http://www.polesoft.com/refer.html ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] blue prints please
where can i find bests practices for tunning postgresql? _ Consigue aquí las mejores y mas recientes ofertas de trabajo en América Latina y USA: http://latam.msn.com/empleos/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] zero performance on query
what happend with postgresql 8.1b4 performance on query? please help me !!! look at this: select count(*) from fotos where archivo not in (select archivo from archivos) Aggregate (cost=4899037992.36..4899037992.37 rows=1 width=0) -> Seq Scan on fotos (cost=22598.78..4899037338.07 rows=261716 width=0) Filter: (NOT (subplan)) SubPlan -> Materialize (cost=22598.78..39304.22 rows=805344 width=58) -> Seq Scan on archivos (cost=0.00..13141.44 rows=805344 width=58) I WILL DIE WAITING FOR QUERY RESPONSE !!! -- CREATE TABLE archivos ( archivo varchar(20)) WITHOUT OIDS; CREATE INDEX archivos_archivo_idx ON archivos USING btree(archivo); ~80 rows -- CREATE TABLE fotos ( cedula varchar(20), nombre varchar(100), apellido1 varchar(100), apellido2 varchar(100), archivo varchar(20) ) WITHOUT OIDS; CREATE INDEX fotos_archivo_idx ON fotos USING btree (archivo); CREATE INDEX fotos_cedula_idx ON fotos USING btree (cedula); ~50 rows _ Consigue aquí las mejores y mas recientes ofertas de trabajo en América Latina y USA: http://latam.msn.com/empleos/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] zero performance on query
That seems like a pretty horrible way to do that query, given the table sizes. What about something like: SELECT count(*) FROM fotos f LEFT JOIN archivo a USING(archivo) WHERE a.archivo IS NULL Incidentally, can someone explain what the "Materialize" subplan does? Is this new in 8.1? Dmitri > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Sidar López Cruz > Sent: Wednesday, October 26, 2005 12:27 AM > To: pgsql-performance@postgresql.org > Subject: [PERFORM] zero performance on query > > > what happend with postgresql 8.1b4 performance on query? > please help me !!! > > look at this: > select count(*) from fotos where archivo not in (select archivo from > archivos) > Aggregate (cost=4899037992.36..4899037992.37 rows=1 width=0) > -> Seq Scan on fotos (cost=22598.78..4899037338.07 rows=261716 > -> width=0) >Filter: (NOT (subplan)) >SubPlan > -> Materialize (cost=22598.78..39304.22 > rows=805344 width=58) >-> Seq Scan on archivos (cost=0.00..13141.44 > rows=805344 > width=58) > > I WILL DIE WAITING FOR QUERY RESPONSE !!! > -- > CREATE TABLE archivos ( archivo varchar(20)) WITHOUT OIDS; > CREATE INDEX archivos_archivo_idx ON archivos USING > btree(archivo); ~80 rows > -- > CREATE TABLE fotos > ( > cedula varchar(20), > nombre varchar(100), > apellido1 varchar(100), > apellido2 varchar(100), > archivo varchar(20) > ) WITHOUT OIDS; > CREATE INDEX fotos_archivo_idx ON fotos USING btree (archivo); > CREATE INDEX fotos_cedula_idx ON fotos USING btree (cedula); > ~50 rows > > _ > Consigue aquí las mejores y mas recientes ofertas de trabajo > en América > Latina y USA: http://latam.msn.com/empleos/ > > > ---(end of > broadcast)--- > TIP 2: Don't 'kill -9' the postmaster > The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] Configuration Suggestion
Hi! Here is the Specifications of My Server. I would really appreciate the best configuration of postgresql.conf for my sevrer. I have tried so many value in the parameters but It seems that I cannot get the speed I want. OS: Redhat Linux CPU: Dual Xeon Memory: 6 gigabyte PostgreSQL Version 8.0 Most of my queries are having Order by Clause, and group by clause. Creation of temporary table. The biggest rows is about 3-5 million which I query almost every 5 seconds. I'm just wondering is it normal to have this result in my memory usage: total used free sharedbuffers cached Mem: 61924606172488 19972 0 399045890824 -/+ buffers/cache: 2417605950700 Swap: 2096472 02096472 What does this mean? I choose Polesoft Lockspam to fight spam, and you? http://www.polesoft.com/refer.html ---(end of broadcast)--- TIP 6: explain analyze is your friend