Re: [PERFORM] When are index scans used over seq scans?

2005-04-21 Thread Richard van den Berg
Tom Lane wrote: The explain shows no such thing. What is the *actual* runtime of each plan per EXPLAIN ANALYZE, please? Ok, it took 3.5 hours to complete. :-/ This is with the default cpu_tuple_cost = 0.01: Nested Loop (cost=252.80..233010147.16 rows=1035480320 width=98) (actual

[PERFORM] How can an index be larger than a table

2005-04-21 Thread David Roussel
Hi, I have a series of tables with identical structure. Some contain a few thousand rows and some contain 3,000,000 rows. Another applicate writes the rows and my applicate reads then just by selecting where pk last_seen_pk limit 2000. I've found that one of the tables, when selecting from it

[PERFORM] two queries and dual cpu (perplexed)

2005-04-21 Thread Shoaib Burq (VPAC)
Hi everybody, One of our clients was using SQL-Server and decided to switch to PostgreSQL 8.0.1. Hardware: Dual processor Intel(R) Xeon(TM) CPU 3.40GHz OS: Enterprise Linux with 2.6.9-5 SMP kernel Filesystem: ext3 SHMMAX: $ cat /proc/sys/kernel/shmmax 6442450944 --- beleive that's ~6.5 GB,

Re: [PERFORM] When are index scans used over seq scans?

2005-04-21 Thread Richard van den Berg
Thanks a lot John for the correct search terms. :-) The suggestion in http://archives.postgresql.org/pgsql-performance/2005-04/msg00029.php to add a constraint that checks (finishtime = starttime) does not make a difference for me. Still seq scans are used. The width solution explained in

Re: [PERFORM] two queries and dual cpu (perplexed)

2005-04-21 Thread Jeff
On Apr 21, 2005, at 7:49 AM, Shoaib Burq (VPAC) wrote: Now I have not touch the $PGDATA/postgresql.conf (As I know very little about memory tuning) Have run VACCUM ANALYZE. You should really, really bump up shared_buffers and given you have 8GB of ram this query would likely benefit from more

Re: [PERFORM] two queries and dual cpu (perplexed)

2005-04-21 Thread Gavin Sherry
On Thu, 21 Apr 2005, Jeff wrote: On Apr 21, 2005, at 7:49 AM, Shoaib Burq (VPAC) wrote: Now I have not touch the $PGDATA/postgresql.conf (As I know very little about memory tuning) Have run VACCUM ANALYZE. You should really, really bump up shared_buffers and given you have 8GB of ram

Re: [PERFORM] two queries and dual cpu (perplexed)

2005-04-21 Thread Shoaib Burq (VPAC)
here's explain sorry about the mess: I can attach it as text-file if you like. ausclimate=# explain ANALYZE select count(*) from getfutureausclimate; QUERY PLAN

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-21 Thread Merlin Moncure
John A Meinel [EMAIL PROTECTED] writes: Joel Fradkin wrote: Postgres was on the second run Total query runtime: 17109 ms. Data retrieval runtime: 72188 ms. 331640 rows retrieved. How were you measuring data retrieval time? I suspect he's using pgadmin. We've seen reports before

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-21 Thread Andreas Pflug
Joel Fradkin wrote: I did think of something similar just loading the data tables with junk records and I may visit that idea with Josh. I did just do some comparisons on timing of a plain select * from tbl where indexed column = x and it was considerably slower then both MSSQL and MYSQL, so I am

Re: [PERFORM] two queries and dual cpu (perplexed)

2005-04-21 Thread Shoaib Burq (VPAC)
here are some i/o stats with the unchanged postgresql.conf. Gonna change it now and have another go. [EMAIL PROTECTED] MultiCPU_test]$ vmstat 10 procs ---memory-- ---swap-- -io --system-- cpu r b swpd free buff cache si sobibo incs us

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-21 Thread Joel Fradkin
Why is MYSQL returning 360,000 rows, while Postgres is only returning 330,000? This may not be important at all, though. I also assume you are selecting from a plain table, not a view. Yes plain table. Difference in rows is one of the datasets had sears data in it. It (speed differences found) is

Re: [PERFORM] When are index scans used over seq scans?

2005-04-21 Thread Tom Lane
Richard van den Berg [EMAIL PROTECTED] writes: This is with the default cpu_tuple_cost = 0.01: Nested Loop (cost=252.80..233010147.16 rows=1035480320 width=98) (actual time=0.369..12672213.137 rows=6171334 loops=1) Join Filter: ((outer.starttimetrunc = inner.ts) AND

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-21 Thread Mohan, Ross
FWIW, ODBC has variables to tweak, as well. fetch/buffer sizes, and the like. Maybe one of the ODBC cognoscenti here can chime in more concretely -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Joel Fradkin Sent: Thursday, April 21, 2005 10:36 AM

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-21 Thread Joel Fradkin
I suspect he's using pgadmin. Yup I was, but I did try running on the linux box in psql, but it was running to the screen and took forever because of that. The real issue is returning to my app using ODBC is very slow (Have not tested the ODBC for MYSQL, MSSQL is ok (the two proc dell is

Re: [PERFORM] two queries and dual cpu (perplexed)

2005-04-21 Thread Shoaib Burq (VPAC)
Is this an IO intensive query? If running both in parellel results in 2x the run time and you have sufficient cpus it would (to me) indicate you don't have enough IO bandwidth to satisfy the query. any tips on how to verify this? ---(end of

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-21 Thread Joel Fradkin
Here is the connect string I am using. It could be horrid as I cut it from ODBC program. Session(StringConn) = DRIVER={PostgreSQL};DATABASE=wazagua;SERVER=192.168.123.252;PORT=5432;UID=; PWD=;ReadOnly=0;Protocol=6.4;FakeOidIndex=0;ShowOidColumn=0;RowVersioning=0;

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-21 Thread Mohan, Ross
Joel, thanks. A couple of things jump out there for me, not a problem for a routine ODBC connection, but perhaps in the lotsa stuff context of your current explorations, it might be relevant? I am completely shooting from the hip, here, but...if it were my goose to cook, I'd be investigating

Re: [PERFORM] two queries and dual cpu (perplexed)

2005-04-21 Thread Shoaib Burq (VPAC)
Just tried it with the following changes: shared_buffers = 10600 work_mem = 102400 enable_seqscan = false still no improvement Ok here's the Plan with the enable_seqscan = false: ausclimate=# explain ANALYZE select count(*) from getfutureausclimate;

Re: [PERFORM] two queries and dual cpu (perplexed)

2005-04-21 Thread John A Meinel
Shoaib Burq (VPAC) wrote: Just tried it with the following changes: shared_buffers = 10600 work_mem = 102400 enable_seqscan = false still no improvement Ok here's the Plan with the enable_seqscan = false: ausclimate=# explain ANALYZE select count(*) from getfutureausclimate; Actually, you

Re: [PERFORM] When are index scans used over seq scans?

2005-04-21 Thread Richard van den Berg
John A Meinel wrote: You might try doing: ALTER TABLE us ALTER COLUMN starttimetrunc SET STATISTICS 200; ALTER TABLE us ALTER COLUMN finishtimetrunc SET STATISTICS 200; VACUUM ANALYZE us; I've been looking into that. While increasing the statistics makes the planner use the index for simple

Re: [PERFORM] When are index scans used over seq scans?

2005-04-21 Thread Richard van den Berg
Tom Lane wrote: which is almost 80% of the entire runtime. Which is enormous. What are those column datatypes exactly? Table richard.sessions Column |Type | Modifiers

Re: [PERFORM] When are index scans used over seq scans?

2005-04-21 Thread Tom Lane
Richard van den Berg [EMAIL PROTECTED] writes: Tom Lane wrote: Perhaps you are incurring a datatype conversion cost? Not that I can tell. No, apparently not. Hmm ... timestamp_cmp_internal is just a couple of isnan() checks and one or two floating-point compares. Should be pretty dang

[PERFORM] Index bloat problem?

2005-04-21 Thread Bill Chandler
All, Running PostgreSQL 7.4.2, Solaris. Client is reporting that the size of an index is greater than the number of rows in the table (1.9 million vs. 1.5 million). Index was automatically created from a 'bigserial unique' column. Database contains several tables with exactly the same columns

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Josh Berkus
Bill, What about if an out-of-the-ordinary number of rows were deleted (say 75% of rows in the table, as opposed to normal 5%) followed by a 'VACUUM ANALYZE'?  Could things get out of whack because of that situation? Yes. You'd want to run REINDEX after and event like that. As you should

Re: [PERFORM] How can an index be larger than a table

2005-04-21 Thread Josh Berkus
David, What also seems weird to me is that the control table has some unique indexes created on it, but the data_upate_events table just has a unique constraint.  Will postgres use an index in the background to enforce this constraint? If you somehow have a unique constraint without a unique

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Bill Chandler
--- Josh Berkus josh@agliodbs.com wrote: Bill, What about if an out-of-the-ordinary number of rows were deleted (say 75% of rows in the table, as opposed to normal 5%) followed by a 'VACUUM ANALYZE'?  Could things get out of whack because of that situation? Yes. You'd want to

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Josh Berkus
Bill, Honestly, this seems like an inordinate amount of babysitting for a production application. I'm not sure if the client will be willing to accept it. Well, then, tell them not to delete 75% of the rows in a table at once. I imagine that operation brought processing to a halt, too. If

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Josh Berkus
Bill, Honestly, this seems like an inordinate amount of babysitting for a production application. I'm not sure if the client will be willing to accept it. Well, then, tell them not to delete 75% of the rows in a table at once. I imagine that operation brought processing to a halt, too.

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Alex Turner
Is: REINDEX DATABASE blah supposed to rebuild all indices in the database, or must you specify each table individualy? (I'm asking because I just tried it and it only did system tables) Alex Turner netEconomist On 4/21/05, Josh Berkus josh@agliodbs.com wrote: Bill, What about if an

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Josh Berkus
Alex, REINDEX DATABASE blah supposed to rebuild all indices in the database, or must you specify each table individualy? (I'm asking because I just tried it and it only did system tables) DATABASE Recreate all system indexes of a specified database. Indexes on user tables are not

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Chris Browne
josh@agliodbs.com (Josh Berkus) writes: Bill, What about if an out-of-the-ordinary number of rows were deleted (say 75% of rows in the table, as opposed to normal 5%) followed by a 'VACUUM ANALYZE'?  Could things get out of whack because of that situation? Yes. You'd want to run REINDEX

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Michael Guerin
Is this a common issue among all RDBMSs or is it something that is PostgreSQL specific? Speaking from experience, this sort of thing affects MSSQL as well, although the maintenance routines are different. Yes, this is true with MSSQL too, however sql server implements a defrag index

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Josh Berkus
Michael, Every five minutes, DBCC INDEXDEFRAG will report to the user an estimated percentage completed. DBCC INDEXDEFRAG can be terminated at any point in the process, and *any completed work is retained.* Keen. Sounds like something for our TODO list. -- Josh Berkus Aglio Database

Re: [PERFORM] Disk filling, CPU filling, renegade inserts and deletes?

2005-04-21 Thread Richard Plotkin
More info on what is bloating: It's only in one database (the one that's most used), and after running oid2name on the bloated files, the result is (mysteriously) empty. Here's the run on the three enormous files: $ /usr/local/bin/oid2name -d smt -o 160779 From database smt: Filenode Table

[PERFORM] index not used

2005-04-21 Thread Enrico Weigelt
Hi folks, I'm doing a simple lookup in a small table by an unique id, and I'm wondering, why explains tells me seqscan is used instead the key. The table looks like: id bigint primary key, a varchar, b varchar, c varchar and I'm quering: select * from foo

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Bill Chandler
--- [EMAIL PROTECTED] wrote: I gather you mean, out-of-the-ordinary for most apps, but not for this client? Actually, no. The normal activity is to delete 3-5% of the rows per day, followed by a VACUUM ANALYZE. Then over the course of the day (in multiple transactions) about the same amount

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Alex Turner
Same thing happens in Oracle ALTER INDEX blah rebuild To force a rebuild. It will mark the free blocks as 'free' below the PCTFREE value for the tablespace. Basically If you build an index with entries. and each entry is 1/4 of a block, the database will write 2500 blocks to the disk.

Re: [PERFORM] immutable functions vs. join for lookups ?

2005-04-21 Thread Enrico Weigelt
* Jaime Casanova [EMAIL PROTECTED] wrote: snip Even if your data never changes it *can* change so the function should be at most stable not immutable. okay, the planner sees that the table could potentionally change. but - as the dba - I'd like to tell him, this table *never* changes in

Re: [PERFORM] immutable functions vs. join for lookups ?

2005-04-21 Thread Enrico Weigelt
* Tom Lane [EMAIL PROTECTED] wrote: snip Yeah, I was actually thinking about a two-step process: inline the function to produce somethig equivalent to a handwritten scalar sub-SELECT, and then try to convert sub-SELECTs into joins. ... back to my original question ... What kind of query

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Dave Chapeskie
On Thu, Apr 21, 2005 at 11:28:43AM -0700, Josh Berkus wrote: Michael, Every five minutes, DBCC INDEXDEFRAG will report to the user an estimated percentage completed. DBCC INDEXDEFRAG can be terminated at any point in the process, and *any completed work is retained.* Keen. Sounds

Re: [PERFORM] index not used

2005-04-21 Thread Litao Wu
If id is PK, the query shoudl return 1 row only... --- Enrico Weigelt [EMAIL PROTECTED] wrote: Hi folks, I'm doing a simple lookup in a small table by an unique id, and I'm wondering, why explains tells me seqscan is used instead the key. The table looks like: idbigint

Re: [PERFORM] Disk filling, CPU filling, renegade inserts and deletes?

2005-04-21 Thread Alvaro Herrera
On Thu, Apr 21, 2005 at 11:38:22AM -0700, Richard Plotkin wrote: More info on what is bloating: It's only in one database (the one that's most used), and after running oid2name on the bloated files, the result is (mysteriously) empty. Here's the run on the three enormous files: $

Re: [PERFORM] Disk filling, CPU filling, renegade inserts and deletes?

2005-04-21 Thread Richard Plotkin
That returned the same result. I also tried oid2name -d smt -x -i -S and, separately -s, and also separately, -d with all other databases, and none of the databases turned up any listing, in either oid or filenode, for any of these three bloated files. One thing I've noticed is that these

Re: [PERFORM] index not used

2005-04-21 Thread Stephan Szabo
On Thu, 21 Apr 2005, Enrico Weigelt wrote: I'm doing a simple lookup in a small table by an unique id, and I'm wondering, why explains tells me seqscan is used instead the key. The table looks like: idbigint primary key, a varchar, b varchar, c

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Mischa Sandberg
Quoting Bill Chandler [EMAIL PROTECTED]: ... The normal activity is to delete 3-5% of the rows per day, followed by a VACUUM ANALYZE. ... However, on occasion, deleting 75% of rows is a legitimate action for the client to take. In case nobody else has asked: is your max_fsm_pages big

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Bill Chandler
Mischa, Thanks. Yes, I understand that not having a large enough max_fsm_pages is a problem and I think that it is most likely the case for the client. What I wasn't sure of was if the index bloat we're seeing is the result of the bleeding you're talking about or something else. If I deleted

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Josh Berkus
Bill, If I deleted 75% of the rows but had a max_fsm_pages setting that still exceeded the pages required (as indicated in VACUUM output), would that solve my indexing problem or would I still need to REINDEX after such a purge? Depends on the performance you're expecting.The FSM

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread John A Meinel
Bill Chandler wrote: Mischa, Thanks. Yes, I understand that not having a large enough max_fsm_pages is a problem and I think that it is most likely the case for the client. What I wasn't sure of was if the index bloat we're seeing is the result of the bleeding you're talking about or something

[PERFORM] foreign key performance

2005-04-21 Thread Enrico Weigelt
Hi folks, do foreign keys have any influence on performance (besides slowing down huge inserts) ? do they bring any performance improvement ? thx -- - Enrico Weigelt== metux IT service phone: +49 36207 519931

Re: [PERFORM] two queries and dual cpu (perplexed)

2005-04-21 Thread Gavin Sherry
On Fri, 22 Apr 2005, Shoaib Burq (VPAC) wrote: Please see attached the output from explain analyse. This is with the shared_buffers = 10600 work_mem = 102400 enable_seqscan = true BTW I guess should mention that I am doing the select count(*) on a View. Ran the Explain