[PERFORM] slow queue-like empty table

2006-09-28 Thread Tobias Brox
I have a query which really should be lightning fast (limit 1 from index), but which isn't. I've checked the pg_locks table, there are no locks on the table. The database is not under heavy load at the moment, but the query seems to draw CPU power. I checked the pg_locks view, but found nothing

Re: [PERFORM] slow queue-like empty table

2006-09-28 Thread Tobias Brox
[Tobias Brox - Thu at 08:56:31AM +0200] It really seems like some transaction is still viewing the queue, since it found 38k of non-removable rows ... but how do I find the pid of the transaction viewing the queue? As said, the pg_locks didn't give me any hints ... Dropping the table and

Re: [PERFORM] autovacuum on a -mostly- r/o table

2006-09-28 Thread Edoardo Ceccarelli
True, but autovacuum could check load -before- and -during- it's execution and it could adjust himself automatically to perform more or less aggressively depending on the difference between those two values. Maybe with a parameter like: maximum-autovacuum-load=0.2 that would mean:

Re: [PERFORM] slow queue-like empty table

2006-09-28 Thread Csaba Nagy
On Thu, 2006-09-28 at 09:36, Tobias Brox wrote: [Tobias Brox - Thu at 08:56:31AM +0200] It really seems like some transaction is still viewing the queue, since it found 38k of non-removable rows ... but how do I find the pid of the transaction viewing the queue? As said, the pg_locks

Re: [PERFORM] slow i/o

2006-09-28 Thread Jignesh K. Shah
Using segmapsize will increase the memory available for doing file system cache and ufs:freebehind=0 helps in caching bigger files in memory. Its worth a try if you are using default UFS (without forcedirectio mount option). In your cache it seems the writes are having the problems

Re: [PERFORM] Problems with inconsistant query performance.

2006-09-28 Thread Matthew Schumacher
Marcin Mank wrote: So the question is why on a relatively simple proc and I getting a query performance delta between 3549ms and 7ms? What version of PG is it? I had such problems in a pseudo-realtime app I use here with Postgres, and they went away when I moved to 8.1 (from 7.4). I guess

Re: [PERFORM] Problems with inconsistant query performance.

2006-09-28 Thread Bill Moran
In response to Matthew Schumacher [EMAIL PROTECTED]: What I really need is a way to profile my proc when it runs slow so that I can resolve which of the queries is really slow. Anyone with an idea on how to do this? You could turn on statement logging and duration logging. This would give

[PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
I am a software developer who is acting in a (temporary) dba role for a project. I had recommended PostgreSQL be brought in to replace the proposed MySQL DB - I chose PostgreSQL because of its reputation as a more stable solution than MySQL. At this early stage in the project, we are

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Joshua D. Drake
The import is slow - and degrades as the tables grow. With even more millions of rows in dozens of import tables to come, the imports will take forever. My ability to analyse the queries is limited; because of the nature of the import process, the SQL queries are mutable, every imported

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Merlin Moncure
On 9/28/06, Carlo Stonebanks [EMAIL PROTECTED] wrote: We urgently need a major performance improvement. We are running the PostgreSQL 8.1.4 on a Windows 2003 x64 Server on a dual processor, dual core 3.2Ghz Xeon box with 4gb RAM and a RAID (sorry, I don't know what type) disc subsystem. Sorry

Re: [PERFORM] Problems with inconsistant query performance.

2006-09-28 Thread Bill Moran
In response to Jim C. Nasby [EMAIL PROTECTED]: On Thu, Sep 28, 2006 at 11:28:43AM -0400, Bill Moran wrote: In response to Matthew Schumacher [EMAIL PROTECTED]: What I really need is a way to profile my proc when it runs slow so that I can resolve which of the queries is really slow.

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
How are you loading the tables? Copy? Insert? Once the data is transformed, it is inserted. I don't have stats, but the programs visual feedback does not spend a lot of time on the inserting data message. Then again, if there is an asynchronous component to an insert, perhaps I am not seeing

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
are you using the 'copy' interface? Straightforward inserts - the import data has to transformed, normalised and de-duped by the import program. I imagine the copy interface is for more straightforward data importing. These are - buy necessity - single row inserts. thats a tough question.

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Dave Dutcher
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Carlo Stonebanks Subject: [PERFORM] Performace Optimization for Dummies At this early stage in the project, we are initializing our portal's database with millions of rows of imported data in over 50

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Jim C. Nasby
On Thu, Sep 28, 2006 at 10:11:31AM -0700, Joshua D. Drake wrote: 4) Can anyone recommend any commercial PostgreSQL service providers that may be able to swiftly come in and assist us with our performance issues? http://www.commandprompt.com/ (disclaimer, I am an employee) You forgot

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Jim C. Nasby
On Thu, Sep 28, 2006 at 01:47:44PM -0400, Carlo Stonebanks wrote: How are you loading the tables? Copy? Insert? Once the data is transformed, it is inserted. I don't have stats, but the programs visual feedback does not spend a lot of time on the inserting data message. Then again, if

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Jim C. Nasby
On Thu, Sep 28, 2006 at 01:53:22PM -0400, Carlo Stonebanks wrote: are you using the 'copy' interface? Straightforward inserts - the import data has to transformed, normalised and de-duped by the import program. I imagine the copy interface is for more straightforward data importing. These

[PERFORM]

2006-09-28 Thread Jim C. Nasby
pgsql-jdbc@postgresql.org Bcc: pgsql-performance@postgresql.org Subject: Re: RES: [PERFORM] Performace Optimization for Dummies Reply-To: In-Reply-To: [EMAIL PROTECTED] X-Operating-System: FreeBSD 6.0-RELEASE-p4 amd64 X-Distributed: Join the Effort! http://www.distributed.net Please start a new

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Matthew Nuzum
Carlo Stonebanks wrote: are you using the 'copy' interface? Straightforward inserts - the import data has to transformed, normalised and de-duped by the import program. I imagine the copy interface is for more straightforward data importing. These are - buy necessity - single row

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
Are you wrapping all this in a transaction? Yes, the transactions can typically wrap 1 to 10 single-table, single-row inserts and updates. You're doing some dynamically generated selects as part of the de-duping process? They're probably the expensive bit. What do those queries tend to

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
So your program first transforms the data and then inserts it? And it is the transforming process which is running select statements that is slow? There are cross-referencing and deduplication processes. Does this person have an office at this exact address? In a similarily named building in

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Merlin Moncure
On 9/28/06, Carlo Stonebanks [EMAIL PROTECTED] wrote: are you using the 'copy' interface? Straightforward inserts - the import data has to transformed, normalised and de-duped by the import program. I imagine the copy interface is for more straightforward data importing. These are - buy

[PERFORM] archive wal's failure and load increase.

2006-09-28 Thread Cedric Boudin
Honorable members of the list, I would like to share with you a side effect that I discovered today on our postgresql 8.1 server. We ve been running this instance with PITR for now 2 months without any problems. The wal's are copied to a remote machine with the pg_archive_command and locally to

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
The deduplication process requires so many programmed procedures that it runs on the client. Most of the de-dupe lookups are not straight lookups, but calculated ones emplying fuzzy logic. This is because we cannot dictate the format of our input data and must deduplicate with what we get.

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Merlin Moncure
On 9/28/06, Carlo Stonebanks [EMAIL PROTECTED] wrote: The deduplication process requires so many programmed procedures that it runs on the client. Most of the de-dupe lookups are not straight lookups, but calculated ones emplying fuzzy logic. This is because we cannot dictate the format of our

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
Lots of great info here, I will see what applies to my situation. However, I don't see bulk inserts of the tables working, because all of the tables need to be refreshed as values to deduplicate and match will change with every row added. In order for this to work, i would have to write queries

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Steve Atkins
On Sep 28, 2006, at 12:10 PM, Carlo Stonebanks wrote: Are you wrapping all this in a transaction? Yes, the transactions can typically wrap 1 to 10 single-table, single-row inserts and updates. You're doing some dynamically generated selects as part of the de-duping process? They're

Re: [PERFORM] archive wal's failure and load increase.

2006-09-28 Thread Tom Lane
Cedric Boudin [EMAIL PROTECTED] writes: So the question is: is there a correlation between not getting the wal's archived and this massive load growth? Shouldn't be. Do you want to force the condition again and try to see *where* the cycles are going? High load factor alone is a singularly

[PERFORM] any hope for my big query?

2006-09-28 Thread Ben
Hey guys, I've got a query that is inherently expensive, because it has to do some joins against some large tables. But it's currently *very* expensive (at least for a web app), and I've been struggling in vain all day to knock the cost down. Annoyingly, the least costly version I've come up

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Jim C. Nasby
On Thu, Sep 28, 2006 at 02:04:21PM -0700, Steve Atkins wrote: I think you're confusing explain and analyze. Explain gives you human readable output as to what the planner decided to do with the query you give it. Don't forget about EXPLAIN ANALYZE, which is related to EXPLAIN but has nothing

Re: [PERFORM] any hope for my big query?

2006-09-28 Thread Jim C. Nasby
Instead of the IN, see if this is better: AND (SELECT count(*) FROM albumjoin aj WHERE aj.album = album.id) BETWEEN 15 AND 25. From a design standpoint, it probably makes sense to have a track_count field in the album table that is kept up-to-date by triggers on albumjoin. And some nits. :) I

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Matthew Nuzum
Carlo Stonebanks wrote: Lots of great info here, I will see what applies to my situation. However, I don't see bulk inserts of the tables working, because all of the tables need to be refreshed as values to deduplicate and match will change with every row added. In order for this to work, i

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
Something else to consider... databases love doing bulk operations. It might be useful to load prospective data into a temporary table, and then do as many operations as you can locally (ie: within the database) on that table, hopefully eleminating as many candidate rows as possible along the

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
imo, the key to high performance big data movements in postgresql is mastering sql and pl/pgsql, especially the latter. once you get good at it, your net time of copy+plpgsql is going to be less than insert+tcl. If this implies bulk inserts, I'm afraid I have to consider something else. Any

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
by the way, stats_command_string is a known performance killer that iirc was improved in 8.2. just fyi. This is a handy fact, I will get on this right away. bulk load denomalized tables into scratch tables into the postgresql database. create indexes appropriate to the nomalization process

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Carlo Stonebanks
Don't forget about EXPLAIN ANALYZE, which is related to EXPLAIN but has nothing to do with the ANALYZE command. Ah, hence my confusion. Thanks for the clarification... I never knew about ANALYZE as a seperate command. ---(end of broadcast)---

Re: [PERFORM] Performace Optimization for Dummies

2006-09-28 Thread Tom Lane
Carlo Stonebanks [EMAIL PROTECTED] writes: But is there any way to create an index expression that will help with: where foo like '%bar%'? If you are concerned about that, what you are probably really looking for is full-text indexing. See contrib/tsearch2 for our current best answer to that.