Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0 Bayes module.

2005-07-28 Thread Tom Lane
Matthew Schumacher [EMAIL PROTECTED] writes: After playing with various indexes and what not I simply am unable to make this procedure perform any better. Perhaps someone on the list can spot the bottleneck and reveal why this procedure isn't performing that well or ways to make it better.

Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0

2005-07-28 Thread Karim Nassar
On Wed, 2005-07-27 at 14:35 -0800, Matthew Schumacher wrote: I put the rest of the schema up at http://www.aptalaska.net/~matt.s/bayes/bayes_pg.sql in case someone needs to see it too. Do you have sample data too? -- Karim Nassar Collaborative Computing Lab of NAU Office: (928) 523 5868 -=-

[PERFORM] Finding bottleneck

2005-07-28 Thread Kari Lavikka
Hello, we recently upgraded our dual Xeon Dell to a brand new Sun v40z with 4 opterons, 16GB of memory and MegaRAID with enough disks. OS is Debian Sarge amd64, PostgreSQL is 8.0.3. Size of database is something like 80GB and our website performs about 600 selects and several updates/inserts

Re: [PERFORM] Finding bottleneck

2005-07-28 Thread Gavin Sherry
Hi, On Thu, 28 Jul 2005, Kari Lavikka wrote: -8 Relevant rows from postgresql.conf 8- shared_buffers = 15000 # min 16, at least max_connections*2, 8KB each work_mem = 1536 # min 64, size in KB As an aside, I'd increase work_mem -- but

Re: [PERFORM] Left joining against two empty tables makes a query SLOW

2005-07-28 Thread Gnanavel S
On 7/28/05, Chris Travers [EMAIL PROTECTED] wrote: Hi all;I have a customer who currently uses an application which had becomeslow.After doing some digging, I found the slow query:SELECT c.accno, c.description, c.link, c.category, ac.project_id,p.projectnumber ,a.department_id, d.description AS

Re: [PERFORM] Finding bottleneck

2005-07-28 Thread Claus Guttesen
effective_cache_size = 100 # typically 8KB each I have this setting on postgresql 7.4.8 on FreeBSD with 4 GB RAM: effective_cache_size = 27462 So eventhough your machine runs Debian and you have four times as much RAM as mine your effective_cache_size is 36 times larger. You could try

Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0 Bayes module.

2005-07-28 Thread Merlin Moncure
I'm not sure how much this has been discussed on the list, but wasn't able to find anything relevant in the archives. The new Spamassassin is due out pretty soon. They are currently testing 3.1.0pre4. One of the things I hope to get out of this release is bayes word stats moved to a real

Re: [PERFORM] Finding bottleneck

2005-07-28 Thread Merlin Moncure
Kari Lavikka wrote: shared_buffers = 15000 you can play around with this one but in my experience it doesn't make much difference anymore (it used to). work_mem = 1536 # min 64, size in KB this seems low. are you sure you are not getting sorts swapped to disk? fsync

[PERFORM] Fwd: Help with view performance problem

2005-07-28 Thread Chris Hoover
Does anyone have any suggestions on this? I did not get any response from the admin list. Thanks, Chris -- Forwarded message -- From: Chris Hoover [EMAIL PROTECTED] Date: Jul 27, 2005 12:29 PM Subject: Re: Help with view performance problem To: pgsql-admin@postgresql.org I

Re: [PERFORM] Fwd: Help with view performance problem

2005-07-28 Thread Dan Harris
On Jul 28, 2005, at 8:38 AM, Chris Hoover wrote: I did some more testing, and ran the explain analyze on the problem. In my session I did a set enable_hashjoin = false and then ran the analyze. This caused it to use the indexes as I have been expecting it to do. Now, how can I get it to use

[PERFORM] Unable to explain DB error

2005-07-28 Thread Steven Rosenstein
Postgres V7.3.9-2. While executing a query in psql, the following error was generated: vsa=# select * from vsa.dtbl_logged_event_20050318 where id=2689472; PANIC: open of /vsa/db/pg_clog/0FC0 failed: No such file or directory server closed the connection unexpectedly This probably

Re: [PERFORM] Left joining against two empty tables makes a query

2005-07-28 Thread Chris Travers
Gnanavel S wrote: vacuum reindex the department and project table as the planner expects there are 1060 rows but actually returning nothing. I guess I should have mentioned that I have been vacuuming and reindexing at least once a week, and I did so just before running this test.

Re: [PERFORM] Fwd: Help with view performance problem

2005-07-28 Thread Chris Hoover
I'm alreading running at 1.5. It looks like if I drop the random_page_cost t0 1.39, it starts using the indexes. Are there any unseen issues with dropping the random_page_cost this low? Thanks, Chris On 7/28/05, Dan Harris [EMAIL PROTECTED] wrote: On Jul 28, 2005, at 8:38 AM, Chris Hoover

Re: [PERFORM] Left joining against two empty tables makes a query

2005-07-28 Thread Tom Lane
Chris Travers [EMAIL PROTECTED] writes: Secondly, the project table has *never* had anything in it. So where are these numbers coming from? The planner is designed to assume a certain minimum size (10 pages) when it sees that a table is of zero physical length. The reason for this is that

Re: [PERFORM] Unable to explain DB error

2005-07-28 Thread Tom Lane
Steven Rosenstein [EMAIL PROTECTED] writes: Any idea why Postgres would be looking for a clog file name 0FC0 when the most recent filename is 04CE? Corrupt data --- specifically a bad transaction number in a tuple header. (In practice, this is the first field looked at in which we can readily

Re: [PERFORM] Finding bottleneck

2005-07-28 Thread Luke Lonergan
On 7/28/05 2:21 AM, Kari Lavikka [EMAIL PROTECTED] wrote: There's a new profiling tool called oprofile: http://oprofile.sourceforge.net/download/ that can be run without instrumenting the binaries beforehand. To actually find out what the code is doing during these stalls, oprofile can show

[PERFORM] Two queries are better than one?

2005-07-28 Thread Karim Nassar
I ran into a situation today maintaining someone else's code where the sum time running 2 queries seems to be faster than 1. The original code was split into two queries. I thought about joining them, but considering the intelligence of my predecessor, I wanted to test it. The question is, which

Re: [PERFORM] [PATCHES] COPY FROM performance improvements

2005-07-28 Thread Mark Wong
On Fri, 22 Jul 2005 12:28:43 -0700 Luke Lonergan [EMAIL PROTECTED] wrote: Joshua, On 7/22/05 10:11 AM, Joshua D. Drake [EMAIL PROTECTED] wrote: The database server is a PE (Power Edge) 6600 Database Server IO: [EMAIL PROTECTED] root]# /sbin/hdparm -tT /dev/sda /dev/sda:

Re: [PERFORM] [PATCHES] COPY FROM performance improvements

2005-07-28 Thread Joshua D. Drake
[EMAIL PROTECTED] root]# /sbin/hdparm -tT /dev/sda /dev/sda: Timing buffer-cache reads: 1816 MB in 2.00 seconds = 908.00 MB/sec Timing buffered disk reads: 26 MB in 3.11 seconds = 8.36 MB/sec [EMAIL PROTECTED] root]# Can you post the time dd if=/dev/zero of=bigfile bs=8k

Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0

2005-07-28 Thread Matthew Schumacher
Karim Nassar wrote: On Wed, 2005-07-27 at 14:35 -0800, Matthew Schumacher wrote: I put the rest of the schema up at http://www.aptalaska.net/~matt.s/bayes/bayes_pg.sql in case someone needs to see it too. Do you have sample data too? Ok, I finally got some test data together so that

Re: [PERFORM] Two queries are better than one?

2005-07-28 Thread John A Meinel
Karim Nassar wrote: I ran into a situation today maintaining someone else's code where the sum time running 2 queries seems to be faster than 1. The original code was split into two queries. I thought about joining them, but considering the intelligence of my predecessor, I wanted to test it.

Re: [PERFORM] Two queries are better than one?

2005-07-28 Thread Michael Fuhr
On Thu, Jul 28, 2005 at 04:04:25PM -0700, Karim Nassar wrote: I ran into a situation today maintaining someone else's code where the sum time running 2 queries seems to be faster than 1. The original code was split into two queries. I thought about joining them, but considering the

Re: [PERFORM] [PATCHES] COPY FROM performance improvements

2005-07-28 Thread Luke Lonergan
Can you post the time dd if=/dev/zero of=bigfile bs=8k count=50 results? Also do the reverse (read the file) with time dd if=bigfile of=/dev/null bs=8k. I didn't see this come across before... here ya go: time dd if=/dev/zero of=bigfile bs=8k count=50 50+0 records in

Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0

2005-07-28 Thread Gavin Sherry
On Thu, 28 Jul 2005, Matthew Schumacher wrote: Karim Nassar wrote: On Wed, 2005-07-27 at 14:35 -0800, Matthew Schumacher wrote: I put the rest of the schema up at http://www.aptalaska.net/~matt.s/bayes/bayes_pg.sql in case someone needs to see it too. Do you have sample data too?

Re: [PERFORM] [PATCHES] COPY FROM performance improvements

2005-07-28 Thread Luke Lonergan
Mark, On 7/28/05 4:43 PM, Mark Wong [EMAIL PROTECTED] wrote: Are there any recommendations for Qlogic controllers on Linux, scsi or fiber channel? I might be able to my hands on some. I have pci-x slots for AMD, Itanium, or POWER5 if the architecture makes a difference. I don't have a

Re: [PERFORM] Left joining against two empty tables makes a query SLOW

2005-07-28 Thread Gnanavel S
On 7/28/05, Chris Travers [EMAIL PROTECTED] wrote: Gnanavel S wrote: vacuum reindex the department and project table as the planner expects there are 1060 rows but actually returning nothing.I guess I should have mentioned that I have been vacuuming and reindexing at least once a week, and I did

Re: [PERFORM] Two queries are better than one?

2005-07-28 Thread Gnanavel S
On 7/29/05, Karim Nassar [EMAIL PROTECTED] wrote: I ran into a situation today maintaining someone else's code where thesum time running 2 queries seems to be faster than 1. The original codewas split into two queries. I thought about joining them, butconsidering the intelligence of my

Re: [PERFORM] Left joining against two empty tables makes a query

2005-07-28 Thread Neil Conway
Gnanavel S wrote: reindex the tables separately. Reindexing should not affect this problem, anyway. -Neil ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining

Re: [PERFORM] Two queries are better than one?

2005-07-28 Thread Karim Nassar
On Fri, 2005-07-29 at 09:41 +0530, Gnanavel S wrote: Joined: test= explain analyze test-SELECT cli_name,order.* test- FROM order test- JOIN client ON (ord_client = cli_code) test-

Re: [PERFORM] Two queries are better than one?

2005-07-28 Thread Gnanavel S
On 7/29/05, Karim Nassar [EMAIL PROTECTED] wrote: On Fri, 2005-07-29 at 09:41 +0530, Gnanavel S wrote: Joined: test= explain analyze test-SELECT cli_name,order.* test- FROM order test- JOIN client ON (ord_client = cli_code) test-WHERE ord_batch='343B' AND ord_id='12-645'; where is the cli_code

Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0

2005-07-28 Thread Matthew Schumacher
Gavin Sherry wrote: I had a look at your data -- thanks. I have a question though: put_token() is invoked 120596 times in your benchmark... for 616 messages. That's nearly 200 queries (not even counting the 1-8 (??) inside the function itself) per message. Something doesn't seem right

Re: [PERFORM] Left joining against two empty tables makes a query

2005-07-28 Thread Chris Travers
Secondly, the project table has *never* had anything in it. So where are these numbers coming from? pg_statistics I very much doubt that. I was unable to locate any rows in pg_statistic where the pg_class.oid for either table matched any row's starelid. Tom's argument that

Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0

2005-07-28 Thread Andrew McMillan
On Thu, 2005-07-28 at 16:13 -0800, Matthew Schumacher wrote: Ok, I finally got some test data together so that others can test without installing SA. The schema and test dataset is over at http://www.aptalaska.net/~matt.s/bayes/bayesBenchmark.tar.gz I have a pretty fast machine with a