Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-16 Thread Simon Riggs

Josh Berkus
  Treating the optimizer as a black box is something I'm very
 used to from
  other RDBMS. My question is, how can you explicitly
 re-write a query now
  to improve it? If there's no way of manipulating queries without
  actually re-writing the optimizer, we're now in a position where we
  aren't able to diagnose when the optimizer isn't working
 effectively.

 Well, there is ... all of the various query cost parameters.

They are very blunt instruments for such a delicate task.

Surely someone of your experience might have benefit from something
more?

My feeling is, I would, though I want those tools as *a developer*
rather than for tuning specific queries for people, which is always so
sensitive to upgrades etc.

 But, ultimately, improvements on the planner are still
 bottlenecked by having
 only one developer actually hacking the changes.


Do we have a clear list of optimizations we'd like to be working on?

The TODO items aren't very related to specific optimizations...

The only ones I was aware of was deferred subselect evaluation for
DBT-3.



...sounds like there's more to discuss here, so I'll duck out now and
get back to my current project...

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-15 Thread Bruno Wolff III
On Wed, Apr 14, 2004 at 21:12:18 +0100,
  Simon Riggs [EMAIL PROTECTED] wrote:
 
 I guess what I'm saying is it's not how many people you've got working
 on the optimizer, its how many accurate field reports of less-than
 perfect optimization reach them. In that case, PostgreSQL is likely in a
 better position than Microsoft, since the accessibility of the pg
 discussion lists makes such cases much more likely to get aired.
 
 Any thoughts?

I have seen exactly this happen a number of times over the last several
years. However there is still only one Tom Lane implementing the
improvements.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-15 Thread Simon Riggs
 Bruno Wolff
   Simon Riggs [EMAIL PROTECTED] wrote:
 
  I guess what I'm saying is it's not how many people you've
 got working
  on the optimizer, its how many accurate field reports of less-than
  perfect optimization reach them. In that case, PostgreSQL
 is likely in a
  better position than Microsoft, since the accessibility of the pg
  discussion lists makes such cases much more likely to get aired.
 
  Any thoughts?

 I have seen exactly this happen a number of times over the
 last several
 years. However there is still only one Tom Lane implementing the
 improvements.


...and very few Mr.Microsofts too.

[I'm uncomfortable with, and it was not my intent, to discuss such an
issue with direct reference to particular individuals. There is no
intent to critiscise or malign anybody named]

Regards, Simon


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-15 Thread Greg Stark

Bruno Wolff III [EMAIL PROTECTED] writes:

 I have seen exactly this happen a number of times over the last several
 years. However there is still only one Tom Lane implementing the
 improvements.

Ob: Well clearly the problem is we need more Tom Lanes.

-- 
greg


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-15 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 Ob: Well clearly the problem is we need more Tom Lanes.

ObHHGReference: Haven't you heard?  I come in six-packs!

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-15 Thread Geoffrey
Greg Stark wrote:
Bruno Wolff III [EMAIL PROTECTED] writes:


I have seen exactly this happen a number of times over the last several
years. However there is still only one Tom Lane implementing the
improvements.


Ob: Well clearly the problem is we need more Tom Lanes.

my $pgGuru = Tom Lane; my @morepgGurus; my $howmany = 10;

while($howmany--) { push @morepgGurus, $pgGuru; }

--
Until later, Geoffrey Registered Linux User #108567
Building secure systems in spite of Microsoft
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-15 Thread Joshua D. Drake

my $pgGuru = Tom Lane; my @morepgGurus; my $howmany = 10;

while($howmany--) { push @morepgGurus, $pgGuru; }

This is just wrong...

--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-15 Thread Greg Stark

Joshua D. Drake [EMAIL PROTECTED] writes:

  while($howmany--) { push @morepgGurus, $pgGuru; }

 This is just wrong...

yeah, it would have been much clearer written as:
 push @morepgGurus, ($pgGuru)x$howmany;

Or at least the perlish:
 for (1..$howmany)
instead of C style while syntax.

Ok. I stop now.

-- 
greg


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-14 Thread Simon Riggs
 Josh Berkus wrote:
 Unfortunately, these days only Tom and Neil seem to be
 seriously working on
 the query planner (beg pardon in advance if I've missed
 someone) so I think
 the real answer is that we need another person interested in
 this kind of
 optimization before it's going to get much better.


H. Interesting line of thought.

Is the problem a person interested or is there another issue there?

I was thinking the other day that maybe removing the ability to control
join order through explicitly manipulating the FROM clause might
actually be counter productive, in terms of longer term improvement of
the optimizer.

Treating the optimizer as a black box is something I'm very used to from
other RDBMS. My question is, how can you explicitly re-write a query now
to improve it? If there's no way of manipulating queries without
actually re-writing the optimizer, we're now in a position where we
aren't able to diagnose when the optimizer isn't working effectively.

For my mind, all the people on this list are potential optimizer
developers in the sense that we can all look at queries and see whether
there is a problem with particular join plans. Providing good cases of
poor optimization is just what's needed to assist those few that do
understand the internals to continue improving things.

I guess what I'm saying is it's not how many people you've got working
on the optimizer, its how many accurate field reports of less-than
perfect optimization reach them. In that case, PostgreSQL is likely in a
better position than Microsoft, since the accessibility of the pg
discussion lists makes such cases much more likely to get aired.

Any thoughts?

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-07 Thread Neil Conway
On Mon, 2004-04-05 at 11:36, Josh Berkus wrote:
 Unfortunately, these days only Tom and Neil seem to be seriously working on 
 the query planner (beg pardon in advance if I've missed someone)

Actually, Tom is the only person actively working on the planner --
while I hope to contribute to it in the future, I haven't done so yet.

-Neil



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-05 Thread Josh Berkus
Mike,

 I think it is also possible that Microsoft has more programmers working
 on tuning issues for SQL Server than PostgreSQL has working on the
 whole project.

Ah, but quantity != quality.Or they wouldn't be trolling our mailing lists 
trying to hire PostgreSQL programmers for the SQL Server project (really!).   
And we had nearly 200 contributors between 7.3 and 7.4 ... a respectable 
development staff for even a large corporation.

Point taken, though, SQL Server has done a better job in opitimizing for 
dumb queries.   This is something that PostgreSQL needs to work on, as is 
self-referential updates for large tables, which also tend to be really slow.   
Mind you, in SQL Server 7 I used to be able to crash the server with a big 
self-referential update, so this is a common database problem.

Unfortunately, these days only Tom and Neil seem to be seriously working on 
the query planner (beg pardon in advance if I've missed someone) so I think 
the real answer is that we need another person interested in this kind of 
optimization before it's going to get much better.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-05 Thread markw
On  4 Apr, Cott Lang wrote:
 On Sun, 2004-04-04 at 01:56, Gary Doades wrote:
 Unfortunately I don't understand the question!
 
 My background is the primarily Win32. The last time I used a *nix OS 
 was about 20 years ago apart from occasional dips into the linux OS 
 over the past few years. If you can tell be how to find out what you want 
 I will gladly give you the information.
 
 There are two available io schedulers in 2.6 (new feature), deadline and
 anticipatory. It should show be listed in the boot messages:
 
 dmesg | grep scheduler
 
 I've seen people arguing for each of the two schedulers, saying one is
 better than the other for databases. I'm curious which one you're
 using.  :)

Our database tests (TPC fair use implementations) show that the deadline
scheduler has an edge on the anticipatory scheduler.  Depending on the
current state of the AS scheduler, it can be within a few percent to 10%
or so.

I have some data with one of our tests here:
http://developer.osdl.org/markw/fs/dbt2_project_results.html

Mark

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-05 Thread Magnus Naeslund(t)
Gary Doades wrote:

Has anyone else done any comparative testing with the 2.6 kernel?

I know for a fact that certain stuff is recognized differently between 
2.2, 2.4 and 2.6 kernels.
For example i have one box that i installed debian stable on that used a 
2.2 kernel which automatically tuned on DMA on the harddrive, didn't do 
it on a 2.4 kernel, but on 2.6 one it saw it as DMA able.
Such things can dramatically affect performance, so make sure to compare 
what capabilities the kernel thinks your hardware has between the 
kernels first...

But i'll grant that the 2.6 kernel is a great deal faster on some of our 
test servers.

Regards
Magnus
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-05 Thread Gary Doades
On 5 Apr 2004 at 8:36, Josh Berkus wrote:

 
 Point taken, though, SQL Server has done a better job in opitimizing for 
 dumb queries.   This is something that PostgreSQL needs to work on, as is 
 self-referential updates for large tables, which also tend to be really slow.   
 Mind you, in SQL Server 7 I used to be able to crash the server with a big 
 self-referential update, so this is a common database problem.
 

I agree about the dumb queries (I'm not mine are *that* dumb :) )

When you can write SQL that looks right, feels right, gives the right 
answers during testing and SQLServer runs them really fast, you stop 
there and tend not to tinker with the SQL further.

You *can* (I certainly do) achieve comparable performance with 
PostgreSQL, but you just have to work harder for it. Now that I have 
learned the characteristics of both servers I can write SQL that is pretty 
good on both. I suspect that there are people who evaluate PostgreSQL 
by executing their favorite SQLSever queries against it, see that it is 
slower and never bother to go further.

Cheers,
Gary.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-04 Thread Gary Doades
Possibly.

A lot of my queries show comparable performance, some a little slower 
and a few a little faster. There are a few, however, that really grind on 
PostgreSQL. I am leaning patterns from these to try and and target the 
most likely performance problems to come and hand tune these types 
of SQL.

I'm not complaining about PostgreSQL or saying that SQLServer is 
better, in most cases it is not. SQLServer seems to be more predictable 
and forgiving in performance which tends to make for lazy SQL 
programming. It also has implications when the SQL is dynamically 
created based on user input, there are more chances of PostgreSQL 
hitting a performance problem than SQLServer.

Overall I'm still very impressed with PostgreSQL. Given the $7000 per 
processor licence for SQLServer makes the case for PostgreSQL even 
stronger!

Cheers,
Gary.

On 3 Apr 2004 at 17:43, Aaron Werman wrote:

Almost any cross dbms migration shows a drop in performance. The engine
effectively trains developers and administrators in what works and what
doesn't. The initial migration thus compares a tuned to an untuned version.

/Aaron

- Original Message - 
From: Josh Berkus [EMAIL PROTECTED]
To: Gary Doades [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Saturday, April 03, 2004 1:59 PM
Subject: Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.


 Gary,

  There are no indexes on the columns involved in the update, they are
  not required for my usual select statements. This is an attempt to
  slightly denormalise the design to get the performance up comparable
  to SQL Server 2000. We hope to move some of our databases over to
  PostgreSQL later in the year and this is part of the ongoing testing.
  SQLServer's query optimiser is a bit smarter that PostgreSQL's (yet)
  so I am hand optimising some of the more frequently used
  SQL and/or tweaking the database design slightly.

 Hmmm ... that hasn't been my general experience on complex queries.
However,
 it may be due to a difference in ANALYZE statistics.   I'd love to see you
 increase your default_stats_target, re-analyze, and see if PostgreSQL gets
 smarter.

 -- 
 -Josh Berkus
  Aglio Database Solutions
  San Francisco


 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


-- 
Incoming mail is certified Virus Free.
Checked by AVG Anti-Virus (http://www.grisoft.com).
Version: 7.0.230 / Virus Database: 262.6.5 - Release Date: 31/03/2004


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-04 Thread Gary Doades
Unfortunately I don't understand the question!

My background is the primarily Win32. The last time I used a *nix OS 
was about 20 years ago apart from occasional dips into the linux OS 
over the past few years. If you can tell be how to find out what you want 
I will gladly give you the information.

Regards,
Gary.

On 3 Apr 2004 at 16:52, Cott Lang wrote:

 On Sat, 2004-04-03 at 03:50, Gary Doades wrote:
  On 2 Apr 2004 at 22:36, [EMAIL PROTECTED] wrote:
  
  OK, some more detail: 
  
  Before wiping 2.4 off my test box for the second time: 
 
 Perhaps I missed it, but which io scheduler are you using under 2.6?
 
 
 
 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly
 
 
 -- 
 Incoming mail is certified Virus Free.
 Checked by AVG Anti-Virus (http://www.grisoft.com).
 Version: 7.0.230 / Virus Database: 262.6.5 - Release Date: 31/03/2004
 



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-04 Thread Gary Doades
Unfortunately I have to try and keep both SQLServer and PostgreSQL 
compatibilty. Our main web application is currently SQLServer, but we 
want to migrate customers who don't care what the DB server is over to 
PostgreSQL. Some of our larger customers demand SQLServer, you 
know how it is!

I don't want to maintain two sets of code or SQL, so I am trying to find 
common ground. The code is not a problem, but the SQL sometimes is.

Cheers,
Gary.


On 3 Apr 2004 at 17:43, Aaron Werman wrote:

 Almost any cross dbms migration shows a drop in performance. The engine
 effectively trains developers and administrators in what works and what
 doesn't. The initial migration thus compares a tuned to an untuned version.
 
 /Aaron
 
 - Original Message - 
 From: Josh Berkus [EMAIL PROTECTED]
 To: Gary Doades [EMAIL PROTECTED]; [EMAIL PROTECTED]
 Sent: Saturday, April 03, 2004 1:59 PM
 Subject: Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.
 
 
  Gary,
 
   There are no indexes on the columns involved in the update, they are
   not required for my usual select statements. This is an attempt to
   slightly denormalise the design to get the performance up comparable
   to SQL Server 2000. We hope to move some of our databases over to
   PostgreSQL later in the year and this is part of the ongoing testing.
   SQLServer's query optimiser is a bit smarter that PostgreSQL's (yet)
   so I am hand optimising some of the more frequently used
   SQL and/or tweaking the database design slightly.
 
  Hmmm ... that hasn't been my general experience on complex queries.
 However,
  it may be due to a difference in ANALYZE statistics.   I'd love to see you
  increase your default_stats_target, re-analyze, and see if PostgreSQL gets
  smarter.
 
  -- 
  -Josh Berkus
   Aglio Database Solutions
   San Francisco
 
 
  ---(end of broadcast)---
  TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly
 
 
 
 -- 
 Incoming mail is certified Virus Free.
 Checked by AVG Anti-Virus (http://www.grisoft.com).
 Version: 7.0.230 / Virus Database: 262.6.5 - Release Date: 31/03/2004
 



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-04 Thread Paul Thomas
On 04/04/2004 09:56 Gary Doades wrote:
Unfortunately I don't understand the question!

My background is the primarily Win32. The last time I used a *nix OS
was about 20 years ago apart from occasional dips into the linux OS
over the past few years. If you can tell be how to find out what you want
I will gladly give you the information.


Googling threw up

http://spider.tm/apr2004/cstory2.html

Interesting and possibly relevant quote:

Benchmarks have shown that in certain conditions the anticipatory 
algorithm is almost 10 times faster than what 2.4 kernel supports.

HTH

--
Paul Thomas
+--+-+
| Thomas Micro Systems Limited | Software Solutions for 
Business |
| Computer Consultants | 
http://www.thomas-micro-systems-ltd.co.uk   |
+--+-+

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-04 Thread Cott Lang
On Sun, 2004-04-04 at 01:56, Gary Doades wrote:
 Unfortunately I don't understand the question!
 
 My background is the primarily Win32. The last time I used a *nix OS 
 was about 20 years ago apart from occasional dips into the linux OS 
 over the past few years. If you can tell be how to find out what you want 
 I will gladly give you the information.

There are two available io schedulers in 2.6 (new feature), deadline and
anticipatory. It should show be listed in the boot messages:

dmesg | grep scheduler

I've seen people arguing for each of the two schedulers, saying one is
better than the other for databases. I'm curious which one you're
using.  :)




---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-04 Thread Gary Doades
On 3 Apr 2004 at 21:23, Mike Nolan wrote:

  Almost any cross dbms migration shows a drop in performance. The engine
  effectively trains developers and administrators in what works and what
  doesn't. The initial migration thus compares a tuned to an untuned version.
 
 I think it is also possible that Microsoft has more programmers working
 on tuning issues for SQL Server than PostgreSQL has working on the 
 whole project.
 --
 Mike Nolan
 

Agreed. Also considering the high price of SQLServer it is in their 
interests to spend a lot of resources on tuning/performance to give it a 
commercial edge over it rivals and in silly benchmark scores.

Cheers,
Gary.
 

 -- 
 Incoming mail is certified Virus Free.
 Checked by AVG Anti-Virus (http://www.grisoft.com).
 Version: 7.0.230 / Virus Database: 262.6.5 - Release Date: 31/03/2004
 



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-03 Thread Josh Berkus
Gary,

 There are no indexes on the columns involved in the update, they are  
 not required for my usual select statements. This is an attempt to  
 slightly denormalise the design to get the performance up comparable  
 to SQL Server 2000. We hope to move some of our databases over to  
 PostgreSQL later in the year and this is part of the ongoing testing.  
 SQLServer's query optimiser is a bit smarter that PostgreSQL's (yet) 
 so I am hand optimising some of the more frequently used  
 SQL and/or tweaking the database design slightly. 

Hmmm ... that hasn't been my general experience on complex queries.   However, 
it may be due to a difference in ANALYZE statistics.   I'd love to see you 
increase your default_stats_target, re-analyze, and see if PostgreSQL gets 
smarter.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-03 Thread Aaron Werman
Almost any cross dbms migration shows a drop in performance. The engine
effectively trains developers and administrators in what works and what
doesn't. The initial migration thus compares a tuned to an untuned version.

/Aaron

- Original Message - 
From: Josh Berkus [EMAIL PROTECTED]
To: Gary Doades [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Saturday, April 03, 2004 1:59 PM
Subject: Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.


 Gary,

  There are no indexes on the columns involved in the update, they are
  not required for my usual select statements. This is an attempt to
  slightly denormalise the design to get the performance up comparable
  to SQL Server 2000. We hope to move some of our databases over to
  PostgreSQL later in the year and this is part of the ongoing testing.
  SQLServer's query optimiser is a bit smarter that PostgreSQL's (yet)
  so I am hand optimising some of the more frequently used
  SQL and/or tweaking the database design slightly.

 Hmmm ... that hasn't been my general experience on complex queries.
However,
 it may be due to a difference in ANALYZE statistics.   I'd love to see you
 increase your default_stats_target, re-analyze, and see if PostgreSQL gets
 smarter.

 -- 
 -Josh Berkus
  Aglio Database Solutions
  San Francisco


 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-03 Thread Gary Doades
Actually it hasn't been my experience either. Most of my queries against 
the database, large and small are either a little quicker or no real 
difference. I have only really noticed big differences under stress when 
memory (RAM) is being squeezed. The main winner on 2.6 seems to be 
write performance and memory management.

Unfortunately I only have one test machine and I can't really keep 
switching between 2.4 and 2.6 to do the comparisons. I had written 
down 27 timings from a set of SQL of varying complexity using the 2.4 
kernel. Each SQL statement was executed 10 times and the average of 
the last 5 was used. I can only really compare those timings against the 
new installation on 2.6. I know that this is not ideal real world testing, 
but it is good enough for me at the moment. Unless anyone has 
contradictory indications then I will proceed with 2.6.

I did increase the default stats target from 10 to 50 and re-analysed. 
The explain numbers are slightly different, but the time to run was 
almost the same. Not surprising since the plan was the same.

QUERY PLAN 
Merge Join  (cost=0.00..192636.20 rows=2845920 width=92) 
  Merge Cond: (outer.reqt_id = inner.reqt_id) 
  -  Index Scan using order_reqt_pkey on order_reqt r  (cost=0.00..52662.40 
rows=2206291 width=6) 
  -  Index Scan using staff_book_idx2 on staff_booking  (cost=0.00..102529.28 
rows=2845920 width=90) 


On 3 Apr 2004 at 10:59, Josh Berkus wrote:

Gary,

 There are no indexes on the columns involved in the update, they are  
 not required for my usual select statements. This is an attempt to  
 slightly denormalise the design to get the performance up comparable  
 to SQL Server 2000. We hope to move some of our databases over to  
 PostgreSQL later in the year and this is part of the ongoing testing.  
 SQLServer's query optimiser is a bit smarter that PostgreSQL's (yet) 
 so I am hand optimising some of the more frequently used  
 SQL and/or tweaking the database design slightly. 

Hmmm ... that hasn't been my general experience on complex queries.   However, 
it may be due to a difference in ANALYZE statistics.   I'd love to see you 
increase your default_stats_target, re-analyze, and see if PostgreSQL gets 
smarter.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


-- 
Incoming mail is certified Virus Free.
Checked by AVG Anti-Virus (http://www.grisoft.com).
Version: 7.0.230 / Virus Database: 262.6.5 - Release Date: 31/03/2004


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-03 Thread Gary Doades
Thanks,

I know about set showplan_text, but it is only the equivalent of explain, 
not explain analyze. The graphical plan gives full statistics, runtime, 
percentage cost, loop execution counts etc. which is much more useful. 
I don't know of a way of getting the graphical plan content in text form.

Cheers,
Gary.

On 3 Apr 2004 at 6:50, @g v t c wrote:

Use Set Show_Plan or something of the sort in Query Analyzer.  Then 
run your SQL.  This will change the graphical plan to a text plan 
similar to Postgresql or at least something close to readable.

Gary Doades wrote:

On 2 Apr 2004 at 22:36, [EMAIL PROTECTED] wrote:

OK, some more detail: 

Before wiping 2.4 off my test box for the second time: 

SQL Statement for update: 
update staff_booking set time_from = r.time_from from order_reqt r where r.reqt_id = 
staff_booking.reqt_id; 

Explain: (on 2.4) 
QUERY PLAN 
Merge Join  (cost=0.00..185731.30 rows=2845920 width=92) 
  Merge Cond: (outer.reqt_id = inner.reqt_id) 
  -  Index Scan using order_reqt_pkey on order_reqt r  (cost=0.00..53068.20 
rows=2206291 width=6) 
  -  Index Scan using staff_book_idx2 on staff_booking  (cost=0.00..99579.21 
rows=2845920 width=90) 

Total execution time: 18 hours 12 minutes 

vacuum full analyze: total time 3 hours 22 minutes 

Wait 2 hours for re-install 2.6, set params etc.  
restore database.  

Same SQL Statement 
Explain: (on 2.6) 
QUERY PLAN 
Merge Join  (cost=0.00..209740.24 rows=2845920 width=92) 
  Merge Cond: (outer.reqt_id = inner.reqt_id) 
  -  Index Scan using order_reqt_pkey on order_reqt r  (cost=0.00..50734.20 
rows=2206291 width=6) 
  -  Index Scan using staff_book_idx2 on staff_booking  (cost=0.00..117921.92 
rows=2845920 width=90) 

Total execution time: 2 hours 53 minutes 

vacuum full analyze: total time 1 hours 6 minutes 

Table definitions for the two tables involved: 
CREATE TABLE ORDER_REQT 
( 
   REQT_ID SERIAL, 
   ORDER_IDinteger NOT NULL, 
   DAYOFWEEK   smallint NOT NULL CHECK (DAYOFWEEK  
BETWEEN 0 AND 6), 
   TIME_FROM   smallint NOT NULL CHECK (TIME_FROM  
BETWEEN 0 AND 1439), 
   DURATIONsmallint NOT NULL CHECK (DURATION  
BETWEEN 0 AND 1439), 
   PRODUCT_ID  integer NOT NULL, 
   NUMBER_REQT smallint NOT NULL DEFAULT (1), 
   WROPTIONS   integer NOT NULL DEFAULT 0, 
   UID_REF integer NOT NULL, 
   DT_STAMPtimestamp NOT NULL DEFAULT  
current_timestamp, 
   Sentinel_Priority   integer NOT NULL DEFAULT 0, 
   PERIOD  smallint NOT NULL DEFAULT 1 CHECK  
(PERIOD BETWEEN -2 AND 4), 
   FREQUENCY   smallint NOT NULL DEFAULT 1, 
   PRIMARY KEY (REQT_ID) 
); 

CREATE TABLE STAFF_BOOKING 
( 
   BOOKING_ID  SERIAL, 
   REQT_ID integer NOT NULL, 
   ENTITY_TYPE smallint NOT NULL DEFAULT 3  
check(ENTITY_TYPE in(3,4)), 
   STAFF_IDinteger NOT NULL, 
   CONTRACT_ID integer NOT NULL, 
   TIME_FROM   smallint NOT NULL CHECK (TIME_FROM  
BETWEEN 0 AND 1439), 
   DURATIONsmallint NOT NULL CHECK (DURATION  
BETWEEN 0 AND 1439), 
   PERIOD  smallint NOT NULL DEFAULT 1 CHECK  
(PERIOD BETWEEN -2 AND 4), 
   FREQUENCY   smallint NOT NULL DEFAULT 1, 
   TRAVEL_TO   smallint NOT NULL DEFAULT 0, 
   UID_REF integer NOT NULL, 
   DT_STAMPtimestamp NOT NULL DEFAULT  
current_timestamp, 
   SELL_PRICE  numeric(10,4) NOT NULL DEFAULT 0, 
   COST_PRICE  numeric(10,4) NOT NULL DEFAULT 0, 
   MIN_SELL_PRICE  numeric(10,4) NOT NULL DEFAULT 0, 
   MIN_COST_PRICE  numeric(10,4) NOT NULL DEFAULT 0, 
   Sentinel_Priority   integer NOT NULL DEFAULT 0, 
   CHECK_INTERVAL  smallint NOT NULL DEFAULT 0, 
  STATUS   smallint NOT NULL DEFAULT 0, 
   WROPTIONS   integer NOT NULL DEFAULT 0, 
   PRIMARY KEY (BOOKING_ID) 
); 

Foreign keys: 

ALTER TABLE ORDER_REQT ADD  
FOREIGN KEY  
   ( 
   ORDER_ID 
   ) REFERENCES MAIN_ORDER ( 
   ORDER_ID 
   ) ON DELETE CASCADE; 

ALTER TABLE ORDER_REQT ADD  
FOREIGN KEY  
   ( 
   PRODUCT_ID 
   ) REFERENCES PRODUCT ( 
   PRODUCT_ID 
   ); 

ALTER TABLE STAFF_BOOKING ADD  
FOREIGN KEY  
   ( 
   CONTRACT_ID 
   ) REFERENCES STAFF_CONTRACT ( 
   CONTRACT_ID 
   ); 

ALTER TABLE STAFF_BOOKING ADD  
FOREIGN KEY  
   ( 
   STAFF_ID 
   ) REFERENCES STAFF ( 
   STAFF_ID 
   ); 


Indexes: 

CREATE INDEX FK_IDX_ORDER_REQT  
ON ORDER_REQT  
   ( 
   ORDER_ID 
   

Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-03 Thread Gary Doades
Following on from Josh's response and my previous reply on SQLServer planning.

The main problem query is this one:

SELECT VS.*,VL.TEL1,SC.CONTRACT_ID,SC.CONTRACT_REF, SC.MAX_HOURS, 
SC.MIN_HOURS, 
 (SELECT COUNT(*) FROM TIMESHEET_DETAIL JOIN MAIN_ORDER ON 
(MAIN_ORDER.ORDER_ID = TIMESHEET_DETAIL.ORDER_ID AND 
MAIN_ORDER.CLIENT_ID = 6) WHERE TIMESHEET_DETAIL.CONTRACT_ID = 
SC.CONTRACT_ID) AS VISITS,
(SELECT (SUM(R.DURATION+1))/60.0 FROM ORDER_REQT R
 JOIN STAFF_BOOKING B ON (B.REQT_ID = R.REQT_ID)
 JOIN BOOKING_PLAN BP ON (BP.BOOKING_ID = B.BOOKING_ID) WHERE 
B.CONTRACT_ID = SC.CONTRACT_ID 
 AND BP.BOOKING_DATE BETWEEN '2004-06-12' AND '2004-06-18') AS RHOURS 
FROM VSTAFF VS
JOIN STAFF_CONTRACT SC ON (SC.STAFF_ID = VS.STAFF_ID)
JOIN VLOCATION VL ON (VL.LOCATION_ID = VS.LOCATION_ID)
JOIN SEARCH_REQT_RESULT SR ON (SR.STAFF_ID = VS.STAFF_ID)
WHERE SR.SEARCH_ID = 1 AND SC.CONTRACT_ID IN
(SELECT C.CONTRACT_ID FROM STAFF_PRODUCT P,STAFF_CONTRACT C  
WHERE P.CONTRACT_ID=C.CONTRACT_ID AND C.STAFF_ID = VS.STAFF_ID AND 
P.PRODUCT_ID IN (SELECT PRODUCT_ID FROM SEARCH_ORDER_REQT WHERE 
SEARCH_ID = 1)  AND C.AVAIL_DATE_FROM = '2004-06-12' AND 
C.AVAIL_DATE_TO = '2004-06-18'  GROUP BY C.CONTRACT_ID
 HAVING (COUNT(C.CONTRACT_ID) = (SELECT COUNT(DISTINCT PRODUCT_ID) 
FROM SEARCH_ORDER_REQT WHERE SEARCH_ID = 1)))

The explain analyze is:
QUERY PLAN
Nested Loop  (cost=101.54..1572059.57 rows=135 width=152) (actual 
time=13749.100..1304586.501 rows=429 loops=1)
  InitPlan
-  Index Scan using fk_idx_wruserarea on wruserarea  (cost=3.26..6.52 rows=1 
width=4) (actual time=0.944..0.944 rows=1 loops=1)
  Index Cond: (area_id = 1)
  Filter: (uid = $4)
  InitPlan
-  Seq Scan on wruser  (cost=0.00..3.26 rows=1 width=4) (actual 
time=0.686..0.691 rows=1 loops=1)
  Filter: ((username)::name = current_user())
  -  Hash Join  (cost=95.02..3701.21 rows=215 width=138) (actual 
time=100.476..1337.392 rows=429 loops=1)
Hash Cond: (outer.staff_id = inner.staff_id)
Join Filter: (subplan)
-  Seq Scan on staff_contract sc  (cost=0.00..33.24 rows=1024 width=37) 
(actual 
time=0.114..245.366 rows=1024 loops=1)
-  Hash  (cost=93.95..93.95 rows=430 width=109) (actual time=38.563..38.563 
rows=0 loops=1)
  -  Hash Join  (cost=47.47..93.95 rows=430 width=109) (actual 
time=15.502..36.627 rows=429 loops=1)
Hash Cond: (outer.staff_id = inner.staff_id)
-  Seq Scan on staff  (cost=34.61..66.48 rows=1030 width=105) 
(actual 
time=9.655..15.264 rows=1030 loops=1)
  Filter: ((hashed subplan) OR $5)
  SubPlan
-  Seq Scan on staff_area  (cost=10.73..33.38 rows=493 
width=4) 
(actual time=8.452..8.452 rows=0 loops=1)
  Filter: ((hashed subplan) OR (area_id = 1))
  SubPlan
-  Seq Scan on wruserarea  (cost=3.26..10.72 
rows=5 width=4) 
(actual time=0.977..1.952 rows=1 loops=1)
  Filter: (uid = $1)
  InitPlan
-  Seq Scan on wruser  (cost=0.00..3.26 
rows=1 width=4) 
(actual time=0.921..0.926 rows=1 loops=1)
  Filter: ((username)::name = 
current_user())
-  Hash  (cost=11.79..11.79 rows=430 width=4) (actual 
time=5.705..5.705 
rows=0 loops=1)
  -  Index Scan using fk_idx_search_reqt_result on 
search_reqt_result 
sr  (cost=0.00..11.79 rows=430 width=4) (actual time=0.470..4.482 rows=429 loops=1)
Index Cond: (search_id = 1)
SubPlan
  -  HashAggregate  (cost=8.32..8.32 rows=1 width=4) (actual 
time=2.157..2.157 
rows=1 loops=429)
Filter: (count(contract_id) = $9)
InitPlan
  -  Aggregate  (cost=1.04..1.04 rows=1 width=4) (actual 
time=0.172..0.173 
rows=1 loops=1)
-  Seq Scan on search_order_reqt  (cost=0.00..1.04 rows=1 
width=4) 
(actual time=0.022..0.038 rows=1 loops=1)
  Filter: (search_id = 1)
-  Hash IN Join  (cost=1.04..7.27 rows=1 width=4) (actual 
time=2.064..2.117 
rows=1 loops=429)
  Hash Cond: (outer.product_id = inner.product_id)
  -  Nested Loop  (cost=0.00..6.19 rows=7 width=8) (actual 
time=1.112..2.081 rows=8 loops=429)
-  Index Scan using fk_idx_staff_contract_2 on 
staff_contract c  
(cost=0.00..3.03 rows=1 width=4) (actual time=0.206..0.245 rows=1 loops=429)
  Index Cond: (staff_id = $8)
  Filter: ((avail_date_from = '2004-06-12'::date) AND 
(avail_date_to 
= '2004-06-18'::date))
-  Index Scan using 

Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-01 Thread Tom Lane
Gary Doades [EMAIL PROTECTED] writes:
 As a test in PosgreSQL I issued a statement to update a single column
 of a table containing 2.8 million rows with the values of a column in
 a table with similar rowcount.  Using the above spec I had to stop the
 server after 17 hours. The poor thing was thrashing the hard disk and
 doing more swapping than useful work.

This statement is pretty much content-free, since you did not show us
the table schemas, the query, or the EXPLAIN output for the query.
(I'll forgive you the lack of EXPLAIN ANALYZE, but you could easily
have provided all the other hard facts.)  There's really no way to tell
where the bottleneck is.  Maybe it's a kernel-level issue, but I would
not bet on that without more evidence.  I'd definitely not bet on it
without direct confirmation that the same query plan was used in both
setups.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-01 Thread Gary Doades
The post was not intended to be content-rich, just my initial feedback 
after only just switching to 2.6. Since I had largely given up on this 
particular line of attack using 2.4 I didn't think to do a detailed analysis 
at this time. I was also hoping that others would add to the discussion. 

As this could become important I will be doing more analysis, but due to 
the nature of the issue and trying to keep as many factors constant as 
possible, this may take some time.

Cheers,
Gary.

On 2 Apr 2004 at 1:32, Tom Lane wrote:

 Gary Doades [EMAIL PROTECTED] writes:
  As a test in PosgreSQL I issued a statement to update a single column
  of a table containing 2.8 million rows with the values of a column in
  a table with similar rowcount.  Using the above spec I had to stop the
  server after 17 hours. The poor thing was thrashing the hard disk and
  doing more swapping than useful work.
 
 This statement is pretty much content-free, since you did not show us
 the table schemas, the query, or the EXPLAIN output for the query.
 (I'll forgive you the lack of EXPLAIN ANALYZE, but you could easily
 have provided all the other hard facts.)  There's really no way to tell
 where the bottleneck is.  Maybe it's a kernel-level issue, but I would
 not bet on that without more evidence.  I'd definitely not bet on it
 without direct confirmation that the same query plan was used in both
 setups.
 
   regards, tom lane
 
 
 -- 
 Incoming mail is certified Virus Free.
 Checked by AVG Anti-Virus (http://www.grisoft.com).
 Version: 7.0.230 / Virus Database: 262.6.5 - Release Date: 31/03/2004
 



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match