Re: [PERFORM] Select performance vs. mssql

2005-05-24 Thread Christopher Kings-Lynne



select count(*) from mtable where day='Mon'

Results:

1. P3 600 512MB RAM MSSQL. It takes about 4-5 secs to
run. If I run a few queries and everything is cached,
it is sometimes  just 1 second.

2. Athlon 1.3 Ghz 1GB RAM. PostgreSQL takes 7 seconds.
I have played with the buffers setting and currently
have it at 7500. At 2 it took over 20 seconds to
run.

5 seconds vs 7 isn't that big of a deal, but 1 second
vs 7 seconds is. Also, the slower performance is with
much lesser hardware.


Post the result of this for us:

explain analyze select count(*) from mtable where day='Mon';

On both machines.

Chris

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Select performance vs. mssql

2005-05-24 Thread mark durrant
 Post the result of this for us:
 
 explain analyze select count(*) from mtable where
 day='Mon';
 
 On both machines.

Hi Chris --

PostgreSQL Machine:
Aggregate  (cost=140122.56..140122.56 rows=1 width=0)
(actual time=24516.000..24516.000 rows=1 loops=1)
  -  Index Scan using day on mtable 
(cost=0.00..140035.06 rows=35000 width=0) (actual
time=47.000..21841.000 rows=1166025 loops=1)
Index Cond: (day = 'Mon'::bpchar)
Total runtime: 24516.000 ms
(Note this took 24 seconds after fresh reboot, next
execution was 11, and execution without explain
analyze was 6.7 seconds)

MSSQL Machine:
That Explain Analyze command doesn't work for MSSQL,
but I did view the Query plan. 97% of it was Scanning
a particular range of rows from a nonclustered index

Thanks for your help --Mark

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Select performance vs. mssql

2005-05-24 Thread Neil Conway

mark durrant wrote:

PostgreSQL Machine:
Aggregate  (cost=140122.56..140122.56 rows=1 width=0)
(actual time=24516.000..24516.000 rows=1 loops=1)
  -  Index Scan using day on mtable 
(cost=0.00..140035.06 rows=35000 width=0) (actual

time=47.000..21841.000 rows=1166025 loops=1)
Index Cond: (day = 'Mon'::bpchar)
Total runtime: 24516.000 ms


Have you run ANALYZE?

Clustering the table on the day index (via the CLUSTER command) would 
be worth trying.


-Neil

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Select performance vs. mssql

2005-05-24 Thread Harald Lau (Sector-X)
Mark,

 MSSQL Machine:
 That Explain Analyze command doesn't work for MSSQL,

try this:
set showplan_all on
go
select ...
go

Harald

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] Need help to decide Mysql vs Postgres

2005-05-24 Thread Amit V Shah
Hi all,

From whatever reading and surfing I have done, I have found that postgres is
good. Actually I myself am a fan of postgres as compared to mysql. However I
want to have some frank opinions before I decide something.  Following are
some of the aspects of my schema, and our concerns --

- We have around 150 tables on the DB
- We have lot of foreign keys between the tables
- Couple of tables are going to have around couple of hundereds of millions
of records (300 Million right now and would grow). Few of this tables are
fairly wide with around 32 columns, and have around 3-4 columns which are
foreign keys and refer to other tables
- Most of the DB usage is Selects. We would have some inserts but that would
be like a nightly or a monthly process


Our only concern with going with postgres is speed. I haven't done a speed
test yet so I can't speak. But the major concern is that the selects and
inserts are going to be much much slower on postgres than on mysql. I dont
know how true this is. I know this is a postgres forum so everyone will say
postgres is better but I am just looking for some help and advise I guess
!!!

I am not trying to start a mysql vs postgres war so please dont
misunderstand me  I tried to look around for mysql vs postgres articles,
but most of them said mysql is better in speed. However those articles were
very old so I dont know about recent stage. Please comment !!!

Thanks,
Amit


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Need help to decide Mysql vs Postgres

2005-05-24 Thread Joshua D. Drake


I am not trying to start a mysql vs postgres war so please dont
misunderstand me  I tried to look around for mysql vs postgres articles,
but most of them said mysql is better in speed. However those articles were
very old so I dont know about recent stage. Please comment !!!


It is my experience that MySQL is faster under smaller load scenarios. 
Say 5 - 10 connections only doing simple SELECTS. E.g; a dymanic website.


It is also my experience that PostgreSQL is faster and more stable under
consistent and heavy load. I have customers you regularly are using up 
to 500 connections.


Note that alot of this depends on how your database is designed. Foreign 
keys slow things down.


I think it would be important for you to look at your overall goal of 
migration. MySQL is really not a bad product IF you are willing to 
work within its limitations.


PostgreSQL is a real RDMS, it is like Oracle or DB2 and comes with a 
comparable feature set. Only you can decide if that is what you need.


Sincerely,

Joshua D. Drake
Command Prompt, Inc.


--
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Need help to decide Mysql vs Postgres

2005-05-24 Thread Amit V Shah
Hi Josh,

Thanks for the prompt reply !! Actually migration is inevitable. We have a
totally messed up schema, not normalized and stuff like that. So the goal of
the migration is to get a new and better normalized schema. That part is
done already. Now the decision point is, should we go with postgres or
mysql. 

Thanks,
Amit
 
-Original Message-
From: Joshua D. Drake [mailto:[EMAIL PROTECTED]
Sent: Tuesday, May 24, 2005 1:15 PM
To: Amit V Shah
Cc: 'pgsql-performance@postgresql.org'
Subject: Re: [PERFORM] Need help to decide Mysql vs Postgres


 
 I am not trying to start a mysql vs postgres war so please dont
 misunderstand me  I tried to look around for mysql vs postgres
articles,
 but most of them said mysql is better in speed. However those articles
were
 very old so I dont know about recent stage. Please comment !!!

It is my experience that MySQL is faster under smaller load scenarios. 
Say 5 - 10 connections only doing simple SELECTS. E.g; a dymanic website.

It is also my experience that PostgreSQL is faster and more stable under
consistent and heavy load. I have customers you regularly are using up 
to 500 connections.

Note that alot of this depends on how your database is designed. Foreign 
keys slow things down.

I think it would be important for you to look at your overall goal of 
migration. MySQL is really not a bad product IF you are willing to 
work within its limitations.

PostgreSQL is a real RDMS, it is like Oracle or DB2 and comes with a 
comparable feature set. Only you can decide if that is what you need.

Sincerely,

Joshua D. Drake
Command Prompt, Inc.


-- 
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Select performance vs. mssql

2005-05-24 Thread Bruno Wolff III
On Tue, May 24, 2005 at 08:36:36 -0700,
  mark durrant [EMAIL PROTECTED] wrote:
 
 --MSSQL's ability to hit the index only and not having
 to go to the table itself results in a _big_
 performance/efficiency gain. If someone who's in
 development wants to pass this along, it would be a
 nice addition to PostgreSQL sometime in the future.
 I'd suspect that as well as making one query faster,
 it would make everything else faster/more scalable as
 the server load is so much less.

This gets brought up a lot. The problem is that the index doesn't include
information about whether the current transaction can see the referenced
row. Putting this information in the index will add significant overhead
to every update and the opinion of the developers is that this would be
a net loss overall.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Need help to decide Mysql vs Postgres

2005-05-24 Thread Joel Fradkin
I took an unbiased look and did some tests.
Objectively for me MYSQL was not an improvement for speed.
I had read the benchmarks in pcmagazine from a while back as well.

I did some tests using ODBC, and .net connections and also used aqua studios
(hooks up to both data bases) and found postgres a bit faster.

I did spend more time getting a feeling for setup on postgres, but I was at
a point of desperation as some queries were still too slow on postgres.

I ended up re-engineering my app to use simpler(flattened) data sets.
I still have a few I am working through, but all in all it is running better
then when I was on MSSQL, and MYSQL was just slower on the tests I did.

I loaded both MYSQL and postgres on both my 4 processor Dell running red hat
AS3 and Windows XP on a optiplex.

Joel Fradkin
 
Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305
 
[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology  advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.
 

 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Amit V Shah
Sent: Tuesday, May 24, 2005 12:22 PM
To: 'Joshua D. Drake'
Cc: 'pgsql-performance@postgresql.org'
Subject: Re: [PERFORM] Need help to decide Mysql vs Postgres

Hi Josh,

Thanks for the prompt reply !! Actually migration is inevitable. We have a
totally messed up schema, not normalized and stuff like that. So the goal of
the migration is to get a new and better normalized schema. That part is
done already. Now the decision point is, should we go with postgres or
mysql. 

Thanks,
Amit
 
-Original Message-
From: Joshua D. Drake [mailto:[EMAIL PROTECTED]
Sent: Tuesday, May 24, 2005 1:15 PM
To: Amit V Shah
Cc: 'pgsql-performance@postgresql.org'
Subject: Re: [PERFORM] Need help to decide Mysql vs Postgres


 
 I am not trying to start a mysql vs postgres war so please dont
 misunderstand me  I tried to look around for mysql vs postgres
articles,
 but most of them said mysql is better in speed. However those articles
were
 very old so I dont know about recent stage. Please comment !!!

It is my experience that MySQL is faster under smaller load scenarios. 
Say 5 - 10 connections only doing simple SELECTS. E.g; a dymanic website.

It is also my experience that PostgreSQL is faster and more stable under
consistent and heavy load. I have customers you regularly are using up 
to 500 connections.

Note that alot of this depends on how your database is designed. Foreign 
keys slow things down.

I think it would be important for you to look at your overall goal of 
migration. MySQL is really not a bad product IF you are willing to 
work within its limitations.

PostgreSQL is a real RDMS, it is like Oracle or DB2 and comes with a 
comparable feature set. Only you can decide if that is what you need.

Sincerely,

Joshua D. Drake
Command Prompt, Inc.


-- 
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Need help to decide Mysql vs Postgres

2005-05-24 Thread Josh Berkus
Amit,

 - We have lot of foreign keys between the tables

Do you need these keys to be enforced?   Last I checked, MySQL was still 
having trouble with foriegn keys.

 - Most of the DB usage is Selects. We would have some inserts but that
 would be like a nightly or a monthly process

So transaction integrity is not a real concern?   This sounds like a data 
warehouse; wanna try Bizgres?  (www.bizgres.org)

 Our only concern with going with postgres is speed. I haven't done a speed
 test yet so I can't speak. But the major concern is that the selects and
 inserts are going to be much much slower on postgres than on mysql. I dont
 know how true this is. I know this is a postgres forum so everyone will say
 postgres is better but I am just looking for some help and advise I guess

Well, the relative speed depends on what you're doing.   You want slow, try a 
transaction rollback on a large InnoDB table ;-)   PostgreSQL/Bizgres will 
also be implementing bitmapped indexes and table partitioning very soon, so 
we're liable to pull way ahead of MySQL on very large databases.

 I am not trying to start a mysql vs postgres war so please dont
 misunderstand me  I tried to look around for mysql vs postgres
 articles, but most of them said mysql is better in speed. 

Also I'll bet most of those articles were based on either website use or 
single-threaded simple-sql tests.   Not a read data warehousing situatiion.

It's been my personal experience that MySQL does not scale well beyond about 
75GB without extensive support from MySQL AB.   PostgreSQL more easily scales 
up to 200GB, and to as much as 1TB with tuning expertise.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Need help to decide Mysql vs Postgres

2005-05-24 Thread Amit V Shah
 - Most of the DB usage is Selects. We would have some inserts but that
 would be like a nightly or a monthly process

So transaction integrity is not a real concern?   This sounds like a data 
warehouse; wanna try Bizgres?  (www.bizgres.org)

I took a look at this. I have a few concerns with bizgres though -- I am
using jetspeed portal engine and Hibernate as my O/R Mapping layer. I know
for sure that they dont support bizgres. Now the question is what difference
is there between bizgres and postgres ... I guess I will try to look around
the website more and find out, but if there is something you would like to
comment, that would be very helpful ...

Thanks,
Amit


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Need help to decide Mysql vs Postgres

2005-05-24 Thread Steinar H. Gunderson
On Tue, May 24, 2005 at 01:56:54PM -0400, Amit V Shah wrote:
 I took a look at this. I have a few concerns with bizgres though -- I am
 using jetspeed portal engine and Hibernate as my O/R Mapping layer.

If you have problems with performance, you might want to look into using JDBC
directly instead of using Hibernate. I know groups of people who are rather
less-than-happy with it performance-wise :-)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Need help to decide Mysql vs Postgres

2005-05-24 Thread Josh Berkus
Amit,

 I took a look at this. I have a few concerns with bizgres though -- I am
 using jetspeed portal engine and Hibernate as my O/R Mapping layer. I know
 for sure that they dont support bizgres. Now the question is what
 difference is there between bizgres and postgres ... I guess I will try to
 look around the website more and find out, but if there is something you
 would like to comment, that would be very helpful ...

Bizgres is PostgreSQL.   Just a different packaging of it, with some patches 
which are not yet in the main PostgreSQL.   Also, it's currently beta.

--Josh

-- 
__Aglio Database Solutions___
Josh BerkusConsultant
josh@agliodbs.comwww.agliodbs.com
Ph: 415-752-2500Fax: 415-752-2387
2166 Hayes Suite 200San Francisco, CA

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Select performance vs. mssql

2005-05-24 Thread mark durrant
I'm far from an expert, so this may be off-base... but
perhaps a suggestion would be to allow a hint to be
sent to the optimizer if the user doesn't care that
the result is approximate maybe then this wouldn't
require adding more overhead to the indexes.

MSSQL has something like this with (nolock) 
i.e. select count(*) from customers (nolock) where
name like 'Mark%' 

Regardless, I'm very impressed with PostgreSQL and I
think we're moving ahead with it.

Mark

--- Bruno Wolff III [EMAIL PROTECTED] wrote:
 On Tue, May 24, 2005 at 08:36:36 -0700,
   mark durrant [EMAIL PROTECTED] wrote:
  
  --MSSQL's ability to hit the index only and not
 having
  to go to the table itself results in a _big_
  performance/efficiency gain. If someone who's in
  development wants to pass this along, it would be
 a
  nice addition to PostgreSQL sometime in the
 future.
  I'd suspect that as well as making one query
 faster,
  it would make everything else faster/more scalable
 as
  the server load is so much less.
 
 This gets brought up a lot. The problem is that the
 index doesn't include
 information about whether the current transaction
 can see the referenced
 row. Putting this information in the index will add
 significant overhead
 to every update and the opinion of the developers is
 that this would be
 a net loss overall.



__ 
Do you Yahoo!? 
Yahoo! Small Business - Try our new Resources site
http://smallbusiness.yahoo.com/resources/

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Need help to decide Mysql vs Postgres

2005-05-24 Thread PFC


	It's common knowledge, it seems, that MySQL without transactions will be  
a lot faster than Postgres on Inserts. And on Updates too, that is, unless  
you have more than a few concurrent concurrent connections, at which point  
the MySQL full table lock will just kill everything. And you don't have  
transactions, of course, and if something goes wrong, bye bye data, or  
funky stuff happens, like half-commited transactions if a constraint is  
violated in an INSERT SELECT, or you get 0 January  or 31 February,  
etc.
	I heard it said that MySQL with transactions (InnoDB) is slower than  
postgres. I'd believe it... and you still get 00-00- as a date for  
free.
	But from your use case postgres doesn't sound like a problem, yours  
sounds like a few big batched COPY's which are really really fast.


	And about SELECTs, this is really from an experience I had a few months  
ago, from a e-commerce site... well, to put it nicely, MySQL's planner  
don't know shit when it comes to doing anything a bit complicated. I had  
this query to show the also purchased products on a page, and also a few  
other queries, best buys in this category, related products, etc...,  
nothing very complicated really, at worst they were 4-table joins... and  
with 50K products MySQL planned it horrendously and it took half a second  
! Seq scans every times... I had to split the query in two, one to get the  
product id's, another one to get the products.
	I took the sql, put it in postgres with the usual changes (typenames,  
etc...) but same indexes, same data... the query took half a millisecond.  
Well... what can I say ?


	Also when you sit in front of the psql or mysql command line, it's an  
entirely different experience. One is a pleasure to work with... the other  
one is just a pain.



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Need help to decide Mysql vs Postgres

2005-05-24 Thread Joshua D. Drake

Amit V Shah wrote:

Hi Josh,

Thanks for the prompt reply !! Actually migration is inevitable. We have a
totally messed up schema, not normalized and stuff like that. So the goal of
the migration is to get a new and better normalized schema. That part is
done already. Now the decision point is, should we go with postgres or
mysql. 


O.k. then I would ask myself this:

Would I trust my brand new data that I have put all this effort into, 
that finally looks the way that I want it to look, to a database that 
truncates information?


PostgreSQL is truly ACID compliant. Even if it is a little slower (which 
under normal use I don't find to be the case) wouldn't the reliability 
of PostgreSQL make up for say the 10% net difference in performance?


Sincerely,

Joshua D. Drake





Thanks,
Amit
 
-Original Message-

From: Joshua D. Drake [mailto:[EMAIL PROTECTED]
Sent: Tuesday, May 24, 2005 1:15 PM
To: Amit V Shah
Cc: 'pgsql-performance@postgresql.org'
Subject: Re: [PERFORM] Need help to decide Mysql vs Postgres




I am not trying to start a mysql vs postgres war so please dont
misunderstand me  I tried to look around for mysql vs postgres


articles,


but most of them said mysql is better in speed. However those articles


were


very old so I dont know about recent stage. Please comment !!!



It is my experience that MySQL is faster under smaller load scenarios. 
Say 5 - 10 connections only doing simple SELECTS. E.g; a dymanic website.


It is also my experience that PostgreSQL is faster and more stable under
consistent and heavy load. I have customers you regularly are using up 
to 500 connections.


Note that alot of this depends on how your database is designed. Foreign 
keys slow things down.


I think it would be important for you to look at your overall goal of 
migration. MySQL is really not a bad product IF you are willing to 
work within its limitations.


PostgreSQL is a real RDMS, it is like Oracle or DB2 and comes with a 
comparable feature set. Only you can decide if that is what you need.


Sincerely,

Joshua D. Drake
Command Prompt, Inc.





--
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Select performance vs. mssql

2005-05-24 Thread Alex Turner
Until you start worrying about MVC - we have had problems with the
MSSQL implementation of read consistency because of this 'feature'.

Alex Turner
NetEconomistOn 5/24/05, Bruno Wolff III [EMAIL PROTECTED] wrote:
On Tue, May 24, 2005 at 08:36:36 -0700,mark durrant [EMAIL PROTECTED] wrote: --MSSQL's ability to hit the index only and not having to go to the table itself results in a _big_
 performance/efficiency gain. If someone who's in development wants to pass this along, it would be a nice addition to PostgreSQL sometime in the future. I'd suspect that as well as making one query faster,
 it would make everything else faster/more scalable as the server load is so much less.This gets brought up a lot. The problem is that the index doesn't includeinformation about whether the current transaction can see the referenced
row. Putting this information in the index will add significant overheadto every update and the opinion of the developers is that this would bea net loss overall.---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Select performance vs. mssql

2005-05-24 Thread Josh Berkus
Folks,

  This gets brought up a lot. The problem is that the
  index doesn't include
  information about whether the current transaction
  can see the referenced
  row. Putting this information in the index will add
  significant overhead
  to every update and the opinion of the developers is
  that this would be
  a net loss overall.

Pretty much.  There has been discussion about allowing index-only access to 
frozen tables, i.e. archive partitions.  But it all sort of hinges on 
someone implementing it and testing 

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Select performance vs. mssql

2005-05-24 Thread PFC


Pretty much.  There has been discussion about allowing index-only access  
to

frozen tables, i.e. archive partitions.  But it all sort of hinges on
someone implementing it and testing 


	Would be interesting as a parameter to set at index creation (ie. if you  
know this table will have a lot of reads and few writes)... like create an  
index on columns X,Y keeping data on columns X,Y and Z...

But in this case do you still need the table ?
	Or even create a table type where the table and the index are one, like  
an auto-clustered table...

I don't know if it would be used that often, though ;)


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Select performance vs. mssql

2005-05-24 Thread Michael Stone

On Tue, May 24, 2005 at 04:35:14PM -0700, Josh Berkus wrote:
Pretty much.  There has been discussion about allowing index-only access to 
frozen tables, i.e. archive partitions.  But it all sort of hinges on 
someone implementing it and testing 


Is there any way to expose the planner estimate? For some purposes it's
enough to just give a rough ballpark (e.g., a google-esque results 1-10
of approximately 1000) so a user knows whether its worth even
starting to page through.

Mike Stone

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Select performance vs. mssql

2005-05-24 Thread John A Meinel

Michael Stone wrote:


On Tue, May 24, 2005 at 04:35:14PM -0700, Josh Berkus wrote:


Pretty much.  There has been discussion about allowing index-only
access to frozen tables, i.e. archive partitions.  But it all sort
of hinges on someone implementing it and testing 



Is there any way to expose the planner estimate? For some purposes it's
enough to just give a rough ballpark (e.g., a google-esque results 1-10
of approximately 1000) so a user knows whether its worth even
starting to page through.

Mike Stone

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Well, you could always do:

EXPLAIN SELECT ...

And then parse out the rows= in the first line.

John
=:-



signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Select performance vs. mssql

2005-05-24 Thread Christopher Kings-Lynne

--As Chris pointed out, how real-world is this test?
His point is valid. The database we're planning will
have a lot of rows and require a lot of summarization
(hence my attempt at a test), but we shouldn't be
pulling a million rows at a time.


If you want to do lots of aggregate analysis, I suggest you create a 
sepearate summary table, and create triggers on the main table to 
maintain your summaries in the other table...



--MSSQL's ability to hit the index only and not having
to go to the table itself results in a _big_
performance/efficiency gain. If someone who's in
development wants to pass this along, it would be a
nice addition to PostgreSQL sometime in the future.
I'd suspect that as well as making one query faster,
it would make everything else faster/more scalable as
the server load is so much less.


This is well-known and many databases do it.  However, due to MVCC 
considerations in PostgreSQL, it's not feasible for us to implement it...


Chris

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Can anyone explain this: duplicate dbs.

2005-05-24 Thread Christopher Kings-Lynne
If I pg_dump that database then create a new database (e.g. tempdb) 
and upload the dump file (thus making a duplicate) then the same query 
only takes 190ms !!
Vacuum, vacuum analyse, and vacuum full analyse does not seem to have an 
impact on these times.


Damn, for some reason I didn't read that you had already tried vacuum 
full.  In that case, I can't explain it except perhaps you aren't 
vacuuming properly, or the right thing, or it's a disk cache thing.


Chris

---(end of broadcast)---
TIP 6: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Can anyone explain this: duplicate dbs.

2005-05-24 Thread John A Meinel

SpaceBallOne wrote:


Wondering if someone could explain a pecularity for me:

We have a database which takes 1000ms to perform a certain query on.

If I pg_dump that database then create a new database (e.g. tempdb)
and upload the dump file (thus making a duplicate) then the same query
only takes 190ms !!
Vacuum, vacuum analyse, and vacuum full analyse does not seem to have
an impact on these times.

Can anyone explain why this may be occurring and how I might be able
to keep the original database running at the same speed as tempdb?

Thanks in advance,

Dave.


What version of postgres?

There are a few possibilities. If you are having a lot of updates to the
table, you can get index bloat. And vacuum doesn't fix indexes. You have
to REINDEX to do that. Though REINDEX has the same lock that VACUUM
FULL has, so you need to be a little careful with it.

Probably better is to do CLUSTER, as it does a REINDEX and a sort, so
your table ends up nicer when you are done.

Also, older versions of postgres had a worse time with index bloat. One
thing that caused a lot of problem is a table that you insert into over
time, so that all the values are incrementing. If you are deleting older
entries, that area won't be re-used because they fall at the back end. I
believe newer versions have been fixed.

By the way, I think doing:

CREATE DATABASE tempdb WITH TEMPLATE = originaldb;

Is a much faster way of doing dump and load. I *think* it would recreate
indexes, etc. If it just does a copy it may not show the dump/restore
improvement.

John
=:-



signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Can anyone explain this: duplicate dbs.

2005-05-24 Thread SpaceBallOne

What version of postgres?


8.0.2 ... but I think I've seen this before on 7.3 ...


There are a few possibilities. If you are having a lot of updates to the
table, you can get index bloat. And vacuum doesn't fix indexes. You have
to REINDEX to do that. Though REINDEX has the same lock that VACUUM
FULL has, so you need to be a little careful with it.



Probably better is to do CLUSTER, as it does a REINDEX and a sort, so
your table ends up nicer when you are done.


Thanks, will try those next time this problem crops up (i just deleted / 
recreated the database to speed things for its users in the office ... 
probably should have held off to see if I could find a solution first!).


Yes, the database / table-in-question does have a lot of updates, deletes, 
and new rows (relatively speaking for a small business).


Would CLUSTER / REINDEX still have an effect if our queries were done via 
sequential scan? This is a old database (as in built by me when i was just 
starting to learn unix / postgres) so the database design is pretty horrible 
(little normalisation, no indexes).


Have taken Chris's advice onboard too and setup cron to do a vacuumdb hourly 
instead of my weekly vacuum.


Cheers,

Dave.




---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Can anyone explain this: duplicate dbs.

2005-05-24 Thread Christopher Kings-Lynne
Would CLUSTER / REINDEX still have an effect if our queries were done 
via sequential scan? 


SELECTS don't write to the database, so they have no effect at all on 
vacuuming/analyzing.  You only need to worry about that with writes.


This is a old database (as in built by me when i 
was just starting to learn unix / postgres) so the database design is 
pretty horrible (little normalisation, no indexes).


No indexes?  Bloody hell :D

Use EXPLAIN ANALYZE SELECT ... ; on all of your selects to see where 
they are slow and where you can add indexes...


Chris

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Can anyone explain this: duplicate dbs.

2005-05-24 Thread Tom Lane
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 If I pg_dump that database then create a new database (e.g. tempdb) 
 and upload the dump file (thus making a duplicate) then the same query 
 only takes 190ms !!
 Vacuum, vacuum analyse, and vacuum full analyse does not seem to have an 
 impact on these times.

 Damn, for some reason I didn't read that you had already tried vacuum 
 full.

I'm thinking index bloat, and a PG version too old for vacuum full to
recover any index space.  But without any information about PG version
or EXPLAIN ANALYZE results, we're all just guessing.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Can anyone explain this: duplicate dbs.

2005-05-24 Thread John A Meinel

SpaceBallOne wrote:


What version of postgres?



8.0.2 ... but I think I've seen this before on 7.3 ...


There are a few possibilities. If you are having a lot of updates to the
table, you can get index bloat. And vacuum doesn't fix indexes. You have
to REINDEX to do that. Though REINDEX has the same lock that VACUUM
FULL has, so you need to be a little careful with it.




Probably better is to do CLUSTER, as it does a REINDEX and a sort, so
your table ends up nicer when you are done.



Thanks, will try those next time this problem crops up (i just deleted
/ recreated the database to speed things for its users in the office
... probably should have held off to see if I could find a solution
first!).

Yes, the database / table-in-question does have a lot of updates,
deletes, and new rows (relatively speaking for a small business).

Would CLUSTER / REINDEX still have an effect if our queries were done
via sequential scan? This is a old database (as in built by me when i
was just starting to learn unix / postgres) so the database design is
pretty horrible (little normalisation, no indexes).


Well, my first recommendation is to put in some indexes. :) They are
relatively easy to setup and can drastically improve select performance.

What version of postgres are you using?
What does it say at the end of VACUUM FULL ANALYZE VERBOSE, that
should tell you how many free pages were reclaimed and how big your free
space map should be.

If you only did 1 VACUUM FULL, you might try another, as it sounds like
your tables aren't properly filled. I'm pretty sure vacuum only removes
empty pages/marks locations for the free space map so they can be
re-used, while vacuum full will move entries around to create free pages.

It sounds like it didn't do it properly.

But even so, CLUSTER is still your friend, as it allows you to presort
the rows in your tables.



Have taken Chris's advice onboard too and setup cron to do a vacuumdb
hourly instead of my weekly vacuum.

Cheers,

Dave.



John
=:-



signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Can anyone explain this: duplicate dbs.

2005-05-24 Thread Bruno Wolff III
On Tue, May 24, 2005 at 21:39:15 -0500,
  John A Meinel [EMAIL PROTECTED] wrote:
 
 By the way, I think doing:
 
 CREATE DATABASE tempdb WITH TEMPLATE = originaldb;
 
 Is a much faster way of doing dump and load. I *think* it would recreate
 indexes, etc. If it just does a copy it may not show the dump/restore
 improvement.

You need to be careful when doing this. See section 18.3 of the 8.0 docs
for caveats.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster