Re: [PERFORM] performance comparission postgresql/ms-sql server

2004-04-07 Thread Heiko Kehlenbrink
hi shridhar,


> Heiko Kehlenbrink wrote:
>
>> hi list,
>>
>> i want to convince people to use postgresql instead of ms-sql server, so i
>> set up a kind of comparission insert data / select data from postgresql /
>> ms-sql server
>>
>> the table i use was pretty basic,
>>
>> id   bigserial
>> dist  float8
>> x float8
>> y float8
>> z float8
>>
>> i filled the table with a function which filled x,y,z with incremental
increasing values (1,2,3,4,5,6...) and computing from that the dist
value
>> for every tupel (sqrt((x*x)+(y*y)+(z*z))).
>>
>> this works fine for both dbms
>>
>> postgresql needs 13:37 min for 10.000.000 tupel,
>> ms-sql needs 1:01:27 h for 10.000.000 tupel.
>>
>> so far so good.
>>
>> i attached an index on the dist row and started to query the dbs with
scripts which select a serial row of 100.000,200.000,500.000 tupels
based
>> on the dist row.
>> i randomizly compute the start and the end distance and made a "select
avg(dist) from table where dist > startdist and dist < enddist"
>
> Some basics to check quickly.
>
> 1. vacuum analyze the table before you start selecting.

was done,

> 2. for slow running queries, check explain analyze output and find out
who takes
> maximum time.

[EMAIL PROTECTED]:~> psql -d test -c 'explain analyse select avg(dist)
from massive2 where dist > (100*sqrt(3.0))::float8 and dist <
(150*sqrt(3.0))::float8;'
NOTICE:  QUERY PLAN:

Aggregate  (cost=14884.61..14884.61 rows=1 width=8) (actual
time=3133.24..3133.24 rows=1 loops=1)
  ->  Index Scan using massive2_dist on massive2  (cost=0.00..13648.17
rows=494573 width=8) (actual time=0.11..2061.38 rows=49 loops=1) Total
runtime: 3133.79 msec

EXPLAIN

seems to me that most time was needed for the index scanning...

> 3. Check for typecasting. You need to typecast the query correctly e.g.
>
> select avg(dist) from table where dist >startdist::float8 and
> dist
> This might still end up with sequential scan depending upon the plan.
but if
> index scan is picked up, it might be plenty fast..
>
nope, the dist row is float8 and the query-borders are float8 too, also
the explain says that an index scann was done.

> Post explain analyze for the queries if things don't improve.
>
see above..

>   HTH
>
>   Shridhar
>
best regards
heiko


>
>




---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] performance comparission postgresql/ms-sql server

2004-04-06 Thread Kaarel
Heiko Kehlenbrink wrote:

i want to convince people to use postgresql instead of ms-sql server, so i
set up a kind of comparission insert data / select data from postgresql /
ms-sql server
 

[...]

do you have any hints like compiler-flags and so on to get the answering
time from postgresql equal to ms-sql?
(btw both dbms were running on exactly the same hardware)

i use suse 8.1
 postgresql 7.2 compiled from the rpms for using postgis, but that is
anothe story...
 1.5 gig ram
 1.8 mhz intel cpu
every help welcome
 

Suse 8.1 comes with 2.4 series kernel I suppose. Many have witnessed a 
speed increase when using 2.6 series kernel. Might consider this too 
besides the newer PostgreSQL version already suggested. 2.6 has some 
scheduling options that are not enabled by default but may enhance 
database performance 
(http://story.news.yahoo.com/news?tmpl=story&cid=75&e=2&u=/nf/20040405/tc_nf/23603).

Kaarel

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] performance comparission postgresql/ms-sql server

2004-04-06 Thread Shridhar Daithankar
Heiko Kehlenbrink wrote:
Hmm... I would suggest if you are testing, you should try 7.4.2. 7.4 has
some
good optimisation for hash agregates though I am not sure if it apply to
averaging.
would be the last option till we are runing other applications on that 7.2
system
I can understand..

Also try forcing a seq. scan by turning off index scan. I guess index scan
for
so many rows is not exactly good thing even if tuple size if pretty small.
a sequential scann gives me the following result:

[EMAIL PROTECTED]:~> time psql -d test -c 'explain analyse select
avg(dist) from massive2 where dist > 100*sqrt(3.0)::float8 and dist <
150*sqrt(3.0)::float8 ;'
NOTICE:  QUERY PLAN:
Aggregate  (cost=1193714.43..1193714.43 rows=1 width=8) (actual
time=166718.54..166718.54 rows=1 loops=1)
  ->  Seq Scan on massive2  (cost=0.00..1192478.00 rows=494573 width=8)
(actual time=3233.22..165576.40 rows=49 loops=1)
Total runtime: 166733.73 msec
Certainly bad and not an option.. I can't think of anything offhand to speed 
this up..

 Shridhar

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] performance comparission postgresql/ms-sql server

2004-04-06 Thread Shridhar Daithankar
Heiko Kehlenbrink wrote:

[EMAIL PROTECTED]:~> psql -d test -c 'explain analyse select avg(dist)
from massive2 where dist > (100*sqrt(3.0))::float8 and dist <
(150*sqrt(3.0))::float8;'
NOTICE:  QUERY PLAN:
Aggregate  (cost=14884.61..14884.61 rows=1 width=8) (actual
time=3133.24..3133.24 rows=1 loops=1)
  ->  Index Scan using massive2_dist on massive2  (cost=0.00..13648.17
rows=494573 width=8) (actual time=0.11..2061.38 rows=49 loops=1)
Total runtime: 3133.79 msec
EXPLAIN

seems to me that most time was needed for the index scanning...
Hmm... I would suggest if you are testing, you should try 7.4.2. 7.4 has some 
good optimisation for hash agregates though I am not sure if it apply to averaging.

Also try forcing a seq. scan by turning off index scan. I guess index scan for 
so many rows is not exactly good thing even if tuple size if pretty small.

 Shridhar

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] performance comparission postgresql/ms-sql server

2004-04-05 Thread Matthew T. O'Connor
Heiko Kehlenbrink wrote:

i use suse 8.1
 postgresql 7.2 compiled from the rpms for using postgis, but that is
 

Try v7.4, there are many performance improvements.  It may not make up 
all the differences but it should help.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] performance comparission postgresql/ms-sql server

2004-04-05 Thread Tom Lane
"Heiko Kehlenbrink" <[EMAIL PROTECTED]> writes:
> i use suse 8.1
>   postgresql 7.2 compiled from the rpms for using postgis, but that is
> anothe story...

7.4 might be a little quicker; but in any case you should be doing this
sort of comparison using the current release, no?

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] performance comparission postgresql/ms-sql server

2004-04-05 Thread Shridhar Daithankar
Heiko Kehlenbrink wrote:

hi list,

i want to convince people to use postgresql instead of ms-sql server, so i
set up a kind of comparission insert data / select data from postgresql /
ms-sql server
the table i use was pretty basic,

id   bigserial
dist  float8
x float8
y float8
z float8
i filled the table with a function which filled x,y,z with incremental
increasing values (1,2,3,4,5,6...) and computing from that the dist value
for every tupel (sqrt((x*x)+(y*y)+(z*z))).
this works fine for both dbms

postgresql needs 13:37 min for 10.000.000 tupel,
ms-sql needs 1:01:27 h for 10.000.000 tupel.
so far so good.

i attached an index on the dist row and started to query the dbs with
scripts which select a serial row of 100.000,200.000,500.000 tupels based
on the dist row.
i randomizly compute the start and the end distance and made a "select
avg(dist) from table where dist > startdist and dist < enddist"
Some basics to check quickly.

1. vacuum analyze the table before you start selecting.
2. for slow running queries, check explain analyze output and find out who takes 
maximum time.
3. Check for typecasting. You need to typecast the query correctly e.g.

select avg(dist) from table where dist >startdist::float8 and dist

This might still end up with sequential scan depending upon the plan. but if 
index scan is picked up, it might be plenty fast..

Post explain analyze for the queries if things don't improve.

 HTH

 Shridhar

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] performance comparission postgresql/ms-sql server

2004-04-05 Thread Heiko Kehlenbrink
hi list,

i want to convince people to use postgresql instead of ms-sql server, so i
set up a kind of comparission insert data / select data from postgresql /
ms-sql server

the table i use was pretty basic,

id   bigserial
dist  float8
x float8
y float8
z float8

i filled the table with a function which filled x,y,z with incremental
increasing values (1,2,3,4,5,6...) and computing from that the dist value
for every tupel (sqrt((x*x)+(y*y)+(z*z))).

this works fine for both dbms

postgresql needs 13:37 min for 10.000.000 tupel,
ms-sql needs 1:01:27 h for 10.000.000 tupel.

so far so good.

i attached an index on the dist row and started to query the dbs with
scripts which select a serial row of 100.000,200.000,500.000 tupels based
on the dist row.
i randomizly compute the start and the end distance and made a "select
avg(dist) from table where dist > startdist and dist < enddist"

Did the same with a table with 50.000.000 tupel in ms-sql and postgres.

the outcome so far:

100.000 from 50.000.000:

postgres: 0.88 sec
ms-sql: 0.38 sec

200.000 from 50.000.000:

postgres: 1.57 sec
ms-sql: 0.54 sec

500.000 from 50.000.000:

postgres: 3.66 sec
ms-sql: 1.18 sec

i try a lot of changes to the postgresql.conf regarding  "Tuning
PostgreSQL for performance"
by
Shridhar Daithankar, Josh Berkus

which did not make a big diffrence to the answering times from postgresql.

i'm pretty fine with the insert time...

do you have any hints like compiler-flags and so on to get the answering
time from postgresql equal to ms-sql?

(btw both dbms were running on exactly the same hardware)

i use suse 8.1
  postgresql 7.2 compiled from the rpms for using postgis, but that is
anothe story...
  1.5 gig ram
  1.8 mhz intel cpu


every help welcome

best regards heiko




---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] performance comparission postgresql/ms-sql server

2004-04-05 Thread Josh Berkus
Heiko,

> 100.000 from 50.000.000:
>
> postgres: 0.88 sec
> ms-sql: 0.38 sec
>
> 200.000 from 50.000.000:
>
> postgres: 1.57 sec
> ms-sql: 0.54 sec
>
> 500.000 from 50.000.000:
>
> postgres: 3.66 sec
> ms-sql: 1.18 sec

Questions:

1. Is this the time to return *all rows* or just the first row?   Given the 
different way that PostgreSQL fetches rows to the client from MSSQL, it makes 
a difference.

2. What are your sort-mem and shared-mem settings?

3. Have you tried clustering the table?

4. Have you done a comparison of selecting random or scattered, instead of 
serial rows?   MSSQL has a tendency to physically store rows in "order" which 
gives it a certain advantage in this kind of query.


-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings