[PERFORM] how postgresql request the computer resources

2005-10-27 Thread Sidar López Cruz
Is there something that tells postgres to take the resorces from computer 
(RAM, HDD, SWAP on linux) as it need, not modifying variables on 
postgresql.conf and other operating system things?


A days ago i am trying to show that postgres is better than mssql but when 
execute a simple query like:


(1)
select count(*) from
(
select archivo from fotos
except
select archivo from archivos
) x;
Aggregate  (cost=182162.83..182162.84 rows=1 width=0) (actual 
time=133974.495..133974.498 rows=1 loops=1)
 -  Subquery Scan x  (cost=173857.98..181830.63 rows=132878 width=0) 
(actual time=109148.158..15.279 rows=169672 loops=1)
   -  SetOp Except  (cost=173857.98..180501.86 rows=132878 width=58) 
(actual time=109148.144..132094.382 rows=169672 loops=1)
 -  Sort  (cost=173857.98..177179.92 rows=1328775 width=58) 
(actual time=109147.656..113870.975 rows=1328775 loops=1)

   Sort Key: archivo
   -  Append  (cost=0.00..38710.50 rows=1328775 width=58) 
(actual time=27.062..29891.075 rows=1328775 loops=1)
 -  Subquery Scan *SELECT* 1  
(cost=0.00..17515.62 rows=523431 width=58) (actual time=27.052..9560.719 
rows=523431 loops=1)
   -  Seq Scan on fotos  (cost=0.00..12281.31 
rows=523431 width=58) (actual time=27.038..5390.238 rows=523431 loops=1)
 -  Subquery Scan *SELECT* 2  
(cost=0.00..21194.88 rows=805344 width=58) (actual time=10.803..12117.788 
rows=805344 loops=1)
   -  Seq Scan on archivos  
(cost=0.00..13141.44 rows=805344 width=58) (actual time=10.784..5420.164 
rows=805344 loops=1)

Total runtime: 134552.325 ms


(2)
select count(*) from fotos where archivo not in (select archivo from 
archivos)
Aggregate  (cost=29398.98..29398.99 rows=1 width=0) (actual 
time=26660.565..26660.569 rows=1 loops=1)
 -  Seq Scan on fotos  (cost=15154.80..28744.69 rows=261716 width=0) 
(actual time=13930.060..25859.340 rows=169799 loops=1)

   Filter: (NOT (hashed subplan))
   SubPlan
 -  Seq Scan on archivos  (cost=0.00..13141.44 rows=805344 
width=58) (actual time=0.319..5647.043 rows=805344 loops=1)

Total runtime: 26747.236 ms


(3)
select count(1) from fotos f where not exists (select a.archivo from 
archivos a where a.archivo=f.archivo)
Aggregate  (cost=1761354.08..1761354.09 rows=1 width=0) (actual 
time=89765.384..89765.387 rows=1 loops=1)
 -  Seq Scan on fotos f  (cost=0.00..1760699.79 rows=261716 width=0) 
(actual time=75.556..0.234 rows=169799 loops=1)

   Filter: (NOT (subplan))
   SubPlan
 -  Index Scan using archivos_archivo_idx on archivos a  
(cost=0.00..13451.40 rows=4027 width=58) (actual time=0.147..0.147 rows=1 
loops=523431)

   Index Cond: ((archivo)::text = ($0)::text)
Total runtime: 89765.714 ms



(4)
SELECT count(*)
FROM fotos f
LEFT JOIN archivos a USING(archivo)
WHERE a.archivo IS NULL
Aggregate  (cost=31798758.40..31798758.41 rows=1 width=0) (actual 
time=114267.337..114267.341 rows=1 loops=1)
 -  Merge Left Join  (cost=154143.73..31772412.02 rows=10538550 width=0) 
(actual time=85472.696..113392.399 rows=169799 loops=1)

   Merge Cond: (outer.?column2? = inner.?column2?)
   Filter: (inner.archivo IS NULL)
   -  Sort  (cost=62001.08..63309.66 rows=523431 width=58) (actual 
time=38018.343..39998.201 rows=523431 loops=1)

 Sort Key: (f.archivo)::text
 -  Seq Scan on fotos f  (cost=0.00..12281.31 rows=523431 
width=58) (actual time=0.158..4904.410 rows=523431 loops=1)
   -  Sort  (cost=92142.65..94156.01 rows=805344 width=58) (actual 
time=47453.790..50811.216 rows=805701 loops=1)

 Sort Key: (a.archivo)::text
 -  Seq Scan on archivos a  (cost=0.00..13141.44 rows=805344 
width=58) (actual time=0.206..7160.148 rows=805344 loops=1)

Total runtime: 114893.116 ms




WITH ANY OF THIS QUERIES MSSQL TAKES NOT MUCH OF 7 SECONDS


PLEASE HELP ME

_
Consigue aquí las mejores y mas recientes ofertas de trabajo en América 
Latina y USA: http://latam.msn.com/empleos/



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] how postgresql request the computer resources

2005-10-27 Thread Richard Huxton

Sidar López Cruz wrote:
Is there something that tells postgres to take the resorces from 
computer (RAM, HDD, SWAP on linux) as it need, not modifying variables 
on postgresql.conf and other operating system things?


Ah, and how is it to know what to share with other processes?

A days ago i am trying to show that postgres is better than mssql but 
when execute a simple query like:


(1)
select count(*) from
Total runtime: 134552.325 ms

(2)
select count(*) from fotos where archivo not in (select archivo from 
Total runtime: 26747.236 ms


(3)
select count(1) from fotos f where not exists (select a.archivo from 
Total runtime: 89765.714 ms


(4)
SELECT count(*)
Total runtime: 114893.116 ms



WITH ANY OF THIS QUERIES MSSQL TAKES NOT MUCH OF 7 SECONDS


In which case they make a bad choice for showing PostgreSQL is faster 
than MSSQL. Is this the only query you have, or are others giving you 
problems too?


I think count(*) is about the weakest point in PG, but I don't think 
there'll be a general solution available soon. As I'm sure someone has 
mentioned, whatever else, PG needs to check the row for its visibility 
information.


From the start of your email, you seem to suspect your configuration 
needs some work. Once you are happy that your settings in general are 
good, you can override some by issuing set statements before your query. 
For example:

SET work_mem = 1;
might well improve example #2 where you had a hash.

--
  Richard Huxton
  Archonet Ltd


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] how postgresql request the computer resources

2005-10-27 Thread Michael Best

Richard Huxton wrote:

WITH ANY OF THIS QUERIES MSSQL TAKES NOT MUCH OF 7 SECONDS



In which case they make a bad choice for showing PostgreSQL is faster 
than MSSQL. Is this the only query you have, or are others giving you 
problems too?


I think count(*) is about the weakest point in PG, but I don't think 
there'll be a general solution available soon. As I'm sure someone has 
mentioned, whatever else, PG needs to check the row for its visibility 
information.


 From the start of your email, you seem to suspect your configuration 
needs some work. Once you are happy that your settings in general are 
good, you can override some by issuing set statements before your query. 
For example:

SET work_mem = 1;
might well improve example #2 where you had a hash.

--
  Richard Huxton
  Archonet Ltd


Someone had suggested keeping a vector table with +1 and -1 for row 
insertion and deletion and then running a cron to sum the vectors and 
update a table so that you could select from that table to get the row 
count.  Perhaps some sort of SUM() on a column function.


Since this seems like a reasonable approach (or perhaps there may be yet 
another better mechanism), cannot someone add this sort of functionality 
to Postgresql to do behind the scenes?


-Mike

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] how postgresql request the computer resources

2005-10-27 Thread Jim C. Nasby
On Thu, Oct 27, 2005 at 03:58:55PM -0600, Michael Best wrote:
 Richard Huxton wrote:
 WITH ANY OF THIS QUERIES MSSQL TAKES NOT MUCH OF 7 SECONDS
 
 
 In which case they make a bad choice for showing PostgreSQL is faster 
 than MSSQL. Is this the only query you have, or are others giving you 
 problems too?
 
 I think count(*) is about the weakest point in PG, but I don't think 
 there'll be a general solution available soon. As I'm sure someone has 
 mentioned, whatever else, PG needs to check the row for its visibility 
 information.
 
  From the start of your email, you seem to suspect your configuration 
 needs some work. Once you are happy that your settings in general are 
 good, you can override some by issuing set statements before your query. 
 For example:
 SET work_mem = 1;
 might well improve example #2 where you had a hash.
 
 -- 
   Richard Huxton
   Archonet Ltd
 
 Someone had suggested keeping a vector table with +1 and -1 for row 
 insertion and deletion and then running a cron to sum the vectors and 
 update a table so that you could select from that table to get the row 
 count.  Perhaps some sort of SUM() on a column function.
 
 Since this seems like a reasonable approach (or perhaps there may be yet 
 another better mechanism), cannot someone add this sort of functionality 
 to Postgresql to do behind the scenes?

There's all kinds of things that could be added; the issue is
ascertaining what the performance trade-offs are (there's no such thing
as a free lunch) and if the additional code complexity is worth it.

Note that your suggestion probably wouldn't work in this case because
the user isn't doing a simple SELECT count(*) FROM table;. I'd bet that
MSSQL is using index covering to answer his queries so quickly,
something that currently just isn't possible with PostgreSQL. But if you
search the -hackers archives, you'll find a discussion on adding limited
heap tuple visibility information to indexes. That would allow for
partial index covering in many cases, which would probably be a huge win
for the queries the user was asking about.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster