Re: [PERFORM] Major differences between oracle and postgres performance

2004-06-18 Thread pginfo




Hi ,
I have similare problem and found that the problem is by pg sort.
It is extremly slow by me.

Also in my case I tryed to migrate one db from oracle to pg .

To solve this problem I dinamicaly set sort_mem to some big value.
In this case the sort is working into RAM and is relative fast.
You can try this and remember sort mem is per sort, not per connection.

In my migration I found the only advantage for oracle is the very good sort.

regards,
ivan.

Gary Cowell wrote:

  --- [EMAIL PROTECTED] wrote:  You can roughly estimate time
spent for just scaning
  
  
the table using
something like this: 

	select sum(version) from ... where version is not
null

	and just 

	select sum(version) from ...

The results would be interesting to compare. 

  
  
To answer (I hope) everyones questions at once:

1) Oracle and postmaster were not running at the same
time
2) The queries were run once, to cache as much as
possible then run again to get the timing

3) Distinct vs. no distinct (i.e. sort performance).

select length(version) from vers where version is not
null;

Time: 9748.174 ms

select distinct(version) from vers where version is
not null;

Time: 67988.972 ms

So about an extra 60 seconds with the distinct on.

Here is the explain analyze output from psql:

# explain analyze select distinct version from vers
where version is not null;
  
 QUERY PLAN
---
 Unique  (cost=117865.77..120574.48 rows=142
width=132) (actual time=63623.428..68269.111 rows=536
loops=1)
   -  Sort  (cost=117865.77..119220.13 rows=541741
width=132) (actual time=63623.417..66127.641
rows=541741 loops=1)
 Sort Key: "version"
 -  Seq Scan on vers  (cost=0.00..21367.41
rows=541741 width=132) (actual time=0.218..7214.903
rows=541741 loops=1)
   Filter: ("version" IS NOT NULL)
 Total runtime: 68324.215 ms
(6 rows)

Time: 68326.062 ms


And the non-default .conf parameters:

tcpip_socket = true
max_connections = 100
password_encryption = true
shared_buffers = 2000
sort_mem = 16384 
vacuum_mem = 8192
effective_cache_size = 4000
syslog = 2 

postgresql version is 7.4.3
compiled with GCC 3.3.2 on sun4u architecture.





	
	
		
___ALL-NEW Yahoo! Messenger - so many all-new ways to express yourself http://uk.messenger.yahoo.com

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


  






Re: [PERFORM] Wierd context-switching issue on Xeon

2004-04-21 Thread pginfo
Hi,

Dual Xeon P4 2.8
linux RedHat AS 3
kernel 2.4.21-4-EL-smp
2 GB ram

I can see the same problem:

procs  memory  swap  io
system cpu
 r  b   swpd   free   buff  cache   si   sobibo   incs us sy
id wa
1  0  0  96212  61056 172024000 0 0  10111 25  0
75  0
 1  0  0  96212  61056 172024000 0 0  108   139 25
0 75  0
 1  0  0  96212  61056 172024000 0 0  104   173 25
0 75  0
 1  0  0  96212  61056 172024000 0 0  10211 25
0 75  0
 1  0  0  96212  61056 172024000 0 0  10111 25
0 75  0
 2  0  0  96204  61056 172024000 0 0  110 53866 31
4 65  0
 2  0  0  96204  61056 172024000 0 0  101 83176 41
5 54  0
 2  0  0  96204  61056 172024000 0 0  102 86050 39
6 55  0
 2  0  0  96204  61056 172024000 049  113 73642 41
5 54  0
 2  0  0  96204  61056 172024000 0 0  102 84211 40
5 55  0
 2  0  0  96204  61056 172024000 0 0  101 105165 39
7 54  0
 2  0  0  96204  61056 172024000 0 0  103 97754 38
6 56  0
 2  0  0  96204  61056 172024000 0 0  103 113668 36
7 57  0
 2  0  0  96204  61056 172024000 0 0  103 112003 37
7 56  0

regards,
ivan.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] [ SOLVED ] select count(*) very slow on an already

2004-04-15 Thread pginfo
Hi ,
I am not sure, but I remember the same problem.
It was ot 7.3.x version and and I needet to reindex the table.

I think after 7.4 vacuum also work correct with reindex.

But I am not sure.

regards,
ivan.

Rajesh Kumar Mallah wrote:

 Hi,

 The problem was solved by reloading the Table.
 the query now takes only 3 seconds. But that is
 not a solution.

 The  problem is that such phenomenon obscures our
 judgement  used in optimising queries and database.

 If a query runs slow we really cant tell if its a problem
 with query itself , hardware or dead rows.

 I already did vacumm full on the table but it still did not
 have that effect on performance.
 In fact the last figures were after doing a vacuum full.

 Can there be any more elegent solution to this problem.

 Regds
 Mallah.

 Richard Huxton wrote:

 On Thursday 15 April 2004 08:10, Rajesh Kumar Mallah wrote:
 
 
 The problem is that i want to know if i need a Hardware upgrade
 at the moment.
 
 Eg i have another table rfis which contains ~ .6 million records.
 
 
 
 
 
 SELECT count(*) from rfis where sender_uid  0;
 
 
 
 
 
 Time: 117560.635 ms
 
 Which is approximate 4804 records per second. Is it an acceptable
 performance  on the hardware below:
 
 RAM: 2 GB
 DISKS: ultra160 , 10 K  , 18 GB
 Processor: 2* 2.0 Ghz Xeon
 
 
 
 Hmm - doesn't seem good, does it? If you run it again, is it much faster
 (since the data should be cached then)? What does vmstat 10 show while
 you're running the query?
 
 One thing you should have done is read the performance tuning guide at:
   http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php
 The default values are very conservative, and you will need to change them.
 
 
 
 What kind of upgrades shoud be put on the server for it to become
 reasonable fast.
 
 
 
 If you've only got one disk, then a second disk for OS/logging. Difficult to
 say more without knowing numbers of users/activity etc.
 
 
 

 ---(end of broadcast)---
 TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]




---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PERFORM] slow vacuum performance

2004-03-24 Thread pginfo
Hi,

I am running pg 7.4.1 on linux box.
I have a midle size DB with many updates and after it I try to run
vacuum full analyze.
It takes about 2 h.
If I try to dump and reload the DB it take 20 min.

How can I improve the vacuum full analyze time?

My configuration:

shared_buffers = 15000  # min 16, at least max_connections*2,
8KB each
sort_mem = 1# min 64, size in KB
vacuum_mem = 32000  # min 1024, size in KB
effective_cache_size = 4# typically 8KB each
#max_fsm_pages = 2  # min max_fsm_relations*16, 6 bytes each

#max_fsm_relations = 1000   # min 100, ~50 bytes each


regards,
ivan.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] slow vacuum performance

2004-03-24 Thread pginfo
Hi Bill,
I am vacuuming every 24 h.
I have a cron script about i.
But if I make massive update (for example it affects 1 M rows) and I start vacuum,
it take this 2 h.
Also I will note, that this massive update is running in one transaction ( I can
not update 100K and start vacuum after it).

regards,
ivan.

Bill Moran wrote:

 pginfo wrote:
  Hi,
 
  I am running pg 7.4.1 on linux box.
  I have a midle size DB with many updates and after it I try to run
  vacuum full analyze.
  It takes about 2 h.
  If I try to dump and reload the DB it take 20 min.
 
  How can I improve the vacuum full analyze time?

 How often are you vacuuming?  If you've gone a LONG time since the last vacuum,
 it can take quite a while, to the point where a dump/restore is faster.

 A recent realization that I've had some misconceptions about vacuuming led me
 to re-read section 8.2 of the admin guide (on vacuuming) ... I highly suggest
 a review of these 3 pages of the admin manual, as it contains an excellent
 description of why databases need vacuumed, that one can use to determine how
 often vacuuming is necessary.

 --
 Bill Moran
 Potential Technologies
 http://www.potentialtech.com




---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] slow vacuum performance

2004-03-24 Thread pginfo
Hi,

scott.marlowe wrote:

 On Wed, 24 Mar 2004, pginfo wrote:

  Hi,
 
  I am running pg 7.4.1 on linux box.
  I have a midle size DB with many updates and after it I try to run
  vacuum full analyze.

 Is there a reason to not use just regular vacuum / analyze (i.e. NOT
 full)?


Yes, in case I make massive updates (only in my case of cource)   for example
2 M rows, I do not expect to have 2M new rows in next 180 days.That is the
reaso for running vacuum full.
My idea was to free unneedet space and so to have faster system.
It is possible that I am wrong.


  It takes about 2 h.

 Full vacuums, by their nature, tend to be a bit slow.  It's better to let
 the database achieve a kind of steady state with regards to number of
 dead tuples, and use regular vacuums to reclaim said space rather than a
 full vacuum.

  How can I improve the vacuum full analyze time?
 
  My configuration:
 
  shared_buffers = 15000  # min 16, at least max_connections*2,
  8KB each
  sort_mem = 1# min 64, size in KB

 You might want to look at dropping sort_mem.  It would appear you've been
 going through the postgresql.conf file and bumping up numbers to see what
 works and what doesn't.  While most of the settings aren't too dangerous
 to crank up a little high, sort_mem is quite dangerous to crank up high,
 should you have a lot of people connected who are all sorting.  Note that
 sort_mem is a limit PER SORT, not per backend, or per database, or per
 user, or even per table, but per sort.  IF a query needs to run three or
 four sorts, it can use 3 or 4x sort_mem.  If a hundred users do this at
 once, they can then use 300 or 400x sort_mem.  You can see where I'm
 heading.

 Note that for individual sorts in batch files, like import processes, you
 can bump up sort_mem with the set command, so you don't have to have a
 large setting in postgresql.conf to use a lot of sort mem when you need
 to, you can just grab it during that one session.


I know. In my case we are using many ID's declared as varchar/name (I know it
is bad idea, butwe are migrating this system from oracle) and pg have very
bad performance with varchar/name indexes.
The only solution I found was to increase the sort mem.
But, I wll try to decrease this one and to see the result.

  vacuum_mem = 32000  # min 1024, size in KB

 If you've got lots of memory, crank up vacuum_mem to the 200 to 500 meg
 range and see what happens.


I wil try it today. It is good idea and hope it will help.

 For a good tuning guide, go here:

 http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

  I know it. It is the best I found and also the site.

Thanks for the help.
ivan.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] slow vacuum performance

2004-03-24 Thread pginfo


scott.marlowe wrote:

 On Wed, 24 Mar 2004, pginfo wrote:

  Hi,
 
  scott.marlowe wrote:
 
   On Wed, 24 Mar 2004, pginfo wrote:
  
Hi,
   
I am running pg 7.4.1 on linux box.
I have a midle size DB with many updates and after it I try to run
vacuum full analyze.
  
   Is there a reason to not use just regular vacuum / analyze (i.e. NOT
   full)?
  
 
  Yes, in case I make massive updates (only in my case of cource)   for example
  2 M rows, I do not expect to have 2M new rows in next 180 days.That is the
  reaso for running vacuum full.
  My idea was to free unneedet space and so to have faster system.
  It is possible that I am wrong.

 It's all about percentages.  If you've got an average of 5% dead tuples
 with regular vacuuming, then full vacuums won't gain you much, if
 anything.  If you've got 20 dead tuples for each live one, then a full
 vacuum is pretty much a necessity.  The generally accepted best
 performance comes with 5 to 50% or so dead tuples.  Keep in mind, having a
 few dead tuples is actually a good thing, as your database won't grow then
 srhink the file all the time, but keep it in a steady state size wise.

thanks for the good analyze,ivan.



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Trigger performance

2004-01-22 Thread pginfo
Hi,

thanks for the answer.
It is very interest, because I readet many times that if I write the trigger
in C it will work faster.
In wich case will this trigger work faster if write it in C?
In all my triggres I have select  or insert into mytable select ...
or update mytable set ...where
I need this info because I have a table with ~1.5 M rows and if I start to
update 300 K from this rows it takes ~ 2h.
If I remove the trigger for this table all the time is ~ 1 min.

regards,
ivan.

Tom Lane wrote:

 pginfo [EMAIL PROTECTED] writes:
  I was supprised that the pgsql trigger take ~8 sec. to insert this rows
  and the C trigger take ~ 17 sec.

 The reason is that plpgsql caches the plan for the invoked SELECT,
 whereas the way you coded the C function, it's re-planning that SELECT
 on every call.

 regards, tom lane

 ---(end of broadcast)---
 TIP 2: you can get off all lists at once with the unregister command
 (send unregister YourEmailAddressHere to [EMAIL PROTECTED])




---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Trigger performance

2004-01-22 Thread pginfo
Ok, thanks.
I will do it.

regards,
ivan.

Tom Lane wrote:

 pginfo [EMAIL PROTECTED] writes:
  In wich case will this trigger work faster if write it in C?

 Given that the dominant part of the time will be spent down inside SPI
 in either case, I doubt you will be able to see much difference.  You
 need to think about how to optimize the invoked query, not waste your
 time recoding the wrapper around it.

 regards, tom lane




---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


[PERFORM] Trigger question

2004-01-15 Thread pginfo
Hi,

I am using pg 7.4.1 and have created a trigger over table with 3 M rows.

If I start masive update on this table, pg executes this trigger on
every row and dramaticaly slows the system.

Exists in pg any way to define the trigger execution only if I have
changes on some fields?

For example I am able to declare this in oracle.

My trigger is writen in pgSQL.

regards,
ivan.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly