Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-10-12 Thread Merlin Moncure
KC wrote:
 
 So I guess it all comes back to the basic question:
 
 For the query select distinct on (PlayerID) * from Player a where
 PlayerID='0' order by PlayerId Desc, AtDate Desc;
 can the optimizer recognise the fact the query is selecting by the
primary
 key (PlayerID,AtDate), so it can skip the remaining rows for that
 PlayerID,
 as if LIMIT 1 is implied?
 
 Best regards, KC.

Hi KC, have you tried:
select * from player where playerid = '0' and atdate  99
order by platerid desc, atdate desc limit 1;

??
Merlin

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-10-12 Thread K C Lau

Dear Merlin and all,

That direct SQL returns in 0 ms. The problem only appears when a view is used.

What we've done to work around this problem is to modify the table to add a 
field DataStatus which is set to 1 for the latest record for each player, 
and reset to 0 when it is superceded.


A partial index is then created as:
CREATE INDEX IDX_CurPlayer on Player (PlayerID) where DataStatus = 1;

The VCurPlayer view is changed to:
CREATE or REPLACE VIEW VCurPlayer as select * from Player where DataStatus = 1;
and it now returns in 0 ms.

This is not the best solution, but until (if ever) the original problem is 
fixed, we have not found an alternative work around.


The good news is that even with the additional overhead of maintaining an 
extra index and the problem of vacuuming, pg 8.0.3 still performs 
significantly faster on Windows than MS Sql 2000 in our OLTP application 
testing so far.


Thanks to all for your help.

Best regards,
KC.

At 20:14 05/10/12, you wrote:

KC wrote:

 So I guess it all comes back to the basic question:

 For the query select distinct on (PlayerID) * from Player a where
 PlayerID='0' order by PlayerId Desc, AtDate Desc;
 can the optimizer recognise the fact the query is selecting by the
primary
 key (PlayerID,AtDate), so it can skip the remaining rows for that
 PlayerID,
 as if LIMIT 1 is implied?

 Best regards, KC.

Hi KC, have you tried:
select * from player where playerid = '0' and atdate  99
order by platerid desc, atdate desc limit 1;

??
Merlin



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Help tuning postgres

2005-10-12 Thread Csaba Nagy
Hi all,

After a long time of reading the general list it's time to subscribe to
this one...

We have adapted our application (originally written for oracle) to
postgres, and switched part of our business to a postgres data base.

The data base has in the main tables around 150 million rows, the whole
data set takes ~ 30G after the initial migration. After ~ a month of
usage that bloated to ~ 100G. We installed autovacuum after ~ 2 weeks.

The main table is heavily updated during the active periods of usage,
which is coming in bursts.

Now Oracle on the same hardware has no problems handling it (the load),
but postgres comes to a crawl. Examining the pg_stats_activity table I
see the updates on the main table as being the biggest problem, they are
very slow. The table has a few indexes on it, I wonder if they are
updated too on an update ? The index fields are not changing. In any
case, I can't explain why the updates are so much slower on postgres.

Sorry for being fuzzy a bit, I spent quite some time figuring out what I
can do and now I have to give up and ask for help.

The machine running the DB is a debian linux, details:

$ cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 11
model name  : Intel(R) Pentium(R) III CPU family  1266MHz
stepping: 1
cpu MHz : 1263.122
cache size  : 512 KB
fdiv_bug: no
hlt_bug : no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 2
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 mmx fxsr sse
bogomips: 2490.36
 
processor   : 1
vendor_id   : GenuineIntel
cpu family  : 6
model   : 11
model name  : Intel(R) Pentium(R) III CPU family  1266MHz
stepping: 1
cpu MHz : 1263.122
cache size  : 512 KB
fdiv_bug: no
hlt_bug : no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 2
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 mmx fxsr sse
bogomips: 2514.94
 

$ uname -a
Linux *** 2.6.12.3 #1 SMP Tue Oct 11 13:13:00 CEST 2005 i686 GNU/Linux


$ cat /proc/meminfo
MemTotal:  4091012 kB
MemFree:118072 kB
Buffers: 18464 kB
Cached:3393436 kB
SwapCached:  0 kB
Active: 947508 kB
Inactive:  2875644 kB
HighTotal: 3211264 kB
HighFree:  868 kB
LowTotal:   879748 kB
LowFree:117204 kB
SwapTotal:   0 kB
SwapFree:0 kB
Dirty:   13252 kB
Writeback:   0 kB
Mapped: 829300 kB
Slab:64632 kB
CommitLimit:   2045504 kB
Committed_AS:  1148064 kB
PageTables:  75916 kB
VmallocTotal:   114680 kB
VmallocUsed:96 kB
VmallocChunk:   114568 kB


The disk used for the data is an external raid array, I don't know much
about that right now except I think is some relatively fast IDE stuff.
In any case the operations should be cache friendly, we don't scan over
and over the big tables...

The postgres server configuration is attached.

I have looked in the postgres statistics tables, looks like most of the
needed data is always cached, as in the most accessed tables the
load/hit ratio is mostly something like 1/100, or at least 1/30.


Is anything in the config I got very wrong for the given machine, or
what else should I investigate further ? If I can't make this fly, the
obvious solution will be to move back to Oracle, cash out the license
and forget about postgres forever...

TIA,
Csaba.

# -
# PostgreSQL configuration file
# -
#
# This file consists of lines of the form:
#
#   name = value
#
# (The '=' is optional.) White space may be used. Comments are introduced
# with '#' anywhere on a line. The complete list of option names and
# allowed values can be found in the PostgreSQL documentation. The
# commented-out settings shown in this file represent the default values.
#
# Please note that re-commenting a setting is NOT sufficient to revert it
# to the default value, unless you restart the postmaster.
#
# Any option can also be given as a command line switch to the
# postmaster, e.g. 'postmaster -c log_connections=on'. Some options
# can be changed at run-time with the 'SET' SQL command.
#
# This file is read on postmaster startup and when the postmaster
# receives a SIGHUP. If you edit the file on a running system, you have 
# to SIGHUP the postmaster for the changes to take effect, or use 
# pg_ctl reload. Some settings, such as listen_address, require
# a postmaster shutdown and restart to take effect.


#---
# FILE LOCATIONS
#---

# The default values of these variables 

Re: [PERFORM] Help tuning postgres

2005-10-12 Thread Emil Briggs
 Hi all,

 After a long time of reading the general list it's time to subscribe to
 this one...

 We have adapted our application (originally written for oracle) to
 postgres, and switched part of our business to a postgres data base.

 The data base has in the main tables around 150 million rows, the whole
 data set takes ~ 30G after the initial migration. After ~ a month of
 usage that bloated to ~ 100G. We installed autovacuum after ~ 2 weeks.


Have you tried reindexing your active tables?

Emil

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Help tuning postgres

2005-10-12 Thread Csaba Nagy
[snip]
 Have you tried reindexing your active tables?
 
Not yet, the db is in production use and I have to plan for a down-time
for that... or is it not impacting the activity on the table ?

 Emil
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Help tuning postgres

2005-10-12 Thread Emil Briggs
 [snip]

  Have you tried reindexing your active tables?

 Not yet, the db is in production use and I have to plan for a down-time
 for that... or is it not impacting the activity on the table ?


It will cause some performance hit while you are doing it. It sounds like 
something is bloating rapidly on your system and the indexes is one possible 
place that could be happening.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Help tuning postgres

2005-10-12 Thread Merlin Moncure
 The disk used for the data is an external raid array, I don't know
much
 about that right now except I think is some relatively fast IDE stuff.
 In any case the operations should be cache friendly, we don't scan
over
 and over the big tables...

Maybe you are I/O bound.  Do you know if your RAID array is caching your
writes?  Easy way to check is to run fsync off and look for obvious
performance differences.  Maybe playing with sync method could help
here.

Merlin


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Help tuning postgres

2005-10-12 Thread Tom Lane
Emil Briggs [EMAIL PROTECTED] writes:
 Not yet, the db is in production use and I have to plan for a down-time
 for that... or is it not impacting the activity on the table ?

 It will cause some performance hit while you are doing it.

It'll also lock out writes on the table until the index is rebuilt,
so he does need to schedule downtime.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Help tuning postgres

2005-10-12 Thread Steve Poe

Would it not be faster to do a dump/reload of the table than reindex or
is it about the same? 

Steve Poe

On Wed, 2005-10-12 at 13:21 -0400, Tom Lane wrote:
 Emil Briggs [EMAIL PROTECTED] writes:
  Not yet, the db is in production use and I have to plan for a down-time
  for that... or is it not impacting the activity on the table ?
 
  It will cause some performance hit while you are doing it.
 
 It'll also lock out writes on the table until the index is rebuilt,
 so he does need to schedule downtime.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Help tuning postgres

2005-10-12 Thread Merlin Moncure
 
 Would it not be faster to do a dump/reload of the table than reindex
or
 is it about the same?
 
reindex is probably faster, but that's not the point. you can reindex a
running system whereas dump/restore requires downtime unless you work
everything into a transaction, which is headache, and dangerous.

reindex locking is very granular, in that it only acquires a excl. lock
on one index at a time and while doing so reading is possible (writes
will wait).

in 8.1 we get a fire and forget reindex database xyz which is about as
good as it gets without a dump/load or full vacuum.

Merlin

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Help tuning postgres

2005-10-12 Thread Andrew Sullivan
On Wed, Oct 12, 2005 at 06:55:30PM +0200, Csaba Nagy wrote:
 Ok, that was the first thing I've done, checking out the explain of the
 query. I don't really need the analyze part, as the plan is going for
 the index, which is the right decision. The updates are simple one-row

How do you know?  You _do_ need the ANALYSE, because it'll tell you
what the query _actually did_ as opposed to what the planner thought
it was going to do.  

Note that EXPLAIN ANALYSE actually performs the work, so you better
do it in a transaction and ROLLBACK if it's a production system.

 Actually I've done an iostat run in the meantime (just learned how to
 use it), and looks like the disk is 100 saturated. So it clearly is a
 disk issue in this case. And it turns out the Oracle hardware has an

Yes, but it could be a disk issue because you're doing more work than
you need to.  If your UPDATEs are chasing down a lot of dead tuples,
for instance, you'll peg your I/O even though you ought to have I/O
to burn.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The whole tendency of modern prose is away from concreteness.
--George Orwell

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings