Re: [PERFORM] [ADMIN] Index not used. WHY?

2003-12-05 Thread Andrei Bintintan
There are around 700 rows in this table.
If I set enable_seqscan=off then the index is used and I also used Vacuum
Analyze recently.

I find it strange because the number of values of id_user and id_modull are
somehow in the same distribution and when I search the table the id_user
index is used but the id_modull index is not used.

Does somehow postgre know that a seq scan runs faster in this case as a
index scan? Should I erase this index?
I have to say that the data's in this table are not changed offen, but there
are a LOT of joins made with this table.

Best regards.
Andy.


- Original Message -
From: "Stephan Szabo" <[EMAIL PROTECTED]>
To: "Andrei Bintintan" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Thursday, December 04, 2003 5:19 PM
Subject: Re: [ADMIN] Index not used. WHY?


>
> On Thu, 4 Dec 2003, Andrei Bintintan wrote:
>
> > Hi,
> >
> > I have the following table:
> > CREATE TABLE public.rights (
> > id int4 DEFAULT nextval('"rights_id_seq"'::text) NOT NULL,
> > id_user int4 NOT NULL,
> > id_modull int4 NOT NULL,
> > CONSTRAINT rights_pkey PRIMARY KEY (id)
> > )
> >
> > and I created the following indexes:
> >
> > CREATE INDEX right_id_modull_idx ON rights USING btree (id_modull);
> > CREATE INDEX right_id_user_idx ON rights USING btree (id_user);
> >
> > Now the problem:
> >
> > EXPLAIN SELECT * FROM rights r WHERE r.id_modull =15
> > returnes:
> > Seq Scan on rights r (cost=0.00..12.30 rows=42 width=12)
> > Filter: (id_modull = 15)
> >
> > EXPLAIN SELECT * FROM rights r WHERE r.id_user =15
> > returnes:
> > Index Scan using right_id_user_idx on rights r (cost=0.00..8.35 rows=11
width=12)
> > Index Cond: (id_user = 15)
> >
> > Question: Why the right_id_modull_idx is NOT USED at the 1st query and
> > the second query the right_id_user_idx index is used.
>
> As a note, pgsql-performance is a better list for these questions.
>
> So, standard questions:
>
> How many rows are in the table, what does EXPLAIN ANALYZE show for the
> queries, if you force index usage (set enable_seqscan=off) on the first
> what does EXPLAIN ANALYZE show then, have you used ANALYZE/VACUUM ANALYZE
> recently?
>


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] [ADMIN] Index not used. WHY?

2003-12-05 Thread Shridhar Daithankar
Andrei Bintintan wrote:

There are around 700 rows in this table.
If I set enable_seqscan=off then the index is used and I also used Vacuum
Analyze recently.
For 700 rows I think seq. would work best.
I find it strange because the number of values of id_user and id_modull are
somehow in the same distribution and when I search the table the id_user
index is used but the id_modull index is not used.
Does somehow postgre know that a seq scan runs faster in this case as a
index scan? Should I erase this index?
I have to say that the data's in this table are not changed offen, but there
are a LOT of joins made with this table.
If table is cached then it does not matter. Unless it grows substantially, say 
to around hundred thousand rows(Note your table is small), idex wouldn't be that 
useful.

 Shridhar

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] tuning questions

2003-12-05 Thread Thierry Missimilly


Jack Coates wrote:

>
> latest changes:
> shared_buffers = 35642
> max_fsm_relations = 1000
> max_fsm_pages = 1
> wal_buffers = 64
> sort_mem = 32768
> vacuum_mem = 32768
> effective_cache_size = 1
>
> /proc/sys/kernel/shmmax = 5
>
> IO is active, but hardly saturated. CPU load is hefty though, load
> average is at 4 now.
>
>procs  memoryswap  io
> system cpu
>  r  b  w   swpd   free   buff  cache  si  sobibo   incs  us
> sy  id
>  0  2  1   2808  11436  39616 1902988   0   0   240   896  765   469
> 2  11  87
>  0  2  1   2808  11432  39616 1902988   0   0   244   848  768   540
> 4   3  93
>  0  2  1   2808  11432  39616 1902984   0   0   204   876  788   507
> 3   4  93
>  0  2  1   2808  11432  39616 1902984   0   0   360   416  715   495
> 4   1  96
>  0  2  1   2808  11432  39616 1902984   0   0   376   328  689   441
> 2   1  97
>  0  2  0   2808  11428  39616 1902976   0   0   464   360  705   479
> 2   1  97
>  0  2  1   2808  11428  39616 1902976   0   0   432   380  718   547
> 3   1  97
>  0  2  1   2808  11428  39616 1902972   0   0   440   372  742   512
> 1   3  96
>  0  2  1   2808  11428  39616 1902972   0   0   416   364  711   504
> 3   1  96
>  0  2  1   2808  11424  39616 1902972   0   0   456   492  743   592
> 2   1  97
>  0  2  1   2808  11424  39616 1902972   0   0   440   352  707   494
> 2   1  97
>  0  2  1   2808  11424  39616 1902972   0   0   456   360  709   494
> 2   2  97
>  0  2  1   2808  11436  39616 1902968   0   0   536   516  807   708
> 3   2  94
>

Hi Jack,

As show by vmstat, your Operating System is spending 96% of its time in Idle. On
RedHat 8.0 IA32, Idle means idle and Wait I/O.
In your case, i think they are Wait I/O as you are working on 2.8 GB  DB with only
2GB RAM, but it should be arround 30%.
Your performances whould increase only if User CPU increase otherwise, for exemple
if your system swap, only Sys CPU whould increase and your application will stay
slow.

You can better check your I/O with : iostat 3 1000, and check that the max tps are
on the database filesystem.

So, all the Postgres tuning you have tried do not change a lot as the bottleneck is
your I/O throuput.
But, one thing you can check is which parts of Postgres need a lot of I/O.
To do that, after shuting down PG, move your database on an other disk (OS disk ?)
for exemple /mypg/data and create a symblolic link for /mypg/data/ to
$PGDATA/base.

Restart PG, and while you execute your application, check with iostat which disk as
the max of tps. I bet, it is the disk where the WAL buffer are logged.

One more thing about I/O, for an IDE disk, the maximum number of Write Block + Read
Block per sec is about 1 based on the I/O block size is 1 K. That means 10
Mb/s. if you need more, you can try Stripped SCSI disks or RAID0 subsystem disks.

Thierry Missimilly

>
> --
> Jack Coates, Lyris Technologies Applications Engineer
> 510-549-4350 x148, [EMAIL PROTECTED]
> "Interoperability is the keyword, uniformity is a dead end."
> --Olivier Fourdan
>
> ---(end of broadcast)---
> TIP 3: if posting/reading through Usenet, please send an appropriate
>   subscribe-nomail command to [EMAIL PROTECTED] so that your
>   message can get through to the mailing list cleanly
begin:vcard 
n:Missimilly;Thierry
tel;fax:+33 (0)4 76 29 78 78
tel;work:+33 (0)4 76 29 74 54
x-mozilla-html:FALSE
url:http:\\www.bull.com
org:BIS/R&D
adr:;;Bull SA, 1, rue de provence - BP 208;ECHIROLLES;;38432;FRANCE
version:2.1
email;internet:[EMAIL PROTECTED]
x-mozilla-cpt:;-18184
fn:Thierry Missimilly
end:vcard

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-05 Thread Richard Huxton
On Friday 05 December 2003 02:07, Ivar Zarans wrote:
> I have played around with explain and explain analyze and noticed one
> interesting oddity:
[snip]
> Why first example, where recid is given as numeric constant, is using
> sequential scan, but second example, where recid is given as string
> constant works with index scan, as expected? Third example shows, that
> numeric constant must be typecasted in order to function properly.
>
> Is this normal behaviour of fields with bigint type?

As Christopher says, normal (albeit irritating). Not sure it applies here - 
all the examples you've shown me are using the index.

Well - I must admit I'm stumped. Unless you have a *lot* of indexes and 
foreign keys to check, I can't see why it would take so long to update a 
single row. Can you post the schema for the table?
-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-05 Thread Ivar Zarans
On Fri, Dec 05, 2003 at 10:08:20AM +, Richard Huxton wrote:

> > numeric constant must be typecasted in order to function properly.
> >
> > Is this normal behaviour of fields with bigint type?
> 
> As Christopher says, normal (albeit irritating). Not sure it applies here - 
> all the examples you've shown me are using the index.

I guess i have solved this mystery. Problem appears to be exactly with
this - numeric constant representation in query.

I am using PyPgSQL for PostgreSQL access and making update queries as this:

qry = "UPDATE table1 SET status = %s WHERE recid = %s"
cursor.execute(qry, status, recid)

Execute method of cursor object is supposed to merge "status" and
"recid" values into "qry", using proper quoting. When i started to play
around with debug information i noticed, that this query used sequential
scan for "recid". Then i also noticed, that query, sent to server looked
like this:
"UPDATE table1 SET status = 'SKIP' WHERE recid = 199901"

Sure enough, when i used psql and EXPLAIN on this query, i got query
plan with sequential scan. And using recid value as string or typecasted
integer gave correct results with index scan. I wrote about this in my
previous message.

It seems, that PyPgSQL query quoting is not aware of this performance
problem (to which Cristopher referred) and final query, sent to server
is correct SQL, but not correct, considering PostgreSQL bugs.

One more explanation - previously i posted some logs, showing correct
query, using index scan, but still taking 0.29 seconds. Reason for this
delay is logging itself - it generates enough IO traffic to have impact
on query speed. With logging disabled, this query takes around 0.0022
seconds, which is perfectly normal.

Finally - what would be correct solution to this problem? Upgrading to
7.5 CVS is not an option :) One possibility is not to use PyPgSQL
variable substitution and create every query "by hand" - not very nice
solution, since variable substitution and quoting is quite convenient.

Second (and better) possibility is to ask PyPgSQL develeopers to take care
of PostgreSQL oddities.

Any other suggestions?

-- 
Ivar


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-05 Thread Shridhar Daithankar
Ivar Zarans wrote:
It seems, that PyPgSQL query quoting is not aware of this performance
problem (to which Cristopher referred) and final query, sent to server
is correct SQL, but not correct, considering PostgreSQL bugs.
Personally I don't consider a bug but anyways.. You are the one facing problem 
so I understand..

Finally - what would be correct solution to this problem? Upgrading to
7.5 CVS is not an option :) One possibility is not to use PyPgSQL
variable substitution and create every query "by hand" - not very nice
solution, since variable substitution and quoting is quite convenient.
Second (and better) possibility is to ask PyPgSQL develeopers to take care
of PostgreSQL oddities.
Any other suggestions?
I know zero in python but just guessing..

Will following help?

qry = "UPDATE table1 SET status = %s WHERE recid = '%s'"
cursor.execute(qry, status, recid)
 Just a thought..

 Shridhar

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-05 Thread Ivar Zarans
On Fri, Dec 05, 2003 at 06:19:46PM +0530, Shridhar Daithankar wrote:

> >is correct SQL, but not correct, considering PostgreSQL bugs.
> 
> Personally I don't consider a bug but anyways.. You are the one facing 
> problem so I understand..

Well, if this is not bug, then what is consideration behind this
behaviour? BTW, according to Cristopher it is fixed in 7.5 CVS.
Why fix it if this is not a bug? :))

One more question - is this "feature" related only to "bigint" fields,
or are other datatypes affected as well?

> Will following help?
> 
> qry = "UPDATE table1 SET status = %s WHERE recid = '%s'"
> cursor.execute(qry, status, recid)

Yes, this helps. But then it sort of obsoletes PyPgSQL-s own quoting
logic. I would prefer to take care of this all by myself or trust some
underlying code to do this for me. And PyPgSQL is quite nice - it
checks datatype and acts accordingly.

-- 
Ivar


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-05 Thread Richard Huxton
On Friday 05 December 2003 12:49, Shridhar Daithankar wrote:
> Ivar Zarans wrote:
> > It seems, that PyPgSQL query quoting is not aware of this performance
> > problem (to which Cristopher referred) and final query, sent to server
> > is correct SQL, but not correct, considering PostgreSQL bugs.

>
> Will following help?
>
> qry = "UPDATE table1 SET status = %s WHERE recid = '%s'"
> cursor.execute(qry, status, recid)

Better IMHO would be: "UPDATE table1 SET status = %s WHERE recid = %s::int8"

PG is very strict regarding types - normally a good thing, but it can hit you 
unexpectedly in this scenario. The reason is that the literal number is 
treated as int4, whereas quoted it is marked as type unknown. Unkown gets 
cast to int8, whereas int4 gets left as-is. If you want to know why int4 
doesn't get promoted to int8 automatically, browse the hackers list for the 
last couple of years.

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-05 Thread Shridhar Daithankar
Ivar Zarans wrote:
On Fri, Dec 05, 2003 at 06:19:46PM +0530, Shridhar Daithankar wrote:


is correct SQL, but not correct, considering PostgreSQL bugs.
Personally I don't consider a bug but anyways.. You are the one facing 
problem so I understand..
Well, if this is not bug, then what is consideration behind this
behaviour? BTW, according to Cristopher it is fixed in 7.5 CVS.
Why fix it if this is not a bug? :))
This is not a bug. It is just that people find it confusing when postgresql 
planner consider seemingly same type as different. e.g. treating int8 as 
different than int4. Obvious thinking is they should be same. But given 
postgresql's flexibility with create type, it is difficult to promote.

AFAIK, the fix in CVS is to make indexes operatable with seemingly compatible 
types. Which does not change the fact that postgresql can not upgrade data types 
on it's own.

Write good queries which adhere to strict data typing. It is better to 
understand anyway.

One more question - is this "feature" related only to "bigint" fields,
or are other datatypes affected as well?
Every data type is affected. int2 will not use a int4 index and so on.

Will following help?

qry = "UPDATE table1 SET status = %s WHERE recid = '%s'"
cursor.execute(qry, status, recid)


Yes, this helps. But then it sort of obsoletes PyPgSQL-s own quoting
logic. I would prefer to take care of this all by myself or trust some
underlying code to do this for me. And PyPgSQL is quite nice - it
checks datatype and acts accordingly.
Well, then pypgsql should be upgraded to query the pg catalogd to find exact 
type of column. But that would be too cumbersome I guess.

 Shridhar

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] [ADMIN] Index not used. WHY?

2003-12-05 Thread Stephan Szabo
On Fri, 5 Dec 2003, Andrei Bintintan wrote:

> There are around 700 rows in this table.
> If I set enable_seqscan=off then the index is used and I also used Vacuum
> Analyze recently.
>
> I find it strange because the number of values of id_user and id_modull are
> somehow in the same distribution and when I search the table the id_user
> index is used but the id_modull index is not used.

It was guessing that one would return 11 rows and the other 42 which is
why one used the index and the other wouldn't.  If those numbers aren't
realistic, you may want to raise the statistics target for the columns
(see ALTER TABLE) and re-run analyze.

> Does somehow postgre know that a seq scan runs faster in this case as a
> index scan? Should I erase this index?

It's making an educated guess.  When you're doing an index scan, it needs
to read through the index and then get matching rows from the table.
However, because those reads from the table are in a potentially random
order, there's usually a higher cost associated with those reads than if
the table was read in order (barring cases where you know your database
should always stay cached in disk cache, etc...).  If there's say 50 pages
in the entire table, a sequence scan does 50 sequential page reads and is
checking all those tuples.  If you're getting say 42 rows through an
index, you're first reading through the index, and then getting  pages
in a random order from the table where  depends on the distribution of
values throughout the table.  There's a variable in the configuration,
random_page_cost which controls the ratio of cost between a sequential
read and a random one (defaulting to 4).


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-05 Thread Ivar Zarans
On Fri, Dec 05, 2003 at 01:23:43PM +, Richard Huxton wrote:

> Better IMHO would be: "UPDATE table1 SET status = %s WHERE recid = %s::int8"

Thanks for the hint!

> unexpectedly in this scenario. The reason is that the literal number is 
> treated as int4, whereas quoted it is marked as type unknown. Unkown gets 
> cast to int8, whereas int4 gets left as-is.

This explains a lot. Thanks!
BTW, is this mentioned somewhere in PostgreSQL documentation? I can't
remember anything on this subject. Maybe i just somehow skipped it...

-- 
Ivar


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-05 Thread Ivar Zarans
On Fri, Dec 05, 2003 at 07:21:38PM +0530, Shridhar Daithankar wrote:

> planner consider seemingly same type as different. e.g. treating int8 as 
> different than int4. Obvious thinking is they should be same. But given 
> postgresql's flexibility with create type, it is difficult to promote.

OK, this makes sense and explains a lot. Thanks!

> Well, then pypgsql should be upgraded to query the pg catalogd to find 
> exact type of column. But that would be too cumbersome I guess.

Yes, so it seems. Time to rewrite my queries :)
Thanks again for help and explanations!

-- 
Ivar


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-05 Thread Mike C. Fletcher
I just spent 2 days tracking this error down in my own code, actually.  
What I wound up doing is having the two places where I generate the 
queries (everything in my system goes through those two points, as I'm 
using a middleware layer) check values used as identifying fields for 
the presence of a bigint, and if one exists, replaces it with a wrapper 
that does the coerced-string representation:

   class Wrap:
   def __init__( self, value ):
   self.value = value
   def __str__( self ):
   return "'%s'::bigint"%(self.value,)
   __repr__ = __str__
   value = Wrap(value)
Just doing that for the indexing/identifying values ATM.  pyPgSQL will 
back up to using simple repr for the object (rather than raising an 
error as it would if you were using a formatted string), but will 
otherwise treat it as a regular value for quoting and the like, so no 
other modifications to the code required.

By no means an elegant fix, but since your post (well, the resulting 
thread) managed to solve my problem, figured I should at least tell 
everyone thanks and how I worked around the problem.  You wouldn't want 
this kind of hack down in the pyPgSQL level I would think, as it's 
DB-version specific.  I suppose you could alter the __repr__ of the 
PgInt8 class/type to always use the string or coerced form, but it seems 
wrong to me.  I'm actually hesitant to include it in our own middleware 
layer, but oh well, it does seem to be necessary for even somewhat 
reasonable performance.

BTW, my case was a largish (88,000 record) table with a non-unique 
bigint key, explain on update shows sequential search, while with 
'int'::bigint goes to index search.  Using pyPgSQL as the interface to 
7.3.4 and 7.3.3.

Enjoy,
Mike
Ivar Zarans wrote:

On Fri, Dec 05, 2003 at 10:08:20AM +, Richard Huxton wrote:
 

...

I am using PyPgSQL for PostgreSQL access and making update queries as this:
 

...

It seems, that PyPgSQL query quoting is not aware of this performance
problem (to which Cristopher referred) and final query, sent to server
is correct SQL, but not correct, considering PostgreSQL bugs.
 

...

Finally - what would be correct solution to this problem? Upgrading to
7.5 CVS is not an option :) One possibility is not to use PyPgSQL
variable substitution and create every query "by hand" - not very nice
solution, since variable substitution and quoting is quite convenient.
Second (and better) possibility is to ask PyPgSQL develeopers to take care
of PostgreSQL oddities.
Any other suggestions?
 

___
 Mike C. Fletcher
 Designer, VR Plumber, Coder
 http://members.rogers.com/mcfletch/


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-05 Thread Mike C. Fletcher
Ivar Zarans wrote:

On Fri, Dec 05, 2003 at 01:23:43PM +, Richard Huxton wrote:
 

Better IMHO would be: "UPDATE table1 SET status = %s WHERE recid = %s::int8"
   

Thanks for the hint!
 

Which makes the wrapper class need:
   def __str__( self ):
   return "%s::int8"%(self.value,)
Enjoy,
Mike
___
 Mike C. Fletcher
 Designer, VR Plumber, Coder
 http://members.rogers.com/mcfletch/


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] tuning questions

2003-12-05 Thread Josh Berkus
Jack,

> The frustrating thing is, we also have a UP P3-500 with 512M RAM and two
> IDE drives with the same PG install which is doing okay with this load
> -- still half the speed of MS-SQL2K, but usable. I'm at a loss.

Overall, I'm really getting the feeling that this procedure was optimized for 
Oracle and/or MSSQL and is hitting some things that aren't such a good idea 
for PostgreSQL.   I highly suggest that you try using log_duration and 
log_statement (and in 7.4 log_min_duration_statement) to try to locate which 
particular statements are taking the longest.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-05 Thread Greg Stark
Ivar Zarans <[EMAIL PROTECTED]> writes:

> > qry = "UPDATE table1 SET status = %s WHERE recid = '%s'"
> > cursor.execute(qry, status, recid)
> 
> Yes, this helps. But then it sort of obsoletes PyPgSQL-s own quoting
> logic. I would prefer to take care of this all by myself or trust some
> underlying code to do this for me. And PyPgSQL is quite nice - it
> checks datatype and acts accordingly.

You should tell the PyPgSQL folk to use the new binary protocol for parameters
so that there are no quoting issues at all.

But if it's going to interpolate strings into the query then pyPgSQL really
ought to be doing '%s' as above even for numbers. This lets postgres decide
what the optimal datatype is based on what you're comparing it to. Skipping
the quotes will only cause headaches. 

-- 
greg


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] Slightly OT -- Xeon versus Opteron Comparison

2003-12-05 Thread William Yu
Ace's Hardware has put together a fairly comprehensive comparison 
between Xeon & Opteron platforms running server apps. Unfortunately, 
only MySQL "data mining" benchmarks as the review crew doesn't have that 
much experience with OLTP-type systems but I'm gonna try to convince 
them to add the ODSL DB benchmarks assuming they work fairly well with 
Postgres.

Read up the goodies here:

http://www.aceshardware.com/read.jsp?id=6275

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] query using cpu nearly 100%, why?

2003-12-05 Thread LIANHE SHAO
Hello,
I use php as front-end to query our database. When I use System Monitor to check the 
usage of cpu and memory, I noticed that the cpu very easily gets up to 100%. Is that 
normal? if not, could someone points out possible reason? 

 
I am using linux7.3, pgsql 7.3.4, 1G Memory and 2GHz CPU. 

Regards,
William


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] Help tracking down problem with inserts slowing down...

2003-12-05 Thread Steve Wampler

I need some help tracking down a sudden, massive slowdown
in inserts in one of our databases.

PG: 7.2.3  (RedHat 8.0)

Background.  We currently run nearly identical systems
at two sites: Site A is a 'lab' site used for development,
Site B is a production site.

The databases in question have identical structure:

  A simple table with 4 columns with a trigger function
 on inserts (which checks to see if the entry already
 exists, and if so, changes the insert into an update...)
  A simple view with 4 columns into the above table.

All access is through jdbc (JDK 1.3.1, jdbc 7.1-1.3),
postgresql.conf's are identical.

The two sites were performing at comparable speeds until
a few days ago, when we deleted several million records
from each database and then did a vacuum full; analyze
on both.  Now inserts at Site B are several orders of
magnitude slower than at Site A.  The odd thing is that
Site B's DB now has only 60,000 records while Site A's is
up around 3 million.  Inserts at A average 63ms, inserts
at B are now up at 4.5 seconds!

EXPLAIN doesn't show any difference between the two.

Can someone suggest ways to track this down?  I don't know
much about postgresql internals/configuration.

Thanks!
Steve
-- 
Steve Wampler -- [EMAIL PROTECTED]
The gods that smiled on your birth are now laughing out loud.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Help tracking down problem with inserts slowing

2003-12-05 Thread Neil Conway
Steve Wampler <[EMAIL PROTECTED]> writes:
> PG: 7.2.3  (RedHat 8.0)

You're using PG 7.2.3 with the PG 7.1 JDBC driver; FWIW, upgrading to
newer software is highly recommended.

> The two sites were performing at comparable speeds until a few days
> ago, when we deleted several million records from each database and
> then did a vacuum full; analyze on both.  Now inserts at Site B are
> several orders of magnitude slower than at Site A.

Two thoughts:

(1) Can you confirm that the VACUUM FULL on site B actually
removed all the tuples you intended it to remove? Concurrent
transactions can limit the amount of data that VACUUM FULL is
able to reclaim. If you run contrib/pgstattuple (or compare
the database's disk consumption with the number of live rows
in it), you should be able to tell.

(2) Look at the EXPLAIN for the SELECTs generated by the ON INSERT
trigger -- is there any difference between site A and B?

-Neil


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] tuning questions

2003-12-05 Thread Jack Coates
On Fri, 2003-12-05 at 09:26, Josh Berkus wrote:
> Jack,
> 
> > The frustrating thing is, we also have a UP P3-500 with 512M RAM and two
> > IDE drives with the same PG install which is doing okay with this load
> > -- still half the speed of MS-SQL2K, but usable. I'm at a loss.
> 
> Overall, I'm really getting the feeling that this procedure was optimized for 
> Oracle and/or MSSQL and is hitting some things that aren't such a good idea 
> for PostgreSQL.   I highly suggest that you try using log_duration and 
> log_statement (and in 7.4 log_min_duration_statement) to try to locate which 
> particular statements are taking the longest.

I'll definitely buy that as round two of optimization, but round one is
still "it's faster on the slower server."

hdparm -I is identical between the boxes, filesystem structure layout is
identical, disk organization isn't identical, but far worse: the UP low
ram box has PG on /dev/hdb, ew. Predictably, vmstat shows low numbers...
but steady numbers.

dev is the box which goes fast, and I was wrong, it's actually a 2GHz
P4. rufus is the box which goes slow. During the big fetch:
dev bi sits around 2000 blocks for twenty seconds while bo is around 50
blocks, then bo jumps to 800 or so while the data is returned, then
we're done.

rufus bi starts at 16000 blocks, then drops steadily while bo climbs.
After a minute or so, bi stabilizes at 4096 blocks, then bo bursts to
return the data. Then the next fetch starts, and it's bi of 500, bo of
300 for several minutes.

These observations certainly all point to Eric and Thierry's
recommendations to better organize the filesystem and get faster disks..
except that the dev box gets acceptable performance.

So, I've dug into postgresql.conf on dev and rufus, and here's what I
found:

RUFUS








how much
ram do
you
have?






75%
converted to 8K pages of that for effective_cache



15% of
that or
512M,
whichever is larger, converted to 8K pages for shared_buffers
15% of
that
converted to 8K pages for vacuum_mem



how many
messages
will you
send
between
vacuums?



divide
that by
2 and
divide
by 6 for
max_fsm_pages







































DEV








how much
ram do
you
have?






48%
converted to 8K pages of that for effective_cache



6.5% of
that or
512M,
whichever is larger, converted to 8K pages for shared_buffers
52% of
that
converted to 8K pages for vacuum_mem




max_fsm_pages untouched on this box.







I adjusted rufus's configuration to match those percentages, but left
max_fsm_pages dialed up to 50. Now Rufus's vmstat shows much better
behavior: bi 12000 blocks gradually sloping down to 3000 during the big
select, bo steady until it's ready to return. As more jobs come in, we
see overlap areas where bi is 600-ish and bo is 200-ish, but they only
last a few tens of seconds.

The big selects are still a lot slower than they are on the smaller
database and overall performance is still unacceptable. Next I dialed
max_fsm_pages back down to 1 -- no change. Hm, maybe it's been too
long since the last vacuumdb --analyze, let's give it another.

hdparm -Tt shows that disk performance is crappo on rufus, half what it
is on dev -- and freaking dev is using 16 bit IO! This is a motherboard
IDE controller issue.

South Bridge:   VIA vt8233
Revision:   ISA 0x0 IDE 0x6

That's it, I'm throwing out this whole test series and starting over
with different hardware. Database server is now a dual 2GHz Xeon with
2GB RAM & 2940UW SCSI, OS and PG's logs on 36G drive, PG data on 9GB
drive. Data is importing now and I'll restart the tests tonight.
-- 
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, [EMAIL PROTECTED]
"Interoperability is the keyword, uniformity is a dead end."
--Olivier Fourdan



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Help tracking down problem with inserts slowing down...

2003-12-05 Thread Robert Treat
On Friday 05 December 2003 16:51, Steve Wampler wrote:
> I need some help tracking down a sudden, massive slowdown
> in inserts in one of our databases.
>
> PG: 7.2.3  (RedHat 8.0)
>
> Background.  We currently run nearly identical systems
> at two sites: Site A is a 'lab' site used for development,
> Site B is a production site.
>
> The databases in question have identical structure:
>
>   A simple table with 4 columns with a trigger function
>  on inserts (which checks to see if the entry already
>  exists, and if so, changes the insert into an update...)
>   A simple view with 4 columns into the above table.
>
> All access is through jdbc (JDK 1.3.1, jdbc 7.1-1.3),
> postgresql.conf's are identical.
>
> The two sites were performing at comparable speeds until
> a few days ago, when we deleted several million records
> from each database and then did a vacuum full; analyze
> on both.  Now inserts at Site B are several orders of
> magnitude slower than at Site A.  The odd thing is that
> Site B's DB now has only 60,000 records while Site A's is
> up around 3 million.  Inserts at A average 63ms, inserts
> at B are now up at 4.5 seconds!
>
> EXPLAIN doesn't show any difference between the two.
>
> Can someone suggest ways to track this down?  I don't know
> much about postgresql internals/configuration.
>

What does explain analyze show for the insert query?

Are there FK and/or Indexes involved here? Did you you reindex?
A vacuum verbose could give you a good indication if you need to reindex, 
compare the # of pages in the index with the # in the table. 

Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match