date:20030709

Re: [NOVICE] [PERFORM] Extreme high load averages

2003-07-09 Thread Martin Foster

Dennis Björklund wrote:

On Sun, 6 Jul 2003, Martin Foster wrote:


The processor seems to be purposely sitting there twiddling it's thumbs. 
 Which leads me to believe that perhaps the nice levels have to be 
changed on the server itself?


It could also be all the usual things that affect performance. Are your 
queries using indexes where it should? Do you vacuum analyze after you 
have updated/inserted a lot of data?

It could be that some of your queries is not as efficient as it should, 
like doing a sequenctial scan over a table instead of an index scan. That 
translates into more IO needed and slower response times. Especially when 
you have more connections figthing for the available IO.

I actually got a bit more respect for PostgreSQL tonight.  It seems that 
one of my scripts was not committing changes after maintenance was 
conducted.  Meaning that rows that would normally be removed after 
offline archiving was completed were in fact still around.

Normally at any given point in time this table would grow 50K rows 
during a day, be archived that night and then loose rows that were no 
longer needed.This process, is what allowed MySQL to maintain any 
stability as the size of this table can balloon significantly.

PostgreSQL with tweaking was handling a table with nearly 300K rows. 
That size alone would of dragged the MySQL system down to a near grind, 
and since most of those rows are not needed.   One can imagine that 
queries are needlessly processing rows that should be outright ignored.

This probably explains why row numbering based searches greatly 
accelerated the overall process.

By fixing the script and doing the appropriate full vacuum and re-index, 
the system is behaving much more like it should.  Even if the process 
may seem a bit odd to some.

The reason for removing rows on a daily basis is due to the perishable 
nature of the information.  Since this is a chat site, posts over a day 
old are rarely needed for any reason.   Which is why they are archived 
into dumps in case we really need to retrieve the information itself and 
this gives us the added bonus of smaller backup sizes and smaller 
database sizes.

Martin Foster
Creator/Designer Ethereal Realms
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

[PERFORM] plpgsql vs. SQL performance (again)

2003-07-09 Thread Michael Pohl

About a month ago I asked the general list about plpgsql functions that
occasionally significantly underperform their straight SQL equivalents.  
Tom noted that a different query plan was almost certainly being chosen by
the plpgsql function:

http://archives.postgresql.org/pgsql-general/2003-05/msg00966.php
http://archives.postgresql.org/pgsql-general/2003-05/msg00998.php

Tom suggested checking for sloppy datatype declarations in the plpgsql 
functions.  Double-checked, a-ok.

Tom also suggested that indexscans might not get picked by the plpgsql
function if I have some very skewed statistics.  Is there a way to verify
the plpgsql function's planner choices?

My casual observations are that this problem occurs with aggregates, and
that the big performance hit is not consistent.  I'd like advice on more
formal troubleshooting.

I can provide examples (my latest problem function is currently taking
over 4 seconds vs. .04 seconds for its straight SQL equivalent), table
schema, explain output for the straight SQL, etc., if anyone cares to work
through this with me.

thanks,

michael


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

[PERFORM] Can you help me?

2003-07-09 Thread Chris_Wu

Hello all!
  I'm a new to Postgresql , I have never used it  before.
  I am having an issue with configure the postgresql.conf file.
  The machine itself is a 2.66GHz P4 w/ 2G memory.
  Would you mind to send me a copy of examples .(postgresql.conf)
  Maybe you can tell me how to configure these parameters.
  Thanks
  Sincerely,

Chris.Wu





---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Moving postgresql.conf tunables into 2003...

2003-07-09 Thread Martin Foster

Sean Chittenden wrote:
I looked through the src/doc/runtime.sgml for a good place to stick
this and couldn't find a place that this seemed appropriate, but on
FreeBSD, this can be determined with a great deal of precision in a
programmatic manner:
echo "effective_cache_size = $((`sysctl -n vfs.hibufspace` / 8192))"

The same OID is available via C too.  It'd be slick if PostgreSQL
could tune itself (on FreeBSD) at initdb time with the above code.  If
Linux exports this info via /proc and can whip out the appropriate
magic, even better.  An uncommented out good guess that shows up in
postgresql.conf would be stellar and quite possible with the use of
sed.
Maybe an initdb switch could be added to have initdb tune the config
it generates?  If a -n is added, have it generate a config and toss it
to stdout?
case `uname` in
"FreeBSD")
echo "effective_cache_size = $((`sysctl -n vfs.hibufspace` / 8192))"
;;
*)
echo "Unable to automatically determine the effective cache size" >> 
/dev/stderr
;;
esac
-sc

Simplest way may be to create a 'auto-tune' directory with scripts for 
configured platforms.   When postgres installs the databases, it checks 
for 'tune.xxx' and if found uses that to generate the script itself?

This would allow for defaults on platforms that do not have them and 
optimization for those that do.

Martin Foster
Creator/Designer Ethereal Realms
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [PERFORM] Moving postgresql.conf tunables into 2003...

2003-07-09 Thread Sean Chittenden

> I don't have much to add because I'm pretty new to Postgres and have
> been soliciting advice here recently, but I totally agree with
> everything you said.  I don't mind if it's in the postgres.conf file
> or in a faq that is easy to find, I just would like it to be in one
> place.  A good example of the need for this is when I was tuning
> "effective_cache" I thought that was creating a cache for Postgres
> when in fact as it was pointed out to me, it's just hinting to
> postgres the size of the OS cache.  Lots of ways for people to get
> really confused here.

I looked through the src/doc/runtime.sgml for a good place to stick
this and couldn't find a place that this seemed appropriate, but on
FreeBSD, this can be determined with a great deal of precision in a
programmatic manner:

echo "effective_cache_size = $((`sysctl -n vfs.hibufspace` / 8192))"

The same OID is available via C too.  It'd be slick if PostgreSQL
could tune itself (on FreeBSD) at initdb time with the above code.  If
Linux exports this info via /proc and can whip out the appropriate
magic, even better.  An uncommented out good guess that shows up in
postgresql.conf would be stellar and quite possible with the use of
sed.

Maybe an initdb switch could be added to have initdb tune the config
it generates?  If a -n is added, have it generate a config and toss it
to stdout?


case `uname` in
"FreeBSD")
echo "effective_cache_size = $((`sysctl -n vfs.hibufspace` / 8192))"
;;
*)
echo "Unable to automatically determine the effective cache size" >> 
/dev/stderr
;;
esac


-sc

-- 
Sean Chittenden

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] Moving postgresql.conf tunables into 2003...

2003-07-09 Thread Martin Foster

Scott Marlowe  wrote:
It would be nice to have a program that could run on any OS postgresql 
runs on and could report on the current limits of the kernel, and make 
recommendations for changes the admin might want to make.

One could probably make a good stab at effective cache size during 
install.  Anything reasonably close would probably help.

Report what % of said resources could be consumed by postgresql under 
various circumstances...

One of the issues that automating the process would encounter are limits 
in the kernel that are too low for PostgreSQL to handle. The BSD's come 
to mind where they need values manually increased in the kernel before 
you can reach a reasonable maximum connection count.

Another example is how OpenBSD will outright crash when trying to test 
the database during install time.   It seems that most of the tests fail 
because the maximum amount of processes allowed is too low for the test 
to succeed.   While FreeBSD will work just fine on those same tests.

If PostgreSQL automates the configuration, that would be a plus.   But 
also detect the platform and inform the person that these changes should 
be done to the kernel, sysctl or whatever in order to have that 
configuration run.

Perl may be useful in this for a few reasons.   It's portable enough to 
run on multiple Unix variants and the tools would be fairly standard, so 
the code would require less considerations for more exotic implementations.

Martin Foster
Creator/Designer Ethereal Realms
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Moving postgresql.conf tunables into 2003...

2003-07-09 Thread scott.marlowe

On Wed, 9 Jul 2003, Kaarel wrote:

> >>Are you willing to say that the PostgreSQL database system should only be
> >>used by DBAs?  I believe that Postgres is such a good and useful tool that
> >>anyone should be able to start using it with little or no barrier to entry.
> > 
> > 
> > I quite agree.  But there is a difference between saying "you should get
> > decent performance with no effort" and "you should get optimal
> > performance with no effort".  I think we can get to the first with
> > relatively little trouble (like boosting the default shared_buffers to
> > 1000), but the second is an impractical goal.
> 
> 
> Just wanted to repeat some of the thoughts already been expressed.
> 
> There are no reasons why shouldn't PostgreSQL be reasonably well 
> configured for a particular platform out of the box. Not for maximum 
> performance but for good enough performance. The many complaints by new 
> users about PostgreSQL being suprisingly slow and the all the so 
> standard answers (vacuum, pump up memory settings) imho prove that the 
> default installatio can be improved. Already mentioned in the mail 
> lists: using multiple standard conf files, quering system info and 
> dynamically generating all or some parts of the conf file, automating 
> the vacuum process...

It would be nice to have a program that could run on any OS postgresql 
runs on and could report on the current limits of the kernel, and make 
recommendations for changes the admin might want to make.

One could probably make a good stab at effective cache size during 
install.  Anything reasonably close would probably help.

Report what % of said resources could be consumed by postgresql under 
various circumstances...


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Moving postgresql.conf tunables into 2003...

2003-07-09 Thread Kaarel

Are you willing to say that the PostgreSQL database system should only be
used by DBAs?  I believe that Postgres is such a good and useful tool that
anyone should be able to start using it with little or no barrier to entry.


I quite agree.  But there is a difference between saying "you should get
decent performance with no effort" and "you should get optimal
performance with no effort".  I think we can get to the first with
relatively little trouble (like boosting the default shared_buffers to
1000), but the second is an impractical goal.


Just wanted to repeat some of the thoughts already been expressed.

There are no reasons why shouldn't PostgreSQL be reasonably well 
configured for a particular platform out of the box. Not for maximum 
performance but for good enough performance. The many complaints by new 
users about PostgreSQL being suprisingly slow and the all the so 
standard answers (vacuum, pump up memory settings) imho prove that the 
default installatio can be improved. Already mentioned in the mail 
lists: using multiple standard conf files, quering system info and 
dynamically generating all or some parts of the conf file, automating 
the vacuum process...

Kaarel

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [PERFORM] Some very weird behaviour....

2003-07-09 Thread Rod Taylor

>  To give you some perspective on the size of the dataset and the
> performance level we are hitting, here are some "good" results based on
> some explains:

Before Tom jumps in taking all the fun out of trying to solve it...


The estimates in the slow queries seem perfectly reasonable.  In fact,
the cost estimates of both the slow and fast queries are the same which
is what would be expected if all of the data was distributed evenly
amongst the table.

Given it's a date, I would guess that the data is generally inserted
into the table in an order following the date but for some reason those
'high' dates have their data distributed more evenly amongst the table. 
Clustered data will have fewer disk seeks and deal with fewer pages of
information in general which makes for a much faster query.  Distributed
data will have to pull out significantly more information from the disk,
throwing most of it away.

I would guess that sometime on 2002-05-25 someone did a bit of data
cleaning (deleting records).  Next day the free space map had entries
available in various locations within the table, and used them rather
than appending to the end.  With 89 Million records with date being
significant, I'm guessing there aren't very many modifications or
deletes on it.

So.. How to solve the problem? If this is the type of query that occurs
most often, you do primarily inserts, and the inserts are generally
created following date, cluster the table by index "some_table_ix_0". 
The clustering won't degrade very much since that is how you naturally
insert the data.


signature.asc
Description: This is a digitally signed message part

[PERFORM] Some very weird behaviour....

2003-07-09 Thread Chris Bowlby

Hi All, 

 I'm sure some of you know me from previous questions on other lists,
but this one has myself and Marc completely stumped. We've got a
database that has about 89 Million rows, under PostgreSQL 7.3.3 on a
dual PIII 1.2 with 4 GBytes of RAM on a 5 disk RAID 5 array. The dataset
itself is about 26+ GBYtes in size, all of it in the one table. 

 To give you some perspective on the size of the dataset and the
performance level we are hitting, here are some "good" results based on
some explains:

jnlstats=# explain analyze select count(*) from some_table where
some_time::date='2003-05-21';
   
QUERY
PLAN
--
 Aggregate  (cost=1794562.35..1794562.35 rows=1 width=0) (actual
time=3013.55..3013.55 rows=1 loops=1)
   ->  Index Scan using some_table_ix_0 on some_table 
(cost=0.00..1793446.02 rows=446531 width=0) (actual time=48.40..2721.26
rows=249837 loops=1)
 Index Cond: ((some_time)::date = '2003-05-21'::date)
 Total runtime: 3015.02 msec
(4 rows)

jnlstats=# explain analyze select count(*) from stats_raw where
some_time::date='2003-05-21';
   QUERY
PLAN
-
 Aggregate  (cost=1794562.35..1794562.35 rows=1 width=0) (actual
time=1401.23..1401.23 rows=1 loops=1)
   ->  Index Scan using some_table_ix_0 on some_table 
(cost=0.00..1793446.02 rows=446531 width=0) (actual time=0.50..1118.92
rows=249837 loops=1)
 Index Cond: ((some_time)::date = '2003-05-21'::date)
 Total runtime: 1401.42 msec

 There are about 249837 items that the query is identifying as valid
results and the results range between 1-1.4 seconds over ten runs with
the initial query taking 3 seconds, this average is how 90% of the
queries resopond, but we've got several peaks that we can not explain in
any way. For instance:

jnlstats=# explain analyze select count(*) from some_table where
some_time::date='2003-05-26';

QUERY
PLAN

 Aggregate  (cost=1794562.35..1794562.35 rows=1 width=0) (actual
time=307025.65..307025.65 rows=1 loops=1)
   ->  Index Scan using some_table_ix_0 on some_table 
(cost=0.00..1793446.02 rows=446531 width=0) (actual
time=51.05..306256.93 rows=374540 loops=1)
 Index Cond: ((some_time)::date = '2003-05-26'::date)
 Total runtime: 307025.81 msec
(4 rows)

jnlstats=# explain analyze select count(*) from some_table where
some_time::date='2003-05-26';
   
QUERY
PLAN
--
 Aggregate  (cost=1794562.35..1794562.35 rows=1 width=0) (actual
time=10837.86..10837.86 rows=1 loops=1)
   ->  Index Scan using some_table_ix_0 on some_table 
(cost=0.00..1793446.02 rows=446531 width=0) (actual time=1.01..10304.78
rows=374540 loops=1)
 Index Cond: ((some_time)::date = '2003-05-26'::date)
 Total runtime: 10838.04 msec

 The total number of items counted is 374540 items, so not too much more
then the previous query, but the 300 second runtime was unexpected (we
were expecting ~4-5 seconds and then ~1-2 seconds for the caches
results. I have 5 other dates that all exhibit this information,but it's
ONLY those dates that run that slow and the one I presented above here
is the largest of them all. The database server is configured with a 5
MByte shared mamory buffer, but even a larger shared memory buffer does
not help (we have had it set to 800 MBytes before). The disk is getting
hit the heviest durring that last query, with iostat results being:

  tty da0  da1  da2
cpu
 tin tout  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in
id
   0   25 12.22   1  0.02   7.68   0  0.00   7.68   0  0.00   0  0  0  0
99
   4 3758  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00   6  0 38  0
56
   0  151  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00   9  0 43  0
48 
   0  148  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00  10  0 40  0
49
   0  153  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00  10  0 40  0
49
   0  152  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00

Re: [NOVICE] [PERFORM] Extreme high load averages

[PERFORM] plpgsql vs. SQL performance (again)

[PERFORM] Can you help me?

Re: [PERFORM] Moving postgresql.conf tunables into 2003...

Re: [PERFORM] Moving postgresql.conf tunables into 2003...

Re: [PERFORM] Moving postgresql.conf tunables into 2003...

Re: [PERFORM] Moving postgresql.conf tunables into 2003...

Re: [PERFORM] Moving postgresql.conf tunables into 2003...

Re: [PERFORM] Some very weird behaviour....

[PERFORM] Some very weird behaviour....

10 matches

Site Navigation

Mail list logo

Footer information