date:20070904

Re: [PERFORM] Vacum Analyze problem

2007-09-04 Thread Gregory Stark

<[EMAIL PROTECTED]> writes:

> I knew that in the long run the VACUUM ANALYZE comand has to be executed
> again. My question is if something can happen over night and cause the need
> of a new VACUUM ANALYZE (regenerating indexes or other thing related with
> performance).

The answer to your question is possibly yes for two reasons:

1) If you're running an autovacuum daemon it might decide it's time to vacuum
the table and kick off a vacuum. That sounds most like what you're describing.

2) If the size of the table changes substantially becoming much larger (or
smaller but that wouldn't happen just due to deletes unless you run vacuum)
then recent versions of Postgres will notice even if you don't run analyze and
take that into account.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Performance on 8CPU's and 32GB of RAM

2007-09-04 Thread Scott Marlowe

On 9/4/07, Alvaro Herrera <[EMAIL PROTECTED]> wrote:
> Carlo Stonebanks wrote:
> > A client is moving their postgresql db to a brand new Windows 2003 x64
> > server with 2 quad cores and 32GB of RAM. It is a dedicated server to run
> > 8.2.4.
>
> Large shared_buffers and Windows do not mix.  Perhaps you should leave
> the shmem config low, so that the kernel can cache the file pages.

Egads, I'd completely missed the word Windows up there.

I would highly recommend building the postgresql server on a unixish
OS.  Even with minimum tuning, I'd expect the same box running linux
or freebsd to stomp windows pretty heavily in the performance
department.

But yeah, the I/O, that's the big one.  If it's just a single or a
couple of IDE drives, it's not gonna be able to handle much load.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] Performance on 8CPU's and 32GB of RAM

2007-09-04 Thread Alvaro Herrera

Carlo Stonebanks wrote:
> A client is moving their postgresql db to a brand new Windows 2003 x64 
> server with 2 quad cores and 32GB of RAM. It is a dedicated server to run 
> 8.2.4.

Large shared_buffers and Windows do not mix.  Perhaps you should leave
the shmem config low, so that the kernel can cache the file pages.

> The server typically will have less than 10 users. The primary use of this 
> server is to host a database that is continuously being updated by data 
> consolidation and matching software software that hits the server very 
> hard. There are typically eight such processes running at any one time. The 
> software extensively exploits postgresql native fuzzy string for data 
> matching. The SQL is dynamically generated by the software and consists of 
> large, complex joins. (the structure of the joins change as the software 
> adapts its matching strategies).

It sounds like you will need a huge lot of vacuuming effort to keep up.
Maybe you should lower autovac scale factors so that your tables are
visited more frequently.  A vacuum_delay of 40 sounds like too much
though.

Since you didn't describe your disk configuration, it is most likely not
really prepared to handle high I/O load.  Maybe you should fix that.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] Performance on 8CPU's and 32GB of RAM

2007-09-04 Thread Scott Marlowe

On 9/4/07, Carlo Stonebanks <[EMAIL PROTECTED]> wrote:
> A client is moving their postgresql db to a brand new Windows 2003 x64
> server with 2 quad cores and 32GB of RAM. It is a dedicated server to run
> 8.2.4.

And what does the drive subsystem look like?  All that horsepower
isn't going to help if all your data is sitting on an inferior drive
subsystem.

> The server typically will have less than 10 users. The primary use of this
> server is to host a database that is continuously being updated by data
> consolidation and matching software software that hits the server very hard.
> There are typically eight such processes running at any one time. The
> software extensively exploits postgresql native fuzzy string for data
> matching. The SQL is dynamically generated by the software and consists of
> large, complex joins. (the structure of the joins change as the software
> adapts its matching strategies).
>
> I would like to favour the needs of the data matching software, and the
> server is almost exclusivly dedicated to PostgreSQL.
>
> I have made some tentative modifications to the default postgres.config file
> (see below), but I don't think I've scratched the surface of what this new
> system is capable of. Can I ask - given my client's needs and this new,
> powerful server and the fact that the server typically has a small number of
> extremely busy processes, what numbers they would change, and what the
> recommendations would be?
>
> Thanks!
>
> Carlo
>
> max_connections = 100
> shared_buffers = 10
> work_mem = 100

Even with only 10 users, 1 gig work_mem is extremely high.  (without a
unit, work_mem is set in k on 8.2.x) 1 would be much more
reasonable.

OTOH, shared_buffers, at 10 is only setting it to 100 meg.  that's
pretty small on a machine with 32 gig.  Also, I recommend setting
values more readable, like 500MB in postgresql.conf.  Much easier to
read than 10...

> effective_cache_size = 375000

This seems low by an order of magnitude or two.

But the most important thing is what you've left out.  What kind of
I/O does this machine have.  It's really important for something that
sounds like an OLAP server.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] join tables vs. denormalization by trigger

2007-09-04 Thread Marc Mamin

Hello,

I had a similar issue and -atfer testing - decided to merge the tables
B and C into a single table.
In my case the resulting table contains a large proportion of nulls
which limits the size increase...
You'll have to do some testing with your data to evaluate the
performance gain.

Hope to help,

Marc  

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Walter
Mauritz
Sent: Tuesday, September 04, 2007 8:53 PM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] join tables vs. denormalization by trigger

Hi,

I wonder about differences in performance between two scenarios:

Background:
Table A, ~50,000 records
Table B, ~3,000,000 records (~20 cols)
Table C, ~30,000,000 records (~10 cols)

a query every 3sec. with limit 10

Table C depends on Table B wich depends on Table A, int8 foreign key,
btree index

* consider it a read only scenario (load data only in night, with time
for vacuum analyze daily)
* im required to show records from Table C, but also with some (~5cols)
info from Table B
* where clause always contains the foreign key to Table A
* where clause may contain further 1-10 search parameter


Scenario A)
simply inner join Table B + C

Scenario B)
with use of trigger on insert/update I could push the required
information from table B down to table C.
-> so i would only require to select from table C.


My question:
1) From your experience ... how much faster (approximately) in percent
do you regard Scenario B faster than A ?

2) any other tips for such a read only scenario

Thx for any attention :-)
Walter
-- 
GMX FreeMail: 1 GB Postfach, 5 E-Mail-Adressen, 10 Free SMS.
Alle Infos und kostenlose Anmeldung: http://www.gmx.net/de/go/freemail

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

[PERFORM] Performance on 8CPU's and 32GB of RAM

2007-09-04 Thread Carlo Stonebanks

A client is moving their postgresql db to a brand new Windows 2003 x64 
server with 2 quad cores and 32GB of RAM. It is a dedicated server to run 
8.2.4.


The server typically will have less than 10 users. The primary use of this 
server is to host a database that is continuously being updated by data 
consolidation and matching software software that hits the server very hard. 
There are typically eight such processes running at any one time. The 
software extensively exploits postgresql native fuzzy string for data 
matching. The SQL is dynamically generated by the software and consists of 
large, complex joins. (the structure of the joins change as the software 
adapts its matching strategies).


I would like to favour the needs of the data matching software, and the 
server is almost exclusivly dedicated to PostgreSQL.


I have made some tentative modifications to the default postgres.config file 
(see below), but I don't think I've scratched the surface of what this new 
system is capable of. Can I ask - given my client's needs and this new, 
powerful server and the fact that the server typically has a small number of 
extremely busy processes, what numbers they would change, and what the 
recommendations would be?


Thanks!

Carlo

max_connections = 100
shared_buffers = 10
work_mem = 100
max_fsm_pages = 204800
max_fsm_relations = 1500
vacuum_cost_delay = 40
bgwriter_lru_maxpages = 100
bgwriter_all_maxpages = 100
checkpoint_segments = 64
checkpoint_warning = 290
effective_cache_size = 375000
stats_command_string = on
stats_start_collector = on
stats_row_level = on
autovacuum = on
autovacuum_naptime = 1min
autovacuum_vacuum_threshold = 500
autovacuum_analyze_threshold = 250
autovacuum_vacuum_scale_factor = 0.2
autovacuum_analyze_scale_factor = 0.1


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] join tables vs. denormalization by trigger

2007-09-04 Thread Mark Lewis

On Tue, 2007-09-04 at 20:53 +0200, Walter Mauritz wrote:
> Hi,
> 
> I wonder about differences in performance between two scenarios:
> 
> Background:
> Table A, ~50,000 records
> Table B, ~3,000,000 records (~20 cols)
> Table C, ~30,000,000 records (~10 cols)
> 
> a query every 3sec. with limit 10
> 
> Table C depends on Table B wich depends on Table A, int8 foreign key, btree 
> index
> 
> * consider it a read only scenario (load data only in night, with time for 
> vacuum analyze daily)
> * im required to show records from Table C, but also with some (~5cols) info 
> from Table B 
> * where clause always contains the foreign key to Table A
> * where clause may contain further 1-10 search parameter
> 
> 
> Scenario A)
> simply inner join Table B + C
> 
> Scenario B)
> with use of trigger on insert/update I could push the required information 
> from table B down to table C.
> -> so i would only require to select from table C.
> 
> 
> My question:
> 1) From your experience ... how much faster (approximately) in percent do you 
> regard Scenario B faster than A ?

You're assuming that B is always going to be faster than A, which
certainly isn't a foregone conclusion.  Let's say that you average 10
bytes per column.  In scenario A, the total data size is then roughly
3,000,000 * 20 * 10 + 30,000,000 * 10 * 10 = 3.6 GiB.  In scenario B due
to your denormalization, the total data size is more like 30,000,000 *
30 * 10 = 9 GiB, or 2.5 times more raw data.

That's a lot of extra disk IO, unless your database will always fit in
memory in both scenarios.

Although you didn't provide enough data to answer with certainty, I
would go on the assumption that A is going to be faster than B.  But
even if it weren't, remember that premature optimization is the root of
all evil.  If you try A and it doesn't perform fast enough, then you can
always try B later to see if it works any better.

-- Mark Lewis

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] About autovacuum

2007-09-04 Thread Bill Moran

In response to Jean-David Beyer <[EMAIL PROTECTED]>:

> I have this turned on, and if I look at the log, it runs once a minute,
> which is fine.
> 
> But what does it do? I.e, it runs VACUUM, but does it also do an analyze?

Yes.  If you turn up the debugging level, you'll see detailed log
messages about its activities.

There were discussions on other lists about improving autovacuum's log
messages, I'm pretty sure it will log more helpful information in 8.3.

-- 
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

[EMAIL PROTECTED]
Phone: 412-422-3463x4023

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[PERFORM] join tables vs. denormalization by trigger

2007-09-04 Thread Walter Mauritz

Hi,

I wonder about differences in performance between two scenarios:

Background:
Table A, ~50,000 records
Table B, ~3,000,000 records (~20 cols)
Table C, ~30,000,000 records (~10 cols)

a query every 3sec. with limit 10

Table C depends on Table B wich depends on Table A, int8 foreign key, btree 
index

* consider it a read only scenario (load data only in night, with time for 
vacuum analyze daily)
* im required to show records from Table C, but also with some (~5cols) info 
from Table B 
* where clause always contains the foreign key to Table A
* where clause may contain further 1-10 search parameter


Scenario A)
simply inner join Table B + C

Scenario B)
with use of trigger on insert/update I could push the required information from 
table B down to table C.
-> so i would only require to select from table C.


My question:
1) From your experience ... how much faster (approximately) in percent do you 
regard Scenario B faster than A ?

2) any other tips for such a read only scenario

Thx for any attention :-)
Walter
-- 
GMX FreeMail: 1 GB Postfach, 5 E-Mail-Adressen, 10 Free SMS.
Alle Infos und kostenlose Anmeldung: http://www.gmx.net/de/go/freemail

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] Vacum Analyze problem

2007-09-04 Thread Richard Broersma Jr

--- [EMAIL PROTECTED] wrote:
> Thank you all for the information. I'll get to work on it and see what
> happends.
> Thanks again
> 
> Rafael

I'll chime in with one last thought about excellent resources on Vacuum:

http://www.postgresql.org/docs/8.2/static/sql-vacuum.html
http://www.postgresql.org/docs/8.2/static/sql-analyze.html
http://www.postgresql.org/docs/8.2/static/routine-vacuuming.html

Regards,
Richard Broersma Jr.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Vacum Analyze problem

2007-09-04 Thread rafael

> On Tuesday 04 September 2007 11:27:07 [EMAIL PROTECTED] wrote:
>> Hello everyone:
>>
>>I wanted to ask you about how the VACUUM ANALYZE works. is it
>>possible
>> that something can happen in order to reset its effects forcing to
>> execute the VACUUM ANALYZE comand again? i am asking this because i am
>> struggling with a query which works ok after i run a VACUUM ANALYZE,
>> however, sudennly, it starts to take forever (the execution of the
>> query) until i make another VACUUM ANALYZE, and so on ...
>>I'd like to point that i am a novice when it comes to non basic
>> postgresql performance related stuff.
>>
>> Thank you all in advance
>>
>> Rafael
>>
>>
>>
>> ---(end of
>> broadcast)--- TIP 1: if posting/reading
>> through Usenet, please send an appropriate
>>subscribe-nomail command to [EMAIL PROTECTED] so that
>>your message can get through to the mailing list cleanly
>
> Rafael;
>
> Vacuum Analyze performs 2 tasks at once.
>
> 1) Vacuum - this analyzes the table pages and sets appropriate dead row
> space  (those from old updates or deletes that are not possibly needed
> by any  existing transactions) as such that the db can re-use
> (over-write) that  space.
>
> 2) Analyze - Like an Oracle compute stats, updates the system catalogs
> with  current table stat data.
>
> The Vacuum will improve queries since the dead space can be re-used and
> any  dead space if the table you are having issues with is a high
> volume table  then the solution is generally to run vacuum more often -
> I've seen tables  that needed a vacuum every 5 minutes due to
> significant sustained churn.
>
> The Analyze of course is key for the planner, if the table is growing
> rapidly  then running analyze more often will help, if however there is
> lots of churn  but little change in the data (i.e. lots of inserts
> followed by delete's of  the same rows) then a straight vacuum is
> probably what you need.  If the data  is changing rapidly then bumping
> up the default_statistics_target value may  help - you can bump the
> default_statistics_target for a single table in the  pg_autovacuum
> system catalog table.
>
> Hope this helps...
>
> /Kevin
>
>
> ---(end of
> broadcast)--- TIP 3: Have you checked our
> extensive FAQ?
>
>   http://www.postgresql.org/docs/faq

Thank you all for the information. I'll get to work on it and see what
happends.
Thanks again

Rafael









---(end of broadcast)---
TIP 6: explain analyze is your friend

[PERFORM] About autovacuum

2007-09-04 Thread Jean-David Beyer

I have this turned on, and if I look at the log, it runs once a minute,
which is fine.

But what does it do? I.e, it runs VACUUM, but does it also do an analyze?

-- 
  .~.  Jean-David Beyer  Registered Linux User 85642.
  /V\  PGP-Key: 9A2FC99A Registered Machine   241939.
 /( )\ Shrewsbury, New Jerseyhttp://counter.li.org
 ^^-^^ 14:35:01 up 26 days, 17:57, 2 users, load average: 4.31, 4.40, 4.78

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Vacum Analyze problem

2007-09-04 Thread Bill Moran

In response to [EMAIL PROTECTED]:

> Hello everyone:
> 
>I wanted to ask you about how the VACUUM ANALYZE works. is it possible
> that something can happen in order to reset its effects forcing to execute
> the VACUUM ANALYZE comand again? i am asking this because i am struggling
> with a query which works ok after i run a VACUUM ANALYZE, however, sudennly,
> it starts to take forever (the execution of the query) until i make another
> VACUUM ANALYZE, and so on ...
>I'd like to point that i am a novice when it comes to non basic
> postgresql performance related stuff.
> 
> Thank you all in advance

To add to Mikko's comments:

Periodic vacuuming and analyzing is a mandatory part of running a
PostgreSQL database server.  You'll probably be best served to configure
the autovacuum daemon to handle this for you.  See the postgresql.conf
config file.

-- 
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

[EMAIL PROTECTED]
Phone: 412-422-3463x4023

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [PERFORM] Vacum Analyze problem

2007-09-04 Thread Kevin Kempter

On Tuesday 04 September 2007 11:27:07 [EMAIL PROTECTED] wrote:
> Hello everyone:
>
>I wanted to ask you about how the VACUUM ANALYZE works. is it possible
> that something can happen in order to reset its effects forcing to execute
> the VACUUM ANALYZE comand again? i am asking this because i am struggling
> with a query which works ok after i run a VACUUM ANALYZE, however,
> sudennly, it starts to take forever (the execution of the query) until i
> make another VACUUM ANALYZE, and so on ...
>I'd like to point that i am a novice when it comes to non basic
> postgresql performance related stuff.
>
> Thank you all in advance
>
> Rafael
>
>
>
> ---(end of broadcast)---
> TIP 1: if posting/reading through Usenet, please send an appropriate
>subscribe-nomail command to [EMAIL PROTECTED] so that your
>message can get through to the mailing list cleanly

Rafael;

Vacuum Analyze performs 2 tasks at once. 

1) Vacuum - this analyzes the table pages and sets appropriate dead row space 
(those from old updates or deletes that are not possibly needed by any 
existing transactions) as such that the db can re-use (over-write) that 
space.

2) Analyze - Like an Oracle compute stats, updates the system catalogs with 
current table stat data.

The Vacuum will improve queries since the dead space can be re-used and any 
dead space if the table you are having issues with is a high volume table 
then the solution is generally to run vacuum more often - I've seen tables 
that needed a vacuum every 5 minutes due to significant sustained churn. 

The Analyze of course is key for the planner, if the table is growing rapidly 
then running analyze more often will help, if however there is lots of churn 
but little change in the data (i.e. lots of inserts followed by delete's of 
the same rows) then a straight vacuum is probably what you need.  If the data 
is changing rapidly then bumping up the default_statistics_target value may 
help - you can bump the default_statistics_target for a single table in the 
pg_autovacuum system catalog table.

Hope this helps...

/Kevin

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Vacum Analyze problem

2007-09-04 Thread rafael

> On 9/4/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]>
> wrote:
>>
>> Hello everyone:
>>
>>I wanted to ask you about how the VACUUM ANALYZE works. is it
>>possible
>> that something can happen in order to reset its effects forcing to
>> execute the VACUUM ANALYZE comand again?
>
>
>
> Yes, lots of modifications (INSERT,UPDATE,DELETE) to the table in
> question.
>
> Regards
>
> MP

I knew that in the long run the VACUUM ANALYZE comand has to be executed
again. My question is if something can happen over night and cause the need
of a new VACUUM ANALYZE (regenerating indexes or other thing related with
performance).

Thanks for your reply.

Rafael

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Vacum Analyze problem

2007-09-04 Thread Mikko Partio

On 9/4/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>
> Hello everyone:
>
>I wanted to ask you about how the VACUUM ANALYZE works. is it possible
> that something can happen in order to reset its effects forcing to execute
> the VACUUM ANALYZE comand again?



Yes, lots of modifications (INSERT,UPDATE,DELETE) to the table in question.

Regards

MP

[PERFORM] Vacum Analyze problem

2007-09-04 Thread rafael

Hello everyone:

   I wanted to ask you about how the VACUUM ANALYZE works. is it possible
that something can happen in order to reset its effects forcing to execute
the VACUUM ANALYZE comand again? i am asking this because i am struggling
with a query which works ok after i run a VACUUM ANALYZE, however, sudennly,
it starts to take forever (the execution of the query) until i make another
VACUUM ANALYZE, and so on ...
   I'd like to point that i am a novice when it comes to non basic
postgresql performance related stuff.

Thank you all in advance

Rafael



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] Vacum Analyze problem

Re: [PERFORM] Performance on 8CPU's and 32GB of RAM

Re: [PERFORM] Performance on 8CPU's and 32GB of RAM

Re: [PERFORM] Performance on 8CPU's and 32GB of RAM

Re: [PERFORM] join tables vs. denormalization by trigger

[PERFORM] Performance on 8CPU's and 32GB of RAM

Re: [PERFORM] join tables vs. denormalization by trigger

Re: [PERFORM] About autovacuum

[PERFORM] join tables vs. denormalization by trigger

Re: [PERFORM] Vacum Analyze problem

Re: [PERFORM] Vacum Analyze problem

[PERFORM] About autovacuum

Re: [PERFORM] Vacum Analyze problem

Re: [PERFORM] Vacum Analyze problem

Re: [PERFORM] Vacum Analyze problem

Re: [PERFORM] Vacum Analyze problem

[PERFORM] Vacum Analyze problem

17 matches

Site Navigation

Mail list logo

Footer information