Re: [PERFORM] Limit + group + join

2005-08-25 Thread Tobias Brox
[Mark Kirkwood - Fri at 03:01:01PM +1200]
> Tobias,
> Interesting example:
> 
> The 'desc' seems to be the guy triggering the sort, e.g:

Oh; really an accident that I didn't notice myself, I was actually going to
remove all instances of "desc" in my simplification, but seems like I forgot.

> However being a bit brutal:
> 
> set enable_mergejoin=false;
> set enable_hashjoin=false;

:-) maybe I can use that in production.  I'll check.

-- 
Notice of Confidentiality: This information may be confidential, and
blah-blah-blah - so please keep your eyes closed.  Please delete and destroy
this email.  Failure to comply will cause my lawyer to yawn.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] postmaster memory keep going up????

2005-08-25 Thread Chun Yit(Chronos)



>I have a pl/pgsql function that using temp 
table to perform searching logic,
>we have one 
server running on 512MB, Red Hat 9.0, postgresql-7.4.5.
>the problem is the pl/pgsql function that i 
created will increase postmaster memory when calling to function
>become more frequent, i did a test by getting 
out all the logic inside the function and what left only
>create temporary table and drop the temporary 
table statement (at the end if this function), i monitor the %mem for 
postmaster
>using linux command, ps -eo pid,comm,user,%mem 
| grep postmaster.
>when i start the postmaster, the %mem show only 
2.0 something, but after i run the function for more that 1000 time, 
then
>the %mem will go up until 10.0 
something.
>my question is,it is postmaster have memory 
leaking problem?
>hope someone can give me some help and best is 
how to identify the problem it is come from postgresql?
>
>thanks
>regards
>ivan


Re: [PERFORM] Limit + group + join

2005-08-25 Thread Mark Kirkwood

Tobias,
Interesting example:

The 'desc' seems to be the guy triggering the sort, e.g:

explain select c.id from c join b on c_id=c.id  group by c.id order by 
c.id limit 5;
   QUERY PLAN 


-
 Limit  (cost=0.00..0.28 rows=5 width=4)
   ->  Group  (cost=0.00..4476.00 rows=8 width=4)
 ->  Merge Join  (cost=0.00..4276.00 rows=8 width=4)
   Merge Cond: ("outer".id = "inner".c_id)
   ->  Index Scan using c_pkey on c  (cost=0.00..1518.00 
rows=8 width=4)
   ->  Index Scan using b_on_c on b  (cost=0.00..1558.00 
rows=8 width=4)

(6 rows)

Whereas with it back in again:

explain select c.id from c join b on c_id=c.id  group by c.id order by 
c.id desc limit 5;
  QUERY PLAN 


--
 Limit  (cost=10741.08..10741.11 rows=5 width=4)
   ->  Group  (cost=10741.08..11141.08 rows=8 width=4)
 ->  Sort  (cost=10741.08..10941.08 rows=8 width=4)
   Sort Key: c.id
   ->  Hash Join  (cost=1393.00..4226.00 rows=8 width=4)
 Hash Cond: ("outer".c_id = "inner".id)
 ->  Seq Scan on b  (cost=0.00..1233.00 rows=8 
width=4)

 ->  Hash  (cost=1193.00..1193.00 rows=8 width=4)
   ->  Seq Scan on c  (cost=0.00..1193.00 
rows=8 width=4)

(9 rows)


However being a bit brutal:

set enable_mergejoin=false;
set enable_hashjoin=false;

explain select c.id from c join b on c_id=c.id  group by c.id order by 
c.id desc limit 5;
QUERY PLAN 


--
 Limit  (cost=0.00..15.24 rows=5 width=4)
   ->  Group  (cost=0.00..243798.00 rows=8 width=4)
 ->  Nested Loop  (cost=0.00..243598.00 rows=8 width=4)
   ->  Index Scan Backward using c_pkey on c 
(cost=0.00..1518.00 rows=8 width=4)
   ->  Index Scan using b_on_c on b  (cost=0.00..3.01 
rows=1 width=4)

 Index Cond: (b.c_id = "outer".id)
(6 rows)

What is interesting is why this plan is being rejected...

Cheers

Mark

Tobias Brox wrote:

Consider this setup - which is a gross simplification of parts of our
production system ;-)

  create table c (id integer primary key);
  create table b (id integer primary key, c_id integer);
  create index b_on_c on b(c_id)

  insert into c (select ... lots of IDs ...);
  insert into b (select id, id from c); /* keep it simple :-) */
  
Now, I'm just interessted in some few rows.  


All those gives good plans:

explain select c.id from c order by c.id limit 1;
explain select c.id from c group by c.id order by c.id limit 1;
explain select c.id from c join b on c_id=c.id order by c.id limit 1;

... BUT ... combining join, group and limit makes havoc:

explain select c.id from c join b on c_id=c.id  group by c.id order by c.id
desc limit 5;



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Limit + group + join

2005-08-25 Thread Jeffrey W. Baker
On Thu, 2005-08-25 at 18:56 -0700, Jeffrey W. Baker wrote:
> On Fri, 2005-08-26 at 02:27 +0200, Tobias Brox wrote:
> > Consider this setup - which is a gross simplification of parts of our
> > production system ;-)
> > 
> >   create table c (id integer primary key);
> >   create table b (id integer primary key, c_id integer);
> >   create index b_on_c on b(c_id)
> > 
> >   insert into c (select ... lots of IDs ...);
> >   insert into b (select id, id from c); /* keep it simple :-) */
> >   
> > Now, I'm just interessted in some few rows.  
> > 
> > All those gives good plans:
> > 
> > explain select c.id from c order by c.id limit 1;
> > explain select c.id from c group by c.id order by c.id limit 1;
> > explain select c.id from c join b on c_id=c.id order by c.id limit 1;
> > 
> > ... BUT ... combining join, group and limit makes havoc:
> > 
> > explain select c.id from c join b on c_id=c.id  group by c.id order by c.id
> > desc limit 5;
> 
> Where's b in this join clause?  It looks like a cartesian product to me.

Nevermind.  I read c_id as c.id.

-jwb


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Limit + group + join

2005-08-25 Thread Tobias Brox
[Jeffrey W. Baker - Thu at 06:56:59PM -0700]
> > explain select c.id from c join b on c_id=c.id  group by c.id order by c.id
> > desc limit 5;
> 
> Where's b in this join clause?

"join b on c_id=c.id"

It just a funny way of writing:

select c.id from c,b where c_id=c.id  group by c.id order by c.id desc limit 5;

> It looks like a cartesian product to me.

No.  The query will return exactly the same as the simplest query:

  select c.id from c order by c.id  desc limit 5;   

As said, this is a gross oversimplification of the production envorinment.
In the production environment, I really need to use both join, group and
limit.  I tested a bit with subqueries, it was not a good solution
(selecting really a lot of rows and aggregates from many of the tables).

The next idea is to hack it up by manually finding out where the "limit"
will cut, and place a restriction in the where-part of the query.

-- 
Notice of Confidentiality: This information may be confidential, and
blah-blah-blah - so please keep your eyes closed.  Please delete and destroy
this email.  Failure to comply will cause my lawyer to yawn.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Limit + group + join

2005-08-25 Thread Jeffrey W. Baker
On Fri, 2005-08-26 at 02:27 +0200, Tobias Brox wrote:
> Consider this setup - which is a gross simplification of parts of our
> production system ;-)
> 
>   create table c (id integer primary key);
>   create table b (id integer primary key, c_id integer);
>   create index b_on_c on b(c_id)
> 
>   insert into c (select ... lots of IDs ...);
>   insert into b (select id, id from c); /* keep it simple :-) */
>   
> Now, I'm just interessted in some few rows.  
> 
> All those gives good plans:
> 
> explain select c.id from c order by c.id limit 1;
> explain select c.id from c group by c.id order by c.id limit 1;
> explain select c.id from c join b on c_id=c.id order by c.id limit 1;
> 
> ... BUT ... combining join, group and limit makes havoc:
> 
> explain select c.id from c join b on c_id=c.id  group by c.id order by c.id
> desc limit 5;

Where's b in this join clause?  It looks like a cartesian product to me.

-jwb


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] Limit + group + join

2005-08-25 Thread Tobias Brox
Consider this setup - which is a gross simplification of parts of our
production system ;-)

  create table c (id integer primary key);
  create table b (id integer primary key, c_id integer);
  create index b_on_c on b(c_id)

  insert into c (select ... lots of IDs ...);
  insert into b (select id, id from c); /* keep it simple :-) */
  
Now, I'm just interessted in some few rows.  

All those gives good plans:

explain select c.id from c order by c.id limit 1;
explain select c.id from c group by c.id order by c.id limit 1;
explain select c.id from c join b on c_id=c.id order by c.id limit 1;

... BUT ... combining join, group and limit makes havoc:

explain select c.id from c join b on c_id=c.id  group by c.id order by c.id
desc limit 5;
 QUERY PLAN 
 
-
 Limit  (cost=3809.65..3809.67 rows=5 width=4)
   ->  Group  (cost=3809.65..3940.59 rows=26187 width=4)
 ->  Sort  (cost=3809.65..3875.12 rows=26188 width=4)
   Sort Key: c.id
   ->  Hash Join  (cost=559.34..1887.89 rows=26188 width=4)
 Hash Cond: ("outer".id = "inner".c_id)
 ->  Seq Scan on c  (cost=0.00..403.87 rows=26187 width=4)
 ->  Hash  (cost=403.87..403.87 rows=26187 width=4)
   ->  Seq Scan on b  (cost=0.00..403.87 rows=26187 
width=4)
(9 rows)

I get the same behaviour on pg 7.4.7 and pg 8.0.2.  Of course, I can
probably use subqueries instead of join - though, I would have wished the
planner could do better ;-)

-- 
Notice of Confidentiality: This information may be confidential, and
blah-blah-blah - so please keep your eyes closed.  Please delete and destroy
this email.  Failure to comply will cause my lawyer to yawn.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Read/Write block sizes

2005-08-25 Thread Ron

At 04:49 PM 8/25/2005, Chris Browne wrote:

[EMAIL PROTECTED] (Ron) writes:
> At 03:45 PM 8/25/2005, Josh Berkus wrote:
>> > Ask me sometime about my replacement for GNU sort. Â It uses the
>> > same sorting algorithm, but it's an order of magnitude faster due
>> > to better I/O strategy. Â Someday, in my infinite spare time, I
>> > hope to demonstrate that kind of improvement with a patch to pg.
>>
>>Since we desperately need some improvements in sort performance, I
>>do hope you follow up on this.
>
> I'll generalize that.  IMO we desperately need any and all
> improvements in IO performance.  Even more so than we need
> improvements in sorting or sorting IO performance.

That's frankly a step backwards.  Feel free to "specialise" that instead.


We can agree to disagree, I'm cool with that.

I'm well aware that a Systems Approach to SW 
Architecture is not always popular in the Open 
Source world.  Nonetheless, my POV is that if we 
want to be taken seriously and beat "the big 
boys", we have to do everything smarter and 
faster, as well as cheaper, than they do.  You 
are not likely to be able to do that consistently 
without using some of the "icky" stuff one is 
required to study as part of formal training in 
the Comp Sci and SW Engineering fields.



A patch that improves some specific aspect of 
performance is a thousand times better than any 
sort of "desperate desire for any and

all improvements in I/O performance."


minor twisting of my words: substituting "desire" 
for "need".  The need is provable.  Just put "the 
big 5" (SQL Server, Oracle, DB2, mySQL, and 
PostgreSQL) into some realistic benches to see that.


Major twisting of my words: the apparent 
implication by you that I don't appreciate 
improvements in the IO behavior of specific 
things like sorting as much as I'd appreciate 
more "general" IO performance 
improvements.  Performance optimization is best 
done as an iterative improvement process that 
starts with measuring where the need is greatest, 
then improving that greatest need by the most you 
can, then repeating the whole cycle.  _Every_ 
improvement in such a process is a specific 
improvement, even if the improvement is a 
decision to re-architect the entire product to 
solve the current biggest issue.  Improving 
sorting IO is cool.  OTOH, if pg's biggest IO 
problems are elsewhere, then the amount of 
overall benefit we will get from improving 
sorting IO is going to be minimized until we 
improve the bigger problem(s).  Amdahl's Law.



The "specialized patch" is also pointedly better 
in that a *confidently submitted* patch is 
likely to be way better than any sort of 
"desperate clutching at whatever may come to hand."


Another distortion of my statement and POV.  I 
never suggested nor implied any sort of 
"desperate clutching...".  We have _measurable_ 
IO issues that need to be addressed in order for 
pg to be a better competitor in the 
marketplace.  Just as we do with sorting performance.



Far too often, I see people trying to address 
performance problems via the "desperate 
clutching at whatever seems near to hand," and that
generally turns out very badly as a particular 
result of the whole "desperate clutching" part.


If you can get a sort improvement submitted, that's a concrete improvement...


As I said, I'm all in favor of concrete, 
measurable improvement.  I do not think I ever 
stated I was in favor of anything else.


You evidently are mildly ranting because you've 
seen some examples of poor SW Engineering 
Discipline/Practice by people with perhaps 
inadequate skills for the issues they were trying 
to address.  We all have. "90% of everything is 
Jreck (eg of too low a quality)."


OTOH, I do not think I've given you any reason to 
think I lack such Clue, nor do I think my post was advocating such thrashing.


My post was intended to say that we need an 
Overall Systems Approach to pg optimization 
rather than just applying what compiler writer's 
call "peephole optimizations" to pg.  No more, no less.


I apologize if I somehow misled you,
Ron Peacetree



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Read/Write block sizes

2005-08-25 Thread Chris Browne
[EMAIL PROTECTED] (Ron) writes:
> At 03:45 PM 8/25/2005, Josh Berkus wrote:
>> > Ask me sometime about my replacement for GNU sort. Â It uses the
>> > same sorting algorithm, but it's an order of magnitude faster due
>> > to better I/O strategy. Â Someday, in my infinite spare time, I
>> > hope to demonstrate that kind of improvement with a patch to pg.
>>
>>Since we desperately need some improvements in sort performance, I
>>do hope you follow up on this.
>
> I'll generalize that.  IMO we desperately need any and all
> improvements in IO performance.  Even more so than we need
> improvements in sorting or sorting IO performance.

That's frankly a step backwards.

Feel free to "specialise" that instead.  

A patch that improves some specific aspect of performance is a
thousand times better than any sort of "desperate desire for any and
all improvements in I/O performance."

The latter is unlikely to provide any usable result.

The "specialized patch" is also pointedly better in that a
*confidently submitted* patch is likely to be way better than any sort
of "desperate clutching at whatever may come to hand."

Far too often, I see people trying to address performance problems via
the "desperate clutching at whatever seems near to hand," and that
generally turns out very badly as a particular result of the whole
"desperate clutching" part.

If you can get a sort improvement submitted, that's a concrete
improvement...
-- 
select 'cbbrowne' || '@' || 'ntlug.org';
http://www3.sympatico.ca/cbbrowne/lisp.html
Appendium to  the Rules  of the  Evil Overlord #1:  "I will  not build
excessively integrated  security-and-HVAC systems. They  may be Really
Cool, but are far too vulnerable to breakdowns."

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Read/Write block sizes

2005-08-25 Thread Ron

At 03:45 PM 8/25/2005, Josh Berkus wrote:

Jeff,

> Ask me sometime about my replacement for GNU sort. Â It uses the same
> sorting algorithm, but it's an order of magnitude faster due to better
> I/O strategy. Â Someday, in my infinite spare time, I hope to demonstrate
> that kind of improvement with a patch to pg.

Since we desperately need some improvements in sort performance, I do hope
you follow up on this.

--
--Josh


I'll generalize that.  IMO we desperately need 
any and all improvements in IO performance.  Even 
more so than we need improvements in sorting or sorting IO performance.


Ron



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] What *_mem to increase when running CLUSTER

2005-08-25 Thread Tom Lane
Andrew Lazarus <[EMAIL PROTECTED]> writes:
> Should I temporarily increase sort_mem, vacuum_mem, neither, or both 
> when doing a CLUSTER on a large (100 million row) table

The only part of that job that can use lots of memory is the index
rebuilds.  In recent PG versions maintenance_work_mem is the thing
to increase for an index build; previously sort_mem controlled it.
I forget when the changeover was; maybe 8.0.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Read/Write block sizes

2005-08-25 Thread Josh Berkus
Jeff,

> Ask me sometime about my replacement for GNU sort.  It uses the same
> sorting algorithm, but it's an order of magnitude faster due to better
> I/O strategy.  Someday, in my infinite spare time, I hope to demonstrate
> that kind of improvement with a patch to pg.

Since we desperately need some improvements in sort performance, I do hope 
you follow up on this.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] What *_mem to increase when running CLUSTER

2005-08-25 Thread Steve Poe
Andrew,

On Thu, 2005-08-25 at 12:24 -0700, Andrew Lazarus wrote:
> Should I temporarily increase sort_mem, vacuum_mem, neither, or both 
> when doing a CLUSTER on a large (100 million row) table where as many as 
> half of the tuples are deadwood from UPDATEs or DELETEs? I have large 
> batch (10 million row) inserts, updates, and deletes so I'm not sure 
> frequent vacuuming would help.

You may need to experiment with both. What version of Postgres? What is
the size of your database? How many concurrent users? If you're seeing
half of the tuples are dead, I look at checking your max_fsm_pages and
max_fsm_relations after a full vacuum analyze before doing too much with
sort mem.

Your mileage may vary.

Best of luck.

Steve Poe


> 
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[PERFORM] What *_mem to increase when running CLUSTER

2005-08-25 Thread Andrew Lazarus
Should I temporarily increase sort_mem, vacuum_mem, neither, or both 
when doing a CLUSTER on a large (100 million row) table where as many as 
half of the tuples are deadwood from UPDATEs or DELETEs? I have large 
batch (10 million row) inserts, updates, and deletes so I'm not sure 
frequent vacuuming would help.


begin:vcard
fn:Andrew Lazarus
n:Lazarus;Andrew
org:Pillette Investment Management;Research and Development
adr;dom:;;3028 Fillmore;San Francisco;CA;94123
email;internet:[EMAIL PROTECTED]
title:Director
tel;work:800-366-0688
tel;fax:415-440-4093
url:http://www.pillette.com
version:2.1
end:vcard


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Need for speed 2

2005-08-25 Thread Kelly Burkhart
On Thu, 2005-08-25 at 11:16 -0400, Ron wrote:
> ># - Settings -
> >
> >fsync = false   # turns forced synchronization on or off
> >#wal_sync_method = fsync# the default varies across platforms:
> > # fsync, fdatasync, open_sync, or
> 
> I hope you have a battery backed write buffer!

Battery backed write buffer will do nothing here, because the OS is
taking it's sweet time flushing to the controller's battery backed write
buffer!

Isn't the reason for batter backed controller cache to make fsync()s
fast?

-K

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Need for speed 2

2005-08-25 Thread Merlin Moncure
> Putting pg_xlog on the IDE drives gave about 10% performance
> improvement. Would faster disks give more performance?
> 
> What my application does:
> 
> Every five minutes a new logfile will be imported. Depending on the
> source of the request it will be imported in one of three "raw click"
> tables. (data from two months back, to be able to verify customer
> complains)
> For reporting I have a set of tables. These contain data from the last
> two years. My app deletes all entries from today and reinserts updated
> data calculated from the raw data tables.
> 
> The queries contain no joins only aggregates. I have several indexes
to
> speed different kinds of queries.
> 
> My problems occur when one users does a report that contains to much
old
> data. In that case all cache mechanisms will fail and disc io is the
> limiting factor.

It seems like you are pushing limit of what server can handle.  This
means: 1. expensive server upgrade. or 
2. make software more efficient.

Since you sound I/O bound, you can tackle 1. by a. adding more memory or
b. increasing i/o throughput.  

Unfortunately, you already have a pretty decent server (for x86) so 1.
means 64 bit platform and 2. means more expensive hard drives.  The
archives is full of information about this...

Is your data well normalized?  You can do tricks like:
if table has fields a,b,c,d,e,f with a is primary key, and d,e,f not
frequently queried or missing, move d,e,f to seprate table.

well normalized structures are always more cache efficient.  Do you have
lots of repeating and/or empty data values in your tables?

Make your indexes and data as small as possible to reduce pressure on
the cache, here are just a few tricks:
1. use int2/int4 instead of numeric
2. know when to use char and varchar 
3. use functional indexes to reduce index expression complexity.  This
can give extreme benefits if you can, for example, reduce double field
index to Boolean.

Merlin

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Need for speed 2

2005-08-25 Thread Ron

At 03:10 AM 8/25/2005, Ulrich Wisser wrote:


I realize I need to be much more specific. Here is a more detailed
description of my hardware and system design.


Pentium 4 2.4GHz
Memory 4x DIMM DDR 1GB PC3200 400MHZ CAS3, KVR
Motherboard chipset 'I865G', two IDE channels on board


First suggestion: Get better server HW.  AMD Opteron based dual 
processor board is the current best in terms of price/performance 
ratio, _particularly_ for DB applications like the one you have 
described.  Such mainboards cost ~$400-$500.  RAM will cost about 
$75-$150/GB.  Opteron 2xx are ~$200-$700 apiece.   So a 2P AMD system 
can be had for as little as ~$850 + the cost of the RAM you need.  In 
the worst case where you need 24GB of RAM (~$3600), the total comes 
in at ~$4450.  As you can see from the numbers, buying only what RAM 
you actually need can save you a great deal on money.


Given what little you said about how much of your DB is frequently 
accessed, I'd suggest buying a server based around the 2P 16 DIMM 
slot IWill DK88 mainboard (Tyan has announced a 16 DIMM slot 
mainboard, but I do not think it is actually being sold yet.).  Then 
fill it with the minimum amount of RAM that will allow the "working 
set" of the DB to be cached in RAM.  In the worst case where DB 
access is essentially uniform and essentially random, you will need 
24GB of RAM to hold the 22GB DB + OS + etc.  That worst case is 
_rare_.  Usually DB's have a working set that is smaller than the 
entire DB.  You want to keep that working set in RAM.  If you can't 
identify the working set, buy enough RAM to hold the entire DB.


In particular, you want to make sure that any frequently accessed 
read only tables or indexes are kept in RAM.  The "read only" part is 
very important.  Tables (and their indexes) that are frequently 
written to _have_ to access HD.  Therefore you get much less out of 
having them in RAM.  Read only tables and their indexes can be loaded 
into tmpfs at boot time thereby keeping out of the way of the file 
system buffer cache.  tmpfs does not save data if the host goes down 
so it is very important that you ONLY use this trick with read only 
tables.  The other half of the trick is to make sure that the file 
system buffer cache does _not_ cache whatever you have loaded into tmpfs.




2x SEAGATE BARRACUDA 7200.7 80GB 7200RPM ATA/100
(software raid 1, system, swap, pg_xlog)
ADAPTEC SCSI RAID 2100S ULTRA160 32MB 1-CHANNEL
2x SEAGATE CHEETAH 15K.3 73GB ULTRA320 68-PIN WIDE
(raid 1, /var/lib/pgsql)


Second suggestion: you need a MUCH better IO subsystem.  In fact, 
given that you have described this system as being primarily OLTP 
like, this is more important that the above server HW.  Best would be 
to upgrade everything, but if you are strapped for cash, upgrade the 
IO subsystem first.


You need many more spindles and a decent RAID card or cards.  You 
want 15Krpm (best) or 10Krpm HDs.  As long as all of the HD's are at 
least 10Krpm, more spindles is more important than faster 
spindles.  If it's a choice between more 10Krpm discs or fewer 15Krpm 
discs, buy the 10Krpm discs.  Get the spindle count as high as you 
RAID cards can handle.


Whatever RAID cards you get should have as much battery backed write 
buffer as possible.  In the commodity market, presently the highest 
performance RAID cards I know of, and the ones that support the 
largest battery backed write buffer, are made by Areca.




Database size on disc is 22GB. (without pg_xlog)


Find out what the working set, ie the most frequently accessed 
portion, of this 22GB is and you will know how much RAM is worth 
having.  4GB is definitely too little!




Please find my postgresql.conf below.


Third suggestion:  make sure you are running a 2.6 based kernel and 
at least PG 8.0.3.  Helping beta test PG 8.1 might be an option for 
you as well.



Putting pg_xlog on the IDE drives gave about 10% performance 
improvement. Would faster disks give more performance?


What my application does:

Every five minutes a new logfile will be imported. Depending on the 
source of the request it will be imported in one of three "raw click"
tables. (data from two months back, to be able to verify customer 
complains)  For reporting I have a set of tables. These contain data 
from the last two years. My app deletes all entries from today and 
reinserts updated data calculated from the raw data tables.


The raw data tables seem to be read only?  If so, you should buy 
enough RAM to load them into tmpfs at boot time and have them be 
completely RAM resident in addition to having enough RAM for the OS 
to cache an appropriate amount of the rest of the DB.



The queries contain no joins only aggregates. I have several indexes 
to speed different kinds of queries.


My problems occur when one users does a report that contains too 
much old data. In that case all cache mechanisms will fail and disc 
io is the limiting factor.


If one query contains so much data, that a f

Re: [PERFORM] Need for speed 2

2005-08-25 Thread Frank Wiles
On Thu, 25 Aug 2005 09:10:37 +0200
Ulrich Wisser <[EMAIL PROTECTED]> wrote:

> Pentium 4 2.4GHz
> Memory 4x DIMM DDR 1GB PC3200 400MHZ CAS3, KVR
> Motherboard chipset 'I865G', two IDE channels on board
> 2x SEAGATE BARRACUDA 7200.7 80GB 7200RPM ATA/100
> (software raid 1, system, swap, pg_xlog)
> ADAPTEC SCSI RAID 2100S ULTRA160 32MB 1-CHANNEL
> 2x SEAGATE CHEETAH 15K.3 73GB ULTRA320 68-PIN WIDE
> (raid 1, /var/lib/pgsql)
> 
> Database size on disc is 22GB. (without pg_xlog)
> 
> Please find my postgresql.conf below.
> 
> Putting pg_xlog on the IDE drives gave about 10% performance
> improvement. Would faster disks give more performance?

  Faster as in RPM on your pg_xlog partition probably won't make
  much of a difference.  However, if you can get a drive with better
  overall write performance then it would be a benefit. 

  Another thing to consider on this setup is whether or not you're
  hitting swap often and/or logging to that same IDE RAID set.  For
  optimal insertion benefit you want the heads of your disks to 
  essentially be only used for pg_xlog.  If you're having to jump
  around the disk in the following manner: 

write to pg_xlog
read from swap
write syslog data
write to pg_xlog 
...
...

  You probably aren't getting anywhere near the benefit you could.  One
  thing you could easily try is to break your IDE RAID set and put 
  OS/swap on one disk and pg_xlog on the other. 

> If one query contains so much data, that a full table scan is needed,
> I  do not care if it takes two minutes to answer. But all other
> queries  with less data (at the same time) still have to be fast.
> 
> I can not stop users doing that kind of reporting. :(
> 
> I need more speed in orders of magnitude. Will more disks / more
> memory do that trick?

  More disk and more memory always helps out.  Since you say these
  queries are mostly on not-often-used data I would lean toward more
  disks in your SCSI RAID-1 setup than maxing out available RAM based
  on the size of your database. 

 -
   Frank Wiles <[EMAIL PROTECTED]>
   http://www.wiles.org
 -


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Some ideas for comment

2005-08-25 Thread Jens-Wolfhard Schicke
--On Mittwoch, August 24, 2005 16:26:40 -0400 Chris Hoover 
<[EMAIL PROTECTED]> wrote:



On 8/24/05, Merlin Moncure <[EMAIL PROTECTED]> wrote:

Linux does a pretty good job of deciding what to cache.  I don't think
this will help much.  You can always look at partial indexes too.


Yes, but won't this help create the need to store less?  If I have
1,000.000 rows in a table, but only 4,000 are active, if I move those
4 to another table and link the tables via a view, should that not
help keep the 9,996,000 rows out of the kernel cache (the majority of
the time at least)?
The kernel caches per page, not per file. It is likely linux only caches 
those pages which contain active rows, as long as no statement does a 
seq-scan on that table.


To optimize the thing, you could consider to cluster by some index which 
sorts by the "activity" of the rows first. That way pages with active rows 
are likely to contain more than only 1 active row and so the cache is 
utilized better.


Cluster is rather slow however and tables need to be reclustered from time 
to time.



Mit freundlichem Gruß
Jens Schicke
--
Jens Schicke  [EMAIL PROTECTED]
asco GmbH http://www.asco.de
Mittelweg 7   Tel 0531/3906-127
38106 BraunschweigFax 0531/3906-400

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] Need for speed 2

2005-08-25 Thread Ulrich Wisser

Hello,

I realize I need to be much more specific. Here is a more detailed
description of my hardware and system design.


Pentium 4 2.4GHz
Memory 4x DIMM DDR 1GB PC3200 400MHZ CAS3, KVR
Motherboard chipset 'I865G', two IDE channels on board
2x SEAGATE BARRACUDA 7200.7 80GB 7200RPM ATA/100
(software raid 1, system, swap, pg_xlog)
ADAPTEC SCSI RAID 2100S ULTRA160 32MB 1-CHANNEL
2x SEAGATE CHEETAH 15K.3 73GB ULTRA320 68-PIN WIDE
(raid 1, /var/lib/pgsql)

Database size on disc is 22GB. (without pg_xlog)

Please find my postgresql.conf below.

Putting pg_xlog on the IDE drives gave about 10% performance
improvement. Would faster disks give more performance?

What my application does:

Every five minutes a new logfile will be imported. Depending on the
source of the request it will be imported in one of three "raw click"
tables. (data from two months back, to be able to verify customer complains)
For reporting I have a set of tables. These contain data from the last
two years. My app deletes all entries from today and reinserts updated
data calculated from the raw data tables.

The queries contain no joins only aggregates. I have several indexes to 
speed different kinds of queries.


My problems occur when one users does a report that contains to much old
data. In that case all cache mechanisms will fail and disc io is the
limiting factor.

If one query contains so much data, that a full table scan is needed, I 
do not care if it takes two minutes to answer. But all other queries 
with less data (at the same time) still have to be fast.


I can not stop users doing that kind of reporting. :(

I need more speed in orders of magnitude. Will more disks / more memory
do that trick?

Money is of course a limiting factor but it doesn't have to be real cheap.

Ulrich





# -
# PostgreSQL configuration file
# -
#---
# CONNECTIONS AND AUTHENTICATION
#---

# - Connection Settings -

tcpip_socket = true
max_connections = 100
# note: increasing max_connections costs about 500 bytes of shared
# memory per connection slot, in addition to costs from
shared_buffers
# and max_locks_per_transaction.
#superuser_reserved_connections = 2
#port = 5432
#unix_socket_directory = ''
#unix_socket_group = ''
#unix_socket_permissions = 0777 # octal
#virtual_host = ''  # what interface to listen on; defaults
to any
#rendezvous_name = ''   # defaults to the computer name

# - Security & Authentication -

#authentication_timeout = 60# 1-600, in seconds
#ssl = false
#password_encryption = true
#krb_server_keyfile = ''
#db_user_namespace = false


#---
# RESOURCE USAGE (except WAL)
#---

# - Memory -

shared_buffers = 2  # min 16, at least max_connections*2, 
8KB each

sort_mem = 4096 # min 64, size in KB
vacuum_mem = 8192   # min 1024, size in KB

# - Free Space Map -

max_fsm_pages = 20  # min max_fsm_relations*16, 6 bytes each
max_fsm_relations = 1   # min 100, ~50 bytes each

# - Kernel Resource Usage -

#max_files_per_process = 1000   # min 25
#preload_libraries = ''


#---
# WRITE AHEAD LOG
#---

# - Settings -

fsync = false   # turns forced synchronization on or off
#wal_sync_method = fsync# the default varies across platforms:
# fsync, fdatasync, open_sync, or
open_datasync
wal_buffers = 128   # min 4, 8KB each

# - Checkpoints -

checkpoint_segments = 16# in logfile segments, min 1, 16MB each
#checkpoint_timeout = 300   # range 30-3600, in seconds
#checkpoint_warning = 30# 0 is off, in seconds
#commit_delay = 0   # range 0-10, in microseconds
#commit_siblings = 5# range 1-1000


#---
# QUERY TUNING
#---

# - Planner Method Enabling -

#enable_hashagg = true
#enable_hashjoin = true
#enable_indexscan = true
#enable_mergejoin = true
#enable_nestloop = true
#enable_seqscan = true
#enable_sort = true
#enable_tidscan = true

# - Planner Cost Constants -

#effective_cache_size = 1000# typically 8KB each
#random_page_cost = 4   # units are one sequential page fetch cost
#cpu_tuple_cost = 0.01  # (same)
#cpu_index_tuple_cost = 0.001   # (same)
#cpu_operator_cost = 0.0025 # (same)

# - Genetic Query Optimizer -

#geqo = true
#geqo_threshold = 11
#geqo_effort = 1
#geqo_generations = 0
#g