Re: [PERFORM] Read/Write block sizes

2005-08-24 Thread PFC



of effort reinventing the wheel ... but our time will be repaid much
more if we work at levels that the OS cannot have knowledge of, such as
join planning and data statistics.


Considering a global budget of man-hours which is the best ?

1- Spend it on reimplementing half of VFS in postgres, half of Windows in  
postgres, half of FreeBSD in postgres, half of Solaris in Postgres, only  
to discover you gain a meagre speed increase and a million and a half bugs,


2- Spending 5% of that time lowering the impedance between the OS and  
Postgres, and another 5% annoying Kernel people and helping them tweaking  
stuff for database use, and the rest on useful features that give useful  
speedups, like bitmap indexes, skip scans, and other features that enhance  
power and usability ?


If you're Oracle and have almost unlimited resources, maybe. But even  
Microsoft opted for option 2 : they implemented ReadFileGather and  
WriteFileScatter to lower the syscall overhead and that's it.


And point 2 will benefit to many other apps, wether 1 would benefit only  
postgres, and then only in certain cases.


I do believe there is something ineresting to uncover with reiser4 though  
(it definitely fits point 2).


I'm happy that the pg team chose point 2 and that new versions keep coming  
with new features at an unbelievable rate these times. Do you guys sleep ?


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Caching by Postgres

2005-08-24 Thread Donald Courtney



Great discussion and illuminating for those of us who are still
learning the subtleties of postGres.

William

To be clear -
I built postgreSQL 8.1 64K bit on solaris 10 a few months ago
and side by side with the 32 bit postgreSQL build saw no improvement. 
In fact the 64 bit result was slightly lower.


I used  the *same 64 bit S10 OS* for both versions.  I think your
experience makes sense since your change was from 32 to 64 bit Linux.
From my experiment I am surmising that there will not be any 
file/os/buffer-cache
scale up effect on the same OS with postgreSQL 64. 


I was testing on a 4 core system in both cases.



William Yu wrote:


Donald Courtney wrote:


in that even if you ran postgreSQL on a 64 bit address space
with larger number of CPUs you won't see much of a scale up
and possibly even a drop.   I am not alone in having the *expectation*



What's your basis for believing this is the case? Why would 
PostgreSQL's dependence on the OS's caching/filesystem limit 
scalability? I know when I went from 32bit to 64bit Linux, I got 
*HUGE* increases in performance using the same amount of memory. And 
when I went from 2x1P to 2xDC, my average cpu usage % dropped almost 
in half.



that a database should have some cache size parameter and
the option to skip the file system.   If I use oracle, sybase, mysql
and maxdb they all have the ability to size a data cache and move
to 64 bits.



Josh Berkus has already mentioned this as conventional wisdom as 
written by Oracle. This may also be legacy wisdom. Oracle/Sybase/etc 
has been around for a long time; it was probably a clear performance 
win way back when. Nowadays with how far open-source OS's have 
advanced, I'd take it with a grain of salt and do my own performance 
analysis. I suspect the big vendors wouldn't change their stance even 
if they knew it was no longer true due to the support hassles.


My personal experience with PostgreSQL. Dropping shared buffers from 
2GB to 750MB improved performance on my OLTP DB a good 25%. Going down 
from 750MB to 150MB was another +10%.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Read/Write block sizes

2005-08-24 Thread Bruce Momjian

This thread covers several performance ideas.  First is the idea that
more parameters should be configurable.   While this seems like a noble
goal, we try to make parameters auto-tuning, or if users have to
configure it, the parameter should be useful for a significant number of
users.

In the commercial software world, if you can convince your boss that a
feature/knob is useful, it usually gets into the product. 
Unfortunately, this leads to the golden doorknob on a shack, where some
features are out of sync with the rest of the product in terms of
usefulness and utility.  With open source, if a feature can not be
auto-tuned, or has significant overhead, the features has to be
implemented and then proven to be a benefit.

In terms of adding async I/O, threading, and other things, it might make
sense to explore how these could be implemented in a way that fits the
above criteria.

---

Jignesh K. Shah wrote:
> Hi Jim,
> 
> | How many of these things are currently easy to change with a recompile?
> | I should be able to start testing some of these ideas in the near
> | future, if they only require minor code or configure changes.
> 
> 
> The following
> * Data File Size   1GB
> * WAL File Size of 16MB
> * Block Size  of 8K
> 
> Are very easy to change with a recompile.. A Tunable will be greatly 
> prefered as it will allow one binary for different tunings
> 
> * MultiBlock read/write
> 
> Is not available but will greatly help in reducing the number of system 
> calls which will only increase as the size of the database increases if 
> something is not done about i.
> 
> * Pregrown files... maybe not important at this point since TABLESPACE 
> can currently work around it a bit (Just need to create a different file 
> system for each tablespace
> 
> But if you really think hardware & OS  is the answer for all small 
> things.. I think we should now start to look on how to make Postgres 
> Multi-threaded or multi-processed for each connection. With the influx 
> of  "Dual-Core" or "Multi-Core" being the fad Postgres can have the 
> cutting edge if somehow exploiting cores is designed.
> 
> Somebody mentioned that adding CPU to Postgres workload halved the 
> average CPU  usage...
> YEAH... PostgreSQL  uses only 1 CPU per connection (assuming 100% 
> usage)  so if you add another CPU it is idle anyway and the system will 
> report only 50%  :-) BUT the importing to measure is.. whether the query 
> time was cut down or not? ( No flames I am sure you were talking about 
> multi-connection multi-user environment :-) ) But my point is then this 
> approach is worth the ROI and the time and effort spent to solve this 
> problem.
> 
> I actually vote for a multi-threaded solution for each connection while 
> still maintaining seperate process for each connections... This way the 
> fundamental architecture of Postgres doesn't change, however a 
> multi-threaded connection can then start to exploit different cores.. 
> (Maybe have tunables for number of threads to read data files who 
> knows.. If somebody is interested in actually working a design .. 
> contact me and I will be interested in assisting this work.
> 
> Regards,
> Jignesh
> 
> 
> Jim C. Nasby wrote:
> 
> >On Tue, Aug 23, 2005 at 06:09:09PM -0400, Chris Browne wrote:
> >  
> >
> >>[EMAIL PROTECTED] (Jignesh Shah) writes:
> >>
> >>
> Does that include increasing the size of read/write blocks? I've
> noticedthat with a large enough table it takes a while to do a
> sequential scan, even if it's cached; I wonder if the fact that it
> takes a million read(2) calls to get through an 8G table is part of
> that.
> 
> 
> >>>Actually some of that readaheads,etc the OS does already if it does
> >>>some sort of throttling/clubbing of reads/writes. But its not enough
> >>>for such types of workloads.
> >>>
> >>>Here is what I think will help:
> >>>
> >>>* Support for different Blocksize TABLESPACE without recompiling the
> >>>code.. (Atlease support for a different Blocksize for the whole
> >>>database without recompiling the code)
> >>>
> >>>* Support for bigger sizes of WAL files instead of 16MB files
> >>>WITHOUT recompiling the code.. Should be a tuneable if you ask me
> >>>(with checkpoint_segments at 256.. you have too many 16MB files in
> >>>the log directory) (This will help OLTP benchmarks more since now
> >>>they don't spend time rotating log files)
> >>>
> >>>* Introduce a multiblock or extent tunable variable where you can
> >>>define a multiple of 8K (or BlockSize tuneable) to read a bigger
> >>>chunk and store it in the bufferpool.. (Maybe writes too) (Most
> >>>devices now support upto 1MB chunks for reads and writes)
> >>>
> >>>*There should be a way to preallocate files for TABLES in
> >>>TABLESPACES otherwise with multiple table writes in the same
> >>>filesystem ends with fragmented files which causes poor "READS" from
> >>>t

Re: [PERFORM] Caching by Postgres

2005-08-24 Thread Stephen Frost
* Donald Courtney ([EMAIL PROTECTED]) wrote:
> To be clear -
> I built postgreSQL 8.1 64K bit on solaris 10 a few months ago
> and side by side with the 32 bit postgreSQL build saw no improvement. 
> In fact the 64 bit result was slightly lower.

That makes some sense actually.  It really depends on what you're doing
alot of the time.  On a Sparc system you're not likely to get much of a
speed improvment by going to 64bit (unless, maybe, you're doing lots of
intensive 64bit math ops).  You'll have larger pointers and whatnot
though.

> I used  the *same 64 bit S10 OS* for both versions.  I think your
> experience makes sense since your change was from 32 to 64 bit Linux.

32bit to 64bit Linux on a Sparc platform really shouldn't affect
performance all that much (I'd expect it to be similar to 32bit to 64bit
under Solaris actually, at least in terms of the performance
difference).  32bit to 64bit Linux on an amd64 platform is another
matter entirely though, but not because of the number of bits involved.

Under amd64, 32bit is limited to 32bit on i386 which has a limited
number of registers and whatnot.  Under amd64/64bit you get more
registers (and I think some other niceities) which will improve
performance.  That's not a 32bit vs. 64bit thing, that's i386 vs. native
amd64.  It's really mainly an oddity of the platform.  On a mips system
I'd expect the same kind of performance difference between 32bit and
64bit as you'd see on a sparc platform.

Enjoy,

Stephen


signature.asc
Description: Digital signature


[PERFORM] performance drop on RAID5

2005-08-24 Thread Alexandre Barros

Hello,
i have a pg-8.0.3 running on Linux  kernel  2.6.8,  CPU  Sempron 2600+, 
1Gb RAM on IDE HD ( which could be called a "heavy desktop" ), measuring 
this performance with pgbench ( found on /contrib ) it gave me an 
average ( after several runs ) of 170 transactions per second;


for the sake of experimentation ( actually, i'm scared this IDE drive 
could fail at any time, hence i'm looking for an alternative, more 
"robust", machine ), i've installed on an aging Compaq Proliant server ( 
freshly compiled SMP kernel 2.6.12.5  with preemption ), dual Pentium 
III Xeon 500Mhz, 512Mb RAM, (older) SCSI-2 80pin drives, and re-tested, 
when the database was on a single SCSI drive, pgbench gave me an average 
of 90 transactions per second, but, and that scared me most, when the 
database was on a RAID-5 array ( four 9Gb disks, using linux software 
RAID mdadm and LVM2, with the default filesystem cluster size of 32Kb ), 
the performance dropped to about 55 transactions per second.


Despite the amount of RAM difference, none machine seems to be swapping.
All filesystems ( on both machines ) are Reiserfs.
Both pg-8.0.3 were compiled with CFLAGS -O3 and -mtune for their 
respective architectures... and "gmake -j2" on the server.
Both machines have an original ( except by the pg and the kernel ) 
Mandrake 10.1 install.


I've googled a little, and maybe the cluster size might be one problem, 
but despite that, the performance dropping when running on 
"server-class" hardware with RAID-5 SCSI-2 drives was way above my most 
delirious expectations... i need some help to figure out what is **so** 
wrong...


i wouldn't be so stunned if the newer machine was ( say ) twice faster 
than the older server, but over three times faster is disturbing.


the postgresql.conf of both machines is here:

max_connections = 50
shared_buffers = 1000   # min 16, at least max_connections*2, 
8KB each

debug_print_parse = false
debug_print_rewritten = false
debug_print_plan = false
debug_pretty_print = false
log_statement = 'all'
log_parser_stats= false
log_planner_stats   = false
log_executor_stats  = false
log_statement_stats = false
lc_messages = 'en_US'   # locale for system error message strings
lc_monetary = 'en_US'   # locale for monetary formatting
lc_numeric = 'en_US'# locale for number formatting
lc_time = 'en_US'   # locale for time formatting

many thanks in advance !


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] performance drop on RAID5

2005-08-24 Thread Frank Wiles
On Wed, 24 Aug 2005 11:43:05 -0300
Alexandre Barros <[EMAIL PROTECTED]> wrote:

> I've googled a little, and maybe the cluster size might be one
> problem,  but despite that, the performance dropping when running on 
> "server-class" hardware with RAID-5 SCSI-2 drives was way above my
> most  delirious expectations... i need some help to figure out what is
> **so**  wrong...

  RAID-5 isn't great for databases in general.  What would be better
  would be to mirror the disks to redundancy or do RAID 1+0.  

  You could probably also increase your shared_buffers some, but 
  that alone most likely won't make up your speed difference. 

 -
   Frank Wiles <[EMAIL PROTECTED]>
   http://www.wiles.org
 -


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Caching by Postgres

2005-08-24 Thread PFC


Really?  Cool, I'd like to see that.   Could you follow up with Hans?   
Or give

me his e-mail?


You can subscribe to the Reiser mailinglist on namesys.com or :
[EMAIL PROTECTED]

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Performance for relative large DB

2005-08-24 Thread Chris Browne
"tobbe" <[EMAIL PROTECTED]> writes:
> Hi Chris.
>
> Thanks for the answer.
> Sorry that i was a bit unclear.
>
> 1) We update around 20.000 posts per night.

No surprise there; I would have been surprised to see 100/nite or
6M/nite...

> 2) What i meant was that we suspect that the DBMS called PervasiveSQL
> that we are using today is much to small. That's why we're looking for
> alternatives.
>
> Today we base our solution much on using querry-specific tables created
> at night, so instead of doing querrys direct on the "post" table (with
> 4-6M rows) at daytime, we have the data pre-aligned in several much
> smaller tables. This is just to make the current DBMS coop with our
> amount of data.
>
> What I am particulary interested in is if we can expect to run all our
> select querrys directly from the "post" table with PostgreSQL.

Given a decent set of indices, I'd expect that to work OK...  Whether
4M or 6M rows, that's pretty moderate in size.

If there are specific states that rows are in which are "of interest,"
then you can get big wins out of having partial indices...  Consider...

create index partial_post_status on posts where status in ('Active', 'Pending', 
'Locked');
-- When processing of postings are completely finished, they wind up with 
'Closed' status

We have some 'stateful' tables in our environment where the
interesting states are 'P' (where work is "pending") and 'C' (where
all the work has been completed and the records are never of interest
again except as ancient history); the partial index "where status =
'P'" winds up being incredibly helpful.

It's worth your while to dump data out from Pervasive and load it into
a PostgreSQL instance and to do some typical sorts of queries on the
PostgreSQL side.

Do "EXPLAIN ANALYZE [some select statement];" and you'll get a feel
for how PostgreSQL is running the queries.

Fiddling with indices to see how that affects things will also be a
big help.

You may find there are columns with large cardinalities (quite a lot
of unique values) where you want to improve the stats analysis via...

  alter posts alter column [whatever] set statistics 100;  
   -- Default is 10 bins
  analyze posts;  
   -- then run ANALYZE to update statistics

> 3) How well does postgres work with load balancing environments. Is
> it built-in?

Load balancing means too many things.  Can you be more specific about
what you consider it to mean?

For Internet registry operations, we use replication (Slony-I) to
create replicas used to take particular sorts of load off the "master"
systems.

But you might be referring to something else...

For instance, connection pools, whether implemented inside
applications (everyone doing Java has one or more favorite Java
connection pool implementations) or in web servers (Apache has a DB
connection pool manager) or in an outside application (pgpool, a
C-based connection pool manager) are also sometimes used for load
balancing.
-- 
(reverse (concatenate 'string "gro.mca" "@" "enworbbc"))
http://www3.sympatico.ca/cbbrowne/postgresql.html
In case you weren't aware, "ad homineum" is not latin for "the user of
this technique is a fine debater." -- Thomas F. Burdick

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Read/Write block sizes

2005-08-24 Thread Chris Browne
[EMAIL PROTECTED] (Steve Poe) writes:
> Chris,
>
> Unless I am wrong, you're making the assumpting the amount of time spent
> and ROI is known. Maybe those who've been down this path know how to get
> that additional 2-4% in 30 minutes or less? 
>
> While each person and business' performance gains (or not) could vary,
> someone spending the 50-100h to gain 2-4% over a course of a month for a
> 24x7 operation would seem worth the investment?  

What we *do* know is that adding these "knobs" would involve a
significant amount of effort, as the values are widely used throughout
the database engine.  Making them dynamic (e.g. - so they could be
tuned on a tablespace-by-tablespace basis) would undoubtedly require
rather a lot of development effort.  They are definitely NOT 30 minute
changes.

Moreover, knowing how to adjust them is almost certainly also NOT a 30
minute configuration change; significant benchmarking effort for the
individual application is almost sure to be needed.

It's not much different from the reason why PostgreSQL doesn't use
threading...

The problem with using threading is that introducing it to the code
base would require a pretty enormous amount of effort (I'll bet
multiple person-years), and it wouldn't provide *any* benefit until
you get rather a long ways down the road.

Everyone involved in development seems to me to have a reasonably keen
understanding as to what the potential benefits of threading are; the
value is that there fall out plenty of opportunities to parallelize
the evaluation of portions of queries.  Alas, it wouldn't be until
*after* all the effort goes in that we would get any idea as to what
kinds of speedups this would provide.

In effect, there has to be a year invested in *breaking* PostgreSQL
(because this would initially break a lot, since thread programming is
a really tough skill) where you don't actually see any benefits.

> I would assume that dbt2 with STP helps minimize the amount of hours
> someone has to invest to determine performance gains with
> configurable options?

That's going to help in constructing a "default" knob value.  And if
we find an "optimal default," that encourages sticking with the
current approach, of using #define to apply that value...

>> If someone spends 100h working on one of these items, and gets a 2%
>> performance improvement, that's almost certain to be less desirable
>> than spending 50h on something else that gets a 4% improvement.
>> 
>> And we might discover that memory management improvements in Linux
>> 2.6.16 or FreeBSD 5.5 allow some OS kernels to provide some such
>> improvements "for free" behind our backs without *any* need to write
>> database code.  :-)
-- 
let name="cbbrowne" and tld="ntlug.org" in String.concat "@" [name;tld];;
http://www.ntlug.org/~cbbrowne/lisp.html
"For those  of you who are  into writing programs that  are as obscure
and complicated  as possible, there are opportunities  for... real fun
here" -- Arthur Norman

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] performance drop on RAID5

2005-08-24 Thread Rosser Schwarz
On 8/24/05, Alexandre Barros <[EMAIL PROTECTED]> wrote:

> i wouldn't be so stunned if the newer machine was ( say ) twice faster
> than the older server, but over three times faster is disturbing.

RAID5 on so few spindles is a known losing case for PostgreSQL.  You'd
be far, far better off doing a pair of RAID1 sets or a single RAID10
set.

/rls

-- 
:wq

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Read/Write block sizes

2005-08-24 Thread Josh Berkus
Tom, Gavin,


> > To get decent I/O you need 1MB fundamental units all the way down the
> > stack.
>
> It would also be a good idea to have an application that isn't likely
> to change a single bit in a 1MB range and then expect you to record
> that change.  This pretty much lets Postgres out of the picture.

We're looking at this pretty much just for data warehousing, where you 
constantly have gigabytes of data which don't change from month to month or 
even year to year.   I agree that it would *not* be an optimization for OLTP 
systems.  Which is why a build-time option would be fine.

> Ummm... I don't see anything here which will be a win for Postgres. The
> transactional semantics we're interested in are fairly complex:
>
> 1) Modifications to multiple objects can become visible to the system
> atomically
> 2) On error, a series of modifications which had been grouped together
> within a transaction can be rolled back
> 3) Using object version information, determine which version of which
> object is visible to a given session
> 4) Using version information and locking, detect and resolve read/write
> and write/write conflicts

I wasn't thinking of database transactions.  I was thinking specifically of 
using Reiser4 transactions (and other transactional filesytems) to do things 
like eliminate the need for full page writes in the WAL.  Filesystems are 
low-level things which should take care of low-level needs, like making sure 
an 8K page got written to disk even in the event of a system failure.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] performance drop on RAID5

2005-08-24 Thread Joshua D. Drake

Alexandre Barros wrote:


Hello,
i have a pg-8.0.3 running on Linux  kernel  2.6.8,  CPU  Sempron 
2600+, 1Gb RAM on IDE HD ( which could be called a "heavy desktop" ), 
measuring this performance with pgbench ( found on /contrib ) it gave 
me an average ( after several runs ) of 170 transactions per second;


That is going to be because IDE drives LIE about write times because of 
the large cache.


for the sake of experimentation ( actually, i'm scared this IDE drive 
could fail at any time, hence i'm looking for an alternative, more 
"robust", machine ), i've installed on an aging Compaq Proliant server 
( freshly compiled SMP kernel 2.6.12.5  with preemption ), dual 
Pentium III Xeon 500Mhz, 512Mb RAM, (older) SCSI-2 80pin drives, and 
re-tested, when the database was on a single SCSI drive, pgbench gave 
me an average of 90 transactions per second, but, and that scared me 
most, when the database was on a RAID-5 array ( four 9Gb disks, using 
linux software RAID mdadm and LVM2, with the default filesystem 
cluster size of 32Kb ), the performance dropped to about 55 
transactions per second.



That seems more reasonable and probably truthful. I would be curious 
what type of performance you would get with the exact same
setup EXCEPT remove LVM2. Just have the software RAID. In fact, since 
you have 4 drives you could do RAID 10.




i wouldn't be so stunned if the newer machine was ( say ) twice faster 
than the older server, but over three times faster is disturbing.


the postgresql.conf of both machines is here:

max_connections = 50
shared_buffers = 1000   # min 16, at least max_connections*2, 
8KB each


You should look at the annotated conf:

http://www.powerpostgresql.com/Downloads/annotated_conf_80.html

Sincerely,

Joshua D. Drake




debug_print_parse = false
debug_print_rewritten = false
debug_print_plan = false
debug_pretty_print = false
log_statement = 'all'
log_parser_stats= false
log_planner_stats   = false
log_executor_stats  = false
log_statement_stats = false
lc_messages = 'en_US'   # locale for system error message strings
lc_monetary = 'en_US'   # locale for monetary formatting
lc_numeric = 'en_US'# locale for number formatting
lc_time = 'en_US'   # locale for time formatting

many thanks in advance !


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] performance drop on RAID5

2005-08-24 Thread Merlin Moncure
> Hello,
> i have a pg-8.0.3 running on Linux  kernel  2.6.8,  CPU  Sempron
2600+,
> 1Gb RAM on IDE HD ( which could be called a "heavy desktop" ),
measuring
> this performance with pgbench ( found on /contrib ) it gave me an
> average ( after several runs ) of 170 transactions per second;

170 tps is not plausible no a single platter IDE disk without using
write caching of some kind.  For a 7200 rpm drive any result much over
100 tps is a little suspicious. (my 10k sata raptor can do about 120).
 
> for the sake of experimentation ( actually, i'm scared this IDE drive
> could fail at any time, hence i'm looking for an alternative, more
> "robust", machine ), i've installed on an aging Compaq Proliant server
(
> freshly compiled SMP kernel 2.6.12.5  with preemption ), dual Pentium
> III Xeon 500Mhz, 512Mb RAM, (older) SCSI-2 80pin drives, and
re-tested,
> when the database was on a single SCSI drive, pgbench gave me an
average
> of 90 transactions per second, but, and that scared me most, when the
> database was on a RAID-5 array ( four 9Gb disks, using linux software
> RAID mdadm and LVM2, with the default filesystem cluster size of 32Kb
),
> the performance dropped to about 55 transactions per second.

Is natural to see a slight to moderate drop in write performance moving
to RAID 5.  The only raid levels that are faster than single disk levels
for writing are the ones with '0' in it or caching raid controllers.
Even for 0+1, expect modest gains in tps vs. single disk if not using
write caching.

Merlin

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] Some ideas for comment

2005-08-24 Thread Chris Hoover
Ok, there is always a lot of talk about tuning PostgreSQL on linux and
how PostgreSQL uses the linux kernel cache to cache the tables and
indexes.

My question is, is there anyway to see what files linux is caching at
this moment?

My reasoning behind this question is:

I have several database systems each with 1 PostgreSQL cluster. 
However, each cluster has  a large number of identical databases on
it. Since there can be a great amount of size disparity between the
databases, I am wondering if some of slowness we see might be caused
by the kernel cache having to reload a lot of data that keeps getting
swapped out. (most systems have at least 100GB of data/indexes on them
with 8 or 12GB ram).

If this is the case, what sort of symptoms would you expect?

To help mitigate this potential, I have been researching the
following, and am thinking of proposing it to management.  Any
comments would be appreciated.

1.  Implement a partition type layout using views and rules - This
will allow me to have one table in each view with the "active" data,
and the inactive data stored by year in other tables.

So I would have the following (for each major table):

Table View as
select * from active_table
union all 
select * from table_2005
union all
select * from table_2004
etc.

Each table would have identical indexes, however only the
"active_table" would have data that is actively being worked.  The
rules and a nightly job can keep the data correctly stored.

I am thinking that with this setup, the active table indexes should
almost always be in memory.  And, if they do happen to be pushed out,
they are much smaller than the indexes I have today (where all data is
in one table), so they should load faster with less i/o pressure.

From the testing I have done so far, I believe I can implement this
system with out having to ask for developer time.  This is a "Good
Thing".

Also, the database is not normalized and is very ugly, by using the
view to partition and abstract the actual data, I will be in a better
position to start normalizing some of the tables w/o developer time
(once again, a "Good Thing")


2.  I am also thinking of recommending we collapse all databases in a
cluster into one "mega" database.  I can then use schema's and views
to control database access and ensure that no customer can see another
customers data.

This would mean that there are only one set of indexes being loaded
into the cache.  While they would be larger, I think in combination
with the partition from idea 1, we would be ahead of the ball game. 
Since there would only be one set of indexes, everyone would be
sharing them so they should always be in memory.

I don't have real numbers to give you, but we know that our systems
are hurting i/o wise and we are growing by about 2GB+ per week (net). 
We actually grow by about 5GB/week/server.  However, when I run my
weekly maintenance of vacuum full, reindex, and the vacuum analyze, we
end up getting about 3GB back.  Unfortunately, I do not have the i/o
bandwidth to vacuum during the day as it causes major slowdowns on our
system.  Each night, I do run a vacuum analyze across all db's to try
and help.  I also have my fsm parameters set high (800 fsm pages,
and 5000 fsm relations) to try and compensate.

I believe this is only hurting us as any queries that choose to
tablescan are only getting slower and slower.  Also, obviously, our
indexes are continually growing.  The partitioning should help as the
actual number of records being worked on each table is a very small
percentage ( a site may have 1 million records, but only load and work
a few thousand each day).  The archive tables would be doing the most
growing while the active tables should stay small.  Most of the
queries that are tablescanning can not be fixed as the database
clusters have been initialized with a non-C locale and won't use
indexes on our queries that are using like with a wild card.


Right now, we are still on 7.3.4.  However, these ideas would be
implemented as part of an upgrade to 8.x (plus, we'll initialize the
new clusters with a C locale).

Anyway, I hope this makes since, and any comments, ideas, and/or
suggestions would be appreciated.

Thanks,

Chris

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] performance drop on RAID5

2005-08-24 Thread Arjen van der Meijden

On 24-8-2005 16:43, Alexandre Barros wrote:

Hello,
i have a pg-8.0.3 running on Linux  kernel  2.6.8,  CPU  Sempron 2600+, 
1Gb RAM on IDE HD ( which could be called a "heavy desktop" ), measuring 
this performance with pgbench ( found on /contrib ) it gave me an 
average ( after several runs ) of 170 transactions per second;


Nowadays you can call that a "light desktop", although the amount of RAM 
is a bit more than normal. ;)


for the sake of experimentation ( actually, i'm scared this IDE drive 
could fail at any time, hence i'm looking for an alternative, more 
"robust", machine ), i've installed on an aging Compaq Proliant server ( 
freshly compiled SMP kernel 2.6.12.5  with preemption ), dual Pentium 


Preemption is afaik counter-productive for a server.

III Xeon 500Mhz, 512Mb RAM, (older) SCSI-2 80pin drives, and re-tested, 
when the database was on a single SCSI drive, pgbench gave me an average 
of 90 transactions per second, but, and that scared me most, when the 
database was on a RAID-5 array ( four 9Gb disks, using linux software 
RAID mdadm and LVM2, with the default filesystem cluster size of 32Kb ), 
the performance dropped to about 55 transactions per second.


The default disk io scheduler of the 2.6-series is designed for disks or 
controllers that have no command queueing (like most standaard 
IDE-disks). Try changing your default "anticipatory" scheduler on the 
test-device to "deadline" or "cfq" (see the two *-iosched.txt files in 
/usr/src/linux/Documentation/block/ for more information).
Changing is simple with a 2.6.11+ kernel, just do "echo 'deadline' > 
/sys/block/*devicename*/queue/scheduler" at runtime.



Despite the amount of RAM difference, none machine seems to be swapping.


But there is a 512MB extra amount of file-cache. Which can make a 
significant difference.



All filesystems ( on both machines ) are Reiserfs.
Both pg-8.0.3 were compiled with CFLAGS -O3 and -mtune for their 
respective architectures... and "gmake -j2" on the server.
Both machines have an original ( except by the pg and the kernel ) 
Mandrake 10.1 install.


I've googled a little, and maybe the cluster size might be one problem, 
but despite that, the performance dropping when running on 
"server-class" hardware with RAID-5 SCSI-2 drives was way above my most 
delirious expectations... i need some help to figure out what is **so** 
wrong...


Did you consider you're overestimating the raid's performance and usage? 
If the benchmark was mostly run from the memory, you're not going to see 
much gain in performance from a faster disk.
But even worse is that for sequential reads and writes, the performance 
of current (large) IDE drives is very good. It may actually outperform 
your RAID on that one.
Random access will probably still be slower, but may not be that much 
slower. And if the database resides in memory, that doesn't matter much 
anyway.


i wouldn't be so stunned if the newer machine was ( say ) twice faster 
than the older server, but over three times faster is disturbing.


I'm actually not surprised. Old scsi disks are not faster than new ones 
anymore, although they still may be a bit faster on random access issues 
or under (very) high load.


Especially if:
- you only ran it with 1 client
- the database mostly or entirely fits in the desktop's memory
- the database did not fit entirely in the server's memory.

Even worse would be if the database does fit entirely in the desktop's 
memory, but not in the server's!


Please don't forget your server probably has much slower memory-access, 
it will likely have 133Mhz SDR Ram instead of your current DDR2700 orso. 
The latter is much faster (in theory more than twice).
Your desktop cpu will very likely, even when multiple processes exist, 
be faster especially with the faster memory accesses. The Xeon's 
probably only beat it on the amount of cache.


So please check if pgbench actually makes much use of the disk, if it 
does check how large the test databases will be, etc, etc.


Btw, if you'd prefer to use your desktop, but are afraid of the 
IDE-drive dying on you, buy a "server class" SATA disk. Most 
manufacturers have those, Western Digital even has "scsi like" sata 
disks (the Raptor drives), they generally have 3 to 5 years warranty and 
higher class components.


Best regards,

Arjen

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Caching by Postgres

2005-08-24 Thread mark
On Wed, Aug 24, 2005 at 09:21:12AM -0400, Donald Courtney wrote:
> I built postgreSQL 8.1 64K bit on solaris 10 a few months ago
> and side by side with the 32 bit postgreSQL build saw no improvement. 
> In fact the 64 bit result was slightly lower.

I've had this sort of argument with a friend of mine who works at a
retail computer sales company who always tries to pitch 64-bit
platforms to me (I don't have one yet).

There are a few issues in here that are hard to properly detach to
allow for a fair comparison.

The first, to always remember - is that the move from 64-bits to
32-bits doesn't come for free. In a real 64-bit system with a
64-bit operating system, and 64-bit applications, pointers are
now double their 32-bit size. This means more bytes to copy around
memory, and in an extreme case, has the potential to approach
halfing both the memory latency to access many such pointers from
RAM, and half the effective amount of RAM. In real world cases,
not everything is a pointer, so this sort of performance degradation
is doubtful - but it is something to keep in mind.

In response to this, it appears that, at least on the Intel/AMD side
of things, they've increased the bandwidth on the motherboard, and
allowed for faster memory to be connected to the motherboard. They've
increased the complexity of the chip, to allow 64-bit register
operations to be equivalent in speed to 32-bit register operations.
I have no idea what else they've done... :-)

So, it may be difficult to properly compare a 32-bit system to a
64-bit system. Even if the Ghz on the chip appears equal, it isn't
the same chip, and unless it is the exact same make, product and
version of the motherboard, it may not be a fair compairson. Turning
support for 32-bit on or off, and using a kernel that is only 32-bit
may give good comparisons - but with the above explanation, I would
expect the 32-bit application + kernel to out-perform the 64-bit
application.

So then we move on to what 64-bit is really useful for. Obviously,
there is the arithmetic. If you were previously doing 64-bit
arithmetic through software, you will notice an immediate speed
improvement when doing it through hardware instead. If you have
a program that is scanning memory in any way, it may benefit from
64-bit instructions (for example - copying data 64-bit words at
a time instead of 32-bit words at a time). PostgreSQL might benefit
slightly from either of these, slightly balancing the performance
degradation of using more memory to store the pointers, and more
memory bandwidth the access the pointers.

The real benefit of 64-bit is address space. From the kernel
perspective, it means that more programs, or bigger programs can run
at once. From the application perspective, it means your application
can use more than 32-bits of address space. For programs that make
extensive use of mmap(), this can be a necessity. They are mapping
very large files into their own address space. This isn't a
performance boost, as much as it is a 'you can't do it', if the
files mmap()'ed at the same time, will not fit within 32-bits of
address space. This also becomes, potentially, a performance
degradation, as the system is now having to manage applications
that have very large page tables. Page faults may become
expensive.

PostgreSQL uses read(), instead of mmap(), and uses <2 Gbyte files.
PostgreSQL doesn't require the additional address space for normal
operation.

If, however, you happen to have a very large amount of physical memory
- more memory than is supported by a 32-bit system, but is supported
by your 64-bit system, then the operating system should be able to use
this additional physical memory to cache file system data pages, which
will benefit PostgreSQL if used with tables that are larger than the
memory supported by your 32-bit system, and which have queries which
require more pages than the memory supported by your 32-bit system to
be frequently accessed. If you have a huge database, with many clients
accessing the data, this would be a definate yes. With anything less,
it is a maybe, or a probably not.

I've been looking at switching to 64-bit, mostly to benefit from the
better motherboard bandwidth, and just to play around. I'm not
expecting to require the 64-bit instructions.

Hope this helps,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Caching by Postgres

2005-08-24 Thread William Yu

Donald Courtney wrote:

I built postgreSQL 8.1 64K bit on solaris 10 a few months ago
and side by side with the 32 bit postgreSQL build saw no improvement. In 
fact the 64 bit result was slightly lower.


I'm not surprised 32-bit binaries running on a 64-bit OS would be faster 
than 64-bit/64-bit. 64-bit isn't some magical wand you wave and it's all 
ok. Programs compiled as 64-bit will only run faster if (1) you need 
64-bit address space and you've been using ugly hacks like PAE to get 
access to memory > 2GB or (2) you need native 64-bit data types and 
you've been using ugly hacks to piece 32-bit ints together (example, 
encryption/compression). In most cases, 64-bit will run slightly slower 
due to extra overhead of using larger datatypes.


Since PostgreSQL hands off the majority of memory management/data 
caching to the OS, only the OS needs to be 64-bit to reap the benefits 
of better memory management. Since Postgres *ALREADY* reaps the 64-bit 
benefit, I'm not sure how the argument moving caching/mm/fs into 
Postgres would apply. Yes there's the point about possibly implementing 
better/smarter/more appropriate caching algorithms but that has nothing 
to do with 64-bit.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Some ideas for comment

2005-08-24 Thread Merlin Moncure
> Ok, there is always a lot of talk about tuning PostgreSQL on linux and
> how PostgreSQL uses the linux kernel cache to cache the tables and
> indexes.
[...]
> 
> 1.  Implement a partition type layout using views and rules - This
> will allow me to have one table in each view with the "active" data,
> and the inactive data stored by year in other tables.
> 
> So I would have the following (for each major table):
> 
> Table View as
> select * from active_table
> union all
> select * from table_2005
> union all
> select * from table_2004
> etc.

Linux does a pretty good job of deciding what to cache.  I don't think
this will help much.  You can always look at partial indexes too.

> 2.  I am also thinking of recommending we collapse all databases in a
> cluster into one "mega" database.  I can then use schema's and views
> to control database access and ensure that no customer can see another
> customers data.

hm. keep in mind views are tightly bound to the tables they are created
with (views can't 'float' over tables in different schemas).  pl/pgsql
functions can, though.  This is a more efficient use of server
resources, IMO, but not a windfall.
 
> This would mean that there are only one set of indexes being loaded
> into the cache.  While they would be larger, I think in combination
> with the partition from idea 1, we would be ahead of the ball game.
> Since there would only be one set of indexes, everyone would be
> sharing them so they should always be in memory.

I would strongly consider adding more memory :).
 
> I don't have real numbers to give you, but we know that our systems
> are hurting i/o wise and we are growing by about 2GB+ per week (net).
> We actually grow by about 5GB/week/server.  However, when I run my
> weekly maintenance of vacuum full, reindex, and the vacuum analyze, we
> end up getting about 3GB back.  Unfortunately, I do not have the i/o
> bandwidth to vacuum during the day as it causes major slowdowns on our
> system.  Each night, I do run a vacuum analyze across all db's to try
> and help.  I also have my fsm parameters set high (800 fsm pages,
> and 5000 fsm relations) to try and compensate.

Generally, you can reduce data turnover for the same workload by
normalizing your database.  IOW, try and make your database more
efficient in the way it stores data.

> Right now, we are still on 7.3.4.  However, these ideas would be
> implemented as part of an upgrade to 8.x (plus, we'll initialize the
> new clusters with a C locale).

yes, do this!

Merlin

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Caching by Postgres

2005-08-24 Thread Alan Stange

[EMAIL PROTECTED] wrote:

So then we move on to what 64-bit is really useful for. Obviously,
there is the arithmetic. If you were previously doing 64-bit
arithmetic through software, you will notice an immediate speed
improvement when doing it through hardware instead. If you have
a program that is scanning memory in any way, it may benefit from
64-bit instructions (for example - copying data 64-bit words at
a time instead of 32-bit words at a time). PostgreSQL might benefit
slightly from either of these, slightly balancing the performance
degradation of using more memory to store the pointers, and more
memory bandwidth the access the pointers.
  
At least on Sparc processors, v8 and newer, any double precision math 
(including longs) is performed with a single instruction, just like for 
a 32 bit datum.  Loads and stores of 8 byte datums are also handled via 
a single instruction.   The urban myth that 64bit math is 
different/better on a 64 bit processor is just that;  yes, some lower 
end processors would emulate/trap those instructions but that an 
implementation detail, not architecture.I believe that this is all 
true for other RISC processors as well.


The 64bit API on UltraSparcs does bring along some extra FP registers IIRC.


If, however, you happen to have a very large amount of physical memory
- more memory than is supported by a 32-bit system, but is supported
by your 64-bit system, then the operating system should be able to use
this additional physical memory to cache file system data pages, which
will benefit PostgreSQL if used with tables that are larger than the
memory supported by your 32-bit system, and which have queries which
require more pages than the memory supported by your 32-bit system to
be frequently accessed. If you have a huge database, with many clients
accessing the data, this would be a definate yes. With anything less,
it is a maybe, or a probably not.
  
Solaris, at least, provided support for far more than 4GB of physical 
memory on 32 bit kernels.  A newer 64 bit kernel might be more 
efficient, but that's just because the time was taken to support large 
page sizes and more efficient data structures.  It's nothing intrinsic 
to a 32 vs 64 bit kernel.


-- Alan

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Some ideas for comment

2005-08-24 Thread Alvaro Herrera
On Wed, Aug 24, 2005 at 12:56:54PM -0400, Chris Hoover wrote:

> I don't have real numbers to give you, but we know that our systems
> are hurting i/o wise and we are growing by about 2GB+ per week (net). 
> We actually grow by about 5GB/week/server.  However, when I run my
> weekly maintenance of vacuum full, reindex, and the vacuum analyze, we
> end up getting about 3GB back.  Unfortunately, I do not have the i/o
> bandwidth to vacuum during the day as it causes major slowdowns on our
> system.  Each night, I do run a vacuum analyze across all db's to try
> and help.  I also have my fsm parameters set high (800 fsm pages,
> and 5000 fsm relations) to try and compensate.

[...]

> Right now, we are still on 7.3.4.  However, these ideas would be
> implemented as part of an upgrade to 8.x (plus, we'll initialize the
> new clusters with a C locale).

If you were on a newer version, I'd suggest that you use the cost-based
vacuum delay, and vacuum at least some of the tables more often.  This
way you can reduce the continual growth of the data files without
affecting day-to-day performance, because you allow the VACUUM-inflicted
I/O to be interleaved by normal query execution.

Sadly (for you), I think the cost-based vacuum delay feature was only
introduced in 8.0.

-- 
Alvaro Herrera ()
Officer Krupke, what are we to do?
Gee, officer Krupke, Krup you! (West Side Story, "Gee, Officer Krupke")

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Some ideas for comment

2005-08-24 Thread Tom Lane
"Merlin Moncure" <[EMAIL PROTECTED]> writes:
>> Right now, we are still on 7.3.4.  However, these ideas would be
>> implemented as part of an upgrade to 8.x (plus, we'll initialize the
>> new clusters with a C locale).

> yes, do this!

Moving from 7.3 to 8.0 is alone likely to give you a noticeable
performance boost.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Caching by Postgres

2005-08-24 Thread mark
On Wed, Aug 24, 2005 at 02:47:09PM -0400, Alan Stange wrote:
> At least on Sparc processors, v8 and newer, any double precision math 
> (including longs) is performed with a single instruction, just like for 
> a 32 bit datum.  Loads and stores of 8 byte datums are also handled via 
> a single instruction.   The urban myth that 64bit math is 
> different/better on a 64 bit processor is just that;  yes, some lower 
> end processors would emulate/trap those instructions but that an 
> implementation detail, not architecture.

It isn't an urban myth that 64-bit math on a 64-bit processor is
faster, at least if done using registers. It definately is faster.

It may be an urban myth, though, that most applications perform
a sufficient amount of 64-bit arithmetic to warrant the upgrade.
The benefit may be lost in the noise for an application such as
PostgreSQL. It takes, effectively, infinately longer to access
a disk page, than to increment a 64-bit integer in software.

For the lower end processors that emulate/trap these instructions,
they are being performed in software, along with the overhead of a
trap, and are therefore not a single instruction any more. We are
coming at this from different sides (which is good - perspective is
always good :-) ). From the Intel/AMD side of things, ALL non 64-bit
platforms are 'lower end processors', and don't emulate/trap the
instructions as they didn't exist (at least not yet - who knows what
clever and sufficiently motivated people will do :-) ).

> >If, however, you happen to have a very large amount of physical memory
> >- more memory than is supported by a 32-bit system, but is supported
> >by your 64-bit system, then the operating system should be able to use
> >this additional physical memory to cache file system data pages, which
> >will benefit PostgreSQL if used with tables that are larger than the
> >memory supported by your 32-bit system, and which have queries which
> >require more pages than the memory supported by your 32-bit system to
> >be frequently accessed. If you have a huge database, with many clients
> >accessing the data, this would be a definate yes. With anything less,
> >it is a maybe, or a probably not.
> Solaris, at least, provided support for far more than 4GB of physical 
> memory on 32 bit kernels.  A newer 64 bit kernel might be more 
> efficient, but that's just because the time was taken to support large 
> page sizes and more efficient data structures.  It's nothing intrinsic 
> to a 32 vs 64 bit kernel.

Hehe. That's why I was so careful to qualify my statements. :-)

But yeah, I agree. It's a lot of hype, for not much gain (and some
loss, depending on what it is being used for). I only want one because
they're built better, and because I want to play around.

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] Performance indexing of a simple query

2005-08-24 Thread Mark Fox
I have a table called 'jobs' with several million rows, and the only
columns that are important to this discussion are 'start_time' and
'completion_time'.

The sort of queries I want to execute (among others) are like:

SELECT * FROM jobs
WHERE completion_time > SOMEDATE AND start_time < SOMEDATE;

In plain english:  All the jobs that were running at SOMEDATE.  The
result of the query is on the order of 500 rows.

I've got seperate indexes on 'start_time' and 'completion_time'.

Now, if SOMEDATE is such that the number of rows with completion_time
> SOMEDATE is small (say 10s of thousands), the query uses index scans
and executes quickly.  If not, the query uses sequential scans and is
unacceptably slow (a couple of minutes).  I've used EXPLAIN and
EXPLAIN ANALYZE to confirm this.  This makes perfect sense to me.

I've played with some of the memory settings for PostgreSQL, but none
has had a significant impact.

Any ideas on how to structure the query or add/change indexes in such
a way to improve its performance?  In desperation, I tried using a
subquery, but unsurprisingly it made no (positive) difference.  I feel
like there might be a way of using an index on both 'completion_time'
and 'start_time', but can't put a temporal lobe on the details.


Mark

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Some ideas for comment

2005-08-24 Thread Chris Hoover
On 8/24/05, Merlin Moncure <[EMAIL PROTECTED]> wrote:
> > Ok, there is always a lot of talk about tuning PostgreSQL on linux and
> > how PostgreSQL uses the linux kernel cache to cache the tables and
> > indexes.
> [...]
> >
> > 1.  Implement a partition type layout using views and rules - This
> > will allow me to have one table in each view with the "active" data,
> > and the inactive data stored by year in other tables.
> >
> > So I would have the following (for each major table):
> >
> > Table View as
> > select * from active_table
> > union all
> > select * from table_2005
> > union all
> > select * from table_2004
> > etc.
> 
> Linux does a pretty good job of deciding what to cache.  I don't think
> this will help much.  You can always look at partial indexes too.
> 
Yes, but won't this help create the need to store less?  If I have
1,000.000 rows in a table, but only 4,000 are active, if I move those
4 to another table and link the tables via a view, should that not
help keep the 9,996,000 rows out of the kernel cache (the majority of
the time at least)?

This would mean I have more room for other objects and hopefully less
turn over in the cache, and less disk i/o.

Yes?
[...]
> I would strongly consider adding more memory :).
Unfortunately, it looks like 12GB is all our Dell servers can handle. :(

> 
> > I don't have real numbers to give you, but we know that our systems
> > are hurting i/o wise and we are growing by about 2GB+ per week (net).
> > We actually grow by about 5GB/week/server.  However, when I run my
> > weekly maintenance of vacuum full, reindex, and the vacuum analyze, we
> > end up getting about 3GB back.  Unfortunately, I do not have the i/o
> > bandwidth to vacuum during the day as it causes major slowdowns on our
> > system.  Each night, I do run a vacuum analyze across all db's to try
> > and help.  I also have my fsm parameters set high (800 fsm pages,
> > and 5000 fsm relations) to try and compensate.
> 
> Generally, you can reduce data turnover for the same workload by
> normalizing your database.  IOW, try and make your database more
> efficient in the way it stores data.
> 
That's the ultimate goal, but this database structure was developed
and released into production before I started work here.  I'm trying
to slowly change it into a better db, but it is  a slow process. 
Normalization does not make it at the top of the priority list,
unfortunately.

> > Right now, we are still on 7.3.4.  However, these ideas would be
> > implemented as part of an upgrade to 8.x (plus, we'll initialize the
> > new clusters with a C locale).
> > > 2.  I am also thinking of recommending we collapse all databases in a
> > cluster into one "mega" database.  I can then use schema's and views
> > to control database access and ensure that no customer can see another
> > customers data.
> 
> hm. keep in mind views are tightly bound to the tables they are created
> with (views can't 'float' over tables in different schemas).  pl/pgsql
> functions can, though.  This is a more efficient use of server
> resources, IMO, but not a windfall.

This I know.  Each schema would have to have a "custom" set of views
replacing the tables with the view programmed to only return that
customers data.

I was thinking all of the tables in schema my_tables and the views all
querying the tables stored in the my_tables schema.  I would add an
identifying column to each table so that I can differentiate the data.

Chris

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Read/Write block sizes

2005-08-24 Thread Jignesh Shah
Agreed!!!

But the knowledge to Auto-tune your application comes from years of 
understanding of how users are using the so-called "knobs".. But if the "knobs" 
are not there in the first place.. how do you know what people are using?

The "so-called" big boys are also using their knowledge base of what works for 
the customer in their autonomic self healers and its based on the experience of 
all the settings possible and based on service requests on what had failed that 
they get the knowledge about avoiding what fails and tuning what works. 

Remember "recompiling" is a risk with upteem number of variables which not 
every production release engineer is happy about.

Its easy to change back the knob to the previous value rather than trying to 
figure out how do I get my old binaries back.


-Jignesh


- Original Message -
From: Lance Obermeyer <[EMAIL PROTECTED]>
Date: Wednesday, August 24, 2005 4:10 pm
Subject: RE: Read/Write block sizes

> Since Bruce referred to the "corporate software world" I'll chime 
> in...
> It has been a while since adding knobs and dials has been 
> considered a good idea.  Customers are almost always bad at tuning 
> their systems, which decreases customer satisfaction.  While many 
> people assume the corporate types don't care, that is actually far 
> from the truth.  Well run commercial software companies regularly 
> commission (expensive) customer satisfaction surveys.  These 
> numbers are the second most important numbers in all of the 
> enterprise, trailing only revenue in importance.  Results are 
> sliced and diced in every way imaginable.
> 
> The commercial world is trying to auto-tune their systems just as 
> much.  Examples are the work that many of the big boys are doing 
> towards "autonomic" computing.  While it is driven by naked self 
> interest of wanting to sell version upgrades, those efforts 
> increase customer satisfaction and decrease support costs.  Works 
> well for everyone...
> 
> 
> 
> -Original Message-
> From: Bruce Momjian [EMAIL PROTECTED]
> Sent: Wednesday, August 24, 2005 8:52 AM
> To: Jignesh K. Shah
> Cc: Jim Nasby; Chris Browne; pgsql-performance@postgresql.org
> Subject: Re: Read/Write block sizes
> 
> 
> 
> This thread covers several performance ideas.  First is the idea that
> more parameters should be configurable.   While this seems like a 
> noblegoal, we try to make parameters auto-tuning, or if users have to
> configure it, the parameter should be useful for a significant 
> number of
> users.
> 
> In the commercial software world, if you can convince your boss 
> that a
> feature/knob is useful, it usually gets into the product. 
> Unfortunately, this leads to the golden doorknob on a shack, where 
> somefeatures are out of sync with the rest of the product in terms of
> usefulness and utility.  With open source, if a feature can not be
> auto-tuned, or has significant overhead, the features has to be
> implemented and then proven to be a benefit.
> 
> In terms of adding async I/O, threading, and other things, it might 
> makesense to explore how these could be implemented in a way that 
> fits the
> above criteria.
> 
> 
> ---
> 
> Jignesh K. Shah wrote:
> > Hi Jim,
> > 
> > | How many of these things are currently easy to change with a 
> recompile?> | I should be able to start testing some of these ideas 
> in the near
> > | future, if they only require minor code or configure changes.
> > 
> > 
> > The following
> > * Data File Size   1GB
> > * WAL File Size of 16MB
> > * Block Size  of 8K
> > 
> > Are very easy to change with a recompile.. A Tunable will be 
> greatly 
> > prefered as it will allow one binary for different tunings
> > 
> > * MultiBlock read/write
> > 
> > Is not available but will greatly help in reducing the number of 
> system 
> > calls which will only increase as the size of the database 
> increases if 
> > something is not done about i.
> > 
> > * Pregrown files... maybe not important at this point since 
> TABLESPACE 
> > can currently work around it a bit (Just need to create a 
> different file 
> > system for each tablespace
> > 
> > But if you really think hardware & OS  is the answer for all 
> small 
> > things.. I think we should now start to look on how to make 
> Postgres 
> > Multi-threaded or multi-processed for each connection. With the 
> influx 
> > of  "Dual-Core" or "Multi-Core" being the fad Postgres can 
> have the 
> > cutting edge if somehow exploiting cores is designed.
> > 
> > Somebody mentioned that adding CPU to Postgres workload halved 
> the 
> > average CPU  usage...
> > YEAH... PostgreSQL  uses only 1 CPU per connection (assuming 100% 
> > usage)  so if you add another CPU it is idle anyway and the 
> system will 
> > report only 50%  :-) BUT the importing to measure is.. whether 
> the query 
> > time was cut down or not? ( No flames I am sure you were talking 
> about 
> >

Re: [PERFORM] Caching by Postgres

2005-08-24 Thread PFC



At least on Sparc processors, v8 and newer, any double precision math  
(including longs) is performed with a single instruction, just like for  
a 32 bit datum.  Loads and stores of 8 byte datums are also handled via  
a single instruction.   The urban myth that 64bit math is  
different/better on a 64 bit processor is just that;  yes, some lower  
end processors would emulate/trap those instructions but that an  
implementation detail, not architecture.I believe that this is all  
true for other RISC processors as well.


The 64bit API on UltraSparcs does bring along some extra FP registers  
IIRC.


It's very different on x86.
	64-bit x86 like the Opteron has more registers, which are very scarce on  
the base x86 (8 I think). This alone is very important. There are other  
factors as well.


Solaris, at least, provided support for far more than 4GB of physical  
memory on 32 bit kernels.  A newer 64 bit kernel might be more  
efficient, but that's just because the time was taken to support large  
page sizes and more efficient data structures.  It's nothing intrinsic  
to a 32 vs 64 bit kernel.


	Well, on a large working set, a processor which can directly address more  
than 4GB of memory will be a lot faster than one which can't, and has to  
play with the MMU and paging units !


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Performance indexing of a simple query

2005-08-24 Thread Jim C. Nasby
Try

CREATE INDEX start_complete ON jobs( start_time, completion_time );

Try also completion_time, start_time. One might work better than the
other. Or, depending on your data, you might want to keep both.

In 8.1 you'll be able to do bitmap-based index combination, which might
allow making use of the seperate indexes.

On Wed, Aug 24, 2005 at 02:43:51PM -0600, Mark Fox wrote:
> I have a table called 'jobs' with several million rows, and the only
> columns that are important to this discussion are 'start_time' and
> 'completion_time'.
> 
> The sort of queries I want to execute (among others) are like:
> 
> SELECT * FROM jobs
> WHERE completion_time > SOMEDATE AND start_time < SOMEDATE;
> 
> In plain english:  All the jobs that were running at SOMEDATE.  The
> result of the query is on the order of 500 rows.
> 
> I've got seperate indexes on 'start_time' and 'completion_time'.
> 
> Now, if SOMEDATE is such that the number of rows with completion_time
> > SOMEDATE is small (say 10s of thousands), the query uses index scans
> and executes quickly.  If not, the query uses sequential scans and is
> unacceptably slow (a couple of minutes).  I've used EXPLAIN and
> EXPLAIN ANALYZE to confirm this.  This makes perfect sense to me.
> 
> I've played with some of the memory settings for PostgreSQL, but none
> has had a significant impact.
> 
> Any ideas on how to structure the query or add/change indexes in such
> a way to improve its performance?  In desperation, I tried using a
> subquery, but unsurprisingly it made no (positive) difference.  I feel
> like there might be a way of using an index on both 'completion_time'
> and 'start_time', but can't put a temporal lobe on the details.
> 
> 
> Mark
> 
> ---(end of broadcast)---
> TIP 4: Have you searched our list archives?
> 
>http://archives.postgresql.org
> 

-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Softwarehttp://pervasive.com512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Caching by Postgres

2005-08-24 Thread Alan Stange

[EMAIL PROTECTED] wrote:

On Wed, Aug 24, 2005 at 02:47:09PM -0400, Alan Stange wrote:
  
At least on Sparc processors, v8 and newer, any double precision math 
(including longs) is performed with a single instruction, just like for 
a 32 bit datum.  Loads and stores of 8 byte datums are also handled via 
a single instruction.   The urban myth that 64bit math is 
different/better on a 64 bit processor is just that;  yes, some lower 
end processors would emulate/trap those instructions but that an 
implementation detail, not architecture.



It isn't an urban myth that 64-bit math on a 64-bit processor is
faster, at least if done using registers. It definately is faster.
  
The older 32bit RISC processors do have 64 bit registers, ALUs and 
datapaths, and they are marketed toward high end scientific computing, 
and you're claiming that such a processor is slower than one which has 
the addition of 64 bit pointers added to it?


As an example, an UltraSparc running a 32 bit kernel+application will 
have the same double precision floating point performance as one 
running  a 64bit kernel+application (except for the additional FP 
registers in the 64bit API).  For a function like daxpy, it's the exact 
same hardware running the exact same instructions!  So why do you think 
the performance would be different?


I believe the IBM Power processors also upped everything to double 
precision internally because of some details of the "multiply-add fused" 
instructions.  It's been a few years since I taught H&P to CS 
undergrads, but I'm fairly sure the details are all the same for MIPS 
processors as well. 


-- Alan




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Caching by Postgres

2005-08-24 Thread Michael Stone

On Wed, Aug 24, 2005 at 03:34:41PM -0400, [EMAIL PROTECTED] wrote:

It isn't an urban myth that 64-bit math on a 64-bit processor is
faster, at least if done using registers. It definately is faster.
It may be an urban myth, though, that most applications perform
a sufficient amount of 64-bit arithmetic to warrant the upgrade.


The mjor problem is that the definition of "64bit processor" is fuzzy.
The major slowdown of "64bitness" is the necessity of carting around 
64 bit pointers. It's not, however, necessary to do 64bit pointers to

get 64bit registers & fast 64 bit ops. E.g., sgi has "n32" & "n64" abi's
which can access exactly the same instruction set & registers, the
difference between them is the size of pointers and whether a "long" is
the same as a "long long". Any discussion of "64 bit processors" is
doomed from the start because people tend to start making implementation
assumptions on top of an already vague concept. Current & future
discussions are tinged by the fact that amd64 *doubles* the number
of registers in 64 bit mode, potentially providing a major speedup--but
one that doesn't really have anything to do with being "64bit". 
Pretty much any discussion of 64 bit mode really needs to be a

discussion of a particular abi on a particular processor; talking about
"64 bit processors" abstractly is a waste of time.

Mike Stone

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Read/Write block sizes

2005-08-24 Thread Jim C. Nasby
On Wed, Aug 24, 2005 at 12:12:22PM -0400, Chris Browne wrote:
> Everyone involved in development seems to me to have a reasonably keen
> understanding as to what the potential benefits of threading are; the
> value is that there fall out plenty of opportunities to parallelize
> the evaluation of portions of queries.  Alas, it wouldn't be until
> *after* all the effort goes in that we would get any idea as to what
> kinds of speedups this would provide.

My understanding is that the original suggestion was to use threads
within individual backends to allow for parallel query execution, not
swiching to a completely thread-based model.

In any case, there are other ways to enable parallelism without using
threads, such as handing actual query execution off to a set of
processes.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Softwarehttp://pervasive.com512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Caching by Postgres

2005-08-24 Thread mark
On Wed, Aug 24, 2005 at 05:09:04PM -0400, Alan Stange wrote:
> The older 32bit RISC processors do have 64 bit registers, ALUs and 
> datapaths, and they are marketed toward high end scientific computing, 
> and you're claiming that such a processor is slower than one which has 
> the addition of 64 bit pointers added to it?

No. I'm claiming that you are talking about a hybrid 64/32 processor,
and that this isn't fair to declare that 64-bit arithmetic units don't
provide benefit for 64-bit math. :-)

> As an example, an UltraSparc running a 32 bit kernel+application will 
> have the same double precision floating point performance as one 
> running  a 64bit kernel+application (except for the additional FP 
> registers in the 64bit API).  For a function like daxpy, it's the exact 
> same hardware running the exact same instructions!  So why do you think 
> the performance would be different?

Double precision floating point isn't 64-bit integer arithmetic. I think
this is all a little besides the point. If you point is that the SPARC
was designed well - I agree with you.

I won't agree that a SPARC with 64-bit registers should be considered
a 32-bit machine. The AMD 64-bit machines come in two forms as well -
the ones that allow you to use the 64-bit integer registers (not
floating point! those are already 80-bit!), and the ones that allow
you to address more memory. I wouldn't consider either to be a 32-bit
CPU, although they will allow 32-bit applications to run fine.

> I believe the IBM Power processors also upped everything to double 
> precision internally because of some details of the "multiply-add fused" 
> instructions.  It's been a few years since I taught H&P to CS 
> undergrads, but I'm fairly sure the details are all the same for MIPS 
> processors as well. 

Smart design, that obscures the difference - but doesn't make the
difference a myth. If it's already there, then it's already there,
and we can't talk as if it isn't.

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Performance indexing of a simple query

2005-08-24 Thread Tom Lane
Mark Fox <[EMAIL PROTECTED]> writes:
> The sort of queries I want to execute (among others) are like:
> SELECT * FROM jobs
> WHERE completion_time > SOMEDATE AND start_time < SOMEDATE;
> In plain english:  All the jobs that were running at SOMEDATE.

AFAIK there is no good way to do this with btree indexes; the problem
is that it's fundamentally a 2-dimensional query and btrees are
1-dimensional.  There are various hacks you can try if you're willing
to constrain the problem (eg, if you can assume some not-very-large
maximum on the running time of jobs) but in full generality btrees are
just the Wrong Thing.

So what you want to look at is a non-btree index, ie, rtree or gist.
For example, the contrib/seg data type could pretty directly be adapted
to solve this problem, since it can index searches for overlapping
line segments.

The main drawback of these index types in existing releases is that they
are bad on concurrent updates and don't have WAL support.  Both those
things are (allegedly) fixed for GIST in 8.1 ... are you interested in
trying out 8.1beta?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] RAID arrays (and vendors)

2005-08-24 Thread Jim C. Nasby
I'm looking for an external RAID array (with external controller);
either ~8 15kRPM SCSI drives or something with more SATA drives. This
will be used in a test environment and could get moved between machines,
so I'd like something with it's own RAID controller. Support for a broad
range of OSes is important.

Can anyone recommend hardware as well as vendors? Feel free to reply
off-list.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Softwarehttp://pervasive.com512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Read/Write block sizes

2005-08-24 Thread Lance Obermeyer
Since Bruce referred to the "corporate software world" I'll chime in...

It has been a while since adding knobs and dials has been considered a good 
idea.  Customers are almost always bad at tuning their systems, which decreases 
customer satisfaction.  While many people assume the corporate types don't 
care, that is actually far from the truth.  Well run commercial software 
companies regularly commission (expensive) customer satisfaction surveys.  
These numbers are the second most important numbers in all of the enterprise, 
trailing only revenue in importance.  Results are sliced and diced in every way 
imaginable.

The commercial world is trying to auto-tune their systems just as much.  
Examples are the work that many of the big boys are doing towards "autonomic" 
computing.  While it is driven by naked self interest of wanting to sell 
version upgrades, those efforts increase customer satisfaction and decrease 
support costs.  Works well for everyone...



-Original Message-
From: Bruce Momjian [mailto:[EMAIL PROTECTED]
Sent: Wednesday, August 24, 2005 8:52 AM
To: Jignesh K. Shah
Cc: Jim Nasby; Chris Browne; pgsql-performance@postgresql.org
Subject: Re: Read/Write block sizes



This thread covers several performance ideas.  First is the idea that
more parameters should be configurable.   While this seems like a noble
goal, we try to make parameters auto-tuning, or if users have to
configure it, the parameter should be useful for a significant number of
users.

In the commercial software world, if you can convince your boss that a
feature/knob is useful, it usually gets into the product. 
Unfortunately, this leads to the golden doorknob on a shack, where some
features are out of sync with the rest of the product in terms of
usefulness and utility.  With open source, if a feature can not be
auto-tuned, or has significant overhead, the features has to be
implemented and then proven to be a benefit.

In terms of adding async I/O, threading, and other things, it might make
sense to explore how these could be implemented in a way that fits the
above criteria.

---

Jignesh K. Shah wrote:
> Hi Jim,
> 
> | How many of these things are currently easy to change with a recompile?
> | I should be able to start testing some of these ideas in the near
> | future, if they only require minor code or configure changes.
> 
> 
> The following
> * Data File Size   1GB
> * WAL File Size of 16MB
> * Block Size  of 8K
> 
> Are very easy to change with a recompile.. A Tunable will be greatly 
> prefered as it will allow one binary for different tunings
> 
> * MultiBlock read/write
> 
> Is not available but will greatly help in reducing the number of system 
> calls which will only increase as the size of the database increases if 
> something is not done about i.
> 
> * Pregrown files... maybe not important at this point since TABLESPACE 
> can currently work around it a bit (Just need to create a different file 
> system for each tablespace
> 
> But if you really think hardware & OS  is the answer for all small 
> things.. I think we should now start to look on how to make Postgres 
> Multi-threaded or multi-processed for each connection. With the influx 
> of  "Dual-Core" or "Multi-Core" being the fad Postgres can have the 
> cutting edge if somehow exploiting cores is designed.
> 
> Somebody mentioned that adding CPU to Postgres workload halved the 
> average CPU  usage...
> YEAH... PostgreSQL  uses only 1 CPU per connection (assuming 100% 
> usage)  so if you add another CPU it is idle anyway and the system will 
> report only 50%  :-) BUT the importing to measure is.. whether the query 
> time was cut down or not? ( No flames I am sure you were talking about 
> multi-connection multi-user environment :-) ) But my point is then this 
> approach is worth the ROI and the time and effort spent to solve this 
> problem.
> 
> I actually vote for a multi-threaded solution for each connection while 
> still maintaining seperate process for each connections... This way the 
> fundamental architecture of Postgres doesn't change, however a 
> multi-threaded connection can then start to exploit different cores.. 
> (Maybe have tunables for number of threads to read data files who 
> knows.. If somebody is interested in actually working a design .. 
> contact me and I will be interested in assisting this work.
> 
> Regards,
> Jignesh
> 
> 
> Jim C. Nasby wrote:
> 
> >On Tue, Aug 23, 2005 at 06:09:09PM -0400, Chris Browne wrote:
> >  
> >
> >>[EMAIL PROTECTED] (Jignesh Shah) writes:
> >>
> >>
> Does that include increasing the size of read/write blocks? I've
> noticedthat with a large enough table it takes a while to do a
> sequential scan, even if it's cached; I wonder if the fact that it
> takes a million read(2) calls to get through an 8G table is part of
> that.
> 
> 
> >>>Actually some of that read