Re: [PERFORM] wal_buffers

2005-10-05 Thread Thomas F. O'Connell


On Oct 5, 2005, at 8:23 AM, Ian Westmacott wrote:


Can anyone tell me what precisely a WAL buffer contains,
so that I can compute an appropriate setting for
wal_buffers (in 8.0.3)?

I know the documentation suggests there is little
evidence that supports increasing wal_buffers, but we
are inserting a large amount of data that, I believe,
easily exceeds the default 64K in a single transaction.
We are also very sensitive to write latency.

As background, we are doing a sustained insert of 2.2
billion rows in 1.3 million transactions per day.  Thats
about 1700 rows per transaction, at (roughly) 50 bytes
per row.


Ian,

The WAL Configuration chapter (25.2) has a pretty good discussion of  
how wal_buffers is used:


http://www.postgresql.org/docs/8.0/static/wal-configuration.html

You might also take a look at Josh Berkus' recent testing on this  
setting:


http://www.powerpostgresql.com/

--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC

Strategic Open Source: Open Your i™

http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-469-5150
615-469-5151 (fax)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Steinar H. Gunderson
On Wed, Oct 05, 2005 at 04:55:51PM -0700, Luke Lonergan wrote:
> In COPY, we found lots of libc functions like strlen() being called
> ridiculous numbers of times, in one case it was called on every
> timestamp/date attribute to get the length of TZ, which is constant.  That
> one function call was in the system category, and was responsible for
> several percent of the time.

What? strlen is definitely not in the kernel, and thus won't count as system
time.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Luke Lonergan
Michael,

On 10/5/05 8:33 AM, "Michael Stone" <[EMAIL PROTECTED]> wrote:

> real0m8.889s 
> user0m0.877s 
> sys 0m8.010s 
> 
> it's not in disk wait state (in fact the whole read was cached) but it's
> only getting 1MB/s.

You've proven my point completely.  This process is bottlenecked in the CPU.
The only way to improve it would be to optimize the system (libc) functions
like "fread" where it is spending most of it's time.

In COPY, we found lots of libc functions like strlen() being called
ridiculous numbers of times, in one case it was called on every
timestamp/date attribute to get the length of TZ, which is constant.  That
one function call was in the system category, and was responsible for
several percent of the time.

By the way, system routines like fgetc/getc/strlen/atoi etc, don't appear in
gprof profiles of dynamic linked objects, nor by default in oprofile
results.

If the bottleneck is in I/O, you will see the time spent in disk wait, not
in system.

- Luke



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Ron Peacetree
I'm putting in as much time as I can afford thinking about pg related
performance issues.  I'm doing it because of a sincere desire to help
understand and solve them, not to annoy people.

If I didn't believe in pg, I would't be posting thoughts about how to
make it better.  

It's probably worth some review (suggestions marked with a "+":

+I came to the table with a possibly better way to deal with external
sorts (that now has branched into 2 efforts: short term improvements
to the existing code, and the original from-the-ground-up idea).  That
suggestion was based on a great deal of prior thought and research,
despite what some others might think.

Then we were told that our IO limit was lower than I thought.

+I suggested that as a "Quick Fix" we try making sure we do IO
transfers in large enough chunks based in the average access time
of the physical device in question so as to achieve the device's
ASTR (ie at least 600KB per access for a 50MBps ASTR device with
a 12ms average access time.) whenever circumstances allowed us.
As far as I know, this experiment hasn't been tried yet.

I asked some questions about physical layout and format translation
overhead being possibly suboptimal that seemed to be agreed to, but
specifics as to where we are taking the hit don't seem to have been
made explicit yet.

+I made the "from left field" suggestion that perhaps a pg native fs
format would be worth consideration.  This is a major project, so
the suggestion was to at least some extent tongue-in-cheek.

+I then made some suggestions about better code instrumentation
so that we can more accurately characterize were the bottlenecks are. 

We were also told that evidently we are CPU bound far before one
would naively expect to be based on the performance specifications
of the components involved.

Double checking among the pg developer community led to some
differing opinions as to what the actual figures were and under what
circumstances they were achieved.  Further discussion seems to have
converged on both accurate values and a better understanding as to
the HW and SW  needed; _and_ we've gotten some RW confirmation
as to what current reasonable expectations are within this problem
domain from outside the pg community.

+Others have made some good suggestions in this thread as well.
Since I seem to need to defend my tone here, I'm not detailing them
here.  That should not be construed as a lack of appreciation of them.

Now I've asked for the quickest path to detailed understanding of the
pg IO subsystem.  The goal being to get more up to speed on its
coding details.  Certainly not to annoy you or anyone else.

At least from my perspective, this for the most part seems to have
been an useful and reasonable engineering discussion that has
exposed a number of important things.
  
Regards,
Ron

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Indexes on ramdisk

2005-10-05 Thread Emil Briggs
> What kind of order of improvement do you need to see?
>

A lot since the load on the system is expected to increase by up to 100% over 
the next 6 months.

> What period are these number for? Were they collected over 1 hour, 1 day, 1
> month?
>

I thought I mentioned that in the earlier post but it was from a 2 hour 
period. It's a busy system.

> How much Cache do you have on the controller?
>

64Mbytes but I don't think that's an issue. As I mentioned in the first post 
the table that is the bottleneck has indexes on 15 columns and is seeing a 
lot of inserts, deletes and updates. The indexes are spread out over the 5 
mirrors but it's still a couple of writes per mirror for each operation. I'm 
going to order an SSD which should give us a lot more headroom than trying to 
rearrange the RAID setup.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Is There Any Way ....

2005-10-05 Thread Douglas J. Trainor
A blast from the past is forwarded below.

douglas

Begin forwarded message:

From: Tom Lane <[EMAIL PROTECTED]>
Date: August 23, 2005 3:23:43 PM EDT
To: Donald Courtney <[EMAIL PROTECTED]>
Cc: pgsql-performance@postgresql.org, Frank Wiles <[EMAIL PROTECTED]>, gokulnathbabu manoharan <[EMAIL PROTECTED]>
Subject: Re: [PERFORM] Caching by Postgres 

Donald Courtney <[EMAIL PROTECTED]> writes:
I am not alone in having the *expectation* that a database should have
some cache size parameter and the option to skip the file system.  If
I use oracle, sybase, mysql and maxdb they all have the ability to
size a data cache and move to 64 bits.

And you're not alone in holding that opinion despite having no shred
of evidence that it's worthwhile expanding the cache that far.

However, since we've gotten tired of hearing this FUD over and over,
8.1 will have the ability to set shared_buffers as high as you want.
I expect next we'll be hearing from people complaining that they
set shared_buffers to use all of RAM and performance went into the
tank ...

regards, tom lane


On Oct 4, 2005, at 11:06 PM, Ron Peacetree wrote:

Unfortunately, no matter what I say or do, I'm not going to please
or convince anyone who has already have made their minds up
to the extent that they post comments like Mr Trainor's below.
His response style pretty much proves my earlier point that this
is presently a religious issue within the pg community.

The absolute best proof would be to build a version of pg that does
what Oracle and DB2 have done and implement it's own DB
specific memory manager and then compare the performance
between the two versions on the same HW, OS, and schema.

The second best proof would be to set up either DB2 or Oracle so
that they _don't_ use their memory managers and compare their
performance to a set up that _does_ use said memory managers
on the same HW, OS, and schema.

I don't currently have the resources for either experiment.

Some might even argue that IBM (where Codd and Date worked)
and Oracle just _might_ have had justification for the huge effort
they put into developing such infrastructure. 

Then there's the large library of research on caching strategies
in just about every HW and SW domain, including DB theory,
that points put that the more context dependent, ie application
or domain specific awareness, caching strategies are the better
they are.

Maybe after we do all we can about physical IO and sorting
performance I'll take on the religious fanatics on this one.

One problem set at a time.
Ron   


Re: [PERFORM] Text/Varchar performance...

2005-10-05 Thread Josh Berkus
Cristian,

> Hello, just a little question, It's preferable to use Text Fields or
> varchar(255) fields in a table? Are there any performance differences in
> the use of any of them?

TEXT, VARCHAR, and CHAR use the same underlying storage mechanism.   This 
means that TEXT is actually the "fastest" since it doesn't check length or 
space-pad.  However, that's unlikely to affect you unless you've millions 
of records; you should use the type which makes sense given your 
application.

For "large text fields" I always use TEXT.  BTW, in PostgreSQL VARCHAR is 
not limited to 255; I think we support up to 1GB of text or something 
preposterous.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Ultra-cheap NVRAM device

2005-10-05 Thread Merlin Moncure
Chris wrote:
> [EMAIL PROTECTED] (Dan Harris) writes:
> > On Oct 3, 2005, at 5:02 AM, Steinar H. Gunderson wrote:
> >
> >> I thought this might be interesting, not the least due to the
> >> extremely low
> >> price ($150 + the price of regular DIMMs):
> >
> > Replying before my other post came through.. It looks like their
> > benchmarks are markedly improved since the last article I read on
> > this.  There may be more interest now..
> 
> It still needs a few more generations worth of improvement.
> 
> 1.  It's still limited to SATA speed
> 2.  It's not ECC smart

3. Another zero (or two) on the price tag :).  While it looks like a fun
toy to play with, for it to replace hard drives in server environments
they need to provide more emphasis and effort in assuring people their
drive is reliable.

If they really wanted it to be adopted in server environments, it would
have been packaged in a 3.5" drive, not a pci card, since that's what we
all hot swap (especially since it already uses SATA interface).  They
would also have allowed use of 2 and 4gb DIMS, and put in a small hard
drive that the memory paged to when powered off, and completely isolated
the power supply...hard to pack all that in 60$.

That said, we are in the last days of the hard disk.  I think it is only
a matter of months before we see a sub 1000$ part which have zero
latency in the 20-40 GB range.  Once that happens economies of scale
will kick in and hard drives will become basically a backup device.

Merlin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Text/Varchar performance...

2005-10-05 Thread Steinar H. Gunderson
On Wed, Oct 05, 2005 at 12:21:35PM -0600, Cristian Prieto wrote:
> Hello, just a little question, It's preferable to use Text Fields or
> varchar(255) fields in a table? Are there any performance differences in the
> use of any of them?

They are essentially the same. Note that you can have varchar without length
(well, up to about a gigabyte or so after compression), and you can have
varchar with a length well above 255 (say, 10).

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Jeffrey W. Baker
On Wed, 2005-10-05 at 12:14 -0400, Ron Peacetree wrote:
> I've now gotten verification from multiple working DBA's that DB2, Oracle, and
> SQL Server can achieve ~250MBps ASTR (with as much as ~500MBps ASTR in
> setups akin to Oracle RAC) when attached to a decent (not outrageous, but
> decent) HD subsystem...
> 
> I've not yet had any RW DBA verify Jeff Baker's supposition that ~1GBps ASTR 
> is
> attainable.  Cache based bursts that high, yes.  ASTR, no.

I find your tone annoying.  That you do not have access to this level of
hardware proves nothing, other than pointing out that your repeated
emails on this list are based on supposition.

If you want 1GB/sec STR you need:

1) 1 or more Itanium CPUs
2) 24 or more disks
3) 2 or more SATA controllers
4) Linux

Have fun.

-jwb

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] Text/Varchar performance...

2005-10-05 Thread Cristian Prieto
Hello, just a little question, It's preferable to use Text Fields or
varchar(255) fields in a table? Are there any performance differences in the
use of any of them?

Thanks a lot for your answer!


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Ron Peacetree
First I wanted to verify that pg's IO rates were inferior to The Competition.
Now there's at least an indication that someone else has solved similar
problems.  Existence proofs make some things easier ;-)

Is there any detailed programmer level architectual doc set for pg?  I know
"the best doc is the code", but the code in isolation is often the Slow Path to
understanding with systems as complex as a DBMS IO layer.

Ron
 

-Original Message-
From: "Joshua D. Drake" <[EMAIL PROTECTED]>
Sent: Oct 5, 2005 1:18 PM
Subject: Re: [HACKERS] [PERFORM] A Better External Sort?


The source is freely available for your perusal. Please feel free to
point us in specific directions in the code where you may see some
benefit. I am positive all of us that can, would put resources into
fixing the issue had we a specific direction to attack.

Sincerely,

Joshua D. Drake

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Ultra-cheap NVRAM device

2005-10-05 Thread Chris Browne
[EMAIL PROTECTED] (Dan Harris) writes:
> On Oct 3, 2005, at 5:02 AM, Steinar H. Gunderson wrote:
>
>> I thought this might be interesting, not the least due to the
>> extremely low
>> price ($150 + the price of regular DIMMs):
>
> Replying before my other post came through.. It looks like their
> benchmarks are markedly improved since the last article I read on
> this.  There may be more interest now..

It still needs a few more generations worth of improvement.

1.  It's still limited to SATA speed
2.  It's not ECC smart

What I'd love to see would be something that much smarter, or, at
least, that pushes the limits of SATA speed, and which has both a
battery on board and enough CF storage to cope with outages.
-- 
output = reverse("gro.mca" "@" "enworbbc")
http://www.ntlug.org/~cbbrowne/linuxxian.html
We  all live in  a yellow  subroutine, a  yellow subroutine,  a yellow
subroutine...

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Is There Any Way ....

2005-10-05 Thread Ron Peacetree
From: Kevin Grittner <[EMAIL PROTECTED]>
Sent: Oct 5, 2005 2:16 AM
Subject: Re: [PERFORM] Is There Any Way 

>First off, Mr. Trainor's response proves nothing about anyone or
>anything except Mr. Trainor.
>
Fair Enough.  I apologize for the inappropriately general statement.

 
>I'm going to offer an opinion on the caching topic.  I don't have
>any benchmarks; I'm offering a general sense of the issue based on
>decades of experience, so I'll give a short summary of that.
> 
>I've been earning my living by working with computers since 1972,
>
~1978 for me.  So to many on this list, I also would be an "old fart".



>
I've pretty much spent my entire career thinking about and making
advances in RW distributed computing and parallel processing as
first a programmer and then a systems architect. 
 

>Now on to the meat of it. 

>
I agree with your comments just about across the board.


I also agree with the poster(s) who noted that the "TLC factor" and the
2x every 18months pace of increasing HW performance and RAM capacity
make this stuff a moving target.

OTOH, there are some fundamentals that don't seem to change no
matter how far or fast the computing field evolves.

As usual, the proper answers involve finding a sometimes nontrivial
balance between building on known precedent and not being trapped
by doctrine.

Ron

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Joshua D. Drake

> We have to fix this.
> Ron  
> 


The source is freely available for your perusal. Please feel free to
point us in specific directions in the code where you may see some
benefit. I am positive all of us that can, would put resources into
fixing the issue had we a specific direction to attack.

Sincerely,

Joshua D. Drake


-- 
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Ron Peacetree
I've now gotten verification from multiple working DBA's that DB2, Oracle, and
SQL Server can achieve ~250MBps ASTR (with as much as ~500MBps ASTR in
setups akin to Oracle RAC) when attached to a decent (not outrageous, but
decent) HD subsystem...

I've not yet had any RW DBA verify Jeff Baker's supposition that ~1GBps ASTR is
attainable.  Cache based bursts that high, yes.  ASTR, no.

The DBA's in question run RW installations that include Solaris, M$, and Linux 
OS's
for companies that just about everyone on these lists are likely to recognize.

Also, the implication of these pg IO limits is that money spent on even 
moderately
priced 300MBps SATA II based RAID HW is wasted $'s.

In total, this situation is a recipe for driving potential pg users to other 
DBMS. 
  
25MBps in and 15MBps out is =BAD=.

Have we instrumented the code in enough detail that we can tell _exactly_ where
the performance drainage is?

We have to fix this.
Ron  


-Original Message-
From: Luke Lonergan <[EMAIL PROTECTED]>
Sent: Oct 5, 2005 11:24 AM
To: Michael Stone <[EMAIL PROTECTED]>, Martijn van Oosterhout 

Cc: pgsql-hackers@postgresql.org, pgsql-performance@postgresql.org
Subject: Re: [HACKERS] [PERFORM] A Better External Sort?

Nope - it would be disk wait.

COPY is CPU bound on I/O subsystems faster that 50 MB/s on COPY (in) and about 
15 MB/s (out).

- Luke

 -Original Message-
From:   Michael Stone [mailto:[EMAIL PROTECTED]
Sent:   Wed Oct 05 09:58:41 2005
To: Martijn van Oosterhout
Cc: pgsql-hackers@postgresql.org; pgsql-performance@postgresql.org
Subject:Re: [HACKERS] [PERFORM] A Better External Sort?

On Sat, Oct 01, 2005 at 06:19:41PM +0200, Martijn van Oosterhout wrote:
>COPY TO /dev/null WITH binary
>13MB/s55% user 45% system  (ergo, CPU bound)
[snip]
>the most expensive. But it does point out that the whole process is
>probably CPU bound more than anything else.

Note that 45% of that cpu usage is system--which is where IO overhead
would end up being counted. Until you profile where you system time is
going it's premature to say it isn't an IO problem.

Mike Stone


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 6: explain analyze is your friend


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Michael Stone

On Wed, Oct 05, 2005 at 11:24:07AM -0400, Luke Lonergan wrote:

Nope - it would be disk wait.


I said I/O overhead; i.e., it could be the overhead of calling the
kernel for I/O's. E.g., the following process is having I/O problems:

time dd if=/dev/sdc of=/dev/null bs=1 count=1000   
1000+0 records in  
1000+0 records out 
1000 bytes transferred in 8.887845 seconds (1125132 bytes/sec) 
  
real0m8.889s   
user0m0.877s   
sys 0m8.010s   


it's not in disk wait state (in fact the whole read was cached) but it's
only getting 1MB/s. 


Mike Stone

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Is There Any Way ....

2005-10-05 Thread Frank Wiles
On Tue, 4 Oct 2005 23:06:54 -0400 (EDT)
Ron Peacetree <[EMAIL PROTECTED]> wrote:

> Then there's the large library of research on caching strategies
> in just about every HW and SW domain, including DB theory,
> that points put that the more context dependent, ie application
> or domain specific awareness, caching strategies are the better
> they are.

  Isn't this also a very strong argument for putting your caching
  into your application and not at the database level? 
 
  As you say the more "application or domain specific" it is the better.
  I don't see how PostgreSQL is going to magically determine what
  is perfect for everyone's differing needs and implement it for you. 

  Even rudimentary controls such "always keep this
  table/index/whatever in RAM" aren't as fine grained or specific
  enough to get full benefit. 

  My suggestion is to use something like memcached to store your
  data in, based on the particular needs of your application.  This
  puts all of the control in the hands of the programmer where, in
  my opinion, it belongs. 

  Just to clarify, I'm not entirely against the idea, but I certainly
  think there are other areas of PostgreSQL we should be focusing our
  efforts. 

 -
   Frank Wiles <[EMAIL PROTECTED]>
   http://www.wiles.org
 -


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Luke Lonergan
Nope - it would be disk wait.

COPY is CPU bound on I/O subsystems faster that 50 MB/s on COPY (in) and about 
15 MB/s (out).

- Luke

 -Original Message-
From:   Michael Stone [mailto:[EMAIL PROTECTED]
Sent:   Wed Oct 05 09:58:41 2005
To: Martijn van Oosterhout
Cc: pgsql-hackers@postgresql.org; pgsql-performance@postgresql.org
Subject:Re: [HACKERS] [PERFORM] A Better External Sort?

On Sat, Oct 01, 2005 at 06:19:41PM +0200, Martijn van Oosterhout wrote:
>COPY TO /dev/null WITH binary
>13MB/s55% user 45% system  (ergo, CPU bound)
[snip]
>the most expensive. But it does point out that the whole process is
>probably CPU bound more than anything else.

Note that 45% of that cpu usage is system--which is where IO overhead
would end up being counted. Until you profile where you system time is
going it's premature to say it isn't an IO problem.

Mike Stone


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Indexes on ramdisk

2005-10-05 Thread Alex Turner
What kind of order of improvement do you need to see?

What period are these number for?  Were they collected over 1 hour, 1 day, 1 month?

How much Cache do you have on the controller?

You can certainly get more speed by adding more disk and possibly by
adding more controller RAM/a second controller.  10 disks isn't
really that many for a totally kick-ass DB server.  You can
acheive more block writes with RAID 10s than with RAID 1s.  Wether
it's cost effective is dependant on lots of factors like your chassis
and drive enclosures etc. vs SSD.  SSD will be faster, but last I
heard was expensive, and I checked a few websites but couldn't get much
price info.  Normaly when you can't get price info, thats a bad
sign ;).  If you are doing large chunks of writes to a small
number of tables, then you might be better off with a single large RAID
10 for your tablespace than with seperate RAID 1s.  If you are
writing 5 to 1 more table data than index data, you are hurting
yourself by seperating on to multiple RAID 1s instead of a single RAID
10 which could write at 2-3x for the 5, and 2-3x for the 1 and only
suffer a single seek penalty but get data onto disk twice to three
times as fast (depending how many RAID 1s you join).  Try
unseperating RAID 1s, and combine to a RAID 10. for indexes and
tablespaces.  The controller will re-sequence your writes/reads to
help with effeciency, and dbwriter is there to make things go easier.

You can at least get some idea by doing an iostat and see how many IOs
and how much throughput is happening. That will rappidly help determine
if you are bound by IOs or by MB/sec.

Worst case I'm wrong, but IMHO it's worth a try.

Alex Turner
NetEconomistOn 10/4/05, Emil Briggs <[EMAIL PROTECTED]> wrote:
> Talk about your IO system a bit. There might be obvious ways to improve.>> What System/Motherboard are you using?> What Controller Cards are you using?> What kind of Disks do you have (SATA, SCSI 
7.6k 10k 15k)> What denominations (9, 18, 36, 72, 143, 80, 160, 200 240Gig)?> What kind of RAIDs do you have setup (How many drives what stripe sizes,> how many used for what).> What levels of RAID are you using (0,1,10,5,50)?
>It's a quad opteron system. RAID controller is a 4 channel LSILogic Megaraid320 connected to 10 15k 36.7G SCSI disks. The disks are configured in 5mirrored partitions. The pg_xlog is on one mirror and the data and indexes
are spread over the other 4 using tablespaces. These numbers frompg_stat_user_tables are from about 2 hours earlier today on this one table.idx_scan   20578690idx_tup_fetch  35866104841
n_tup_ins1940081n_tup_upd   1604041n_tup_del1880424---(end of broadcast)---TIP 4: Have you searched our list archives?
   http://archives.postgresql.org


Re: [PERFORM] Indexes on ramdisk

2005-10-05 Thread Merlin Moncure
> It's a quad opteron system. RAID controller is a 4 channel LSILogic
> Megaraid
> 320 connected to 10 15k 36.7G SCSI disks. The disks are configured in
5
> mirrored partitions. The pg_xlog is on one mirror and the data and
indexes
> are spread over the other 4 using tablespaces. These numbers from
> pg_stat_user_tables are from about 2 hours earlier today on this one
> table.
> 
> 
> idx_scan   20578690
> idx_tup_fetch  35866104841
> n_tup_ins1940081
> n_tup_upd   1604041
> n_tup_del1880424

Is your raid controller configured to buffer your writes?  How much RAM
are you packing?  Are you running 64 bit?

Merlin

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Is There Any Way ....

2005-10-05 Thread Kevin Grittner
** Low Priority **

Human feedback from testers and users has proven pretty effective
at catching errors in the "human assisted" cache configuration.  When
people setting up the servers have missed the named cache configuration,
and all they had was the single general purpose cache, it has been caught
because of user complaints on performance.
 
There was an attempt made to simulate database queries -- hitting a
client side cache on some of the roughly100 tables (out of 300 in the well
normalized schema) which fit this pattern of usage.  It didn't prove very
cost effective.  It just makes more sense to allow the DBAs to tweek
database performance through database configuration changes than to
jump through that many hoops in application code to try to achieve it
where it becomes an issue.
 
As far as I know, you can't use this technique in Microsoft SQL Server or
Oracle.  They are using Sybase Adaptive Server Enterprise (ASE).  I
believe named caches were added in version 12.0, long after Microsoft
split off with their separate code stream based on the Sybase effort.
 
-Kevin
 
 
>>> "Dario" <[EMAIL PROTECTED]> 10/05/05 6:16 AM >>>
I'm sure there will be cases when some human assisted caching algorithm will
perform better than an mathetical statistical based design, but it will also
depend on the "human". And it probably will make thing worse when workload
changes and human doesn't realize. It must be considered that, today,
hardware cost is not the %90 of budget that it used to be. Throwing hardware
at the system can be as much expensive as throwing certified "it stuff".
(just think in coffee budget! :-) )

If you need to improve "user perception", you can do others things. Like
caching a table in your client  (with a trigger for any change on table X
updating a table called "timestamp_table_change" and a small select to this
table, you can easily know when you must update your client). If it is a
application server, serving http request, then "user perception" will be
sticked to bandwidth AND application server (some of them have cache for
request).

FYI, I don't recall a mechanism in MSSQL to cache a table in buffers. Oracle
has some structures to allow that. (you know) It uses his own buffer. Since
version 9i, you can set three different data buffers, one (recycled cache)
for low usage tables (I mean tables with blocks which don't have too much
chance to be queried again, like a very large historical table) , one for
high usage tables (keep cache), and the regular one (difference is in
algorithm). And you must also set a buffer cache size for tablespaces with
different block size. But there is no such thing as "create table x keep
entirenly in buffer". And above all things, oracle doc always states "first,
tune design, then tune queries, then start tunning engine".

greetings.




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] wal_buffers

2005-10-05 Thread Ian Westmacott
Can anyone tell me what precisely a WAL buffer contains,
so that I can compute an appropriate setting for
wal_buffers (in 8.0.3)?

I know the documentation suggests there is little
evidence that supports increasing wal_buffers, but we
are inserting a large amount of data that, I believe,
easily exceeds the default 64K in a single transaction.
We are also very sensitive to write latency.

As background, we are doing a sustained insert of 2.2
billion rows in 1.3 million transactions per day.  Thats
about 1700 rows per transaction, at (roughly) 50 bytes
per row.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-10-05 Thread Merlin Moncure
[to K C:] sorry, was out on vactation all last week.  I was visualizing
the problem incorrectly anyways...

Jim wrote:
> That function is not immutable, it should be defined as stable.

That is 100% correct: however now and then I declare stable functions as
immutable in some cases because the planner treats them differently with
no side effects...this is a hack of course...see my earlier suggestion
to try both immutable and stable versions.  I can give a pretty good
example of when this can make a big difference.
 
> PostgreSQL doesn't pre-compile functions, at least not until 8.1 (and
> I'm not sure how much those are pre-compiled, though they are
> syntax-checked at creation). Do you get the same result time when you
> run it a second time? What time do you get from running just the
> function versus the SQL in the function?

plpgsql functions are at least partially compiled (sql functions afaik
are not), in that a internal state is generated following the first
execution.  This is the cause of all those infernal 'invalid table oid'
errors.
 
> Also, remember that every layer you add to the cake means more work
for
> the database. If speed is that highly critical you'll probably want to
> not wrap things in functions, and possibly not use views either.

The overhead of the function/view is totally inconsequential next to the
planner choosing a suboptimal plan.  The purpose of the function is to
coerce the planner into choosing the correct plan.

> Also, keep in mind that getting below 1ms doesn't automatically mean
> you'll be able to scale to 1000TPS. Things will definately change when
> you load the system down, so if performance is that critical you
should
> start testing with the system under load if you're not already.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Is There Any Way ....

2005-10-05 Thread Dario
I'm sure there will be cases when some human assisted caching algorithm will
perform better than an mathetical statistical based design, but it will also
depend on the "human". And it probably will make thing worse when workload
changes and human doesn't realize. It must be considered that, today,
hardware cost is not the %90 of budget that it used to be. Throwing hardware
at the system can be as much expensive as throwing certified "it stuff".
(just think in coffee budget! :-) )

If you need to improve "user perception", you can do others things. Like
caching a table in your client  (with a trigger for any change on table X
updating a table called "timestamp_table_change" and a small select to this
table, you can easily know when you must update your client). If it is a
application server, serving http request, then "user perception" will be
sticked to bandwidth AND application server (some of them have cache for
request).

FYI, I don't recall a mechanism in MSSQL to cache a table in buffers. Oracle
has some structures to allow that. (you know) It uses his own buffer. Since
version 9i, you can set three different data buffers, one (recycled cache)
for low usage tables (I mean tables with blocks which don't have too much
chance to be queried again, like a very large historical table) , one for
high usage tables (keep cache), and the regular one (difference is in
algorithm). And you must also set a buffer cache size for tablespaces with
different block size. But there is no such thing as "create table x keep
entirenly in buffer". And above all things, oracle doc always states "first,
tune design, then tune queries, then start tunning engine".

greetings.




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Is There Any Way ....

2005-10-05 Thread Douglas J. Trainor


Hey, you can say what you want about my style, but you
still haven't pointed to even one article from the vast literature
that you claim supports your argument.  And I did include a
smiley.  Your original email that PostgreSQL is wrong and
that you are right led me to believe that you, like others making
such statements, would not post your references.  You remind
me of Ted Nelson, who wanted the computing center at
the University of Illinois at Chicago to change their systems
just for him.  BTW, I'm a scientist -- I haven't made my mind
up about anything.  I really am interested in what you say,
if there is any real work backing up your claims such that
it would impact average cases.

Any app designer can conceive of many ways to game the
server to their app's advantage -- I'm not interested in that
potboiler.

douglas

On Oct 4, 2005, at 11:06 PM, Ron Peacetree wrote:


Unfortunately, no matter what I say or do, I'm not going to please
or convince anyone who has already have made their minds up
to the extent that they post comments like Mr Trainor's below.
His response style pretty much proves my earlier point that this
is presently a religious issue within the pg community.

The absolute best proof would be to build a version of pg that does
what Oracle and DB2 have done and implement it's own DB
specific memory manager and then compare the performance
between the two versions on the same HW, OS, and schema.

The second best proof would be to set up either DB2 or Oracle so
that they _don't_ use their memory managers and compare their
performance to a set up that _does_ use said memory managers
on the same HW, OS, and schema.

I don't currently have the resources for either experiment.

Some might even argue that IBM (where Codd and Date worked)
and Oracle just _might_ have had justification for the huge effort
they put into developing such infrastructure.

Then there's the large library of research on caching strategies
in just about every HW and SW domain, including DB theory,
that points put that the more context dependent, ie application
or domain specific awareness, caching strategies are the better
they are.

Maybe after we do all we can about physical IO and sorting
performance I'll take on the religious fanatics on this one.

One problem set at a time.
Ron



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Michael Stone

On Tue, Oct 04, 2005 at 12:43:10AM +0300, Hannu Krosing wrote:

Just FYI, I run a count(*) on a 15.6GB table on a lightly loaded db and
it run in 163 sec. (Dual opteron 2.6GHz, 6GB RAM, 6 x 74GB 15k  disks in
RAID10, reiserfs). A little less than 100MB sec.


And none of that 15G table is in the 6G RAM?

Mike Stone

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Query in SQL statement

2005-10-05 Thread Richard Huxton

R, Rajesh (STSD) wrote:
 
Am trying to port a mysql statement to postgres.


Please help me in finding the error in this,


Can I recommend the reference section of the manuals for this sort of 
thing? There is an excellent section detailing the valid SQL for the 
CREATE TABLE command.


Also - the pgsql-hackers list is for discussion of database development, 
and the performance list is for performance problems. This would be 
better posted on pgsql-general or -sql or -novice.



CREATE SEQUENCE ai_id;


This line is causing the first error:
> ERROR:  relation "ai_id" already exists

That's because you've already successfully created the sequence, so it 
already exists. Either drop it and recreate it, or stop trying to 
recreate it.



CREATE TABLE badusers (
  id int DEFAULT nextval('ai_id') NOT NULL,
  UserName varchar(30),
  Date  datetime DEFAULT '-00-00 00:00:00' NOT NULL,


Well, "Date" is a type-name, "datetime" isn't and even if it was 
"-00-00" isn't a valid date is it?



  Reason varchar(200),
  Admin varchar(30) DEFAULT '-',
  PRIMARY KEY (id),
  KEY UserName (UserName),
  KEY Date (Date)


The word "KEY" isn't valid here either - are you trying to define an 
index? If so, see the "CREATE INDEX" section of the SQL reference.


http://www.postgresql.org/docs/8.0/static/sql-commands.html

If you reply to this message, please remove the pgsql-hackers CC:
--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Which one FreeBSD or Linux

2005-10-05 Thread Richard Huxton

ALÝ ÇELÝK wrote:
FreeBSD or Linux , which system has better performance for PostgreSQL 


Depends on the underlying hardware and your experience. I'd recommend 
going with whichever you are more familiar, so long as it will support 
the hardware you need to buy.


--
  Richard Huxton
  Archonet Ltd


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Michael Stone

On Sat, Oct 01, 2005 at 06:19:41PM +0200, Martijn van Oosterhout wrote:

COPY TO /dev/null WITH binary
13MB/s55% user 45% system  (ergo, CPU bound)

[snip]

the most expensive. But it does point out that the whole process is
probably CPU bound more than anything else.


Note that 45% of that cpu usage is system--which is where IO overhead
would end up being counted. Until you profile where you system time is
going it's premature to say it isn't an IO problem.

Mike Stone


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Query in SQL statement

2005-10-05 Thread Richard Huxton

R, Rajesh (STSD) wrote:
Thanks. 
I've already understood that 
I need to post it in another list.


Sorry for wasting your precious time. 


No time wasted. It was a perfectly reasonable question, just to the 
wrong lists.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Zeugswetter Andreas DAZ SD

> In my original example, a sequential scan of the 1TB of 2KB 
> or 4KB records, => 250M or 500M records of data, being sorted 
> on a binary value key will take ~1000x more time than reading 
> in the ~1GB Btree I described that used a Key+RID (plus node 
> pointers) representation of the data.

Imho you seem to ignore the final step your algorithm needs of
collecting the
data rows. After you sorted the keys the collect step will effectively
access the 
tuples in random order (given a sufficiently large key range).

This random access is bad. It effectively allows a competing algorithm
to read the
whole data at least 40 times sequentially, or write the set 20 times
sequentially. 
(Those are the random/sequential ratios of modern discs)

Andreas

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Query in SQL statement

2005-10-05 Thread R, Rajesh (STSD)
 
Am trying to port a mysql statement to postgres.

Please help me in finding the error in this,


CREATE SEQUENCE ai_id;
CREATE TABLE badusers (
  id int DEFAULT nextval('ai_id') NOT NULL,
  UserName varchar(30),
  Date  datetime DEFAULT '-00-00 00:00:00' NOT NULL,
  Reason varchar(200),
  Admin varchar(30) DEFAULT '-',
  PRIMARY KEY (id),
  KEY UserName (UserName),
  KEY Date (Date)
);


Am always getting foll. Errors,

ERROR:  relation "ai_id" already exists
ERROR:  syntax error at or near "(" at character 240

Thanks,
Rajesh R

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread David Fetter
On Thu, Sep 29, 2005 at 10:06:52AM -0700, Luke Lonergan wrote:
> Josh,
> 
> On 9/29/05 9:54 AM, "Josh Berkus"  wrote:
> 
> > Following an index creation, we see that 95% of the time required
> > is the external sort, which averages 2mb/s.  This is with seperate
> > drives for the WAL, the pg_tmp, the table and the index.  I've
> > confirmed that increasing work_mem beyond a small minimum (around
> > 128mb) had no benefit on the overall index creation speed.
> 
> Yp!  That about sums it up - regardless of taking 1 or 2 passes
> through the heap being sorted, 1.5 - 2 MB/s is the wrong number.
> This is not necessarily an algorithmic problem, but is a
> optimization problem with Postgres that must be fixed before it can
> be competitive.
> 
> We read/write to/from disk at 240MB/s and so 2 passes would run at a
> net rate of 120MB/s through the sort set if it were that efficient.
> 
> Anyone interested in tackling the real performance issue? (flame
> bait, but for a worthy cause :-)

I'm not sure that it's flamebait, but what do I know?  Apart from the
nasty number (1.5-2 MB/s), what other observations do you have to
hand?  Any ideas about what things are not performing here?  Parts of
the code that could bear extra scrutiny?  Ideas on how to fix same in
a cross-platform way?

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] [HACKERS] Query in SQL statement

2005-10-05 Thread Obe, Regina DND\\MIS
I think this question may be more appropriate for
[EMAIL PROTECTED]

Anyrate for the below.  Sounds like you maybe already have a table or
sequence called ai_id;

Try doing a DROP SEQUENCE ai_id;

First

Also if you plan to use this sequence only for this table it would be better
to use serial8 which will automatically create the sequence for you. Then
you don't even need that first part.  Also you should avoid naming fields
things like Date which tend to be keywords in many kinds of databases.

Try changing your logic to something like

CREATE TABLE badusers (
  id serial8,
  UserName varchar(30),
  Date  timestamp DEFAULT now() NOT NULL,
  Reason varchar(200),
  Admin varchar(30) DEFAULT '-',
  PRIMARY KEY (id)
);

CREATE INDEX badusers_username
  ON badusers
  USING btree
  (username);

CREATE INDEX badusers_date
  ON badusers
  USING btree
  (date);

-Original Message-
From: R, Rajesh (STSD) [mailto:[EMAIL PROTECTED] 
Sent: Thursday, September 29, 2005 9:05 AM
To: pgsql-hackers@postgresql.org; pgsql-performance@postgresql.org
Subject: [HACKERS] Query in SQL statement


 
Am trying to port a mysql statement to postgres.

Please help me in finding the error in this,


CREATE SEQUENCE ai_id;
CREATE TABLE badusers (
  id int DEFAULT nextval('ai_id') NOT NULL,
  UserName varchar(30),
  Date  datetime DEFAULT '-00-00 00:00:00' NOT NULL,
  Reason varchar(200),
  Admin varchar(30) DEFAULT '-',
  PRIMARY KEY (id),
  KEY UserName (UserName),
  KEY Date (Date)
);


Am always getting foll. Errors,

ERROR:  relation "ai_id" already exists
ERROR:  syntax error at or near "(" at character 240

Thanks,
Rajesh R

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Gregory Maxwell
On 9/30/05, Ron Peacetree <[EMAIL PROTECTED]> wrote:
> 4= I'm sure we are paying all sorts of nasty overhead for essentially
> emulating the pg "filesystem" inside another filesystem.  That means
> ~2x as much overhead to access a particular piece of data.
>
> The simplest solution is for us to implement a new VFS compatible
> filesystem tuned to exactly our needs: pgfs.
>
> We may be able to avoid that by some amount of hacking or
> modifying of the current FSs we use, but I suspect it would be more
> work for less ROI.

On this point, Reiser4 fs already implements a number of things which
would be desirable for PostgreSQL. For example: write()s to reiser4
filesystems are atomic, so there is no risk of torn pages (this is
enabled because reiser4 uses WAFL like logging where data is not
overwritten but rather relocated). The filesystem is modular and
extensible so it should be easy to add whatever additional semantics
are needed.  I would imagine that all that would be needed is some
more atomicity operations (single writes are already atomic, but I'm
sure it would be useful to batch many writes into a transaction),some
layout and packing controls, and some flush controls.  A step further
would perhaps integrate multiversioning directly into the FS (the
wandering logging system provides the write side of multiversioning, a
little read side work would be required.). More importantly: the file
system was intended to be extensible for this sort of application.

It might make a good 'summer of code' project for someone next year,
... presumably by then reiser4 will have made it into the mainline
kernel by then. :)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster