Re: [PERFORM] LIKE search and performance

2007-05-25 Thread Richard Huxton

[EMAIL PROTECTED] wrote:

And since it's basically impossible to know the selectivity of this kind
of where condition, I doubt the planner would ever realistically want to
choose that plan anyway because of its poor worst-case behavior.


What is a real life example where an intelligent and researched
database application would issue a like or ilike query as their
primary condition in a situation where they expected very high
selectivity?

Avoiding a poor worst-case behaviour for a worst-case behaviour that
won't happen doesn't seem practical.


But if you are also filtering on e.g. date, and that has an index with 
good selectivity, you're never going to use the text index anyway are 
you? If you've only got a dozen rows to check against, might as well 
just read them in.


The only time it's worth considering the behaviour at all is *if* the 
worst-case is possible.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13)

2007-05-25 Thread Bastian Voigt

Hi *,
for caching large autogenerated XML files, I have created a bytea table 
in my database so that the cached files can be used by multiple servers. 
There are about 500 rows and 10-20 Updates per minute on the table. The 
files stored in the bytea are anything from 10kB to 10MB. My PostgreSQL 
version is 8.0.13 on Gentoo Linux (x86) with PostGIS 1.2.0.


For vacuum I use the pg_autovacuum daemon. It decided to vacuum my cache 
table about every 3 hours, the vacuum process takes 20-30 minutes 
(oops!) every time.


Now my big big problem is that the database gets really really slow 
during these 20 minutes and after the vacuum process is running for a 
short time, many transactions show state "UPDATE waiting" in the process 
list. In my Java application server I sometimes get tons of deadlock 
Exceptions (waiting on ShareLock blahblah). The web frontend gets nearly 
unusable, logging in takes more than 60 seconds, etc. etc.


Under normal circumstances my application is really fast, vacuuming 
other tables is no problem, only the bytea table is really awkward


I hope some of you performance cracks can help me...


this is my table definition:

Table »public.binary_cache«
 Column  | Type| Attributes
--+-+---
cache_id | bigint  | not null
date | timestamp without time zone |
data | bytea   |

Indexe:
   »binary_cache_pkey« PRIMARY KEY, btree (cache_id)


Thanks in advance for any hints!

--
Bastian Voigt
Neumünstersche Straße 4
20251 Hamburg
telefon +49 - 40  - 67957171
mobil   +49 - 179 - 4826359



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[PERFORM] general PG network slowness (possible cure) (repost)

2007-05-25 Thread Peter T. Breuer

I set up pg to replace a plain gdbm database for my application.  But
even running to the same machine, via a unix socket

   * the pg database ran 100 times slower 

Across the net it was

  * about 500 to 1000 times slower than local gdbm
 
with no cpu use to speak of.

I'd heard that networked databases are slow.  I might have left it at
that if curiosity hadn't led me to write a network server for gdbm
databases, and talk to _that_ just to get a comparison.

Lo and behold and smack me with a corncob if it wasn't _slower_ than pg.

On a whim I mapped the network bandwidth per packet size with the NPtcp
suite, and got surprising answers ..  at 1500B, naturally, the bandwidth
was the full 10Mb/s (minus overheads, say 8.5Mb/s) of my pathetic little
local net.  At 100B the bandwidth available was only 25Kb/s.  At 10B,
you might as well use tin cans and taut string instead.

I also mapped the network flows using ntop, and yes, the average packet
size for both gdbm and pg in one direction was only about 100B or
so.  That's it!  Clearly there are a lot of short queries going out and
the answers were none too big either ( I had a LIMIT 1 in all my PG
queries).

About 75% of traffic was in the 64-128B range while my application was
running, with the peak bandwidth in that range being about 75-125Kb/s
(and I do mean bits, not bytes).

S ... I took a look at my implementation of remote gdbm, and did
a very little work to aggregate outgoing transmissions together into
lumps.  Three lines added in two places.  At the level of the protocol
where I could tell how long the immediate conversation segment would be,
I "corked" the tcp socket before starting the segment and "uncorked" it
after the segment (for "cork", see tcp(7), setsockopt(2) and TCP_CORK in
linux).

Surprise, ...  I got a speed up of hundreds of times.  The same application
that crawled under my original rgdbm implementation and under PG now
maxed out the network bandwidth at close to a full 10Mb/s and 1200
pkts/s, at 10% CPU on my 700MHz client, and a bit less on the 1GHz
server.

So

  * Is that what is holding up postgres over the net too?  Lots of tiny
packets?

And if so

  * can one fix it the way I fixed it for remote gdbm?

The speedup was hundreds of times. Can someone point me at the relevant
bits of pg code? A quick look seems to say that fe-*.c is 
interesting. I need to find where the actual read and write on the
conn->sock is done.

Very illuminating gnuplot outputs available on request.

Peter


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] How PostgreSQL handles multiple DDBB instances?

2007-05-25 Thread Arnau

Hi all,

  I have a doubt/problem about how PostgreSQL handles multiple DDBB 
instances running on a same server and how I should design the 
architecture of an application.


  I have an application that works with multiple customers. Thinking in 
scalability we are thinking in applying the following approaches:


  - Create a separate database instance for each customer.
  - We think that customer's DB will be quite small, about 200MB as 
average.

  - The number of clients, then DDBB, can be significant(thousands).
  - Have as many customers as possible on the same server, so a single 
server could have more than 300 DDBB instances.



  Do you think this makes sense? or taking into account that the 
expected DDBB size, would be better to join several customers DDBB in 
just one instance. What I'm worried about is, if having so many DDBB 
instances PostgreSQL's performance would be worse.


 I have been following the list and one of the advises that appears 
more often is keep your DB in memory, so if I have just one instance 
instead of "hundreds" the performance will be better?


Thank you very much
--
Arnau

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Big problem with sql update operation

2007-05-25 Thread Michal Szymanski
Recently I've wrote few pgSql procedures that generates invoices and 
store it in postgres table. Small test has shown that there is 
performance problem. I've thought that string operation in pgsql are not 
perfect but it has appeared that 90% of time program waste on very 
simple update.

Below is my simplified procedures:

CREATE TABLE group_fin_account_tst (
  group_fin_account_tst_id BIGSERIAL PRIMARY KEY,
  credit NUMERIC(8,2) DEFAULT 0.00 NOT NULL
) ;  ALTER TABLE group_fin_account_tst OWNER TO freeconetadm;

INSERT INTO group_fin_account_tst
(credit) VALUES (4);

CREATE OR REPLACE FUNCTION test()
RETURNS void AS
$BODY$
DECLARE
BEGIN  
FOR v_i IN 1..4000 LOOP

   UPDATE group_fin_account_tst SET
   credit = v_i
WHERE group_fin_account_tst_id = 1;   -- for real procedure I 
update different rows


END LOOP;
END; $BODY$ LANGUAGE 'plpgsql' VOLATILE;
ALTER FUNCTION test()  OWNER TO freeconetadm;
select test();

The strange thing is how program behave  when I increase number of 
iteration.

Below my results (where u/s is number of updates per second)

On windows
500  - 0.3s(1666u/s)
1000 - 0.7s  (1428u/s)
2000 - 2.3s  (869u/s)
4000 - 9s (444u/s)
8000 -29s (275u/s)
16000-114s (14u/s)

On linux:
500  - 0.5s(1000u/s)
1000 - 1.8s  (555u/s)
2000 - 7.0s  (285u/s)
4000 - 26s (153u/s)
8000 -101s (79u/s)
16000-400s (40u/s)

On both systems relation between number of iteration and time is 
strongly nonlinear! Do you know what is a problem? Is it possible to 
commit transaction inside  pgsql procedure because I think that maybe 
transaction is too long?


Regards
Michal Szymanski
http://blog.szymanskich.net

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13)

2007-05-25 Thread Richard Huxton

Bastian Voigt wrote:


OK, I'll give that a try. What about pg_autovacuum then? Is it a problem
when two processes try to vacuum the same table in parallel? Or do I
need to deactivate autovacuum altogether?


I was about to say that you can tune pg_autovacuum, but I just checked 
your original post and you're running 8.0.x - not sure about that one.


You'll have to check the documentation for that version to see if you 
can either:

1. exclude that table from pg_autovacuum
2. increase pg_autovacuum's sensitivity

If not, and this table is the most active, it might be simpler just to 
run your own vacuum-ing from a cron job.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] general PG network slowness (possible cure) (repost)

2007-05-25 Thread Richard Huxton

Peter T. Breuer wrote:

I set up pg to replace a plain gdbm database for my application.  But
even running to the same machine, via a unix socket

   * the pg database ran 100 times slower 


For what operations? Bulk reads? 19-way joins?


Across the net it was

  * about 500 to 1000 times slower than local gdbm
 
with no cpu use to speak of.


Disk-intensive or memory intensive?


I'd heard that networked databases are slow.  I might have left it at
that if curiosity hadn't led me to write a network server for gdbm
databases, and talk to _that_ just to get a comparison.

Lo and behold and smack me with a corncob if it wasn't _slower_ than pg.

On a whim I mapped the network bandwidth per packet size with the NPtcp
suite, and got surprising answers ..  at 1500B, naturally, the bandwidth
was the full 10Mb/s (minus overheads, say 8.5Mb/s) of my pathetic little
local net.  At 100B the bandwidth available was only 25Kb/s.  At 10B,
you might as well use tin cans and taut string instead.


This sounds like you're testing a single connection. You would expect 
"dead time" to dominate in that scenario. What happens when you have 50 
simultaneous connections? Or do you think it's just packet overhead?



I also mapped the network flows using ntop, and yes, the average packet
size for both gdbm and pg in one direction was only about 100B or
so.  That's it!  Clearly there are a lot of short queries going out and
the answers were none too big either ( I had a LIMIT 1 in all my PG
queries).


I'm not sure that 100B query-results are usually the bottleneck.
Why would you have LIMIT 1 on all your queries?


About 75% of traffic was in the 64-128B range while my application was
running, with the peak bandwidth in that range being about 75-125Kb/s
(and I do mean bits, not bytes).


None of this sounds like typical database traffic to me. Yes, there are 
lots of small result-sets, but there are also typically larger (several 
kilobytes) to much larger (10s-100s KB).



S ... I took a look at my implementation of remote gdbm, and did
a very little work to aggregate outgoing transmissions together into
lumps.  Three lines added in two places.  At the level of the protocol
where I could tell how long the immediate conversation segment would be,
I "corked" the tcp socket before starting the segment and "uncorked" it
after the segment (for "cork", see tcp(7), setsockopt(2) and TCP_CORK in
linux).


I'm a bit puzzled, because I'd have thought the standard Nagle algorithm 
would manage this gracefully enough for short-query cases. There's no 
way (that I know of) for a backend to handle more than one query at a time.



Surprise, ...  I got a speed up of hundreds of times.  The same application
that crawled under my original rgdbm implementation and under PG now
maxed out the network bandwidth at close to a full 10Mb/s and 1200
pkts/s, at 10% CPU on my 700MHz client, and a bit less on the 1GHz
server.

So

  * Is that what is holding up postgres over the net too?  Lots of tiny
packets?


I'm not sure your setup is typical, interesting though the figures are. 
Google a bit for pg_bench perhaps and see if you can reproduce the 
effect with a more typical load. I'd be interested in being proved wrong.



And if so

  * can one fix it the way I fixed it for remote gdbm?

The speedup was hundreds of times. Can someone point me at the relevant
bits of pg code? A quick look seems to say that fe-*.c is 
interesting. I need to find where the actual read and write on the

conn->sock is done.


You'll want to look in backend/libpq and interfaces/libpq I think 
(although I'm not a developer).


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13)

2007-05-25 Thread Bastian Voigt

Richard Huxton wrote:


I was about to say that you can tune pg_autovacuum, but I just checked 
your original post and you're running 8.0.x - not sure about that one.
The system catalog pg_autovacuum which allows finetuning autovacuum at 
table level was introduced in 8.1 :-(


You'll have to check the documentation for that version to see if you 
can either:

1. exclude that table from pg_autovacuum
2. increase pg_autovacuum's sensitivity
(1) seems to be impossible (correct me if I'm wrong..), so maybe I'll go 
for (2) ...


If not, and this table is the most active, it might be simpler just to 
run your own vacuum-ing from a cron job.
Well, it is one of the most active, but there are others. pg_autovacuum 
seems to do a very good job, apart from this one table...



--
Bastian Voigt
Neumünstersche Straße 4
20251 Hamburg
telefon +49 - 40  - 67957171
mobil   +49 - 179 - 4826359



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13)

2007-05-25 Thread Richard Huxton

Bastian Voigt wrote:

Hi *,
for caching large autogenerated XML files, I have created a bytea table 
in my database so that the cached files can be used by multiple servers. 
There are about 500 rows and 10-20 Updates per minute on the table. The 
files stored in the bytea are anything from 10kB to 10MB. My PostgreSQL 
version is 8.0.13 on Gentoo Linux (x86) with PostGIS 1.2.0.


For vacuum I use the pg_autovacuum daemon. It decided to vacuum my cache 
table about every 3 hours, the vacuum process takes 20-30 minutes 
(oops!) every time.


Try vacuuming every 3 minutes and see what happens.

--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13)

2007-05-25 Thread Richard Huxton

Bastian Voigt wrote:

Richard Huxton wrote:


I was about to say that you can tune pg_autovacuum, but I just checked 
your original post and you're running 8.0.x - not sure about that one.
The system catalog pg_autovacuum which allows finetuning autovacuum at 
table level was introduced in 8.1 :-(


Hmm - thought it might have been :-(

You'll have to check the documentation for that version to see if you 
can either:

1. exclude that table from pg_autovacuum
2. increase pg_autovacuum's sensitivity
(1) seems to be impossible (correct me if I'm wrong..), so maybe I'll go 
for (2) ...


No, the per-table stuff was via the system table.

If not, and this table is the most active, it might be simpler just to 
run your own vacuum-ing from a cron job.
Well, it is one of the most active, but there are others. pg_autovacuum 
seems to do a very good job, apart from this one table...


Do you have any settings in your postgresql.conf? Failing that, you 
would have to poke around the source.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


My quick and dirty "solution" (Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13))

2007-05-25 Thread Bastian Voigt
No, this did not help. The vacuum process is still running far too long 
and makes everything slow. It is even worse than before, cause now the 
system is slow almost all the time while when vacuuming only every 3 
hours it is only slow once every three hours.



I now did the following. Well, no comment.


Shellscript A:

while true
do
   psql -U $user -d $database -c "vacuum analyze verbose binary_cache"
   echo "Going to sleep"
   sleep 60
done


Shellscript B:

while true
do
   ps aux > $tempfile
   numwaiting=`grep UPDATE.waiting $tempfile | grep -c -v grep`
   echo "Number of waiting updates: $numwaiting"

   vacuumpid=`grep VACUUM $tempfile| grep -v grep | awk '{print $2}'`
   echo "PID of vacuum process: $vacuumpid"

   if [ $numwaiting -gt 5 ]
   then
   echo "Too many waiting transactions, killing vacuum 
process $vacuumpid..."

   kill $vacuumpid
   fi
   echo "Sleeping 30 Seconds"
   sleep 30
done

--
Bastian Voigt
Neumünstersche Straße 4
20251 Hamburg
telefon +49 - 40  - 67957171
mobil   +49 - 179 - 4826359



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] general PG network slowness (possible cure) (repost)

2007-05-25 Thread Steinar H. Gunderson
On Fri, May 25, 2007 at 10:50:58AM +0200, Peter T. Breuer wrote:
> I set up pg to replace a plain gdbm database for my application.

Postgres and gdbm are completely different. You want to rethink your queries
so each does more work, instead of running a zillion of them over the network.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: My quick and dirty "solution" (Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13))

2007-05-25 Thread Kristo Kaiv

you should first cluster the table on primary key.
The table is probably already bloated from the 3 hr delay it had before.
First
CLUSTER "primary key index name" ON group_fin_account_tst;
Then
vacuum it every 3 minutes.
NB! clustering takes an access exclusive lock on table

Kristo

On 25.05.2007, at 15:30, Bastian Voigt wrote:

No, this did not help. The vacuum process is still running far too  
long and makes everything slow. It is even worse than before, cause  
now the system is slow almost all the time while when vacuuming  
only every 3 hours it is only slow once every three hours.



I now did the following. Well, no comment.


Shellscript A:

while true
do
   psql -U $user -d $database -c "vacuum analyze verbose binary_cache"
   echo "Going to sleep"
   sleep 60
done


Shellscript B:

while true
do
   ps aux > $tempfile
   numwaiting=`grep UPDATE.waiting $tempfile | grep -c -v grep`
   echo "Number of waiting updates: $numwaiting"

   vacuumpid=`grep VACUUM $tempfile| grep -v grep | awk '{print  
$2}'`

   echo "PID of vacuum process: $vacuumpid"

   if [ $numwaiting -gt 5 ]
   then
   echo "Too many waiting transactions, killing vacuum  
process $vacuumpid..."

   kill $vacuumpid
   fi
   echo "Sleeping 30 Seconds"
   sleep 30
done

--
Bastian Voigt
Neumünstersche Straße 4
20251 Hamburg
telefon +49 - 40  - 67957171
mobil   +49 - 179 - 4826359



---(end of  
broadcast)---

TIP 5: don't forget to increase your free space map settings



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] general PG network slowness (possible cure) (repost)

2007-05-25 Thread Peter T. Breuer
"Also sprach Kenneth Marshall:"
> > Surprise, ...  I got a speed up of hundreds of times.  The same application
> > that crawled under my original rgdbm implementation and under PG now
> > maxed out the network bandwidth at close to a full 10Mb/s and 1200
> > pkts/s, at 10% CPU on my 700MHz client, and a bit less on the 1GHz
> > server.
> > 
> > So
> > 
> >   * Is that what is holding up postgres over the net too?  Lots of tiny
> > packets?
> 
> 
> This effect is very common, but you are in effect altering the query/

I imagined so, but no, I am not changing the behaviour - I believe you
are imagining something different here.  Let me explain.

It is usually the case that drivers and the network layer conspire to
emit packets when they are otherwise idle, since they have nothing
better to do.  That is, if the transmission unit is the normal 1500B and
there is 200B in the transmission buffer and nothing else is frisking
them about the chops, something along the line will shrug and say, OK,
I'll just send out a 200B fragment now, apologize, and send out another
fragment later if anything else comes along for me to chunter out.

It is also the case that drivers do the opposite .. that is, they do
NOT send out packets when the transmission buffer is full, even if they
have 1500B worth. Why? Well, on Ge for sure, and on 100BT most of the
time, it doesn't pay to send out individual packets because the space
required between packets is relatively too great to permit the network
to work at that speed  given the speed of light as it is, and the
spacing it implies between packets (I remember when I advised the
networking protocol people that Ge was a coming thing about 6 years
ago, they all protested and said it was _physically_ impossible. It is.
If you send packets one by one!).  An ethernet line is fundamentally
only electrical and only signals up or down (relative) and needs time to
quiesce. And then there's the busmastering .. a PCI bus is only about
33MHz, and 32 bits wide (well, or 16 on portables, or even 64, but
you're getting into heavy server equipment then).  That's 128MB/s in
one direction, and any time one releases the bus there's a re-setup time
that costs the earth and will easily lower bandwidth by 75%. So drivers
like to take the bus for a good few packets at a time. Even a single
packet (1500B) will take 400 multi-step bus cycles to get to the
card, and then it's a question of how much onboard memory it has or
whether one has to drive it synchronously. Most cards have something
like a 32-unit ring buffer, and I think each unit is considerable.

Now, if a driver KNOWS what's coming then it can alter its behavior in
order to mesh properly with the higher level layers. What I did was
_tell_ the driver and the protocol not to send any data until I well
and truly tell it to, and then told it to, when I was ready.  The result
is that a full communication unit (start, header, following data, and
stop codon) was sent in one blast.

That meant that there were NO tiny fragments blocking up the net, being
sent wily-nily. And it also meant that the driver was NOT waiting for
more info to come in before getting bored and sending out what it had.
It did as I told it to.

The evidence from monitoring the PG network thruput is that 75% of its
packets are in the 64-128B range, including tcp header. That's hitting
the 100Kb/s (10KB/s) bandwidth regime on my network at the lower end.
It will be even _worse_ on a faster net, I think (feel free to send me a
faster net to compare with :). 

I also graphed latency, but I haven't taken into account the results as
the bandwidth measurements were so striking.

> response behavior of the database. Most applications expect an answer
> from the database after every query.

Well of course. Nothing else would work! (I imagine you have some kind
of async scheme, but I haven't investigated). I ask, the db replies. I
ask, the db replies. What I did was

  1) made the ASK go out as one lump.
  2) made the REPLY go out as one lump
  3) STOPPED the card waiting for several replies or asks to accumulate
 before sending out anything at all.

> If it could manage retrying failed
> queries later, you could use the typical sliding window/delayed ack
> that is so useful in improving the bandwidth utilization of many network

That is not what is going on (though that's not a bad idea). See
above for the explanation. One has to take into account the physical
hardware involved and its limitations, and arrange the communications
accordingly. All I did was send EACH query and EACH response as a
single unit, at the hardware level. 

One could do better still by managing _several_ threads communications
at once.

> programs. Maybe an option in libpq to tell it to use delayed "acks". I
> do not know what would be involved.

Nothing spectacular is required to see a considerable improvement, I
think,.  apart from a little direction from the high level protocol down
to the driver about where 

Re: My quick and dirty "solution" (Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13))

2007-05-25 Thread Richard Huxton

Bastian Voigt wrote:
No, this did not help. The vacuum process is still running far too long 
and makes everything slow. It is even worse than before, cause now the 
system is slow almost all the time while when vacuuming only every 3 
hours it is only slow once every three hours.


Could you check the output of vacuum verbose on that table and see how 
much work it's doing? I'd have thought the actual bytea data would be 
TOASTed away to a separate table for storage, leaving the vacuum with 
very little work to do.


It might well be your actual problem is your disk I/O is constantly 
saturated and the vacuum just pushes it over the edge. In which case 
you'll either need more/better disks or to find a quiet time once a day 
to vacuum and just do so then.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] general PG network slowness (possible cure) (repost)

2007-05-25 Thread Peter T. Breuer
"Also sprach Richard Huxton:"
[Charset ISO-8859-1 unsupported, filtering to ASCII...]
> Peter T. Breuer wrote:
> > I set up pg to replace a plain gdbm database for my application.  But
> > even running to the same machine, via a unix socket
> > 
> >* the pg database ran 100 times slower 
> 
> For what operations? Bulk reads? 19-way joins?

The only operations being done are simple "find the row with this key",
or "update the row with this key". That's all. The queries are not an
issue (though why the PG thread choose to max out cpu when it gets the
chance to do so through a unix socket, I don't know).

> > Across the net it was
> > 
> >   * about 500 to 1000 times slower than local gdbm
> >  
> > with no cpu use to speak of.
> 
> Disk-intensive or memory intensive?

There is no disk as such...  it's running on a ramdisk at the server
end.  But assuming you mean i/o, i/o was completely stalled.  Everything
was idle, all waiting on the net.

> > On a whim I mapped the network bandwidth per packet size with the NPtcp
> > suite, and got surprising answers ..  at 1500B, naturally, the bandwidth
> > was the full 10Mb/s (minus overheads, say 8.5Mb/s) of my pathetic little
> > local net.  At 100B the bandwidth available was only 25Kb/s.  At 10B,
> > you might as well use tin cans and taut string instead.
> 
> This sounds like you're testing a single connection. You would expect 
> "dead time" to dominate in that scenario. What happens when you have 50 

Indeed, it is single, because that's my application. I don't have 
50 simultaneous connections.  The use of the database is as a permanent
storage area for the results of previous analyses (static analysis of
the linux kernel codes) from a single client.

Multiple threads accessing at the same time might help keep the network
drivers busier, which would help. They would always see their buffers
filling at an even rate and be able to send out groups of packets at
once.

> simultaneous connections? Or do you think it's just packet overhead?

It's not quite overhead in the sense of the logical layer. It's a
physical layer thing.  I replied in another mail on this thread, but in
summary, tcp behaves badly with small packets on ethernet, even on a
dedicated line (as this was). One needs to keep it on a tight rein.

> > I also mapped the network flows using ntop, and yes, the average packet
> > size for both gdbm and pg in one direction was only about 100B or
> > so.  That's it!  Clearly there are a lot of short queries going out and
> > the answers were none too big either ( I had a LIMIT 1 in all my PG
> > queries).
> 
> I'm not sure that 100B query-results are usually the bottleneck.
> Why would you have LIMIT 1 on all your queries?

Because there is always only one answer to the query, according to the
logic.  So I can always tell the database manager to stop looking after
one, which will always help it.

> > About 75% of traffic was in the 64-128B range while my application was
> > running, with the peak bandwidth in that range being about 75-125Kb/s
> > (and I do mean bits, not bytes).
> 
> None of this sounds like typical database traffic to me. Yes, there are 
> lots of small result-sets, but there are also typically larger (several 
> kilobytes) to much larger (10s-100s KB).

There's none here.

> > S ... I took a look at my implementation of remote gdbm, and did
> > a very little work to aggregate outgoing transmissions together into
> > lumps.  Three lines added in two places.  At the level of the protocol
> > where I could tell how long the immediate conversation segment would be,
> > I "corked" the tcp socket before starting the segment and "uncorked" it
> > after the segment (for "cork", see tcp(7), setsockopt(2) and TCP_CORK in
> > linux).
> 
> I'm a bit puzzled, because I'd have thought the standard Nagle algorithm 
> would manage this gracefully enough for short-query cases. There's no 

On the contrary, Nagle is also often wrong here because it will delay
sending in order to accumulate more data into buffers when only a little
has arrived, then give up when no more data arrives to be sent out, then
send out the (short) packet anyway, late. There's no other traffic
apart from my (single thread) application.

What we want is to direct the sending exactly,n this situation saying
when to not send, and when to send.  Disable Nagle for a start, use
async read (noblock), and sync write, with sends from the socket blocked
from initiation of a message until the whole message is ready to be sent
out.  Sending the message piecemeal just hurts too.

> way (that I know of) for a backend to handle more than one query at a time.

That's not the scenario.

> > Surprise, ...  I got a speed up of hundreds of times.  The same application
> > that crawled under my original rgdbm implementation and under PG now
> > maxed out the network bandwidth at close to a full 10Mb/s and 1200
> > pkts/s, at 10% CPU on my 700MHz client, and a bit less on the 1GHz
> > server.
> >

Re: [PERFORM] general PG network slowness (possible cure) (repost)

2007-05-25 Thread Richard Huxton

Peter T. Breuer wrote:


The only operations being done are simple "find the row with this key",
or "update the row with this key". That's all. The queries are not an
issue (though why the PG thread choose to max out cpu when it gets the
chance to do so through a unix socket, I don't know).



There is no disk as such...  it's running on a ramdisk at the server
end.  But assuming you mean i/o, i/o was completely stalled.  Everything
was idle, all waiting on the net.


Indeed, it is single, because that's my application. I don't have 
50 simultaneous connections.  The use of the database is as a permanent

storage area for the results of previous analyses (static analysis of
the linux kernel codes) from a single client.


I'm not sure your setup is typical, interesting though the figures are. 
Google a bit for pg_bench perhaps and see if you can reproduce the 
effect with a more typical load. I'd be interested in being proved wrong.


But the load is typical HERE. The application works well against gdbm
and I was hoping to see speedup from using a _real_ full-fledged DB
instead.


I'm not sure you really want a full RDBMS. If you only have a single 
connection and are making basic key-lookup queries then 90% of 
PostgreSQL's code is just getting in your way. Sounds to me like gdbm 
(or one of its alternatives) is a good match for you. Failing that, 
sqlite is probably the next lowest-overhead solution.


Of course, if you want to have multiple clients interacting and 
performing complex 19-way joins on gigabyte-sized tables with full-text 
indexing and full transaction control then you *do* want a RDBMS.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: My quick and dirty "solution" (Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13))

2007-05-25 Thread Alvaro Herrera
Bastian Voigt wrote:
> No, this did not help. The vacuum process is still running far too long 
> and makes everything slow. It is even worse than before, cause now the 
> system is slow almost all the time while when vacuuming only every 3 
> hours it is only slow once every three hours.
> 
> 
> I now did the following. Well, no comment.

Killing the vacuum mid-process doesn't help you, because the table will
be in a sorrier state than it was when it started.

I think it would be better if you:

1. Revert pg_autovacuum changes so that it processes every 3 hours or
whatever, like you had at the start of this thread.  Or maybe less.
That one will take care of the _other_ tables.

2. Vacuum the bytea table manually more often, say every 10 minutes or
so (vacuum, sleep 10m, goto start).  Make sure this is done with an
appropriate vacuum_cost_delay setting (and related settings).

3. Raise max_fsm_pages so that a lot of pages with free space can be
recorded for that table


The point here is that vacuuming the bytea table can take a long time
due to vacuum_cost_delay, but it won't affect the rest of the system;
regular operation will continue to run at (almost) normal speed.  Having
a big number of free pages ensures that the free space in the table is
not "lost".

Also, you may want to reindex that table once, because with so many
killing vacuums you have probably screwed up the indexes big time (maybe
cluster it once instead of reindexing, because that will compact the
heap as well as the indexes).

Another recommendation is to upgrade to 8.2.4 which is faster and has
a better autovacuum.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] general PG network slowness (possible cure) (repost)

2007-05-25 Thread Peter T. Breuer
"Also sprach Richard Huxton:"
> I'm not sure you really want a full RDBMS. If you only have a single 
> connection and are making basic key-lookup queries then 90% of 
> PostgreSQL's code is just getting in your way. Sounds to me like gdbm 

Yep - I could happily tell it not to try and compile a special lookup
scheme each time, for example!  (how that?). I could presumably also
help it by preloading the commands I will run and sending over the 
params only with a "do a no. 17 now!".

> (or one of its alternatives) is a good match for you. Failing that, 
> sqlite is probably the next lowest-overhead solution.

Not a bad idea. but PG _will_ be useful when folk come to analyse the
result of the analyses being done. What is slow is getting the data
into the database now via simple store, fetch and update.

> Of course, if you want to have multiple clients interacting and 
> performing complex 19-way joins on gigabyte-sized tables with full-text 

Well, the dbs are in the tens of MB from a single run over a single
file (i.e analysis of a single 30KLOC source).  The complete analysis
space is something like 4000 times that, for 4300 C files in the linux
kernel source. And then there is all the linux kernel versions. Then
there is godzilla and apache source ..

> indexing and full transaction control then you *do* want a RDBMS.

We want one anyway. The problem is filling the data and the simple
fetch and update queries on it.

I really think it would be worthwhile getting some developer to tell me
where the network send is done in PG.

Peter

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13)

2007-05-25 Thread Tom Lane
Bastian Voigt <[EMAIL PROTECTED]> writes:
> Now my big big problem is that the database gets really really slow 
> during these 20 minutes and after the vacuum process is running for a 
> short time, many transactions show state "UPDATE waiting" in the process 
> list. In my Java application server I sometimes get tons of deadlock 
> Exceptions (waiting on ShareLock blahblah). The web frontend gets nearly 
> unusable, logging in takes more than 60 seconds, etc. etc.

Hmm.  That's a bit weird --- what are they waiting on exactly?  Look in
pg_locks to see what the situation is.  A vacuum per se ought not be
blocking any updates.

Aside from the recommendation to make the vacuums happen more frequently
instead of less so, you should experiment with vacuum_cost_delay and
related parameters.  The idea is to reduce vacuum's I/O load so that it
doesn't hurt foreground response time.  This means any individual vacuum
will take longer, but you won't need to care.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: My quick and dirty "solution" (Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13))

2007-05-25 Thread Bastian Voigt

Richard Huxton wrote:
Could you check the output of vacuum verbose on that table and see how 
much work it's doing? I'd have thought the actual bytea data would be 
TOASTed away to a separate table for storage, leaving the vacuum with 
very little work to do.
I'm quite new to postgres (actually I just ported our running 
application from MySQL...), so I don't know what toast means. But I 
noticed that vacuum also tried to cleanup some "toast" relations or so. 
This was what took so long.


It might well be your actual problem is your disk I/O is constantly 
saturated and the vacuum just pushes it over the edge. In which case 
you'll either need more/better disks or to find a quiet time once a 
day to vacuum and just do so then.
Yes, that was definitely the case. But now everything runs smoothly 
again, so I don't think I need to buy new disks.


Regards
Bastian


--
Bastian Voigt
Neumünstersche Straße 4
20251 Hamburg
telefon +49 - 40  - 67957171
mobil   +49 - 179 - 4826359



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] general PG network slowness (possible cure) (repost)

2007-05-25 Thread Alvaro Herrera
Peter T. Breuer escribió:

> I really think it would be worthwhile getting some developer to tell me
> where the network send is done in PG.

See src/backend/libpq/pqcomm.c (particularly internal_flush()).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: My quick and dirty "solution" (Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13))

2007-05-25 Thread Bastian Voigt

Kristo Kaiv wrote:

you should first cluster the table on primary key.
The table is probably already bloated from the 3 hr delay it had before.
First
CLUSTER "primary key index name" ON group_fin_account_tst;
Then
vacuum it every 3 minutes.
NB! clustering takes an access exclusive lock on table

Kristo,
thanks a bunch!!
This was the solution...
The cluster operation took about 60sec, and after it was done the vacuum 
finished in only 10sec. or so, with no noticeable performance 
bottleneck. Now vacuum is running every 2-3 minutes and makes no problems.


Hhhh, now I can look forward to a laid-back weekend..

Richard, Kristo, Alvaro, thanks 1000 times for responding so quickly

:-)

--
Bastian Voigt
Neumünstersche Straße 4
20251 Hamburg
telefon +49 - 40  - 67957171
mobil   +49 - 179 - 4826359



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] general PG network slowness (possible cure) (repost)

2007-05-25 Thread Tom Lane
"Peter T. Breuer" <[EMAIL PROTECTED]> writes:
> S ... I took a look at my implementation of remote gdbm, and did
> a very little work to aggregate outgoing transmissions together into
> lumps.

We do that already --- for a simple query/response such as you are
describing, each query cycle will involve one physical client->server
message followed by one physical server->client message.  The only way
to aggregate more is for the application code to merge queries together.

Migrating a dbm-style application to a SQL database is often a real
pain, precisely because the application is designed to a mindset of
"fetch one record, manipulate it, update it", where "fetch" and "update"
are assumed to be too stupid to do any of the work for you.  The way
to get high performance with a SQL engine is to push as much of the work
as you can to the database side, and let the engine process multiple
records per query; and that can easily mean rewriting the app from the
ground up :-(

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] general PG network slowness (possible cure) (repost)

2007-05-25 Thread Richard Huxton

Peter T. Breuer wrote:

"Also sprach Richard Huxton:"
I'm not sure you really want a full RDBMS. If you only have a single 
connection and are making basic key-lookup queries then 90% of 
PostgreSQL's code is just getting in your way. Sounds to me like gdbm 


Yep - I could happily tell it not to try and compile a special lookup
scheme each time, for example!  (how that?). I could presumably also
help it by preloading the commands I will run and sending over the 
params only with a "do a no. 17 now!".


PREPARE/EXECUTE (or the equivalent libpq functions).
Also - if you can have multiple connections to the DB you should be able 
to have several queries running at once.


(or one of its alternatives) is a good match for you. Failing that, 
sqlite is probably the next lowest-overhead solution.


Not a bad idea. but PG _will_ be useful when folk come to analyse the
result of the analyses being done. What is slow is getting the data
into the database now via simple store, fetch and update.


I'd have an hourly/daily bulk-load running from the simple system into 
PG. If you have to search all the data from your app that's not 
practical of course.


Of course, if you want to have multiple clients interacting and 
performing complex 19-way joins on gigabyte-sized tables with full-text 


Well, the dbs are in the tens of MB from a single run over a single
file (i.e analysis of a single 30KLOC source).  The complete analysis
space is something like 4000 times that, for 4300 C files in the linux
kernel source. And then there is all the linux kernel versions. Then
there is godzilla and apache source ..


If you're doing some sort of token analysis on source-code you probably 
want to look into how tsearch2 / trigram / Gist+GIN indexes work. It 
might be that you're doing work in your app that the DB can handle for you.



indexing and full transaction control then you *do* want a RDBMS.


We want one anyway. The problem is filling the data and the simple
fetch and update queries on it.


OK


I really think it would be worthwhile getting some developer to tell me
where the network send is done in PG.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Big problem with sql update operation

2007-05-25 Thread Tom Lane
Michal Szymanski <[EMAIL PROTECTED]> writes:
> CREATE OR REPLACE FUNCTION test()
> RETURNS void AS
> $BODY$
> DECLARE
> BEGIN  
> FOR v_i IN 1..4000 LOOP
> UPDATE group_fin_account_tst SET
> credit = v_i
>  WHERE group_fin_account_tst_id = 1;   -- for real procedure I 
> update different rows

Does updating the *same* record 4000 times per transaction reflect the
real behavior of your application?  If not, this is not a good
benchmark.  If so, consider redesigning your app to avoid so many
redundant updates.

(For the record, the reason you see nonlinear degradation is the
accumulation of tentatively-dead versions of the row, each of which has
to be rechecked by each later update.)

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] How PostgreSQL handles multiple DDBB instances?

2007-05-25 Thread Tom Lane
Arnau <[EMAIL PROTECTED]> writes:
>I have an application that works with multiple customers. Thinking in 
> scalability we are thinking in applying the following approaches:

>- Create a separate database instance for each customer.
>- We think that customer's DB will be quite small, about 200MB as 
> average.
>- The number of clients, then DDBB, can be significant(thousands).
>- Have as many customers as possible on the same server, so a single 
> server could have more than 300 DDBB instances.

This is probably a bad idea, unless each customer's performance demands
are so low that you can afford to use very small shared-memory settings
for each instance.  But even small settings will probably eat ~10MB per
instance --- can you afford to build these machines with multiple GB of
RAM?

Can you instead run things with one postmaster per machine and one
database per customer within that instance?  From a performance
perspective this is likely to work much better.

If you desire to give the customers database-superuser capability then
this probably won't do, but if they are restricted users it might be OK.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] LIKE search and performance

2007-05-25 Thread mark
On Fri, May 25, 2007 at 09:13:25AM +0100, Richard Huxton wrote:
> [EMAIL PROTECTED] wrote:
> >>And since it's basically impossible to know the selectivity of this kind
> >>of where condition, I doubt the planner would ever realistically want to
> >>choose that plan anyway because of its poor worst-case behavior.
> >What is a real life example where an intelligent and researched
> >database application would issue a like or ilike query as their
> >primary condition in a situation where they expected very high
> >selectivity?
> >Avoiding a poor worst-case behaviour for a worst-case behaviour that
> >won't happen doesn't seem practical.
> But if you are also filtering on e.g. date, and that has an index with 
> good selectivity, you're never going to use the text index anyway are 
> you? If you've only got a dozen rows to check against, might as well 
> just read them in.
> The only time it's worth considering the behaviour at all is *if* the 
> worst-case is possible.

I notice you did not provide a real life example as requested. :-)

This seems like an ivory tower restriction. Not allowing best performance
in a common situation vs not allowing worst performance in a not-so-common
situation.

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] general PG network slowness (possible cure) (repost)

2007-05-25 Thread Peter T. Breuer
"Also sprach Alvaro Herrera:"
> > I really think it would be worthwhile getting some developer to tell me
> > where the network send is done in PG.
> 
> See src/backend/libpq/pqcomm.c (particularly internal_flush()).

Yes. Thanks. That looks like it. It calls secure_write continually
until the buffer is empty.

Secure_write is located ibe-secure.c, but I'm not using ssl, so the 
call reduces to just

n = send(port->sock, ptr, len, 0);

And definitely all those could be grouped if there are several to do.
But under normal circumstances the send will be pushing against a
lttle resistance (the copy to the driver/protocol stack buffer is faster
than the physical network send, by a ratio of GB/s to MB/s, or 1000 to
1), and thus all these sends will probably complete as a single unit
once they have been started.

It's worth a try.  I thought first this may be too low level, but it
looks as though internal_flush is only triggered when some other buffer
is full, or deliberately, so it may be useful to block until it fires.

I'll try it.


Peter

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] general PG network slowness (possible cure) (repost)

2007-05-25 Thread Tom Lane
"Peter T. Breuer" <[EMAIL PROTECTED]> writes:
> And definitely all those could be grouped if there are several to do.

Except that in the situation you're describing, there's only a hundred
or two bytes of response to each query, which means that only one send()
will occur anyway.  (The flush call comes only when we are done
responding to the current client query.)

It's possible that for bulk data transmission situations we could
optimize things a bit better --- in particular I've wondered whether we
can reliably find out the MTU of the connection and use that as the
output buffer size, instead of trusting the kernel to choose the best
message boundaries --- but for the situation you're worried about
there will be only one send.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: My quick and dirty "solution" (Re: [PERFORM] Performance P roblem with Vacuum of bytea table (PG 8.0.13))

2007-05-25 Thread Andreas Kostyrka
TOASTed means storage outside of the main table. But AFAIK, only rows bigger 2K 
are considered for toasting.

Andreas

-- Ursprüngl. Mitteil. --
Betreff:Re: My quick and dirty "solution" (Re: [PERFORM] Performance 
Problem with Vacuum of bytea table (PG 8.0.13))
Von:Bastian Voigt <[EMAIL PROTECTED]>
Datum:  25.05.2007 14:13

Richard Huxton wrote:
> Could you check the output of vacuum verbose on that table and see how 
> much work it's doing? I'd have thought the actual bytea data would be 
> TOASTed away to a separate table for storage, leaving the vacuum with 
> very little work to do.
I'm quite new to postgres (actually I just ported our running 
application from MySQL...), so I don't know what toast means. But I 
noticed that vacuum also tried to cleanup some "toast" relations or so. 
This was what took so long.

> It might well be your actual problem is your disk I/O is constantly 
> saturated and the vacuum just pushes it over the edge. In which case 
> you'll either need more/better disks or to find a quiet time once a 
> day to vacuum and just do so then.
Yes, that was definitely the case. But now everything runs smoothly 
again, so I don't think I need to buy new disks.

Regards
Bastian


-- 
Bastian Voigt
Neumünstersche Straße 4
20251 Hamburg
telefon +49 - 40  - 67957171
mobil   +49 - 179 - 4826359



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] general PG network slowness (possible cure) (repost)

2007-05-25 Thread Peter T. Breuer
"Also sprach Tom Lane:"
> "Peter T. Breuer" <[EMAIL PROTECTED]> writes:
> > And definitely all those could be grouped if there are several to do.
> 
> Except that in the situation you're describing, there's only a hundred
> or two bytes of response to each query, which means that only one send()
> will occur anyway.  (The flush call comes only when we are done
> responding to the current client query.)

It may still be useful. The kernel won't necessarily send data as you
push it down to the network protocols and driver. The driver may decide
to wait for more data to accumulate, particularly if it only has a
couple of hundred bytes to send so far and the medium is high speed and
medium latency (fast ethernet).  It'll get fed up with waiting for more
data eventually, and send it out, but it is essentially waiting on
_itself_ in that case, since the outgoing data is required at the other
side of the net as a response to be processed before another query can
be sent out, only then prompting the postmaster to start stuffing the
output buffer with more bytes.

Waiting on oneself is bad for us procrastinators. We need some whips.

I'll try and really force a send, and try some more tricks.
Unfortunately this isn't really quite the right level, so I have to use
some heuristics. Can you guarantee that internal_flush is not called
until (a) the internal buffer is full, OR (b) we have finished
composing a reply, AND (c) there is no other way to send out data?

I also need to find where we begin to compose a reply. That's somewhere
well before internal flush ever gets called. I want to block output at
that point. 

As it is, I can either unblock just before internal_flush and block
after, or block just before internal_flush and unblock after (:-)
that's not quite as daft as it sounds, but needs care). Really I want
to do

   query received
   *block output
   process query
   create response
   *unblock output
   send

Instead, I have here to do

   query received
   process query
   create response
   *unblock output
   send
   *block output

Which is not quite the same. It may work though, because the driver
will know nothing is going to go out while it is listening for the next
query, and it will not have sent anything prematurely or kept it back
inopportunely.

> It's possible that for bulk data transmission situations we could
> optimize things a bit better --- in particular I've wondered whether we
> can reliably find out the MTU of the connection and use that as the
> output buffer size, instead of trusting the kernel to choose the best
> message boundaries --- but for the situation you're worried about

Don't bother, I think.  MTU is often effectively only notional these
days at the hardware level in many media.

OTOH, on my little net, MTU really does mean something because it's
10BT.

> there will be only one send.

True.

Peter

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] LIKE search and performance

2007-05-25 Thread Richard Huxton

[EMAIL PROTECTED] wrote:

On Fri, May 25, 2007 at 09:13:25AM +0100, Richard Huxton wrote:

[EMAIL PROTECTED] wrote:

And since it's basically impossible to know the selectivity of this kind
of where condition, I doubt the planner would ever realistically want to
choose that plan anyway because of its poor worst-case behavior.

What is a real life example where an intelligent and researched
database application would issue a like or ilike query as their
primary condition in a situation where they expected very high
selectivity?
Avoiding a poor worst-case behaviour for a worst-case behaviour that
won't happen doesn't seem practical.
But if you are also filtering on e.g. date, and that has an index with 
good selectivity, you're never going to use the text index anyway are 
you? If you've only got a dozen rows to check against, might as well 
just read them in.
The only time it's worth considering the behaviour at all is *if* the 
worst-case is possible.


I notice you did not provide a real life example as requested. :-)


OK - any application that allows user-built queries: foo>  


Want another? Any application that has a "search by name" box - users 
can (and do) put one letter in and hit enter.


Unfortunately you don't always have control over the selectivity of 
queries issued.



This seems like an ivory tower restriction. Not allowing best performance
in a common situation vs not allowing worst performance in a not-so-common
situation.


What best performance plan are you thinking of? I'm assuming we're 
talking about trailing-wildcard matches here, rather than "contains" 
style matches.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] general PG network slowness (possible cure) (repost)

2007-05-25 Thread Tom Lane
"Peter T. Breuer" <[EMAIL PROTECTED]> writes:
> "Also sprach Tom Lane:"
>> Except that in the situation you're describing, there's only a hundred
>> or two bytes of response to each query, which means that only one send()
>> will occur anyway.  (The flush call comes only when we are done
>> responding to the current client query.)

> It may still be useful. The kernel won't necessarily send data as you
> push it down to the network protocols and driver. The driver may decide
> to wait for more data to accumulate,

No, because we set TCP_NODELAY.  Once we've flushed a message to the
kernel, we don't want the kernel sitting on it --- any delay there adds
directly to the elapsed query time.  At least this is the case for the
final response to a query.  I'm not too clear on whether this means we
need to be careful about intermediate message boundaries when there's a
lot of data being sent.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] LIKE search and performance

2007-05-25 Thread PFC



OK - any application that allows user-built queries: foo>  


Want another? Any application that has a "search by name" box - users  
can (and do) put one letter in and hit enter.


Unfortunately you don't always have control over the selectivity of  
queries issued.


-*- HOW TO MAKE A SEARCH FORM -*-

Imagine you have to code the search on IMDB.

This is what a smart developer would do

First, he uses AJAX autocompletion, so the thing is reactive.
	Then, he does not bother the user with a many-fields form. Instead of  
forcing the user to think (users HATE that), he writes smart code.
	Does Google Maps have separate fields for country, city, street, zipcode  
? No. Because Google is about as smart as it gets.


So, you parse the user query.

	If the user types, for instance, less than 3 letters (say, spi), he  
probably wants stuff that *begins* with those letters. There is no point  
in searching for the letter "a" in a million movie titles database.
	So, if the user types "spi", you display "name LIKE spi%", which is  
indexed, very fast. And since you're smart, you use AJAX. And you display  
only the most popular results (ie. most clicked on).


http://imdb.com/find?s=all&q=spi

	Since 99% of the time the user wanted "spiderman" or "spielberg", you're  
done and he's happy. Users like being happy.
	If the user just types "a", you display the first 10 things that start  
with "a", this is useless but the user will marvel at your AJAX skillz.  
Then he will probably type in a few other letters.


	Then, if the user uses his space bar and types "spi 1980" you'll  
recognize a year and display spielberg's movies in 1980.
	Converting your strings to phonetics is also a good idea since about 0.7%  
of the l33T teenagers can spell stuff especially spiElberg.


	Only the guy who wants to know who had sex with marilyn monroe on the  
17th day of the shooting of Basic Instinct will need to use the Advanced  
search.


	If you detect several words, then switch to a prefix-based fulltext  
search like Xapian which utterly rocks.
	Example : the user types "savin priv", you search for "savin*" NEAR  
"priv*" and you display "saving private ryan" before he has even finished  
typing the second word of his query. Users love that, they feel  
understood, they will click on your ads and buy your products.


	In all cases, search results should be limited to less than 100 to be  
easy on the database. The user doesn't care about a search returning more  
than 10-20 results, he will just rephrase the query, and the time taken to  
fetch those thousands of records with name LIKE '%a%' will have been  
utterly lost. Who goes to page 2 in google results ?


BOTTOM LINE : databases don't think, you do.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] general PG network slowness (possible cure) (repost)

2007-05-25 Thread Peter T. Breuer
"Also sprach Tom Lane:"
> > It may still be useful. The kernel won't necessarily send data as you
> > push it down to the network protocols and driver. The driver may decide
> > to wait for more data to accumulate,
> 
> No, because we set TCP_NODELAY.  Once we've flushed a message to the

That just means "disable Nagle", which is indeed more or less the
correct thing to do ..  you don't want to sit around waiting  for more
data when we're sure there will be none, as you say.  Yet you also don't
want to send short data out prematurely, which disabling Nagle can
cause.

And disabling Nagle doesn't actually force data out immediately you want
it to be sent ...  it just disables extra waits imposed by the Nagle
algorithm/protocol.  It doesn't stop the driver from waiting around
because it feels taking the bus might be a bit premature right now,
for example.

> kernel, we don't want the kernel sitting on it --- any delay there adds
> directly to the elapsed query time.  At least this is the case for the
> final response to a query.  I'm not too clear on whether this means we
> need to be careful about intermediate message boundaries when there's a
> lot of data being sent.

It's unclear. But not my situation.


If I clear TCP_CORK all data is sent at that point. If I set TCP_CORK
data is held until I clear TCP_CORK, or 200ms have passed with no send.

Peter

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] LIKE search and performance

2007-05-25 Thread mark
On Fri, May 25, 2007 at 04:35:22PM +0100, Richard Huxton wrote:
> >I notice you did not provide a real life example as requested. :-)
> OK - any application that allows user-built queries:  foo>  
> Want another? Any application that has a "search by name" box - users 
> can (and do) put one letter in and hit enter.
> Unfortunately you don't always have control over the selectivity of 
> queries issued.

The database has 10 million records. The user enters "bar" and it
translates to "%bar%". You are suggesting that we expect bar to match
1 million+ records? :-)

I hope not. I would define this as bad process. I would also use "LIMIT"
to something like "100".

> >This seems like an ivory tower restriction. Not allowing best performance
> >in a common situation vs not allowing worst performance in a not-so-common
> >situation.
> What best performance plan are you thinking of? I'm assuming we're 
> talking about trailing-wildcard matches here, rather than "contains" 
> style matches.

"Trailing-wildcard" already uses B-Tree index, does it not?

I am speaking of contains, as contains is the one that was said to
require a seqscan. I am questioning why it requires a seqscan. The
claim was made that with MVCC, the index is insufficient to check
for visibility and that the table would need to be accessed anyways,
therefore a seqscan is required. I question whether a like '%bar%'
should be considered a high selectivity query in the general case.
I question whether a worst case should be assumed.

Perhaps I question too much? :-)

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13)

2007-05-25 Thread Bastian Voigt



OK, I'll give that a try. What about pg_autovacuum then? Is it a problem
when two processes try to vacuum the same table in parallel? Or do I
need to deactivate autovacuum altogether?


Try vacuuming every 3 minutes and see what happens.



(Sorry Richard, forgot to reply to the list!)
--
Bastian Voigt
Neumünstersche Straße 4
20251 Hamburg
telefon +49 - 40  - 67957171
mobil   +49 - 179 - 4826359




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] LIKE search and performance

2007-05-25 Thread Richard Huxton

PFC wrote:


OK - any application that allows user-built queries: foo>  


Want another? Any application that has a "search by name" box - users 
can (and do) put one letter in and hit enter.


Unfortunately you don't always have control over the selectivity of 
queries issued.


-*- HOW TO MAKE A SEARCH FORM -*-

Imagine you have to code the search on IMDB.

This is what a smart developer would do


All good domain-specific tips to provide users with a satisfying 
search-experience.


None of which address the question of what plan PG should produce for: 
SELECT * FROM bigtable WHERE foo LIKE 's%'


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] How PostgreSQL handles multiple DDBB instances?

2007-05-25 Thread Arnau

Hi Tom,


Arnau <[EMAIL PROTECTED]> writes:
   I have an application that works with multiple customers. Thinking in 
scalability we are thinking in applying the following approaches:



   - Create a separate database instance for each customer.
   - We think that customer's DB will be quite small, about 200MB as 
average.

   - The number of clients, then DDBB, can be significant(thousands).
   - Have as many customers as possible on the same server, so a single 
server could have more than 300 DDBB instances.


This is probably a bad idea, unless each customer's performance demands
are so low that you can afford to use very small shared-memory settings
for each instance.  But even small settings will probably eat ~10MB per
instance --- can you afford to build these machines with multiple GB of
RAM?

Can you instead run things with one postmaster per machine and one
database per customer within that instance?  From a performance
perspective this is likely to work much better.


  What I meant is just have only one postmaster per server and a lot of 
databases running in it. Something like that:


  template1=# \l
List of databases
   Name|   Owner   | Encoding
---+---+--
 alertwdv2 | gguridi   | LATIN1
 postgres  | postgres  | LATIN1
 template0 | postgres  | LATIN1
 template1 | postgres  | LATIN1
 voicexml  | root  | LATIN1
 wikidb| root  | LATIN1
(6 rows)

  Here I just have 6 databases, so my doubt is if instead having 6 
databases have 300/600 bases running on the same postmaster how this 
will impact the performance e.g.


  template1=# \l
List of databases
   Name|   Owner   | Encoding
---+---+--
 template0 | postgres  | LATIN1
 template1 | postgres  | LATIN1
 customers_group_1 | root  | LATIN1
(3 rows)

Instead of:

  template1=# \l
List of databases
   Name|   Owner   | Encoding
---+---+--
 template0 | postgres  | LATIN1
 template1 | postgres  | LATIN1
 customers_1   | root  | LATIN1
 customers_2   | root  | LATIN1
 customers_3   | root  | LATIN1
 ...
 customers_500 | root  | LATIN1
(502 rows)



If you desire to give the customers database-superuser capability then
this probably won't do, but if they are restricted users it might be OK.


  The users won't have superuser access just execute plain queries.

Thank you very much
--
Arnau

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] LIKE search and performance

2007-05-25 Thread Joshua D. Drake

[EMAIL PROTECTED] wrote:


I am speaking of contains, as contains is the one that was said to
require a seqscan. I am questioning why it requires a seqscan. The
claim was made that with MVCC, the index is insufficient to check
for visibility and that the table would need to be accessed anyways,
therefore a seqscan is required. I question whether a like '%bar%'
should be considered a high selectivity query in the general case.
I question whether a worst case should be assumed.


If you are doing %bar% you should be using pg_tgrm or tsearch2.

J




Perhaps I question too much? :-)

Cheers,
mark




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] LIKE search and performance

2007-05-25 Thread Richard Huxton

[EMAIL PROTECTED] wrote:

On Fri, May 25, 2007 at 04:35:22PM +0100, Richard Huxton wrote:

I notice you did not provide a real life example as requested. :-)
OK - any application that allows user-built queries: foo>  
Want another? Any application that has a "search by name" box - users 
can (and do) put one letter in and hit enter.
Unfortunately you don't always have control over the selectivity of 
queries issued.


The database has 10 million records. The user enters "bar" and it
translates to "%bar%". You are suggesting that we expect bar to match
1 million+ records? :-)


I was saying that you don't know. At least, I don't know of any cheap 
way of gathering full substring stats or doing a full substring 
indexing. Even tsearch2 can't do that.



I hope not. I would define this as bad process. I would also use "LIMIT"
to something like "100".


Yes, but that's not the query we're talking about is it? If possible you 
don't do '%bar%' searches at all. If you do, you try to restrict it 
further or LIMIT the results. There's nothing to discuss in these cases.



This seems like an ivory tower restriction. Not allowing best performance
in a common situation vs not allowing worst performance in a not-so-common
situation.
What best performance plan are you thinking of? I'm assuming we're 
talking about trailing-wildcard matches here, rather than "contains" 
style matches.


"Trailing-wildcard" already uses B-Tree index, does it not?


Yes, it searches the btree and then checks the data for visibility. I 
thought that was what you felt could be worked around. It appears I was 
wrong.



I am speaking of contains, as contains is the one that was said to
require a seqscan. I am questioning why it requires a seqscan. 


Well, you seemed to be suggesting you had something better in mind. At 
least, that was my reading of your original post.


> The

claim was made that with MVCC, the index is insufficient to check
for visibility 


True, for PG's implementation of MVCC. You *could* have visibility in 
each index, but that obviously would take more space. For a table with 
many indexes, that could be a *lot* more space. You also have to update 
all that visibilty information too.


> and that the table would need to be accessed anyways,

therefore a seqscan is required. I question whether a like '%bar%'
should be considered a high selectivity query in the general case.
I question whether a worst case should be assumed.


Well, the general rule-of-thumb is only about 10% for the changeover 
between index & seq-scan. That is, once you are reading 10% of the rows 
on disk (to check visibility) you might as well read them all (since 
you'll be reading most of the blocks anyway if the rows are randomly 
distributed). If you are doing SELECT * from that table then you'll want 
all that data you read. If you are doing SELECT count(*) then you only 
wanted the visibility :-(


Now you and I can look at a substring and probably make a good guess how 
common it is (assuming we know the targets are British surnames or 
Japanese towns). PG needs one number - or rather, it picks one number 
for each length of search-string (afaik).



Perhaps I question too much? :-)


Not sure it's possible to question too much :-)
However, you need to provide answers occasionally too - what numbers 
would you pick? :-)


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] LIKE search and performance

2007-05-25 Thread Richard Huxton

PFC wrote:
None of which address the question of what plan PG should produce for: 
SELECT * FROM bigtable WHERE foo LIKE 's%'


Ah, this one already uses the btree since the '%' is at the end.
My point is that a search like this will yield too many results to 
be useful to the user anyway, so optimizing its performance is a kind of 
red herring.


At the *application level* yes.
At the *query planner* level no.

At the query planner level I just want it to come up with the best plan 
it can. The original argument was that PG's estimate of the number of 
matching rows was too optimistic (or pessimistic) in the case where we 
are doing a contains substring-search.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] LIKE search and performance

2007-05-25 Thread PFC
None of which address the question of what plan PG should produce for:  
SELECT * FROM bigtable WHERE foo LIKE 's%'


Ah, this one already uses the btree since the '%' is at the end.
	My point is that a search like this will yield too many results to be  
useful to the user anyway, so optimizing its performance is a kind of red  
herring.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] LIKE search and performance

2007-05-25 Thread Gregory Stark
"Richard Huxton" <[EMAIL PROTECTED]> writes:

> Now you and I can look at a substring and probably make a good guess how 
> common
> it is (assuming we know the targets are British surnames or Japanese towns). 
> PG
> needs one number - or rather, it picks one number for each length of
> search-string (afaik).

I don't think that's true. Postgres calculates the lower and upper bound
implied by the search pattern and then uses the histogram to estimate how
selective that range is. It's sometimes surprisingly good but obviously it's
not perfect.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] How PostgreSQL handles multiple DDBB instances?

2007-05-25 Thread Tom Lane
Arnau <[EMAIL PROTECTED]> writes:
>> Can you instead run things with one postmaster per machine and one
>> database per customer within that instance?  From a performance
>> perspective this is likely to work much better.

>What I meant is just have only one postmaster per server and a lot of 
> databases running in it.

OK, we are on the same page then.  Should work fine.  I think I've heard
of people running installations with thousands of DBs in them.  You'll
want to test it a bit of course ...

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] LIKE search and performance

2007-05-25 Thread Richard Huxton

Gregory Stark wrote:

"Richard Huxton" <[EMAIL PROTECTED]> writes:


Now you and I can look at a substring and probably make a good guess how common
it is (assuming we know the targets are British surnames or Japanese towns). PG
needs one number - or rather, it picks one number for each length of
search-string (afaik).


I don't think that's true. Postgres calculates the lower and upper bound
implied by the search pattern and then uses the histogram to estimate how
selective that range is. It's sometimes surprisingly good but obviously it's
not perfect.


Sorry - I'm obviously picking my words badly today.

I meant for the "contains" substring match. It gives different (goes 
away and checks...yes) predictions based on string length. So it guesses 
that LIKE '%aaa%' will match more than LIKE '%%'. Of course, if we 
were matching surnames you and I could say that this is very unlikely, 
but without some big statistics table I guess there's not much more PG 
can do.


For a trailing wildcard LIKE 'aaa%' it can and does as you say convert 
this into something along the lines of (>= 'aaa' AND < 'aab'). Although 
IIRC that depends if your locale allows such (not sure, I don't really 
use non-C/non-English locales enough).


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] How PostgreSQL handles multiple DDBB instances?

2007-05-25 Thread Arnau

Tom Lane wrote:

Arnau <[EMAIL PROTECTED]> writes:

Can you instead run things with one postmaster per machine and one
database per customer within that instance?  From a performance
perspective this is likely to work much better.


   What I meant is just have only one postmaster per server and a lot of 
databases running in it.


OK, we are on the same page then.  Should work fine.  I think I've heard
of people running installations with thousands of DBs in them.  You'll
want to test it a bit of course ...


I'm worried about performance, I have done some tests and I have on a 
server more than 400 DBs, so it's possible to run such amount of DBs in 
a single postmaster.


  The point I'm worried is performance. Do you think the performance 
would be better executing exactly the same queries only adding an extra 
column to all the tables e.g. customer_id, than open a connection to the 
only one customers DB and execute the query there?


  I don't know if PostgreSQL cache's mechanism works as good as 
querying to 400 possible DBs or just to one possible DB.


Thank you very much for your help :)
--
Arnau

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] How PostgreSQL handles multiple DDBB instances?

2007-05-25 Thread Tom Lane
Arnau <[EMAIL PROTECTED]> writes:
>The point I'm worried is performance. Do you think the performance 
> would be better executing exactly the same queries only adding an extra 
> column to all the tables e.g. customer_id, than open a connection to the 
> only one customers DB and execute the query there?

[ shrug... ]  That's going to depend on enough factors that I don't
think anyone could give you a generic answer.  You'd have to test it for
yourself under your own application conditions.

However: doing it that way seems to me to create severe risks that the
customers might be able to look at each others' data.  You probably want
to go with separate databases just as a security matter.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[PERFORM] Performance problem on 8.2.4, but not 8.2.3

2007-05-25 Thread Dave Pirotte

Greetings,

We have two servers running pgsql -- an older server running 8.2.3,  
and a newer (far superior) one running 8.2.4.  One of our reporting  
queries is running painfully slowly on 8.2.4, but it executes in a  
reasonable timeframe on 8.2.3.  Below, I've included a contrived,  
stripped down query which still exhibits the same unintuitively poor  
performance, as well as its explain analyze output from both  
servers.  In particular, 8.2.4 opts for filters in a couple places  
where we would expect index conds.  Also, we've noticed that the  
8.2.4 box (in other similar queries) consistently underestimates  
costs, whereas the 8.2.3 box consistently overestimates.


All columns involved in this query are indexed (btrees), and there is  
a functional index on mm_date_trunc('day', created_at)...where  
mm_date_trunc is simply an immutable version of date_trunc (fine for  
our purposes).  The only configuration differences between the  
servers are various memory settings... work_mem and temp_buffers are  
8mb / 16mb, shared buffers 128mb / 512mb on the 8.2.3 and 8.2.4  
servers, respectively.  Stats targets are 10 on both, for  
consistency... but it is worth mentioning that performance was still  
abysmal under 8.2.4 with 250 as the target.


Any insight would be most appreciated, as we're a bit stumped.  Thanks!

Cheers,

Dave Pirotte
Director of Technology
Media Matters for America

===

select h.day, h.c as total,
(select count(*) as c
from hits h2
join uri_qstrings uq on (h2.uri_qstring_id = uq.id)
join referrer_paths rp on (h2.referrer_path_id = rp.id)
join referrer_domains rd on (rp.referrer_domain_id = rd.id)
where mm_date_trunc('day', created_at) = h.day
and site_id = 3
and uq.qstring = '?f=h_top'
and rd.domain = 'mediamatters.org'
) as h_top
from (
select mm_date_trunc('day', h.created_at) as day,
count(*) as c
from hits h
where created_at > date_trunc('day', now() - interval '2 days')
group by mm_date_trunc('day', h.created_at)
) h
order by h.day asc;


QUERY PLAN (8.2.4)
 
 

Sort  (cost=204012.65..204012.66 rows=3 width=16) (actual  
time=83012.885..83012.885 rows=3 loops=1)
   Sort Key: "day"   ->  Subquery Scan h  (cost=149811.02..204012.62  
rows=3 width=16) (actual time=28875.251..83012.868 rows=3 loops=1)
 ->  HashAggregate  (cost=149811.02..149811.06 rows=3  
width=8) (actual time=1602.787..1602.794 rows=3 loops=1)
   ->  Bitmap Heap Scan on hits h   
(cost=6485.90..148079.18 rows=346368 width=8) (actual  
time=48.222..1358.196 rows=391026 loops=1)
 Recheck Cond: (created_at > date_trunc 
('day'::text, (now() - '2 days'::interval)))
 ->  Bitmap Index Scan on hits_created_idx   
(cost=0.00..6399.31 rows=346368 width=0) (actual time=47.293..47.293  
rows=391027 loops=1)
   Index Cond: (created_at > date_trunc 
('day'::text, (now() - '2 days'::interval)))

 SubPlan
   ->  Aggregate  (cost=18067.17..18067.18 rows=1 width=0)  
(actual time=27136.681..27136.681 rows=1 loops=3)
 ->  Nested Loop  (cost=40.66..18067.16 rows=1  
width=0) (actual time=1105.396..27135.496 rows=3394 loops=3)
   ->  Nested Loop  (cost=40.66..18063.56 rows=9  
width=8) (actual time=32.132..26837.394 rows=50537 loops=3)
 ->  Nested Loop  (cost=40.66..5869.35  
rows=47 width=8) (actual time=20.482..276.889 rows=121399 loops=3)
   ->  Index Scan using  
referrer_domains_domains_idx on referrer_domains rd  (cost=0.00..8.27  
rows=1 width=8) (actual time=0.024..0.026 rows=1 loops=3)
 Index Cond:  
(("domain")::text = 'mediamatters.org'::text)
   ->  Bitmap Heap Scan on  
referrer_paths rp  (cost=40.66..5834.77 rows=2105 width=16) (actual  
time=20.402..210.440 rows=121399 loops=3)
 Recheck Cond:  
(rp.referrer_domain_id = rd.id)
 ->  Bitmap Index Scan on  
referrer_paths_domains_idx  (cost=0.00..40.13 rows=2105 width=0)  
(actual time=17.077..17.077 rows=121399 loops=3)
   Index Cond:  
(rp.referrer_domain_id = rd.id)
 ->  Index Scan using hits_refer_idx on  
hits h2  (cost=0.00..257.59 rows=149 width=16) (actual  
time=0.167..0.218 rows=0 loops=364197)
   Index Cond: (h2.r

Re: [PERFORM] Performance problem on 8.2.4, but not 8.2.3

2007-05-25 Thread Kristo Kaiv

Lo!

referrer_paths seems to have totally wrong stats. try full analyze on  
it.

how many records in total do you have in referrer_paths on 8.2.4 server?
might be just a problem of usage pattern change from old system to  
new (1 row vs. 121399 rows) ?
does not seem to be just a plan problem as the data itself seems to  
be quite different.


Kristo

On 25.05.2007, at 21:08, Dave Pirotte wrote:


Greetings,

We have two servers running pgsql -- an older server running 8.2.3,  
and a newer (far superior) one running 8.2.4.  One of our reporting  
queries is running painfully slowly on 8.2.4, but it executes in a  
reasonable timeframe on 8.2.3.  Below, I've included a contrived,  
stripped down query which still exhibits the same unintuitively  
poor performance, as well as its explain analyze output from both  
servers.  In particular, 8.2.4 opts for filters in a couple places  
where we would expect index conds.  Also, we've noticed that the  
8.2.4 box (in other similar queries) consistently underestimates  
costs, whereas the 8.2.3 box consistently overestimates.


All columns involved in this query are indexed (btrees), and there  
is a functional index on mm_date_trunc('day', created_at)...where  
mm_date_trunc is simply an immutable version of date_trunc (fine  
for our purposes).  The only configuration differences between the  
servers are various memory settings... work_mem and temp_buffers  
are 8mb / 16mb, shared buffers 128mb / 512mb on the 8.2.3 and 8.2.4  
servers, respectively.  Stats targets are 10 on both, for  
consistency... but it is worth mentioning that performance was  
still abysmal under 8.2.4 with 250 as the target.


Any insight would be most appreciated, as we're a bit stumped.   
Thanks!


Cheers,

Dave Pirotte
Director of Technology
Media Matters for America

===

select h.day, h.c as total,
(select count(*) as c
from hits h2
join uri_qstrings uq on (h2.uri_qstring_id = uq.id)
join referrer_paths rp on (h2.referrer_path_id = rp.id)
join referrer_domains rd on (rp.referrer_domain_id = rd.id)
where mm_date_trunc('day', created_at) = h.day
and site_id = 3
and uq.qstring = '?f=h_top'
and rd.domain = 'mediamatters.org'
) as h_top
from (
select mm_date_trunc('day', h.created_at) as day,
count(*) as c
from hits h
where created_at > date_trunc('day', now() - interval '2 days')
group by mm_date_trunc('day', h.created_at)
) h
order by h.day asc;

   
 QUERY PLAN (8.2.4)
-- 
-- 

Sort  (cost=204012.65..204012.66 rows=3 width=16) (actual  
time=83012.885..83012.885 rows=3 loops=1)
   Sort Key: "day"   ->  Subquery Scan h   
(cost=149811.02..204012.62 rows=3 width=16) (actual  
time=28875.251..83012.868 rows=3 loops=1)
 ->  HashAggregate  (cost=149811.02..149811.06 rows=3  
width=8) (actual time=1602.787..1602.794 rows=3 loops=1)
   ->  Bitmap Heap Scan on hits h   
(cost=6485.90..148079.18 rows=346368 width=8) (actual  
time=48.222..1358.196 rows=391026 loops=1)
 Recheck Cond: (created_at > date_trunc 
('day'::text, (now() - '2 days'::interval)))
 ->  Bitmap Index Scan on hits_created_idx   
(cost=0.00..6399.31 rows=346368 width=0) (actual  
time=47.293..47.293 rows=391027 loops=1)
   Index Cond: (created_at > date_trunc 
('day'::text, (now() - '2 days'::interval)))

 SubPlan
   ->  Aggregate  (cost=18067.17..18067.18 rows=1 width=0)  
(actual time=27136.681..27136.681 rows=1 loops=3)
 ->  Nested Loop  (cost=40.66..18067.16 rows=1  
width=0) (actual time=1105.396..27135.496 rows=3394 loops=3)
   ->  Nested Loop  (cost=40.66..18063.56  
rows=9 width=8) (actual time=32.132..26837.394 rows=50537 loops=3)
 ->  Nested Loop  (cost=40.66..5869.35  
rows=47 width=8) (actual time=20.482..276.889 rows=121399 loops=3)
   ->  Index Scan using  
referrer_domains_domains_idx on referrer_domains rd   
(cost=0.00..8.27 rows=1 width=8) (actual time=0.024..0.026 rows=1  
loops=3)
 Index Cond:  
(("domain")::text = 'mediamatters.org'::text)
   ->  Bitmap Heap Scan on  
referrer_paths rp  (cost=40.66..5834.77 rows=2105 width=16) (actual  
time=20.402..210.440 rows=121399 loops=3)
 Recheck Cond:  
(rp.referrer_domain_id = rd.id)
 ->  Bitmap Index Scan on  
referrer_paths_doma

Re: [PERFORM] general PG network slowness (possible cure) (repost)

2007-05-25 Thread Peter T. Breuer
"Also sprach Kenneth Marshall:"
> improvement from coalescing the packets. Good luck in your investigations.

While I am recompiling stuff, just some stats.

Typical network traffic analysis during the PG runs:

Total Packets Processed 493,499
Unicast 100.0%  493,417
Broadcast   0.0%82
Multicast   0.0%0
pktCast distribution chart
Shortest 42 bytes
Average Size192 bytes
Longest   1,514 bytes
<= 64 bytes 0.0%   158
64 to 128 bytes77.3%   381,532
129 to 256 bytes6.8%33,362
257 to 512 bytes8.6%42,535
513 to 1024 bytes   4.0%19,577
1025 to 1518 bytes  3.3%16,335
 
Typical application rusage stats:

   time ./c -timeout 12000 -database postgresql://pebbles/d /tmp/tty_io..c
   user   system elapsed cpu
   7.866u 6.038s 5:49.13 3.9%  0+0k 0+0io 0pf+0w
   
Those stats show the system lost in i/o. It's neither in kernel nor in
userspace. Presumably the other side plus networking was the holdup.

For comparison, against localhost via loopback ("fake" networking):

   time ./c -timeout 12000 -database postgresql://localhost/d /tmp/tty_io..c
   user   system elapsed cpu
   9.483u 5.321s 2:41.78 9.1%  0+0k 0+0io 0pf+0w

but in that case postmaster was doing about 54% cpu, so the overall
cpu for server + client is 63%.

I moved to a unix domain socket and postmaster alone went to 68%.


   time ./c -timeout 12000 -database postgresql://unix/var/run/postgresql/d 
/tmp/tty_io..c
   user   system elapsed cpu
   9.569u 3.698s 2:52.41 7.6%  0+0k 0+0io 0pf+0w

The elapsed time is not much different between unix and localhost. One can
see that there is some i/o holdup because the two threads ought to do 100%
between them if handover of info were costless. The difference (the system
was queiscent o/w apart from the monitoring software, which shows only a
fraction of a percent loading). There were no memory shortages and swap
was disabled for the test (both sides)

For comparison, running against gdbm straignt to disk

  time ./c -timeout 12000  /tmp/tty_io..c
  user   system elapsed cpu
  2.637u 0.735s 0:05.34 62.9% 0+0k 0+0io 0pf+0w

Through localhost:

  time ./c -timeout 12000 -database gdbm://localhost/ptb/c /tmp/tty_io..c
  user   system elapsed cpu
  2.746u 3.699s 0:16.00 40.1% 0+0k 0+0io 0pf+0w

(the server process was at 35% cpu, for 75% total).
 
Across the net:

  time ./c -timeout 12000 -database gdbm://pebbles/ptb/c /tmp/tty_io..c
  user   system elapsed cpu
  2.982u 4.430s 1:03.44 7.9%  0+0k 0+0io 0pf+0w

(the server was at 7% cpu)


Have to go shopping 
  
Peter

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Performance problem on 8.2.4, but not 8.2.3

2007-05-25 Thread Tom Lane
Dave Pirotte <[EMAIL PROTECTED]> writes:
> We have two servers running pgsql -- an older server running 8.2.3,  
> and a newer (far superior) one running 8.2.4.  One of our reporting  
> queries is running painfully slowly on 8.2.4, but it executes in a  
> reasonable timeframe on 8.2.3.

Are you sure you've analyzed all these tables in the 8.2.4 database?
Some of the rowcount estimates seem a bit far off.

I looked through the CVS logs and didn't find any planner changes
between 8.2.3 and 8.2.4 that seem likely to affect your query, so
I'm thinking it must be a statistical discrepancy.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Performance problem on 8.2.4, but not 8.2.3

2007-05-25 Thread Steinar H. Gunderson
On Fri, May 25, 2007 at 03:56:35PM -0400, Tom Lane wrote:
> I looked through the CVS logs and didn't find any planner changes
> between 8.2.3 and 8.2.4 that seem likely to affect your query, so
> I'm thinking it must be a statistical discrepancy.

It looks like the estimated cost is lower for 8.2.4 -- could it be that the
fact that he's giving it more memory lead to the planner picking a plan that
happens to be worse?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Performance problem on 8.2.4, but not 8.2.3

2007-05-25 Thread Tom Lane
"Steinar H. Gunderson" <[EMAIL PROTECTED]> writes:
> It looks like the estimated cost is lower for 8.2.4 -- could it be that the
> fact that he's giving it more memory lead to the planner picking a plan that
> happens to be worse?

Offhand I don't think so.  More work_mem might make a hash join look
cheaper (or a sort for a mergejoin), but the problem here seems to be
that it's switching away from a hash and to a nestloop.  Which is a
loser because there are many more outer-relation rows than it's
expecting.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Performance problem on 8.2.4, but not 8.2.3

2007-05-25 Thread Dave Pirotte
Thanks for the quick responses.  :-)  The data is almost identical,  
between the two servers: 8.2.3 has 882198 records, 8.2.4 has 893121.   
For background, I pg_dump'ed the data into the 8.2.4 server  
yesterday, and analyzed with the stats target of 250, then reanalyzed  
with target 10.   So, the statistics should theoretically be ok.   
Running a vacuum full analyze on referrer_paths, per Kristo's  
suggestion, didn't affect the query plan.


We downgraded to 8.2.3 just to rule that out, upped stats target to  
100, analyzed, and are still experiencing the same behavior -- it's  
still coming up with the same bogus rowcount estimates.  Over the  
weekend I'll lower the memory and see if that does anything, just to  
rule that out...  Any other thoughts?  Thanks so much for your time  
and suggestions thus far.


Cheers,
Dave

On May 25, 2007, at 4:33 PM, Tom Lane wrote:


"Steinar H. Gunderson" <[EMAIL PROTECTED]> writes:
It looks like the estimated cost is lower for 8.2.4 -- could it be  
that the
fact that he's giving it more memory lead to the planner picking a  
plan that

happens to be worse?


Offhand I don't think so.  More work_mem might make a hash join look
cheaper (or a sort for a mergejoin), but the problem here seems to be
that it's switching away from a hash and to a nestloop.  Which is a
loser because there are many more outer-relation rows than it's
expecting.

regards, tom lane

---(end of  
broadcast)---

TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate



Dave Pirotte
Director of Technology
Media Matters for America
[EMAIL PROTECTED]
phone: 202-756-4122




[PERFORM] Adding disks/xlog & index

2007-05-25 Thread lists
I have a busy postgresql server running running on a raid1 of 2 15k rpm
scsi drives.

I have been running into the problem of maxed out IO bandwidth. I would
like to convert my raid1 into a raid10 but that would require a full
rebuild which is more downtime than I want so I am looking into other
alternatives.

The best one I have come up with is moving the xlog/wal (can someone
confirm whether these are the same thing?) to another physical drive. I
also think it may be beneficial to move some indexes to another drive as
well (same one as xlog).

Some questions on this:
1. Can the database survive loss/corruption of the xlog and indexes in a
recoverable way? To save money (and because I won't need the throughput as
much), I am thinking on making this index/wal/xlog drive a single cheap
sata drive (or maybe a non-raided 15k scsi for 60% more money). However
without the redundancy of a mirror I am concerned about drive failure.
Loss of several mins of recent transactions in a serious crash is
acceptable to be, but full/serious database corruption (the likes of fsync
off) is not.

2. Is there any point using a high performance (ie scsi) disk for this, or
would the mirror containing the majority of the data still be the major
bottleneck causing the disk usage to not exceed sata performance anyway?

3. Is there any easy way to move ALL indexes to another drive? Is this a
good performance idea or would they just bottleneck each other seriously?


Other info for reference
Running postgresql 8.2 on FreeBSD 6.1
server is a core2 with 4gb of ram. CPU usage is moderate.


Also, can anyone recommend a good shared_buffers size? The server is
dedicated to postgres except for half a gig used by memcached. Right now I
have it set at 51200 which may be too high (I've read varying suggestions
with this and I'm not sure how aggressive FreeBSD6's IO cache is).

And any suggestions on what effective_cache_size I should use on this
hardware and OS? I've been using 384MB but I don't know if this is optimal
or not.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] general PG network slowness (possible cure) (repost)

2007-05-25 Thread Peter T. Breuer
"Also sprach Richard Huxton:"
> > scheme each time, for example!  (how that?). I could presumably also
> > help it by preloading the commands I will run and sending over the 
> > params only with a "do a no. 17 now!".
> 
> PREPARE/EXECUTE (or the equivalent libpq functions).

Yes, thank you. It seems to speed things up by a factor of 2.

But can I prepare a DECLARE x BINARY CURSOR FOR SELECT ...  statement?
The manual seems to say no.

Peter

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Big problem with sql update operation

2007-05-25 Thread Michal Szymanski

Tom Lane wrote:

Michal Szymanski <[EMAIL PROTECTED]> writes:
  

CREATE OR REPLACE FUNCTION test()
RETURNS void AS
$BODY$
DECLARE
BEGIN  
FOR v_i IN 1..4000 LOOP

UPDATE group_fin_account_tst SET
credit = v_i
 WHERE group_fin_account_tst_id = 1;   -- for real procedure I 
update different rows



Does updating the *same* record 4000 times per transaction reflect the
real behavior of your application?  If not, this is not a good
benchmark.  If so, consider redesigning your app to avoid so many
redundant updates.

  

Real application modifiy every time modify different row.


(For the record, the reason you see nonlinear degradation is the
accumulation of tentatively-dead versions of the row, each of which has
to be rechecked by each later update.)

  
There is another  strange  thing. We  have two versions of our test 
environment one with production DB copy and second genereated with 
minimal data set and it is odd that update presented above on copy of 
production is executing 170ms but on small DB it executing 6s 


Michal Szymanski
http://blog.szymanskich.net

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Big problem with sql update operation

2007-05-25 Thread Alvaro Herrera
Michal Szymanski wrote:
> Tom Lane wrote:

> >(For the record, the reason you see nonlinear degradation is the
> >accumulation of tentatively-dead versions of the row, each of which has
> >to be rechecked by each later update.)
> >  
> There is another  strange  thing. We  have two versions of our test 
> environment one with production DB copy and second genereated with 
> minimal data set and it is odd that update presented above on copy of 
> production is executing 170ms but on small DB it executing 6s 

How are you vacuuming the tables?

-- 
Alvaro Herrerahttp://www.advogato.org/person/alvherre
"El conflicto es el camino real hacia la unión"

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Adding disks/xlog & index

2007-05-25 Thread Tom Lane
[EMAIL PROTECTED] writes:
> The best one I have come up with is moving the xlog/wal (can someone
> confirm whether these are the same thing?) to another physical drive.

Yeah, two names for same thing.

> I also think it may be beneficial to move some indexes to another drive as
> well (same one as xlog).

Depends on how the I/O workload works out.  On systems that have fairly
heavy write traffic, the standard advice is that you want WAL on its own
dedicated spindle, because the less that head needs to move the faster
you can write WAL, and WAL output speed is going to determine how fast
you can perform updates.

If it's a read-mostly database then maybe you can ignore that advice and
worry more about separating indexes from tables.

> 1. Can the database survive loss/corruption of the xlog and indexes in a
> recoverable way? To save money (and because I won't need the throughput as
> much), I am thinking on making this index/wal/xlog drive a single cheap
> sata drive (or maybe a non-raided 15k scsi for 60% more money).

Do not go cheap on the WAL drive --- you lose WAL, you're in serious
trouble.  Indexes can always be rebuilt with REINDEX, so they're maybe
a bit more expendable.

> 3. Is there any easy way to move ALL indexes to another drive?

No, I think you have to move 'em one at a time :-(.  The standard advice
for this is to set up a plpgsql function that scans the catalogs and
issues the commands you want (ALTER INDEX SET TABLESPACE in this case).

> Is this a
> good performance idea or would they just bottleneck each other seriously?

Impossible to tell without a lot more details than you provided.  I'd
suggest you try it and see.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Adding disks/xlog & index

2007-05-25 Thread Tom Lane
[EMAIL PROTECTED] writes:
> The best one I have come up with is moving the xlog/wal (can someone
> confirm whether these are the same thing?) to another physical drive.

Yeah, two names for same thing.

> I also think it may be beneficial to move some indexes to another drive as
> well (same one as xlog).

Depends on how the I/O workload works out.  On systems that have fairly
heavy write traffic, the standard advice is that you want WAL on its own
dedicated spindle, because the less that head needs to move the faster
you can write WAL, and WAL output speed is going to determine how fast
you can perform updates.

If it's a read-mostly database then maybe you can ignore that advice and
worry more about separating indexes from tables.

> 1. Can the database survive loss/corruption of the xlog and indexes in a
> recoverable way? To save money (and because I won't need the throughput as
> much), I am thinking on making this index/wal/xlog drive a single cheap
> sata drive (or maybe a non-raided 15k scsi for 60% more money).

Do not go cheap on the WAL drive --- you lose WAL, you're in serious
trouble.  Indexes can always be rebuilt with REINDEX, so they're maybe
a bit more expendable.

> 3. Is there any easy way to move ALL indexes to another drive?

No, I think you have to move 'em one at a time :-(.  The standard advice
for this is to set up a plpgsql function that scans the catalogs and
issues the commands you want (ALTER INDEX SET TABLESPACE in this case).

> Is this a
> good performance idea or would they just bottleneck each other seriously?

Impossible to tell without a lot more details than you provided.  I'd
suggest you try it and see.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] general PG network slowness (possible cure) (repost)

2007-05-25 Thread Tom Lane
"Peter T. Breuer" <[EMAIL PROTECTED]> writes:
> But can I prepare a DECLARE x BINARY CURSOR FOR SELECT ...  statement?
> The manual seems to say no.

No, you just prepare the SELECT.  At the protocol level, DECLARE CURSOR
is a tad useless.  You can still fetch the data in binary if you want...

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Adding disks/xlog & index

2007-05-25 Thread Gregory Stark
<[EMAIL PROTECTED]> writes:

> Some questions on this:
> 1. Can the database survive loss/corruption of the xlog and indexes in a
> recoverable way? To save money (and because I won't need the throughput as
> much), I am thinking on making this index/wal/xlog drive a single cheap
> sata drive (or maybe a non-raided 15k scsi for 60% more money). However
> without the redundancy of a mirror I am concerned about drive failure.
> Loss of several mins of recent transactions in a serious crash is
> acceptable to be, but full/serious database corruption (the likes of fsync
> off) is not.

Losing any WAL that the database has fsynced is exactly like having fsync off.

> 2. Is there any point using a high performance (ie scsi) disk for this, or
> would the mirror containing the majority of the data still be the major
> bottleneck causing the disk usage to not exceed sata performance anyway?

Well that depends on your database traffic. In most databases the volume of
WAL traffic is substantially less than the i/o traffic to the data drives. So
you usually don't need to be able to sustain high i/o bandwidth to the WAL
drive.

However in some database loads the latency to the WAL drive does matter. This
is especially true if you're executing a lot of short transactions and
response time is critical. Especially if you aren't executing many such
transactions in parallel. So for example if you're processing a serial batch
of short transactions and committing each one as a separate transaction. In
that case you would want a drive that can fsync fast which either means a
battery backed cache or 15kRPM drive. It doesn't necessarily mean you need a
bit raid array though.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[PERFORM] ECC RAM really needed?

2007-05-25 Thread Craig James

We're thinking of building some new servers.  We bought some a while back that 
have ECC (error correcting) RAM, which is absurdly expensive compared to the 
same amount of non-ECC RAM.  Does anyone have any real-life data about the 
error rate of non-ECC RAM, and whether it matters or not?  In my long career, 
I've never once had a computer that corrupted memory, or at least I never knew 
if it did.  ECC sound like a good idea, but is it solving a non-problem?

Thanks,
Craig

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] ECC RAM really needed?

2007-05-25 Thread Bruno Wolff III
On Fri, May 25, 2007 at 18:45:15 -0700,
  Craig James <[EMAIL PROTECTED]> wrote:
> We're thinking of building some new servers.  We bought some a while back 
> that have ECC (error correcting) RAM, which is absurdly expensive compared 
> to the same amount of non-ECC RAM.  Does anyone have any real-life data 
> about the error rate of non-ECC RAM, and whether it matters or not?  In my 
> long career, I've never once had a computer that corrupted memory, or at 
> least I never knew if it did.  ECC sound like a good idea, but is it 
> solving a non-problem?

In the past when I purchased ECC ram it wasn't that much more expensive
than nonECC ram.

Wikipedia suggests a rule of thumb of one error per month per gigabyte,
though suggests error rates vary widely. They reference a paper that should
provide you with more background.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] ECC RAM really needed?

2007-05-25 Thread Greg Smith

On Fri, 25 May 2007, Bruno Wolff III wrote:

Wikipedia suggests a rule of thumb of one error per month per gigabyte, 
though suggests error rates vary widely. They reference a paper that 
should provide you with more background.


The paper I would recommend is

http://www.tezzaron.com/about/papers/soft_errors_1_1_secure.pdf

which is a summary of many other people's papers, and quite informative. 
I know I had no idea before reading it how much error rates go up with 
increasing altitute.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] ECC RAM really needed?

2007-05-25 Thread Tom Lane
Greg Smith <[EMAIL PROTECTED]> writes:
> The paper I would recommend is
> http://www.tezzaron.com/about/papers/soft_errors_1_1_secure.pdf
> which is a summary of many other people's papers, and quite informative. 
> I know I had no idea before reading it how much error rates go up with 
> increasing altitute.

Not real surprising if you figure the problem is mostly cosmic rays.

Anyway, this paper says

> Even using a relatively conservative error rate (500 FIT/Mbit), a
> system with 1 GByte of RAM can expect an error every two weeks;

which should pretty much cure any idea that you want to run a server
with non-ECC memory.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings