from:"Manfred Koizar"

Re: [PERFORM] Sort and index

2005-05-11 Thread Manfred Koizar

On Sun, 24 Apr 2005 17:01:46 -0500, "Jim C. Nasby" <[EMAIL PROTECTED]>
wrote:
>> >> Feel free to propose better cost equations.

I did.  More than once.

>estimated index scan cost for (project_id, id, date) is
>0.00..100117429.34 while the estimate for work_units is
>0.00..103168408.62; almost no difference,

~3%

> even though project_id correlation is .657

This is divided by the number of index columns, so the index correlation
is estimated to be 0.219.

> while work_units correlation is .116.

So csquared is 0.048 and 0.013, respectively, and you get a result not
far away from the upper bound in both cases.  The cost estimations
differ by only 3.5% of (max_IO_cost - min_IO_cost).

>you'll see that the cost of the index scan is way overestimated. Looking
>at the code, the runcost is calculated as
>
>run_cost += max_IO_cost + csquared * (min_IO_cost - max_IO_cost);
>
>where csquared is indexCorrelation^2. Why is indexCorrelation squared?
>The comments say a linear interpolation between min_IO and max_IO is
>used, but ISTM that if it was linear then instead of csquared,
>indexCorrelation would just be used.

In my tests I got much more plausible results with

1 - (1 - abs(correlation))^2

Jim, are you willing to experiment with one or two small patches of
mine?  What version of Postgres are you running?

Servus
 Manfred

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] Sort and index

2005-05-12 Thread Manfred Koizar

On Wed, 11 May 2005 16:15:16 -0500, "Jim C. Nasby" <[EMAIL PROTECTED]>
wrote:
>> This is divided by the number of index columns, so the index correlation
>> is estimated to be 0.219.
>
>That seems like a pretty bad assumption to make.

Any assumption we make without looking at entire index tuples has to be
bad.  A new GUC variable secondary_correlation introduced by my patch at
least gives you a chance to manually control the effects of additional
index columns.

>> In my tests I got much more plausible results with
>> 
>>  1 - (1 - abs(correlation))^2
>
>What's the theory behind that?

The same as for csquared -- pure intuition.  But the numbers presented
in http://archives.postgresql.org/pgsql-hackers/2002-10/msg00072.php
seem to imply that in this case my intiution is better ;-)

Actually above formula was not proposed in that mail.  AFAIR it gives
results between p2 and p3.

>And I'd still like to know why correlation squared is used.

On Wed, 02 Oct 2002 18:48:49 -0400, Tom Lane <[EMAIL PROTECTED]> wrote:
|The indexCorrelation^2 algorithm was only a quick hack with no theory
|behind it :-(.

>It depends on the patches, since this is a production machine. Currently
>it's running 7.4.*mumble*,

The patch referenced in
http://archives.postgresql.org/pgsql-hackers/2003-08/msg00931.php is
still available.  It doesn't touch too many places and should be easy to
review.  I'm using it and its predecessors in production for more than
two years.  Let me know, if the 74b1 version does not apply cleanly to
your source tree.

Servus
 Manfred

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] slow queries, possibly disk io

2005-05-31 Thread Manfred Koizar

>On 5/31/05, Martin Fandel <[EMAIL PROTECTED]> wrote:
>> In the documentation of
>> http://www.powerpostgresql.com/Downloads/annotated_conf_80.html
>> is the shared_buffers set to 1/3 of the availble RAM.

Well, it says "you should never use more than 1/3 of your available RAM"
which is not quite the same as "it is set to 1/3."  I'd even say, never
set it higher than 1/10 of your available RAM, unless you know what
you're doing and why you're doing it.

Servus
 Manfred

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Odd Locking Problem

2005-08-15 Thread Manfred Koizar

On Thu, 11 Aug 2005 16:11:58 -0500, John A Meinel
<[EMAIL PROTECTED]> wrote:
>the insert is occurring into table 'a' not table 'b'.
>'a' refers to other tables, but these should not be modified.

So your "a" is Alvaro's "b", and one of your referenced tables is
Alvaro's "a".  This is further supported by the fact that the problem
doesn't occur with 8.1.

Servus
 Manfred

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Insert performance (OT?)

2005-08-17 Thread Manfred Koizar

On Tue, 19 Jul 2005 11:51:51 +0100, Richard Huxton 
wrote:
>You could get away with one query if you converted them to left-joins:
>INSERT INTO ...
>SELECT * FROM upload LEFT JOIN ... WHERE f3 IS NULL
>UNION
>SELECT * FROM upload LEFT JOIN ... WHERE f4 IS NULL

For the archives:  This won't work.  Each of the two SELECTs
eliminates rows violating one of the two constraints but includes rows
violating the other constraint.  After the UNION you are back to
violating both constraints :-(

Servus
 Manfred

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] LEFT JOIN optimization

2005-09-12 Thread Manfred Koizar

On Mon, 12 Sep 2005 00:47:57 +0300, Ksenia Marasanova
<[EMAIL PROTECTED]> wrote:
>   ->  Seq Scan on user_  (cost=0.00..7430.63 rows=12763 width=245)
>(actual time=360.431..1120.012 rows=12763 loops=1)

If 12000 rows of the given size are stored in more than 7000 pages, then
there is a lot of free space in these pages.  Try VACUUM FULL ...

Servus
 Manfred

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Queries taking ages in PG 8.1, have been much faster in PG<=8.0

2005-12-10 Thread Manfred Koizar

On Mon, 05 Dec 2005 10:11:41 -0500, Tom Lane <[EMAIL PROTECTED]>
wrote:
>> Correlation  -0.0736492
>> Correlation  -0.237136

>That has considerable impact on the
>estimated cost of an indexscan

The cost estimator uses correlationsquared.  So all correlations
between -0.3 and +0.3 can be considered equal under the assumption
that estimation errors of up to 10% are acceptable.
Servus
 Manfred

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Joining 2 tables with 300 million rows

2005-12-12 Thread Manfred Koizar

On Thu, 8 Dec 2005 11:59:24 -0500 , Amit V Shah <[EMAIL PROTECTED]>
wrote:
>  CONSTRAINT pk_runresult_has_catalogtable PRIMARY KEY
>(runresult_id_runresult, catalogtable_id_catalogtable, value)

>'  ->  Index Scan using runresult_has_catalogtable_id_runresult
>on runresult_has_catalogtable runresult_has_catalogtable_1
>(cost=0.00..76.65 rows=41 width=8) (actual time=0.015..0.017 rows=1
>loops=30)'
>'Index Cond:
>(runresult_has_catalogtable_1.runresult_id_runresult =
>"outer".runresult_id_runresult)'
>'Filter: ((catalogtable_id_catalogtable = 54) AND (value
>= 1))'

If I were the planner, I'd use the primary key index.  You seem to
have a redundant(?) index on
runresult_has_catalogtable(runresult_id_runresult).  Dropping it might
help, or it might make things much worse.  But at this stage this is
pure speculation.

Give us more information first.  Show us the complete definition
(including *all* indices) of all tables occurring in your query.  What
Postgres version is this?  And please post EXPLAIN ANALYSE output of a
*slow* query.
Servus
 Manfred

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] index v. seqscan for certain values

2004-04-15 Thread Manfred Koizar

On Tue, 13 Apr 2004 13:55:49 -0400, Tom Lane <[EMAIL PROTECTED]> wrote:
>Possibly the
>nonuniform clumping of CID has something to do with the poor results.

It shouldn't.  The sampling algorithm is designed to give each tuple the
same chance of ending up in the sample, and tuples are selected
independently.  (IOW each one of the {N \chooose n} possible samples has
the same probability.)   There are known problems with nonuniform
distribution of dead vs. live and large vs. small tuples, but AFAICS the
order of values does not matter.

Servus
 Manfred

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] query slows down with more accurate stats

2004-04-15 Thread Manfred Koizar

[Just a quick note here;  a more thorough discussion of my test results
will be posted to -hackers]

On Tue, 13 Apr 2004 15:18:42 -0400, Tom Lane <[EMAIL PROTECTED]> wrote:
>Well, the first problem is why is ANALYZE's estimate of the total row
>count so bad :-( ?  I suspect you are running into the situation where
>the initial pages of the table are thinly populated and ANALYZE
>mistakenly assumes the rest are too.  Manfred is working on a revised
>sampling method for ANALYZE that should fix this problem

The new method looks very promising with respect to row count
estimation:  I got estimation errors of +/- 1% where the old method was
off by up to 60%.  (My test methods might be a bit biased though :-))

My biggest concern at the moment is that the new sampling method
violates the contract of returning each possible sample with he same
probability:  getting several tuples from the same page is more likely
than with the old method.

Servus
 Manfred

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] query slows down with more accurate stats

2004-04-16 Thread Manfred Koizar

On Thu, 15 Apr 2004 20:18:49 -0400, Tom Lane <[EMAIL PROTECTED]> wrote:
>> getting several tuples from the same page is more likely
>> than with the old method.
>
>Hm, are you sure?

Almost sure.  Let's look at a corner case:  What is the probability of
getting a sample with no two tuples from the same page?  To simplify the
problem assume that each page contains the same number of tuples c.

If the number of pages is B and the sample size is n, a perfect sampling
method collects a sample where all tuples come from different pages with
probability (in OpenOffice.org syntax):

p = prod from{i = 0} to{n - 1} {{c(B - i)}  over {cB - i}}

or in C:

p = 1.0;
for (i = 0; i < n; ++i)
p *= c*(B - i) / (c*B - i)

This probability grows with increasing B.

>Also, I'm not at all sure that the old method satisfies that constraint
>completely in the presence of nonuniform numbers of tuples per page,
>so we'd not necessarily be going backwards anyhow ...

Yes, it boils down to a decision whether we want to replace one not
quite perfect sampling method with another not quite perfect method.
I'm still working on putting together the pros and cons ...

Servus
 Manfred

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] query slows down with more accurate stats

2004-04-16 Thread Manfred Koizar

On Fri, 16 Apr 2004 10:34:49 -0400, Tom Lane <[EMAIL PROTECTED]> wrote:
>>  p = prod from{i = 0} to{n - 1} {{c(B - i)}  over {cB - i}}
>
>So?  You haven't proven that either sampling method fails to do the
>same.

On the contrary, I believe that above formula is more or less valid for
both methods.  The point is in what I said next:
| This probability grows with increasing B.

For the one-stage sampling method B is the number of pages of the whole
table.  With two-stage sampling we have to use n instead of B and get a
smaller probability (for n < B, of course).  So this merely shows that
the two sampling methods are not equivalent.

>The desired property can also be phrased as "every tuple should be
>equally likely to be included in the final sample".

Only at first sight.  You really expect more from random sampling.
Otherwise I'd just put one random tuple and its n - 1 successors (modulo
N) into the sample.  This satisfies your condition but you wouldn't call
it a random sample.

Random sampling is more like "every possible sample is equally likely to
be collected", and two-stage sampling doesn't satisfy this condition.

But if in your opinion the difference is not significant, I'll stop
complaining against my own idea.  Is there anybody else who cares?

>You could argue that a tuple on a heavily populated page is
>statistically likely to see a higher T when it's part of the page sample
>pool than a tuple on a near-empty page is likely to see, and therefore
>there is some bias against selection of the former tuple.  But given a
>sample over a reasonably large number of pages, the contribution of any
>one page to T should be fairly small and so this effect ought to be
>small.

It is even better:  Storing a certain number of tuples on heavily
populated pages takes less pages than to store them on sparsely
populated pages (due to tuple size or to dead tuples).  So heavily
populated pages are less likely to be selected in stage one, and this
exactly offsets the effect of increasing T.

>So I think this method is effectively unbiased at the tuple level.

Servus
 Manfred

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Setting Shared Buffers , Effective Cache, Sort Mem Parameters

2004-04-22 Thread Manfred Koizar

On Thu, 22 Apr 2004 13:51:42 -0400, Pallav Kalva <[EMAIL PROTECTED]> wrote:
>I need some help with setting these parameters (shared buffers, 
>effective cache, sort mem) in the pg_conf file.

It really depends on the kind of queries you intend to run, the number
of concurrent active connections, the size of the working set (active
part of the database), what else is running on the machine, and and and
...

Setting shared_buffers to 1, effective_cache_size to 40 (80% of
installed RAM), and sort_mem to a few thousand might be a good start.

>  Also can anyone explain 
>the difference between shared buffers and effective cache , how these 
>are allocated in the main memory (the docs are not clear on this).

Shared_buffers directly controls how many pages are allocated as
internal cache.  Effective_cache_size doesn't allocate anything, it is
just a hint to the planner how much cache is available on the system
level.

Servus
 Manfred

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Setting Shared Buffers , Effective Cache, Sort Mem Parameters

2004-04-23 Thread Manfred Koizar

On Fri, 23 Apr 2004 10:20:10 -0400, Pallav Kalva <[EMAIL PROTECTED]> wrote:
> the database sizes is around 2- 4 gig and 
>there are 5 of them. this machine is
> mainly for the databases and nothing is running on them.

Did I understand correctly that you run (or plan to run) five
postmasters?  Is there a special reason that you cannot put all your
tables into one database?

>setting shared buffers to 1 allocates (81Mb)  and effective 
>cache to 40 would be around (3gig)
>does this means that if all of the 81mb of the shared memory gets 
>allocated it will use rest from the effective
>cache of (3g-81mb) ?

Simply said, if Postgres wants to access a block, it first looks whether
this block is already in shared buffers which should be the case, if the
block is one of the last 1 blocks accessed.  Otherwise the block has
to be read in.  If the OS has the block in its cache, reading it is just
a (fast) memory operation, else it involves a (slow) physical disk read.

The number of database pages residing in the OS cache is totally out of
control of Postgres.  Effective_cache_size tells the query planner how
many database pages can be *expected* to be present in the OS cache.

>increasing the shared buffers space to 2g

Setting shared_buffers to half your available memory is the worst thing
you can do.  You would end up caching exactly the same set of blocks in
the internal buffers and in the OS cache, thus effectively making one of
the caches useless.

Better keep shared_buffers low and let the OS do its job.

Servus
 Manfred

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [PERFORM] Why will vacuum not end?

2004-04-24 Thread Manfred Koizar

On Sat, 24 Apr 2004 10:45:40 -0400, "Shea,Dan [CIS]" <[EMAIL PROTECTED]>
wrote:
>[...] 87 GB table with a 39 GB index?

>The vacuum keeps redoing the index, but there is no indication as to why it
>is doing this.  

If VACUUM finds a dead tuple, if does not immediately remove index
entries pointing to that tuple.  It instead collects such tuple ids and
later does a bulk delete, i.e. scans the whole index and removes all
index items pointing to one of those tuples.  The number of tuple ids
that can be remembered is controlled by vacuum_mem: it is

VacuumMem * 1024 / 6

Whenever this number of dead tuples has been found, VACUUM scans the
index (which takes ca. 6 seconds, more than 16 hours), empties the
list and continues to scan the heap ...

>From the number of dead tuples you can estimate how often your index
will be scanned.  If dead tuples are evenly distributed, expect there to
be 15 index scans with your current vacuum_mem setting of 196608.  So
your VACUUM will run for 11 days :-(

OTOH this would mean that there are 500 million dead tuples.  Do you
think this is possible?

Servus
 Manfred

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [PERFORM] Why will vacuum not end?

2004-04-24 Thread Manfred Koizar

On Sat, 24 Apr 2004 15:48:19 -0400, "Shea,Dan [CIS]" <[EMAIL PROTECTED]>
wrote:
>Manfred is indicating the reason it is taking so long is due to the number
>of dead tuples in my index and the vacuum_mem setting.  

Not dead tuples in the index, but dead tuples in the table.

>The last delete that I did before starting a vacuum had 219,177,133
>deletions.

Ok, with vacuum_mem = 196608 the bulk delete batch size is ca. 33.5 M
tuple ids.  219 M dead tuples will cause 7 index scans.  The time for an
index scan is more or less constant, 6 seconds in your case.  So
yes, a larger vacuum_mem will help, but only if you really have as much
*free* memory.  Forcing the machine into swapping would make things
worse.

BTW, VACUUM frees millions of index pages, is your FSM large enough?

Servus
 Manfred

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Why will vacuum not end?

2004-04-24 Thread Manfred Koizar

On Sat, 24 Apr 2004 15:58:08 -0400, "Shea,Dan [CIS]" <[EMAIL PROTECTED]>
wrote:
>There were defintely 219,177,133 deletions.  
>The deletions are most likely from the beginning, it was based on the
>reception_time of the data.
>I would rather not use re-index, unless it is faster then using vacuum.

I don't know whether it would be faster.  But if you decide to reindex,
make sure sort_mem is *huge*!

>What do you think would be the best way to get around this?
>Increase vacuum_mem to a higher amount 1.5 to 2 GB or try a re-index (rather
>not re-index so that data can be queried without soing a seqscan).

Just out of curiosity:  What kind of machine is this running on?  And
how long does a seq scan take?

>Once the index is cleaned up, how does vacuum handle the table?  

If you are lucky VACUUM frees half the index pages.  And if we assume
that the most time spent scanning an index goes into random page
accesses, future VACUUMs will take "only" 3 seconds per index scan.

Servus
 Manfred

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

[PERFORM] Number of pages in a random sample (was: query slows down with more accurate stats)

2004-04-25 Thread Manfred Koizar

On Mon, 19 Apr 2004 12:00:10 -0400, Tom Lane <[EMAIL PROTECTED]> wrote:
>A possible compromise is to limit the number of pages sampled to
>something a bit larger than n, perhaps 2n or 3n.  I don't have a feeling
>for the shape of the different-pages probability function; would this
>make a significant difference, or would it just waste cycles?

I would have replied earlier, if I had a good answer.  What I have so
far contains at least one, probably two flaws.  Knowing not much more
than the four basic arithmetic operations I was not able to improve my
model.  So I post what I have:

As usual we assume a constant number c of tuples per page.  If we have a
table of size B pages and want to collect a sample of n tuples, the
number of possible samples is (again in OOo syntax)

left( binom{cB}{n} right)

If we select an arbitrary page, the number of possible samples that do
NOT contain any tuple from this page is

left( binom {c (B-1)} {n} right)

Let's forget about our actual implementations of sampling methods and
pretend we have a perfect random sampling method.  So the probability
Pnot(c, B, n) that a certain page is not represented in a random sample
is

left( binom {c (B-1)} {n} right) over left( binom{cB}{n} right)

which can be transformed into the more computing-friendly form

prod from{i=0} to{n-1} {{cB-c - i} over {cB - i}}

Clearly the probability that a certain page *is* represented in a sample
is

Pyes(c, B, n) = 1 - Pnot(c, B, n)

The next step assumes that these probabilities are independent for
different pages, which in reality they are not.  We simply estimate the
number of pages represented in a random sample as 

numPag(c, B, n) = B * Pyes(c, B, n)

Here are some results for n = 3000:

B  \ c->10 |   100 |   200
---+---+---+---
   100 |  ---  |   100 |   100
  1000 |   972 |   953 |   951
  2000 |  1606 |  1559 |  1556
  3000 |  1954 |  1902 |  1899
  6000 |  2408 |  2366 |  2363
  9000 |  2588 |  2555 |  2553
 2 |  2805 |  2788 |  2787
 3 |  2869 |  2856 |  2856
10 |  2960 |  2956 |  2956

This doesn't look to depend heavily on the number of tuples per page,
which sort of justifies the assumption that c is constant.

In the next step I tried to estimate the number of pages that contain
exactly 1, 2, ... tuples of the sample.  My naive procedure works as
follows (I'm not sure whether it is even valid as a rough approximation,
constructive criticism is very welcome):

For c=100, B=3000, n=3000 we expect 1902 pages to contain at least 1
tuple of the sample.  There are 1098 more tuples than pages, these
tuples lie somewhere in those 1902 pages from the first step.
numPag(99, 1902, 1098) = 836 pages contain at least a second tuple.
So the number of pages containing exactly 1 tuple is 1902 - 836 = 1066.
Repeating these steps we get 611 pages with 2 tuples, 192 with 3, 30
with 4, and 3 pages with 5 tuples.

Here are some more numbers for c = 100 and n = 3000:

   B   | pages with 1, 2, ... tuples
---+
   100 |  1 to 24 tuples: 0, then 1, 2, 4, 10, 18, 26, 24, 11, 4
  1000 |  108, 201, 268, 229, 113, 29, 5
  2000 |  616, 555, 292,  83, 12, 1
  3000 | 1066, 611, 192,  30,  3
  6000 | 1809, 484,  68,   5
  9000 | 2146, 374,  32,   2
 2 | 2584, 196,   8
 3 | 2716, 138,   3
10 | 2912,  44

A small C program to experimentally confirm or refute these calculations
is attached.  Its results are fairly compatible with above numbers,
IMHO.

Servus
 Manfred
/*
** samsim.c  -  sampling simulator
*/
#include 
#include 
#include 
#include 

typedef int bool;

#define MAX_RANDOM_VALUE  (0x7FFF)

static void initrandom()
{
struct timeval tv;

gettimeofday(&tv, NULL);
srandom(tv.tv_sec ^ tv.tv_usec);
}/*initrandom*/

/* Select a random value R uniformly distributed in 0 < R < 1 */
static double
random_fract(void)
{
longz;

/* random() can produce endpoint values, try again if so */
do
{
z = random();
} while (z <= 0 || z >= MAX_RANDOM_VALUE);
return (double) z / (double) MAX_RANDOM_VALUE;
}

/*
** data structure for (modified) Algorithm S from Knuth 3.4.2
*/
typedef struct
{
longN;  /* number of tuples, known in advance 
*/
int n;  /* sample size */
longt;  /* current tuple number */
int m;  /* tuples selected so far */
} SamplerData;
typedef SamplerData *Sampler;

static void Sampler_Init(Sampler bs, long N, int samplesize);
static bool Sampler_HasMore(Sampler bs);
static long Sampler_Next(Sampler bs);

/*
**

Re: [PERFORM] Why will vacuum not end?

2004-04-25 Thread Manfred Koizar

On Sun, 25 Apr 2004 09:05:11 -0400, "Shea,Dan [CIS]" <[EMAIL PROTECTED]>
wrote:
>It is set at max_fsm_pages = 150 .

This might be too low.  Your index has ca. 5 M pages, you are going to
delete half of its entries, and what you delete is a contiguous range of
values.  So up to 2.5 M index pages might be freed (minus inner nodes
and pages not completely empty).  And there will be lots of free heap
pages too ...

I wrote:
>If you are lucky VACUUM frees half the index pages.  And if we assume
>that the most time spent scanning an index goes into random page
>accesses, future VACUUMs will take "only" 3 seconds per index scan.

After a closer look at the code and after having slept over it I'm not
so sure any more that the number of tuple ids to be removed has only
minor influence on the time spent for a bulk delete run.  After the
current VACUUM has finished would you be so kind to run another VACUUM
VERBOSE with only a few dead tuples and post the results here?

Servus
 Manfred

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] analyzer/planner and clustered rows

2004-04-30 Thread Manfred Koizar

On Thu, 29 Apr 2004 19:09:09 -0400, Joseph Shraibman
<[EMAIL PROTECTED]> wrote:
>How does the analyzer/planner deal with rows clustered together?

There's a correlation value per column.  Just try

SELECT attname, correlation
  FROM pg_stats
 WHERE tablename = '...';

if you are interested.  It indicates how well the hypothetical order of
tuples if sorted by that column corresponds to the physical order.  +1.0
is perfect correlation, 0.0 is totally chaotic, -1.0 means reverse
order.  The optimizer is more willing to choose an index scan if
correlation for the first index column is near +/-1.

>  What if the data in the table happens to be close 
>together because it was inserted together originally?

Having equal values close to each other is not enough, the values should
be increasing, too.  Compare

5 5 5 4 4 4 7 7 7 2 2 2 6 6 6 3 3 3 8 8 8   low correlation
and
2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8   correlation = 1.0

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] planner/optimizer question

2004-04-30 Thread Manfred Koizar

On Fri, 30 Apr 2004 19:46:24 +0200, Jochem van Dieten
<[EMAIL PROTECTED]> wrote:
>> While the storage overhead could be reduced to 1 bit (not a joke)
>
>You mean adding an isLossy bit and only where it is set the head 
>tuple has to be checked for visibility, if it is not set the head 
>tuple does not have to be checked?

Yes, something like this.  Actually I imagined it the other way round: a
visible-to-all flag similar to the existing dead-to-all flag (search for
LP_DELETE and ItemIdDeleted in nbtree.c).

>> we'd
>> still have the I/O overhead of locating and updating index tuples for
>> every heap tuple deleted/updated.
>
>Would there be additional I/O for the additional bit in the index 
>tuple (I am unable to find the layout of index tuple headers in 
>the docs)?

Yes, the visible-to-all flag would be set as a by-product of an index
scan, if the heap tuple is found to be visible to all active
transactions.  This update is non-critical and, I think, not very
expensive.

Deleting (and hence updating) a tuple is more critical, regarding both
consistency and performance.  We'd have to locate all index entries
pointing to the heap tuple and set their visible-to-all flags to false.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] planner/optimizer question

2004-05-02 Thread Manfred Koizar

On Sat, 01 May 2004 13:18:04 +0200, Jochem van Dieten
<[EMAIL PROTECTED]> wrote:
>Tom Lane wrote:
>> Oh really?  I think you need to think harder about the transition
>> conditions.

Indeed.

>> 
>> Dead-to-all is reasonably safe to treat as a hint bit because *it does
>> not ever need to be undone*.  Visible-to-all does not have that
>> property.
>
>Yes, really :-)

No, not really :-(

As Tom has explained in a nearby message his concern is that -- unlike
dead-to-all -- visible-to-all starts as false, is set to true at some
point in time, and is eventually set to false again.  Problems arise if
one backend wants to set visible-to-all to true while at the same time
another backend wants to set it to false.

This could be curable by using a second bit as a deleted flag (might be
even the same bit that's now used as dead-to-all, but I'm not sure).  An
index tuple having both the visible flag (formerly called
visible-to-all) and the deleted flag set would cause a heap tuple access
to check visibility.  But that leaves the question of what to do after
the deleting transaction has rolled back.  I see no clean way from the
visible-and-deleted state to visible-to-all.

This obviously needs another round of hard thinking ...

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] Shared buffers, Sort memory, Effective Cache Size

2004-04-27 Thread Manfred Koizar

On Wed, 21 Apr 2004 10:01:30 -0700, Qing Zhao <[EMAIL PROTECTED]>
wrote:
>I have recently configured my PG7.3 on a G5 (8GB RAM) with
>shmmax set to 512MB and shared_buffer=5, sort_mem=4096
>and effective cache size = 1.  It seems working great so far but
>I am wondering if I should make effctive cache size larger myself.

Yes, much larger!  And while you are at it make shared_buffers smaller.

Servus
 Manfred

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [PERFORM] planner/optimizer question

2004-04-28 Thread Manfred Koizar

On Wed, 28 Apr 2004 07:35:41 +0100, "Gary Doades" <[EMAIL PROTECTED]>
wrote:
>Why is there an entry in the index for a row if the row is not valid? 

Because whether a row is seen as valid or not lies in the eye of the
transaction looking at it.  Full visibility information is stored in the
heap tuple header.  The developers' consensus is that this overhead
should not be in every index tuple.

Servus
 Manfred

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Simply join in PostrgeSQL takes too long

2004-04-28 Thread Manfred Koizar

On Tue, 27 Apr 2004 18:01:34 -0400, Rod Taylor <[EMAIL PROTECTED]> wrote:
>On Tue, 2004-04-27 at 17:27, Vitaly Belman wrote:
>> Hello pgsql-performance,
>> 
>>   I discussed the whole subject for some time in DevShed and didn't
>>   achieve much (as for results). I wonder if any of you guys can help
>>   out:
>> 
>>   http://forums.devshed.com/t136202/s.html

>The point is that a book cannot be of a certain genre more than once.

Rod, he has a hierarchy of genres.  Genre 1 has 6379 child genres and a
book can be in more than one of these.

Vitaly, though LIMIT makes this look like a small query, DISTINCT
requires the whole result set to be retrieved.  0.7 seconds doesn't look
so bad for several thousand rows.  Did you try with other genre_ids?

Maybe a merge join is not the best choice.  Set enable_mergejoin to
false and see whether you get a (hopefully faster) hash join, assuming
that sort_mem is large enough to keep the hash table in memory.

If you send me your table contents I'll try it on Linux.

Servus
 Manfred

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Manfred Koizar

On Wed, 28 Apr 2004 09:05:04 -0400, Tom Lane <[EMAIL PROTECTED]> wrote:
>> [ ... visibility information in index tuples ... ]

>Storing that information would at least double the overhead space used
>for each index tuple.  The resulting index bloat would significantly
>slow index operations by requiring more I/O.  So it's far from clear
>that this would be a win, even for those who care only about select
>speed.

While the storage overhead could be reduced to 1 bit (not a joke) we'd
still have the I/O overhead of locating and updating index tuples for
every heap tuple deleted/updated.

Servus
 Manfred

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] Simply join in PostrgeSQL takes too long

2004-04-29 Thread Manfred Koizar

On Wed, 28 Apr 2004 08:23:35 -0400, Rod Taylor <[EMAIL PROTECTED]> wrote:
>The below plpgsql forces the kind of algorithm we wish the planner could
>choose. It should be fairly quick irregardless of dataset.

That reminds me of hash aggregation.  So here's another idea for Vitaly:

SELECT book_id
  FROM ...
 WHERE ...
 GROUP BY book_id
 LIMIT ...

Servus
 Manfred

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] Simply join in PostrgeSQL takes too long

2004-04-29 Thread Manfred Koizar

On Thu, 29 Apr 2004 13:36:47 -0400, Rod Taylor <[EMAIL PROTECTED]> wrote:
>The reason for the function is that the sort routines (hash aggregation
>included) will not stop in mid-sort

Good point.

Servus
 Manfred

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Quad processor options

2004-05-12 Thread Manfred Koizar

On Tue, 11 May 2004 15:46:25 -0700, Paul Tuckfield <[EMAIL PROTECTED]>
wrote:
>- the "cache" column shows that linux is using 2.3G for cache. (way too 
>much)

There is no such thing as "way too much cache".

>   you generally want to give memory to postgres to keep it "close" to 
>the user,

Yes, but only a moderate amount of memory.

>   not leave it unused to be claimed by linux cache

Cache is not unused memory.

>- I'll bet you have a low value for shared buffers, like 1.  On 
>your 3G system
>   you should ramp up the value to at least 1G (125000 8k buffers) 

In most cases this is almost the worst thing you can do.  The only thing
even worse would be setting it to 1.5 G.

Postgres is just happy with a moderate shared_buffers setting.  We
usually recommend something like 1.  You could try 2, but don't
increase it beyond that without strong evidence that it helps in your
particular case.

This has been discussed several times here, on -hackers and on -general.
Search the archives for more information.

Servus
 Manfred

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] [HACKERS] Number of pages in a random sample

2004-04-29 Thread Manfred Koizar

On Mon, 26 Apr 2004 08:08:16 -0700, Sailesh Krishnamurthy
<[EMAIL PROTECTED]> wrote:
> "A Bi-Level Bernoulli Scheme for Database Sampling"
> Peter Haas, Christian Koenig (SIGMOD 2004) 

Does this apply to our problem?  AFAIK with Bernoulli sampling you don't
know the sample size in advance.

Anyway, thanks for the hint.  Unfortunately I couldn't find the
document.  Do you have a link?

Servus
 Manfred

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] optimization ideas for frequent, large(ish) updates in frequently accessed DB?

2004-06-01 Thread Manfred Koizar

On Fri, 13 Feb 2004 16:21:29 +0100, I wrote:
>Populate this table with
>   INSERT INTO idmap
>   SELECT id, id, true
> FROM t;

This should be
INSERT INTO idmap
SELECT DISTINCT id, id, true
  FROM t;

Servus
 Manfred

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [PERFORM] Use of Functional Indexs and Planner estimates

2004-06-08 Thread Manfred Koizar

On Tue, 8 Jun 2004 17:24:36 +1000, Russell Smith <[EMAIL PROTECTED]>
wrote:
>Also I am interested in how functional indexes have statistics collected for them, if 
>they do.

Not in any released version.

http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/backend/commands/analyze.c

| Revision 1.70 / Sun Feb 15 21:01:39 2004 UTC (3 months, 3 weeks ago) by tgl
| Changes since 1.69: +323 -16 lines
|
| First steps towards statistics on expressional (nee functional) indexes.
| This commit teaches ANALYZE to store such stats in pg_statistic, but
| nothing is done yet about teaching the planner to use 'em.

So statistics gathering for expressional indexes will be in 7.5, but I
don't know about the state of the planner ...

Servus
 Manfred

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] I could not get postgres to utilizy indexes

2004-08-20 Thread Manfred Koizar

On Thu, 19 Aug 2004 09:54:47 +0200, "Leeuw van der, Tim"
<[EMAIL PROTECTED]> wrote:
>You asked the very same question yesterday, and I believe you got some useful 
>answers. Why do you post the question again?

Tim, no need to be rude here.  We see this effect from time to time when
a new user sends a message to a mailing list while not subscribed.  The
sender gets an automated reply from majordomo, subscribes to the list
and sends his mail again.  One or two days later the original message is
approved (by Marc, AFAIK) and forwarded to the list.  Look at the
timestamps in these header lines:
|Received: from postgresql.org (svr1.postgresql.org [200.46.204.71])
|   by svr4.postgresql.org (Postfix) with ESMTP id 32B1F5B04F4;
|   Wed, 18 Aug 2004 15:54:13 + (GMT)
|Received: from localhost (unknown [200.46.204.144])
|   by svr1.postgresql.org (Postfix) with ESMTP id E6B2B5E4701
|   for <[EMAIL PROTECTED]>; Tue, 17 Aug 2004 11:23:07 -0300 (ADT)

>[more instructions]

And while we are teaching netiquette, could you please stop top-posting
and full-quoting.

Igor, welcome to the list!  Did the suggestions you got solve your
problem?

Servus
 Manfred

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] Large # of rows in query extremely slow, not using

2004-09-17 Thread Manfred Koizar

On Thu, 16 Sep 2004 20:51:11 -0500, Stephen Crowley
<[EMAIL PROTECTED]> wrote:
>explain analyze select * from history where date='2004-09-07' and
>stock='ORCL' LIMIT 10;

>"  ->  Index Scan using island_history_date_stock_time on
>island_history  (cost=0.00..183099.72 rows=102166 width=83) (actual
>time=1612.000..1702.000 rows=10 loops=1)"
  ^^
LIMIT 10 hides what would be the most interesting info here.  I don't
believe that
EXPLAIN ANALYSE SELECT * FROM history WHERE ...
consumes lots of memory.  Please try it.

And when you post the results please include your Postgres version, some
info about hardware and OS, and your non-default settings, especially
random_page_cost and effective_cache_size.

May I guess that the correlation of the physical order of tuples in your
table to the contents of the date column is pretty good (examine
correlation in pg_stats) and that island_history_date_stock_time is a
3-column index?

It is well known that the optimizer overestimates the cost of index
scans in those situations.  This can be compensated to a certain degree
by increasing effective_cache_size and/or decreasing random_page_cost
(which might harm other planner decisions).

You could also try
CREATE INDEX history_date_stock ON history("date", stock);

This will slow down INSERTs and UPDATEs, though.

Servus
 Manfred

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [PERFORM] Large # of rows in query extremely slow, not using

2004-09-20 Thread Manfred Koizar

On Fri, 17 Sep 2004 19:23:44 -0500, Stephen Crowley
<[EMAIL PROTECTED]> wrote:
>Seq Scan [...] rows=265632
>  Filter: ((date = '2004-09-07'::date) AND ((stock)::text = 'MSFT'::text))
>Total runtime: 412703.000 ms
>
>random_page_cost and effective_cache_size are both default, 8 and 1000

Usually random_page_cost is 4.0 by default.  And your
effective_cache_size setting is far too low for a modern machine.

>"Index Scan [...] rows=159618
>"  Index Cond: ((date = '2004-09-07'::date) AND ((stock)::text = 'ORCL'::text))"
>"Total runtime: 201009.000 ms"

Extrapolating this to 265000 rows you should be able to get the MSFT
result in ca. 330 seconds, if you can persuade the planner to choose an
index scan.  Fiddling with random_page_cost and effective_cache_size
might do the trick.

>So now  this in all in proportion and works as expected.. the question
>is, why would the fact that it needs to be vaccumed cause such a huge
>hit in performance? When i vacuumed it did free up nearly 25% of the
>space.

So before the VACCUM a seq scan would have taken ca. 550 seconds.  Your
MSFT query with LIMIT 10 took ca. 350 seconds.  It's not implausible to
assume that more than half of the table had to be scanned to find the
first ten rows matching the filter condition.

Servus
 Manfred

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] Possibly slow query

2005-01-31 Thread Manfred Koizar

On Wed, 26 Jan 2005 07:16:25 -0800, "Peter Darley"
<[EMAIL PROTECTED]> wrote:
>SELECT User_ID
>FROM Assignments A NATURAL LEFT JOIN (SELECT * FROM Assignment_Settings
>WHERE Setting='Status') ASet
>WHERE A.User_ID IS NOT NULL
>   AND ASet.Assignment_ID IS NULL
>GROUP BY User_ID;

"ASet.Assignment_ID IS NULL" and "value IS NULL" as you had in your
original post don't necessarily result in the same set of rows.

SELECT DISTINCT a.User_ID
  FROM Assignments a
   LEFT JOIN Assignment_Settings s
  ON (a.Assignment_ID=s.Assignment_ID
  AND s.Setting='Status')
 WHERE a.User_ID IS NOT NULL
   AND s.Value IS NULL;

Note how the join condition can contain subexpressions that only depend
on columns from one table.

BTW,
|neo=# \d assignment_settings
| [...]
| setting   | character varying(250) | not null
| [...]
|Indexes:
|[...]
|"assignment_settings_assignment_id_setting" unique, btree (assignment_id, 
setting)

storing the setting names in their own table and referencing them by id
might speed up some queries (and slow down others).  Certainly worth a
try ...

Servus
 Manfred

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] index scan on =, but not < ?

2005-03-17 Thread Manfred Koizar

On Thu, 10 Mar 2005 10:24:46 +1000, David Brown <[EMAIL PROTECTED]>
wrote:
>What concerns me is that this all depends on the correlation factor, and 
>I suspect that the planner is not giving enough weight to this. 

The planner does the right thing for correlations very close to 1 (and
-1) and for correlations near zero.  For correlations somewhere between
0 and 1 the cost is estimated by interpolation, but it tends too much
towards the bad end, IMHO.

Servus
 Manfred

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] cpu_tuple_cost

2005-03-17 Thread Manfred Koizar

On Mon, 14 Mar 2005 21:23:29 -0500, Tom Lane <[EMAIL PROTECTED]> wrote:
> I think that the "reduce random_page_cost" mantra
>is not an indication that that parameter is wrong, but that the
>cost models it feeds into need more work.

One of these areas is the cost interpolation depending on correlation.
This has been discussed on -hackes in October 2002 and August 2003
("Correlation in cost_index()").  My Postgres installations contain the
patch presented during that discussion (and another index correlation
patch), and I use *higher* values for random_page_cost (up to 10).

Servus
 Manfred

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] multi-column index

2005-03-17 Thread Manfred Koizar

On Wed, 16 Mar 2005 22:19:13 -0500, Tom Lane <[EMAIL PROTECTED]> wrote:
>calculate the correlation explicitly for each index

May be it's time to revisit an old proposal that has failed to catch
anybody's attention during the 7.4 beta period:
http://archives.postgresql.org/pgsql-hackers/2003-08/msg00937.php

I'm not sure I'd store index correlation in a separate table today.
You've invented something better for functional index statistics, AFAIR.

Servus
 Manfred

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] multi-column index

2005-03-17 Thread Manfred Koizar

On Thu, 17 Mar 2005 16:55:15 +0800, Christopher Kings-Lynne
<[EMAIL PROTECTED]> wrote:
>Make it deal with cross-table fk correlations as well :)

That's a different story.  I guess it boils down to cross-column
statistics for a single table.  Part of this is the correlation between
values in two or more columns, which is not the same as the correlation
between column (or index tuple) values and tuple positions.

And yes, I did notice the smiley ;-)

Servus
 Manfred

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] multi-column index

2005-03-18 Thread Manfred Koizar

On Thu, 17 Mar 2005 23:48:30 -0800, Ron Mayer
<[EMAIL PROTECTED]> wrote:
>Would this also help estimates in the case where values in a table
>are tightly clustered, though not in strictly ascending or descending
>order?

No, I was just expanding the existing notion of correlation from single
columns to index tuples.

>For example, address data has many fields that are related
>to each other (postal codes, cities, states/provinces).

This looks like a case for cross-column statistics, though you might not
have meant it as such.  I guess what you're talking about can also be
described with a single column.  In a list like

  3 3 ... 3 1 1 ... 1 7 7 ... 7 4 4 ... 4 ...

equal items are "clustered" together but the values are not "correlated"
to their positions.  This would require a whole new column
characteristic, something like the probability that we find the same
value in adjacent heap tuples, or the number of different values we can
expect on one heap page.  The latter might even be easy to compute
during ANALYSE.

Servus
 Manfred

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] multi-column index

2005-03-18 Thread Manfred Koizar

On Thu, 17 Mar 2005 13:15:32 -0500, Tom Lane <[EMAIL PROTECTED]> wrote:
>I am coming around to the view that we really do need to calculate
>index-specific correlation numbers,

Correlation is a first step.  We might also want distribution
information like number of distinct index tuples and histograms.

>Now, as to the actual mechanics of getting the numbers: the above link
>seems to imply reading the whole index in index order.

That turned out to be surprisingly easy (no need to look at data values,
no operator lookup, etc.) to implement as a proof of concept.  As it's
good enough for my use cases I never bothered to change it.

>  Which is a
>hugely expensive proposition for a big index,

Just a thought:  Could the gathering of the sample be integrated into
the bulk delete phase of VACUUM?  (I know, ANALYSE is not always
performed as an option to VACUUM, and VACUUM might not even have to
delete any index tuples.)

>  We need a way
>to get the number from a small sample of pages.

I had better (or at least different) ideas at that time, like walking
down the tree, but somehow lost impetus :-(

>The idea I was toying with was to recalculate the index keys for the
>sample rows that ANALYZE already acquires, and then compare/sort
>those.

This seems to be the approach that perfectly fits into what we have now.

>  This is moderately expensive CPU-wise though, and it's also not
>clear what "compare/sort" means for non-btree indexes.

Nothing.  We'd need some notion of "clusteredness" instead of
correlation.  C.f. my answer to Ron in this thread.

BTW, the more I think about it, the more I come to the conclusion that
when the planner starts to account for "clusteredness", random page cost
has to be raised.

Servus
 Manfred

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] Performance advice

2003-06-24 Thread Manfred Koizar

[ This has been written offline yesterday.  Now I see that most of it
has already been covered.  I send it anyway ... ]

On Tue, 24 Jun 2003 09:39:32 +0200, "Michael Mattox"
<[EMAIL PROTECTED]> wrote:
>Websites are monitored every 5 or 10 minutes (depends on client),
>there are 900 monitors which comes out to 7,800 monitorings per hour.

So your server load - at least INSERT, UPDATE, DELETE - is absolutely
predictable.  This is good.  It enables you to design a cron-driven
VACUUM strategy.

|INFO:  --Relation public.jdo_sequencex--
|INFO:  Pages 28: Changed 1, Empty 0; Tup 1: Vac 5124, Keep 0, UnUsed 0.
  ^  
This table could stand more frequent VACUUMs, every 15 minutes or so.

BTW, from the name of this table and from the fact that there is only
one live tuple I guess that you are using it to keep track of a
sequence number.  By using a real sequence you could get what you need
with less contention; and you don't have to VACUUM a sequence.

|INFO:  --Relation public.monitorx--
|INFO:  Removed 170055 tuples in 6036 pages.
|CPU 0.52s/0.81u sec elapsed 206.26 sec.
|INFO:  Pages 6076: Changed 0, Empty 0; Tup 2057: Vac 170055, Keep 568, UnUsed 356.
|Total CPU 6.28s/13.23u sec elapsed 486.07 sec.

The Vac : Tup ratio for this table is more than 80.  You have to
VACUUM this table more often.  How long is "overnight"?  Divide this
by 80 and use the result as the interval between
VACUUM [VERBOSE] [ANALYSE] public.monitorx;

Thus you'd have approximately as many dead tuples as live tuples and
the table size should not grow far beyond 150 pages (after an initial
VACUUM FULL, of course).  Then VACUUM of this table should take no
more than 20 seconds.

Caveat:  Frequent ANALYSEs might trigger the need to VACUUM
pg_catalog.pg_statistic.

>  The
>monitor table has columns "nextdate" and "status" which are updated with
>every monitoring, [...]
> updating the "nextdate" before the monitoring and inserting the
>status and status item records after.

Do you mean updating monitor.nextdate before the monitoring and
monitor.status after the monitoring?  Can you combine these two
UPDATEs into one?

>  During the vacuum my application does slow down quite a bit

Yes, because VACUUM does lots of I/O.

> and afterwards is slow speeds back up.

... because the working set is slowly fetched into the cache after
having been flushed out by VACUUM.  Your five largest relations are
monitorstatus_statusitemsx, monitorstatusitemlistd8ea58a5x,
monitorstatusitemlistx, monitorstatusitemx, and monitorstatusx.  The
heap relations alone (without indexes) account for 468701 pages,
almost 4GB.  VACUUMing these five relations takes 23 minutes for
freeing less than 200 out of 6 million tuples for each relation.  This
isn't worth it.  Unless always the same tuples are updated over and
over, scheduling a VACUUM for half a million deletions/updates should
be sufficient.

>shared_buffers = 3072   # min max_connections*2 or 16, 8KB each
>sort_mem = 8192 # min 64, size in KB
>vacuum_mem = 24576  # min 1024, size in KB
>
>The rest are left uncommented (using the defaults).

As has already been said, don't forget effective_cache_size.  I'm not
so sure about random_page_cost.  Try to find out which queries are too
slow.  EXPLAIN ANALYSE is your friend.

One more thing:  I see 2 or 3 UPDATEs and 5 INSERTs per monitoring.
Are these changes wrapped into a single transaction?

Servus
 Manfred

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] Performance advice

2003-06-26 Thread Manfred Koizar

On Wed, 25 Jun 2003 11:47:48 +0200, "Michael Mattox"
<[EMAIL PROTECTED]> wrote:
>> |INFO:  --Relation public.jdo_sequencex--
>> |INFO:  Pages 28: Changed 1, Empty 0; Tup 1: Vac 5124, Keep 0, UnUsed 0.
>>   ^  
>> This table could stand more frequent VACUUMs, every 15 minutes or so.
>
>Can you explain what the "Vac" is

That's a long story, where shall I start?  Search for MVCC in the docs
and in the list archives.  So you know that every DELETE and every
UPDATE leaves behind old versions of tuples.  The space occupied by
these cannot be used immediately.  VACUUM is responsible for finding
dead tuples, which are so old that there is no active transaction that
could be interested in their contents, and reclaiming the space.  The
number of such tuples is reported as "Vac".

> and how you knew that it should be vacuumed more often?

jdo_sequencex stores (5000 old versions and 1 active version of) a
single row in 28 pages.  Depending on when you did ANALYSE it and
depending on the SQL statement, the planner might think that a
sequential scan is the most efficient way to access this single row.
A seq scan has to read 28 pages instead of a single page.  Well,
probably all 28 pages are in the OS cache or even in PG's shared
buffers, but 27 pages are just wasted and push out pages you could
make better use of.  And processing those 28 pages does not come at no
CPU cost.  If you VACUUM frequently enough, this relation never grows
beyond one page.

>I'm using Java Data Objects (JDO) which is an O/R mapper.  It generated the
>schema from my object model by default it used a table for a sequence.  I
>just got finished configuring it to use a real postgres sequence.  With the
>way they have it designed, it opens and closes a connection each time it
>retrieves a sequence.  Would I get a performance increase if I modify their
>code to retrieve multiple sequence numbers in one connection?  For example I
>could have it grab 50 at a time, which would replace 50 connections with 1.

Better yet you modify the code to use the normal access functions for
sequences.

Servus
 Manfred

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] problem with pg_statistics

2003-06-26 Thread Manfred Koizar

On Thu, 26 Jun 2003 10:08:05 -0400, Tom Lane <[EMAIL PROTECTED]>
wrote:
>Andre Schubert <[EMAIL PROTECTED]> writes:
>> i think i need a little help with a problem with pg_statistic.
>
>Try reducing random_page_cost

With index scan cost being more than 25 * seq scan cost, I guess that
- all other things held equal - even random_page_cost = 1 wouldn't
help.

Andre might also want to experiment with effective_cache_size and with
ALTER TABLE ... SET STATISTICS.

Or there's something wrong with correlation?

Andre, what hardware is this running on?  What are the values of
shared_buffers, random_page_cost, effective_cache_size, ... ?  Could
you show us the result of

SELECT * FROM pg_stats
 WHERE tablename = "tbl_traffic" AND attname = "time_stamp";

Servus
 Manfred

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [PERFORM] problem with pg_statistics

2003-06-27 Thread Manfred Koizar

On Fri, 27 Jun 2003 08:07:35 +0200, Andre Schubert
<[EMAIL PROTECTED]> wrote:
>Traffic data are inserted every 5 minutes with the actual datetime
>of the transaction, thatswhy the table should be physically order by time_stamp.

So I'd expect a correlation of nearly 1.  Why do your statistics show
a value of -0.479749?  A negative correlation is a sign of descending
sort order, and correlation values closer to 0 indicate poor
correspondence between column values and tuple positions.

Could this be the effect of initial data loading?  Are there any
updates or deletions in your traffic table?

>To answer Manfreds questions:
>> Andre, what hardware is this running on?  What are the values of
>> shared_buffers, random_page_cost, effective_cache_size, ... ?  Could
>> you show us the result of
>> 
>>  SELECT * FROM pg_stats
>>   WHERE tablename = "tbl_traffic" AND attname = "time_stamp";
   ^   ^   ^  ^
Oops, these should have been single quotes.  It's too hot here these
days :-)

>sort_mem = 32000
>shared_buffers = 13000

Personally I would set them to lower values, but if you have good
reasons ...

>#effective_cache_size = 1000  # default in 8k pages

This is definitely too low.  With 512MB or more I tend to set this to
ca. 80% of available RAM.  Use top and free to find hints for good
values.

Servus
 Manfred

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [PERFORM] problem with pg_statistics

2003-06-27 Thread Manfred Koizar

On Fri, 27 Jun 2003 11:10:58 +0200, Andre Schubert <[EMAIL PROTECTED]>
wrote:
>Once a month we delete the all data of the oldest month.
>And after that a vacuum full verbose analyze is performed.
>Could this cause reordering of the data ?

I may be wrong, but I think VACUUM FULL starts taking tuples from the
end of the relation and puts them into pages at the beginning until
read and write position meet somewhere in the middle.  This explains
the bad correlation.

>And should i do a cluster idx_ts tbl_traffic ?

I think so.

>> >#effective_cache_size = 1000  # default in 8k pages
>> 
>> This is definitely too low.  With 512MB or more I tend to set this to
>> ca. 80% of available RAM.  Use top and free to find hints for good
>> values.
>> 
>
>Ok, i will talk with my coworker ( he is the sysadmin of our machine )
>and look if can use such amount of RAM, because there are several other
>processes that are running on these machines.
>But i will test and report ...

effective_cache_size does not *control* resource consumption,  it just
*reports* it as a hint to the planner.

Servus
 Manfred

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Effective Cache Size

2003-07-01 Thread Manfred Koizar

On Tue, 1 Jul 2003 15:50:14 +0200 , Howard Oblowitz
<[EMAIL PROTECTED]> wrote:
>What then will be the effect of setting this too high?

The planner might choose an index scan where a sequential scan would
be faster.

>And too low?

The planner might choose a sequential scan where an index scan would
be faster.

>How does it impact on other applications eg Java ?

It doesn't -- at least not directly.  (There could be very subtle
effects when Postgres does a sequential scan over a large relation
thus pushing everything else out of the cache, where an index scan
would have read only a small number of pages.  Or when a large index
scan turns your machine from CPU bound to I/O bound.)

Servus
 Manfred

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [PERFORM] Version 7 question

2003-07-01 Thread Manfred Koizar

On Tue, 1 Jul 2003 15:02:21 +0200, "Michael Mattox"
<[EMAIL PROTECTED]> wrote:
>I have 1.5 gigs of RAM on my
>server but I'm also running a few other java programs that take up probably
>500 megs total of memory, leaving me 1gig for Postgres.  Should I set my
>shared buffers to be 25% of 1gig?  That would be 32768.  Then what should my
>effective cache be?  Right now I have it set to 64000 which would be
>512megs.  Between the buffers and cache that'd be a total of 768megs,
>leaving approximately 768 for my other java apps & the OS.

Michael, by setting effective_cache_size you do not allocate anything.
This configuration variable is just a *hint* to the planner how much
RAM is used for caching on your system (as shown by top or free).

Servus
 Manfred

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Help on my database performance

2003-07-31 Thread Manfred Koizar

On Thu, 31 Jul 2003 11:06:09 -0400, "Jianshuo Niu" <[EMAIL PROTECTED]>
wrote:
>I ran the same explain analyze on two similar tables. However, the table
>with less data took much more time than the one with more data. Could anyone
>tell me what happened?

>Seq Scan on tfd_catalog  (cost=0.00..43769.82 rows=161282 width=10) (actual
>time=3928.64..12905.76 rows=161282 loops=1)
>Total runtime: 13240.21 msec
>
>Seq Scan on hm_catalog  (cost=0.00..22181.18 rows=277518 width=9) (actual
>time=21.32..6420.76 rows=277518 loops=1)
>Total runtime: 6772.95 msec

The first SELECT takes almost twice the time because tfd_catalog has
almost twice as many pages than hm_catalog.  This may be due to having
wider tuples or more dead tuples in tfd_catalog.

In the former case theres not much you can do.

But the high startup cost of the first SELECT is a hint for lots of
dead tuples.  So VACUUM FULL ANALYSE might help.

Servus
 Manfred

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [PERFORM] Moving postgresql.conf tunables into 2003...

2003-07-31 Thread Manfred Koizar

[jumping in late due to vacation]

On Thu, 3 Jul 2003 17:06:46 -0700, Sean Chittenden
<[EMAIL PROTECTED]> wrote:
>> is some other problem that needs to be solved.  (I'd wonder about
>> index correlation myself; we know that that equation is pretty
>> bogus.)
>
>Could be.  I had him create a multi-column index on the date and a
>non-unique highly redundant id.

Tom has already suspected index correlation to be a possible source of
the problem and recommended to CLUSTER on the index.  A weakness of
the current planner implementation is that a multi column index is
always thought to have low correlation.  In your case even after
CLUSTER the 2-column index on (date, sensorid) is treated like a
single column index with correlation 0.5.

I have an experimental patch lying around somewhere that tries to work
around these problems by offering different estimation methods for
index scans.  If you are interested, I'll dig it out.

In the meantime have him try with a single column index on date.

On 04 Jul 2003 08:29:04 -0400, Rod Taylor <[EMAIL PROTECTED]> wrote:
|That's one heck of a poor estimate for the number of rows returned.
|
|> ->  Seq Scan on mss_fwevent  (cost=0.00..223312.60 rows=168478 width=12)
|(actual time=24253.66..24319.87 rows=320 loops=1)

>  ->  Index Scan using mss_fwevent_evtime_sensorid_idx on mss_fwevent
> (cost=0.00..2442524.70 rows=168478 width=12)
>(actual time=68.36..132.84 rows=320 loops=1)
>  Index Cond: ((evtime > (now() - '06:00'::interval)) AND (sensorid = 7))
>  Filter: (NOT "action")

Estimated number of rows being wrong by a factor 500 seems to be the
main problem hiding everything else.  With statistics already set to
1000, does this mean that sensorid, evtime, and action are not
independent?  It'd be interesting to know whether the estimation error
comes from "Index Cond" or from "Filter".

Servus
 Manfred

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [PERFORM] Help on my database performance

2003-07-31 Thread Manfred Koizar

On Thu, 31 Jul 2003 16:08:11 -0400, "Jianshuo Niu" <[EMAIL PROTECTED]>
wrote:
>explain analyze select count(*) from tfd_catalog ;
>NOTICE:  QUERY PLAN:
>
>Aggregate  (cost=15986.02..15986.02 rows=1 width=0)
>   (actual time=1089.99..1089.99 rows=1 loops=1)
>  ->  Seq Scan on tfd_catalog  (cost=0.00..15582.82 rows=161282 width=0)
>   (actual time=0.11..833.41 rows=161282 loops=1)
>Total runtime: 1090.51 msec

>Could you tell me what does  "Aggregate  (cost=15986.02..15986.02 rows=1
>width=0) (actual time=1089.99..1089.99 rows=1 loops=1)" mean? It does not
>show in my previous report.

In your first post you did 
SELECT productid FROM tfd_catalog;

now you did
SELECT count(*) FROM tfd_catalog;

count() is an aggregate function which in your case takes 161282 rows
as input and produces a single row as output.  The "actual" part of
the "Aggregate" line tells you that the first resulting row is
generated 1089.99 milliseconds after query start and the last row (not
surprisingly) at the same time.  The "cost" part contains the
planner's estimations for these values.

Servus
 Manfred

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [PERFORM] I can't wait too much: Total runtime 432478.44 msec

2003-08-04 Thread Manfred Koizar

On Fri, 1 Aug 2003 18:17:17 -0300, "Fernando Papa" <[EMAIL PROTECTED]>
wrote:
> AND cont_publicacion.fecha_publicacion = (SELECT
>max(cp1.fecha_publicacion) 
>  FROM cont_publicacion cp1  
>  WHERE cp1.id_instalacion =
>cont_publicacion.id_instalacion 
>AND cp1.id_contenido = cont_publicacion.id_contenido
>
>AND cp1.generar_vainilla =
>cont_publicacion.generar_vainilla) 

If certain uniqueness conditions are met, the Postgres specific
DISTINCT ON clause could help totally eliminating the subselect:

SELECT DISTINCT ON (
cp.id_instalacion,
cp.id_contenido,
cp.generar_vainilla,
cp.fecha_publicacion
   )
  cc.id_contenido
 ,cc.pertenece_premium
 ,cc.Titulo_esp as v_sufix 
 ,cc.url_contenido
 ,cc.tipo_acceso
 ,cc.id_sbc
 ,cc.cant_vistos
 ,cc.cant_votos 
 ,cc.puntaje_total 
 ,cc.id_contenido_padre 
 ,jc.imagen_tapa_especial 
 ,jc.info_general_esp as info_general 
 ,jc.ayuda 
 ,jc.tips_tricks_esp as tips_tricks 
 ,jc.mod_imagen_tapa_especial 
 ,cp.fecha_publicacion as fecha_publicacion 
 ,cp.generar_Vainilla 
 FROM 
 cont_contenido cc
 ,juegos_config jc
 ,cont_publicacion cp
WHERE 
 cc.id_instalacion= 2
 AND cc.id_sbc   = 619
 AND cc.id_tipo   = 2
 AND cc.id_instalacion  = jc.id_instalacion 
 AND cc.id_contenido   = jc.id_contenido 
 AND upper(cp.generar_Vainilla) = 'S'
 AND cp.id_instalacion = cc.id_instalacion 
 AND cp.id_contenido = cc.id_contenido 
 ORDER BY  
   cp.id_instalacion,
   cp.id_contenido,
   cp.generar_vainilla,
   cp.fecha_publicacion desc

However, this doesn't get the result in the original order, so you
have to wrap another SELECT ... ORDER BY ... LIMIT around it.  Or try
to move the subselect into the FROM clause:

SELECT
  cc.id_contenido
 ,cc.pertenece_premium
 ,cc.Titulo_esp as v_sufix 
 ,cc.url_contenido
 ,cc.tipo_acceso
 ,cc.id_sbc
 ,cc.cant_vistos
 ,cc.cant_votos 
 ,cc.puntaje_total 
 ,cc.id_contenido_padre 
 ,jc.imagen_tapa_especial 
 ,jc.info_general_esp as info_general 
 ,jc.ayuda 
 ,jc.tips_tricks_esp as tips_tricks 
 ,jc.mod_imagen_tapa_especial 
 ,cp.fecha_publicacion as fecha_publicacion 
 ,cp.generar_Vainilla 
 FROM 
 cont_contenido cc
 ,juegos_config jc
 ,(SELECT DISTINCT ON (
id_instalacion,
id_contenido,
generar_vainilla,
fecha_publicacion
   )
  *
 FROM cont_publicacion
ORDER BY
   id_instalacion,
   id_contenido,
   generar_vainilla,
   fecha_publicacion desc
  ) AS cp
WHERE 
 cc.id_instalacion= 2
 AND cc.id_sbc   = 619
 AND cc.id_tipo   = 2
 AND cc.id_instalacion  = jc.id_instalacion 
 AND cc.id_contenido   = jc.id_contenido 
 AND upper(cp.generar_Vainilla) = 'S'
 AND cp.id_instalacion = cc.id_instalacion 
 AND cp.id_contenido = cc.id_contenido 
 ORDER BY  
   cp.fecha_publicacion desc
 LIMIT 10
 OFFSET 0
 
[completely untested]

Servus
 Manfred

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] I can't wait too much: Total runtime 432478.44 msec

2003-08-04 Thread Manfred Koizar

On Mon, 4 Aug 2003 12:02:46 -0300, "Fernando Papa" <[EMAIL PROTECTED]>
wrote:
>FROM 
>cont_contenido   
>,juegos_config
>,cont_publicacion 
>,(SELECT max(cp1.fecha_publicacion) as max_pub
>  --change here
> FROM cont_publicacion cp1) a --change here

But this calculates the global maximum, not per id_instalacion,
id_contenido, and generar_vainilla as in

>AND cont_publicacion.fecha_publicacion = (SELECT 
> max(cp1.fecha_publicacion) 
> FROM cont_publicacion cp1  
> WHERE cp1.id_instalacion = 
> cont_publicacion.id_instalacion 
>   AND cp1.id_contenido = 
> cont_publicacion.id_contenido  
>   AND cp1.generar_vainilla = 
> cont_publicacion.generar_vainilla) 

Servus
 Manfred

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] OSDL Database Test Suite 3 is available on PostgreSQL

2003-08-04 Thread Manfred Koizar

On 01 Aug 2003 11:04:10 -0700, Jenny Zhang <[EMAIL PROTECTED]> wrote:
>A sample OSDL-DBT3 test result report can be found at:
>http://khack.osdl.org/stp/276912/
>
>Your comments are welcome,

| effective_cache_size   | 1000

With 4GB of memory this is definitely too low and *can* (note that I
don't say *must*) lead the planner to wrong decisions.

| shared_buffers | 15200

... looks reasonable.  Did you test with other values?

| sort_mem   | 524288

This is a bit high, IMHO, but might be ok given that DBT3 is not run
with many concurrent sessions (right?).
http://khack.osdl.org/stp/276912/results/plot/vmstat_swap.png shows
some swapping activity towards the end of the run which could be
caused by a too high sort_mem setting.

Servus
 Manfred

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] I can't wait too much: Total runtime 432478.44 msec

2003-08-04 Thread Manfred Koizar

On Mon, 04 Aug 2003 16:10:18 +0200, I wrote:
>SELECT DISTINCT ON (
>cp.id_instalacion,
>cp.id_contenido,
>cp.generar_vainilla,
>cp.fecha_publicacion
>   )

Cut'n'paste error!  fecha_publicacion should not be in the DISTINCT ON
list.  The same error is in my second suggestion (FROM (subselect)).

Servus
 Manfred

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

[PERFORM] Index correlation (was: Moving postgresql.conf tunables into 2003... )

2003-08-14 Thread Manfred Koizar

On Thu, 07 Aug 2003 19:31:52 -0400, Tom Lane <[EMAIL PROTECTED]>
wrote:
>The correlation is between index order and heap order --- that is, are
>the tuples in the table physically in the same order as the index?
>The better the correlation, the fewer heap-page reads it will take to do
>an index scan.

This is true for a column that is the first column of a btree index.
Correlation doesn't help with additional index columns and with
functional indices.

>Note it is possible to measure correlation without regard to whether
>there actually is any index;

But there is no need to, because the correlation is only used for
index access cost estimation.

>One problem we have is extrapolating from the single-column correlation
>stats computed by ANALYZE to appropriate info for multi-column indexes.
>It might be that the only reasonable fix for this is for ANALYZE to
>compute multi-column stats too when multi-column indexes are present.

I wonder whether it would be better to drop column correlation and
calculate index correlation instead, i.e. correlation of index tuples
to heap tuple positions.  This would solve both the multi-column index
and the functional index cost estimation problem.

>People are used to the assumption that you don't need to re-ANALYZE
>after creating a new index, but maybe we'll have to give that up.

Index correlation would be computed on CREATE INDEX and whenever the
heap relation is analysed ...

Servus
 Manfred

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] Moving postgresql.conf tunables into 2003...

2003-08-14 Thread Manfred Koizar

On Tue, 5 Aug 2003 15:26:09 -0700, Sean Chittenden
<[EMAIL PROTECTED]> wrote:
>> I have an experimental patch lying around somewhere that tries to
>> work around these problems by offering different estimation methods
>> for index scans.  If you are interested, I'll dig it out.
>
>Sure, I'll take a gander... had my head in enough Knuth recently to
>even hopefully have some kind of a useful response to the patch.

Sean, the patch is at http://www.pivot.at/pg/16-correlation-732.diff.
A short description of its usage can be found at
http://archives.postgresql.org/pgsql-performance/2002-11/msg00256.php.
If you are interested how the different interpolation methods work,
read the source - it shouldn't be too hard to find.

You might also want to read the thread starting at
http://archives.postgresql.org/pgsql-hackers/2002-10/msg00072.php.

>>  does this mean that sensorid, evtime, and action are not
>> independent?
>
>Hrm...  sensorid is sequence and grows proportional with evtime,
>obviously.

So a *low* sensorid (7) is quite uncommon for a *late* evtime?  This
would help understand the problem.  Unfortunately I have no clue what
to do about it.  :-(

>Having spent a fair amount of time looking at the two following plans,
>it seems as though an additional statistic is needed to change the
>cost of doing an index lookup when the index is linearly ordered.

I'm not sure I understand what you mean by "index is linearly
ordered",  but I guess correlation is that statistic you are talking
about.  However, it is calculated per column, not per index.

>Whether CLUSTER does this or not, I don't know,

If you CLUSTER on an index and then ANALYSE, you get a correlation of
1.0 (== optimum) for the first column of the index.

> I never heard back
>from him after getting the runtime down to a few ms.  :-/

Pity!  I'd have liked to see EXPLAIN ANALYSE for

SELECT *
  FROM mss_fwevent
 WHERE sensorid = 7
   AND evtime > (now() - '6 hours'::INTERVAL)
   AND NOT action;

SELECT *
  FROM mss_fwevent
 WHERE sensorid = 7
   AND evtime > (now() - '6 hours'::INTERVAL);

SELECT *
  FROM mss_fwevent
 WHERE evtime > (now() - '6 hours'::INTERVAL);

SELECT *
  FROM mss_fwevent
 WHERE sensorid = 7;

> Are indexes
>on linearly ordered data rebalanced somehow?  I thought CLUSTER only
>reordered data on disk.  -sc

AFAIK CLUSTER re-creates all indices belonging to the table.

Servus
 Manfred

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Serious issues with CPU usage

2003-09-08 Thread Manfred Koizar

On Mon, 8 Sep 2003 13:50:23 +0300, Andri Saar <[EMAIL PROTECTED]>
wrote:
>Basically I do this:
>1) select about ~700 ID's I have to poll
>2) poll them
>3) update those 700 rows in that "table" I used (~2700 rows total).
>
>And I do this cycle once per minute, so yes, I've got a zillion updates. 700 
>of 2700 is roughly 25%, so I'd have to vacuum once per minute?

With such a small table VACUUM should be a matter of less than one
second:

fred=# vacuum verbose t;
INFO:  --Relation public.t--
INFO:  Index t_pkey: Pages 65; Tuples 16384: Deleted 4096.
CPU 0.01s/0.10u sec elapsed 0.21 sec.
INFO:  Removed 4096 tuples in 154 pages.
CPU 0.04s/0.02u sec elapsed 0.07 sec.
INFO:  Pages 192: Changed 192, Empty 0; Tup 16384: Vac 4096, Keep 0,
UnUsed 0.
Total CPU 0.08s/0.16u sec elapsed 0.36 sec.
VACUUM
Time: 415.00 ms

And this is on a 400 MHz machine under cygwin, so don't worry if you
have a real computer.

Servus
 Manfred

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Inconsistent performance

2003-09-16 Thread Manfred Koizar

On Mon, 15 Sep 2003 22:26:45 -0400, Christopher Browne
<[EMAIL PROTECTED]> wrote:
>> select count (*) from table;
>The only possible plan for THAT query will involve a seq scan of the
>whole table.  If the postmaster already has the data in cache, it
>makes sense for it to run in 1 second.  If it has to read it from
>disk, 12 seconds makes a lot of sense.

Yes.  And note that the main difference is between having the data in
memory and having to fetch it from disk.  I don't believe that this
difference can be explained by 9000 read calls hitting the operating
system's cache.

>You might want to increase the "shared_buffers" parameter in
>postgresql.conf; that should lead to increased stability of times as
>it should be more likely that the data in "table" will remain in
>cache.

Let's not jump to this conclusion before we know what's going on.

Joseph Bove <[EMAIL PROTECTED]> wrote in another message above:
| I did have shared_buffers and sort_mem both set higher originally (15000, 
| 32168)

As I read this I think he meant "... and had the same performance
problem."

Joseph, what do you get, if you run that
 EXPLAIN ANALYSE SELECT count(*) ...
several times?  What do vmstat and top show while the query is
running?  Are there other processes active during or between the runs?
What kind of processes?  Postgres backends?  Web server? ...

Servus
 Manfred

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] rewrite in to exists?

2003-09-18 Thread Manfred Koizar

On Thu, 18 Sep 2003 13:23:37 +0800, "Christopher Kings-Lynne"
<[EMAIL PROTECTED]> wrote:
>Why can't you just go:
>
>select code, id, name, date_of_service from tbl where xxx <= 29800 and xx >=
>29909 and code='XX' and client_code='XX' order by id, date_of_service;

Because (ignoring conditions on code and client_code for a moment) if
for a given date there is at least one row satisfying the condition on
xxx, the original query returns *all* rows having this date,
regardless of their xxx value.  For example:

 id |  date  |  xxx
++---
  1 | 2003-01-01 | 1*
  2 | 2003-01-01 | 29800   **
  3 | 2003-01-01 | 3*
  4 | 2003-02-02 | 2
  5 | 2003-03-03 | 29900   **


>> select code, id, name, date_of_service
>>   from tbl
>> where date_of_service in
>>   (select date_of_service
>>  from tbl
>> where xxx >= '29800'
>>   and xxx <= '29909'
>>   and code = 'XX')
>>   and client_code = 'XX'
>> order by  id, date_of_service;

To the original poster:  You did not provide a lot of information, but
the following suggestions might give you an idea ...

SELECT code, id, date_of_service
  FROM tbl
 WHERE EXISTS (SELECT *
 FROM tbl t2
WHERE t2.xxx >= '29800' AND t2.xxx <= '29909'
  AND t2.code = 'XX'
  AND tbl.date_of_service = t2.date_of_service)
   AND client_code = 'XX'
 ORDER BY id, date_of_service;

SELECT t1.code, t1.id, t1.date_of_service
  FROM tbl t1 INNER JOIN
   (SELECT DISTINCT date_of_service
  FROM tbl
 WHERE xxx >= '29800' AND xxx <= '29909'
   AND code = 'XX'
   ) AS t2 ON (t1.date_of_service = t2.date_of_service)
 WHERE t1.client_code = 'XX'
 ORDER BY id, date_of_service;

SELECT DISTINCT t1.code, t1.id, t1.date_of_service
  FROM tbl AS t1 INNER JOIN tbl AS t2
   ON (t1.date_of_service = t2.date_of_service
   AND t2.xxx >= '29800' AND t2.xxx <= '29909'
   AND t2.code = 'XX')
 WHERE t1.client_code = 'XX'  -- might as well put this
  -- condition into the ON clause
 ORDER BY id, date_of_service;

The last one assumes that there are no duplicates on code, id,
date_of_service in the desired result.

Servus
 Manfred

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] rewrite in to exists?

2003-09-19 Thread Manfred Koizar

On Thu, 18 Sep 2003 12:27:23 -0700 (GMT-07:00), LN Cisneros
<[EMAIL PROTECTED]> wrote:
>But, the EXISTS version doesn't

Laurette,
looking at that SELECT statement again I can't see what's wrong with
it.  One of us is missing something ;-)

> really give me what I want...

Can you elaborate?

SELECT code, id, date_of_service
  FROM tbl
 WHERE EXISTS (SELECT *
 FROM tbl t2
WHERE t2.xxx >= '29800' AND t2.xxx <= '29909'
  AND t2.code = 'XX'
  AND tbl.date_of_service = t2.date_of_service) -- (!)
   AND client_code = 'XX'
 ORDER BY id, date_of_service;

>all rows in tbl that
 ^^^
Well, all that have client_code = 'XX', as in your original query.

> match the date of the subquery.

The matching is done by the line with the (!) comment.

Servus
 Manfred

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] osdl-dbt3 run results - puzzled by the execution plans

2003-09-19 Thread Manfred Koizar

On Thu, 18 Sep 2003 15:36:50 -0700, Jenny Zhang <[EMAIL PROTECTED]>
wrote:
>We thought the large effective_cache_size should lead us to better 
>plans. But we found the opposite. 

The common structure of your query plans is:

 Sort
   Sort Key: sum((partsupp.ps_supplycost * partsupp.ps_availqty))
   InitPlan
 ->  Aggregate
   ->  SubPlan
   ->  Aggregate
 Filter: (sum((ps_supplycost * ps_availqty)) > $0)
 ->  Group
   ->  Sort
 Sort Key: partsupp.ps_partkey
 ->  SubPlan (same as above)

where the SubPlan is

 ->  Merge Join  (cost=519.60..99880.05 rows=32068 width=65)
 (actual time=114.78..17435.28 rows=30400 loops=1)
 ctr=5.73
   Merge Cond: ("outer".ps_suppkey = "inner".s_suppkey)
   ->  Index Scan using i_ps_suppkey on partsupp
 (cost=0.00..96953.31 rows=801712 width=34)
 (actual time=0.42..14008.92 rows=799361 loops=1)
 ctr=6.92
   ->  Sort  (cost=519.60..520.60 rows=400 width=31)
 (actual time=106.88..143.49 rows=30321 loops=1)
 ctr=3.63
 Sort Key: supplier.s_suppkey
 ->  SubSubPlan

for large effective_cache_size and

 ->  Nested Loop  (cost=0.00..130168.30 rows=32068 width=65)
  (actual time=0.56..1374.41 rows=30400 loops=1)
  ctr=94.71
   ->  SubSubPlan
   ->  Index Scan using i_ps_suppkey on partsupp
 (cost=0.00..323.16 rows=80 width=34)
 (actual time=0.16..2.98 rows=80 loops=380)
 ctr=108.44
 Index Cond: (partsupp.ps_suppkey = "outer".s_suppkey)

for small effective_cache_size.  Both subplans have an almost
identical subsubplan:

->  Nested Loop  (cost=0.00..502.31 rows=400 width=31)
 (actual time=0.23..110.51 rows=380 loops=1)
 ctr=4.55
  Join Filter: ("inner".s_nationkey = "outer".n_nationkey)
  ->  Seq Scan on nation  (cost=0.00..1.31 rows=1 width=10)
  (actual time=0.08..0.14 rows=1 loops=1)
  ctr=9.36
Filter: (n_name = 'ETHIOPIA'::bpchar)
  ->  Seq Scan on supplier (cost=0.00..376.00 rows=1 width=21)
  (actual time=0.10..70.72 rows=1 loops=1)
   ctr=5.32

I have added the ctr (cost:time ratio) for each plan node.  These
values are mostly between 5 and 10 with two notable exceptions:

1) ->  Sort  (cost=519.60..520.60 rows=400 width=31)
 (actual time=106.88..143.49 rows=30321 loops=1)
 ctr=3.63

It has already been noticed by Matt Clark that this is the only plan
node where the row count estimation looks wrong.  However, I don't
believe that this has great influence on the total cost of the plan,
because the ctr is not far from the usual range and if it were a bit
higher, it would only add a few hundred cost units to a branch costing
almost 10 units.  BTW I vaguely remember that there is something
strange with the way actual rows are counted inside a merge join.
Look at the branch below this plan node:  It shows an actual row count
of 380.

2) ->  Index Scan using i_ps_suppkey on partsupp
 (cost=0.00..323.16 rows=80 width=34)
 (actual time=0.16..2.98 rows=80 loops=380)
 ctr=108.44

Here we have the only plan node where loops > 1, and it is the only
one where the ctr is far off.  The planner computes the cost for one
loop and multiplies it by the number of loops (which it estimates
quite accurately to be 400), thus getting a total cost of ca. 13.
We have no reason to believe that the single loop cost is very far
from reality (for a *single* index scan), but the planner does not
account for additional index scans hitting pages in the cache that
have been brought in by preceding scans.  This is a known problem, Tom
has mentioned it several times, IIRC.

Now I'm very interested in getting a better understanding of this
problem, so could you please report the results of

. \d i_ps_suppkey

. VACUUM VERBOSE ANALYSE partsupp;
  VACUUM VERBOSE ANALYSE supplier;

. SELECT attname, null_frac, avg_witdh, n_distinct, correlation
FROM pg_stats
   WHERE tablename = 'partsupp' AND attname IN ('ps_suppkey', ...);

  Please insert other interesting column names for ..., especially
  those contained in i_ps_suppkey, if any.

. SELECT relname, relpages, reltuples
FROM pg_class
   WHERE relname IN ('partsupp', 'supplier', ...);
 ^^^
Add relevant index names here.

. EXPLAIN ANALYSE
  SELECT ps_partkey, ps_supplycost, ps_availqty
FROM partsupp, supplier
   WHERE ps_suppkey = s_suppkey AND s_nationkey = '';

  The idea is to eliminate parts of the plan that are always the same.
  Omitting nation is possibly to much a simplification.  In this case
  pleas

Re: [PERFORM] osdl-dbt3 run results - puzzled by the execution

2003-09-24 Thread Manfred Koizar

On Fri, 19 Sep 2003 11:35:35 -0700, Jenny Zhang <[EMAIL PROTECTED]>
wrote:
>I posted more results as you requested:

Unfortunately they only confirm what I suspected earlier:

>> 2) ->  Index Scan using i_ps_suppkey on partsupp
>>  (cost=0.00..323.16 rows=80 width=34)
>>  (actual time=0.16..2.98 rows=80 loops=380)
>>  ctr=108.44

>> the planner does not
>> account for additional index scans hitting pages in the cache that
>> have been brought in by preceding scans.  This is a known problem

PF1 = estimated number of page fetches for one loop ~ 320
L   = estimated number of loops ~ 400
P   = number of pages in relation ~ 21000

Cutting down the number of heap page fetches if PF1 * L > P and P <
effective_cache_size seems like an obvious improvement, but I was not
able to figure out where to make this change.  Maybe it belongs into
costsize.c near

run_cost += outer_path_rows *
(inner_path->total_cost - inner_path->startup_cost) *
joininfactor;

in cost_nestloop() or it should be pushed into the index cost
estimation functions.  Hackers?

For now you have to keep lying about effective_cache_size to make the
planner overestimate merge joins to compensate for the planner's
overestimation of nested loops.  Sorry for having no better answer.

Servus
 Manfred

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Optimizing >= and <= for numbers and dates

2003-10-01 Thread Manfred Koizar

On Wed, 1 Oct 2003 19:45:29 +0200 (MEST), "Dimitri Nagiev"
<[EMAIL PROTECTED]> wrote:
>template1=# explain analyze select * from mytable where
>mydate>='2003-09-01';
> Seq Scan on mytable  (cost=0.00..2209.11 rows=22274 width=562) (actual 
> time=0.06..267.30 rows=22677 loops=1)
>   Filter: (mydate >= '2003-09-01'::date)
> Total runtime: 307.71 msec

Didn't you say that there are 25000 rows in the table?  I can't
believe that for selecting 90% of all rows an index scan would be
faster.  Try

SET enable_seqscan = 0;
explain analyze
 select * from mytable where mydate>='2003-09-01';

If you find the index scan to be faster, there might be lots of dead
tuples in which case you should

VACUUM FULL mytable;

Servus
 Manfred

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] planner doesn't use multicolumn index

2003-10-08 Thread Manfred Koizar

On Wed, 8 Oct 2003 09:08:59 -0500 (CDT), Adrian Demaestri
<[EMAIL PROTECTED]> wrote:
>the type of the fields are int2 and
>int4, the where condition is v.g. partido=99 and partida=123).

Write your search condition as

WHERE partido=99::int2 and partida=123

Servus
 Manfred

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] vacuum locking

2003-10-17 Thread Manfred Koizar

On Fri, 17 Oct 2003 09:52:26 -0600, Rob Nagler <[EMAIL PROTECTED]>
wrote:
>INFO:  Removed 8368 tuples in 427 pages.
>CPU 0.06s/0.04u sec elapsed 1.54 sec.
>INFO:  Pages 24675: Changed 195, Empty 0; Tup 1031519: Vac 8368, Keep 254, UnUsed 
>1739.
>Total CPU 2.92s/2.58u sec elapsed 65.35 sec.
>
>INFO:  Removed 232 tuples in 108 pages.
>CPU 0.01s/0.02u sec elapsed 0.27 sec.
>INFO:  Pages 74836: Changed 157, Empty 0; Tup 4716475: Vac 232, Keep 11, UnUsed
>641.
>Total CPU 10.19s/6.03u sec elapsed 261.44 sec.

The low UnUsed numbers indicate that FSM is working fine.

>Assuming I vacuum every 15 minutes, it would seem like max_fsm_pages
>should be 1000, because that's about what was reclaimed.  The default
>is 1.  Do I need to change this?

ISTM you are VACCUMing too aggressively.  You are reclaiming less than
1% and 0.005%, respectively, of tuples.  I would increase FSM settings
to ca. 1000 fsm_relations, 10 fsm_pages and VACUUM *less* often,
say every two hours or so.

... or configure autovacuum to VACUUM a table when it has 10% dead
tuples.

Servus
 Manfred

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Performance Concern

2003-10-27 Thread Manfred Koizar

On Sun, 26 Oct 2003 00:13:36 +0300, Hannu Krosing <[EMAIL PROTECTED]> wrote:
>UPDATE baz
>   SET customer_id = '1234'
> WHERE baz_key IN (
>SELECT baz_key
>  FROM baz innerbaz
> WHERE customer_id IS NULL
>   and innerbaz.baz_key = baz.baz_key
> LIMIT 1000 );

AFAICS this is not what the OP intended.  It is equivalent to 

UPDATE baz
   SET customer_id = '1234'
 WHERE customer_id IS NULL;

because the subselect is now correlated to the outer query and is
evaluated for each row of the outer query which makes the LIMIT clause
ineffective.

Servus
 Manfred

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Guesses on what this NestLoop is for?

2003-10-28 Thread Manfred Koizar

On Mon, 27 Oct 2003 15:32:41 -0800, Josh Berkus <[EMAIL PROTECTED]>
wrote:
>FROM event_types, events
>   LEFT OUTER JOIN ...
>WHERE events.status = 1 or events.status = 11
>   and events.event_date > '2003-10-27'
>   and events.etype_id = event_types.etype_id
>   and ( ...
>   );
>
>
>What I can't figure out is what is that inredibly expensive nested loop for?   

Sorry, I have no answer to your question, but may I ask whether you
really want to get presumably 106 output rows for each event with
status 1?

Or did you mean
 WHERE (events.status = 1 OR events.status = 11) AND ...

>Ideas?

I'd also try to push that NOT EXISTS condition into the FROM clause:

...LEFT JOIN (SELECT DISTINCT ON (event_id)
 event_id, mod_date, mod_user
FROM event_history
   ORDER BY event_id, mod_date
 ) AS eh ON (events.event_id = eh.event_id) ...
WHERE ...
  AND CASE WHEN eh.event_id IS NULL
   THEN events.mod_user
   ELSE eh.mod_user END = 562

If mod_user is NOT NULL in event_history, then CASE ... END can be
simplified to COALESCE(eh.mod_user, events.mod_user).

Servus
 Manfred

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] redundent index?

2003-10-31 Thread Manfred Koizar

On Wed, 29 Oct 2003 10:17:24 -0500, Rod Taylor <[EMAIL PROTECTED]> wrote:
>On Wed, 2003-10-29 at 09:03, Robert Treat wrote:
>> Indexes: entity_watch_map_pkey primary key btree (entity_id, watch_id),
>>  ewm_entity_id btree (entity_id),
>> 
>> I can't think of why the second index is there, as ISTM there is no
>> instance where the first index wouldn't be used in place of the second
>
>The cost in evaluating the first index will be a little higher

Yes, the actual cost may be a little higher.  But the cost estimation
might be significantly higher, so there can be border cases where the
planner chooses a sequential scan over a multi-column index scan while
a single-column index would correctly be recognized as being faster
...

Servus
 Manfred

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Seeking help with a query that takes too long

2003-11-12 Thread Manfred Koizar

On Wed, 12 Nov 2003 08:34:50 -0500, "Nick Fankhauser"
<[EMAIL PROTECTED]> wrote:
>   ->  Index Scan using
>actor_full_name_uppercase on actor  (cost=0.00..6.01 rows=1 width=42)
  ^^
>(actual time=37.62..677.44 rows=3501 loops=1)
^
> Index Cond:
>((actor_full_name_uppercase >= 'SANDERS'::character varying) AND
>(actor_full_name_uppercase < 'SANDERT'::character varying))
> Filter:
>(actor_full_name_uppercase ~~ 'SANDERS%'::text)

Nick, can you find out why this row count estimation is so far off?

\x
SELECT * FROM pg_stats
 WHERE tablename='actor' AND attname='actor_full_name_uppercase';

BTW, there seem to be missing cases:
>  ->  Nested Loop  (cost=0.00..2214.66 rows=2 width=115)
>   (actual time=59.05..119929.71 rows=5879 loops=1)
   
>->  Nested Loop  (cost=0.00..2205.26 rows=3 width=76)
> (actual time=51.46..66089.04 rows=5882 loops=1)


Servus
 Manfred

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] Seeking help with a query that takes too long

2003-11-12 Thread Manfred Koizar

On Wed, 12 Nov 2003 13:27:53 -0500, "Nick Fankhauser"
<[EMAIL PROTECTED]> wrote:
>
>> You might have to resort to brute force, like "set enable_nestloop=false".

> ->  Seq Scan on
>actor_case_assignment  (cost=0.00..209980.49 rows=8669349 width=34) (actual
>time=9.13..85504.05 rows=8670467 loops=1)

Does actor_case_assignment contain more columns than just the two ids?
If yes, do these additional fields account for ca. 70 bytes per tuple?
If not, try
VACUUM FULL ANALYSE actor_case_assignment;

>   ->  Index Scan using
>actor_full_name_uppercase on actor  (cost=0.00..6.01 rows=1 width=42)
>(actual time=51.67..24900.53 rows=3502 loops=1)

This same index scan on actor has been much faster in your previous
postings (677ms, 3200ms), probably due to caching effects.  7ms per
tuple returned looks like a lot of disk seeks are involved.  Is
clustering actor on actor_full_name_uppercase an option or would this
slow down other queries?

Servus
 Manfred

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Seeking help with a query that takes too long

2003-11-14 Thread Manfred Koizar

On Fri, 14 Nov 2003 11:00:38 -0500, "Nick Fankhauser"
<[EMAIL PROTECTED]> wrote:
>Good question... I've never used clustering in PostgreSQL before, so I'm
>unsure. I presume this is like clustering in Oracle where the table is
>ordered to match the index?

Yes, something like that.  With the exception that Postgres looses the
clustered status, while you INSERT and UPDATE tuples.  So you have to
re-CLUSTER from time to time.  Look at pg_stats.correlation to see, if
its necessary.

> Is there a way to flush out
>the cache in a testing situation like this in order to start from a
>consistent base?

To flush Postgres shared buffers:
SELECT count(*) FROM another_large_table;

To flush your database pages from the OS cache:
tar cf /dev/null /some/large/directory

And run each of your tests at least twice to get a feeling how caching
affects your specific queries.

Servus
 Manfred

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] TEXT column and indexing

2003-11-19 Thread Manfred Koizar

On Wed, 19 Nov 2003 10:18:18 +0100, Ian Barwick <[EMAIL PROTECTED]>
wrote:
>Indexes:
>[...]
>"opv_v_ix" btree (substr(value, 1, 128))

>SELECT obj_property_id
>  FROM object_property_value opv
> WHERE opv.value = 'foo'

Try
... WHERE substr(opv.value, 1, 128) = 'foo'

HTH.
Servus
 Manfred

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] optimization ideas for frequent, large(ish) updates in frequently accessed DB?

2004-02-13 Thread Manfred Koizar

On Fri, 13 Feb 2004 01:58:34 +0100, "Marinos J. Yannikos"
<[EMAIL PROTECTED]> wrote:
>I'm looking for ideas that might improve the interactive performance of 
>the system, without slowing down the updates too much.

IOW, you could accept slower updates.  Did you actually try and throttle
down the insert rate?

> Here are the 
>characteristics of the table and its use:
>
>- approx. 2 million rows

Doesn't sound worrying.  What's the min/max/average size of these rows?
How large is this table?
SELECT relpages FROM pg_class WHERE relname='...';

What else is in this database, how many tables, how large is the
database (du $PGDATA)?

>- approx. 4-5 million rows per day are replaced in short bursts of 
>1-200k rows (average ~3000 rows per update)

How often do you VACUUM [ANALYSE]?

>- the table needs 6 indexes (not all indexes are used all the time, but 
>keeping them all the time slows the system down less than re-creating 
>some of them just before they're needed and dropping them afterwards)

I agree.

>- an "update" means that 1-200k rows with a common value in a particular 
>field are replaced with an arbitrary number of new rows (with the same 
>value in that field), i.e.:
>
>begin transaction;
>   delete from t where id=5;
>   insert into t (id,...) values (5,...);
>   ... [1-200k rows]
>end;

This is a wide variation in the number of rows.  You told us the average
batch size is 3000.  Is this also a *typical* batch size?  And what is
the number of rows where you start to get the feeling that it slows down
other sessions?

Where do the new values come from?  I don't think they are typed in :-)
Do they come from external sources or from the same database?  If the
latter, INSERT INTO ... SELECT ... might help.

>The problem is, that a large update of this kind can delay SELECT 
>queries running in parallel for several seconds, so the web interface 
>used by several people will be unusable for a short while.

Silly question:  By SELECT you mean pure SELECT transactions and not
some transaction that *mostly* reads from the database?  I mean, you are
sure your SELECT queries are slowed down and not blocked by the
"updates".

Show us the EXPLAIN ANALYSE output for the same SELECT, once when it is
fast and once when it is slow.  BTW, what is fast and what is slow?

>Currently, I'm using temporary tables:
> [...]
>This is slightly faster than inserting directly into t (and probably 
>faster than using COPY, even though using that might reduce the overall 
>load on the database).

You might try using a prepared INSERT statement or COPY.

>shared_buffers=10
>(I tried many values, this seems to work well for us - 12GB RAM)
>wal_buffers=500
>sort_mem=80
>checkpoint_segments=16
>effective_cache_size=100

See Josh's comments.

>Any help/suggestions would be greatly appreciated... Even if it's 
>something like "you need a faster db box, there's no other way" ;-)

We have to find out, what is the bottleneck.  Tell us about your
environment (hardware, OS, ...).  Run top and/or vmstat and look for
significant differences between times of normal processing and slow
phases.  Post top/vmstat output here if you need help.

Servus
 Manfred

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] optimization ideas for frequent, large(ish) updates in frequently accessed DB?

2004-02-13 Thread Manfred Koizar

Marinos, while you are busy answering my first set of questions :-),
here is an idea that might help even out resource consumption.

On Fri, 13 Feb 2004 01:58:34 +0100, "Marinos J. Yannikos"
<[EMAIL PROTECTED]> wrote:
>begin transaction;
>   delete from t where id=5;
>   insert into t (id,...) values (5,...);
>   ... [1-200k rows]
>end;
>
>The problem is, that a large update of this kind can delay SELECT 
>queries running in parallel for several seconds, so the web interface 
>used by several people will be unusable for a short while.

CREATE TABLE idmap (
internalid int NOT NULL PRIMARY KEY,
visibleid int NOT NULL,
active bool NOT NULL
);
CREATE INDEX ipmap_visible ON idmap(visibleid);

Populate this table with
INSERT INTO idmap
SELECT id, id, true
  FROM t;

Change
SELECT ...
  FROM t
 WHERE t.id = 5;

to
SELECT ...
  FROM t INNER JOIN idmap ON (idmap.internalid = t.id AND
  idmap.active)
 WHERE idmap.visibleid = 5;

When you have to replace the rows in t for id=5, start by

INSERT INTO idmap VALUES (12345, 5, false);

Then repeatedly
INSERT INTO t (id, ...) VALUES (12345, ...);
at a rate as slow as you can accept.  You don't have to wrap all INSERTs
into a single transaction, but batching together a few hundred to a few
thousand INSERTs will improve performance.

When all the new values are in the database, you switch to the new id in
one short transaction:
BEGIN;
UPDATE idmap SET active = false WHERE visibleid = 5 AND active;
UPDATE idmap SET active = true WHERE internalid = 12345;
COMMIT;

Do the cleanup in off-peak hours (pseudocode):

FOR delid IN (SELECT internalid FROM idmap WHERE NOT active)
BEGIN
DELETE FROM t WHERE id = delid;
DELETE FROM idmap WHERE internalid = delid;
END;
VACUUM ANALYSE t;
VACUUM ANALYSE idmap;

To prevent this cleanup from interfering with INSERTs in progress, you
might want to add a "beinginserted" flag to idmap.

HTH.
Servus
 Manfred

---(end of broadcast)---
TIP 8: explain analyze is your friend

76 matches

Mail list logo