[PERFORM] postgresql-9.0 Windows service stops after database transaction

2010-09-23 Thread adrian . kitchingman
Hi all

I've have a strange problem with my Windows postgresql-9.0 service 
stopping after any transaction which manipulates tables in any database 
(Deleting records, Inserting records, bulk importing via \copy, etc). This 
problem occurs regardless whether I'm accessing the database server via 
pgAdmin III on a client machine or the server itself as well as through 
the command prompt. The end result it that I have to constantly restart 
the postgresql service after any transactions. Sometime there is a slight 
delay allowing a couple of transactions but the service is always 
eventually stopped.

The problem appears to have started when I changed permissions on my 
machine to allow the user 'postgres' access to the C drive following the 
instructions in the last post of this thread:
http://www.jitterbit.com/PhpBB/viewtopic.php?f=5&t=869

The specs of postres environment are:
Windows XP SP3
PostgreSQL 9.0.0, compiled by Visual C++ build 1500, 32-bit
pgAdmin III 1.12.0 (Sep17 2010, rev: REL-1_12_0)
PostGIS 2.0SVN

Hope someone can shed some light on this issue.

Cheers

Adrian


Notice:
This email and any attachments may contain information that is personal, 
confidential, legally privileged and/or copyright.No part of it should be 
reproduced, 
adapted or communicated without the prior written consent of the copyright 
owner. 

It is the responsibility of the recipient to check for and remove viruses.
If you have received this email in error, please notify the sender by return 
email, delete 
it from your system and destroy any copies. You are not authorised to use, 
communicate or rely on the information 
contained in this email.

Please consider the environment before printing this email.


Re: [PERFORM] locking issue on simple selects?

2010-09-23 Thread Kevin Grittner
Tobias Brox  wrote:
 
> 13:19:13.840 - LOG: checkpoint complete
 
> 13:19:13.841 - LOG:  checkpoint starting
 
> 13:20:43.176 - LOG:  checkpoint complete
 
There wasn't a lot of time between the completion of one checkpoint
and the start of the next.  And the two checkpoints finished a
minute and a half apart.  Perhaps you need to boost your
checkpoint_segments setting?  What is it now?
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] locking issue on simple selects?

2010-09-23 Thread Tobias Brox
On 23 September 2010 22:55, Kevin Grittner  wrote:
> Have you turned on checkpoint logging?

Yes ... it seems so:

13:19:13.840 - LOG: checkpoint complete: wrote 3849 buffers (0.2%); 0
transaction log file(s) added, 0 removed, 5 recycled; write=269.551 s,
sync=0.103 s, total=269.953 s
13:19:13.841 - LOG:  checkpoint starting: xlog
13:19:33 - the seq scan query towards the affected table started
13:20:31.454 - one of the index lookup queries towards the affected
table was finished
13:20:43.176 - LOG:  checkpoint complete: wrote 108199 buffers (6.9%);
0 transaction log file(s) added, 0 removed, 16 recycled; write=11.521
s, sync=77.533 s, total=89.335 s

> You might want to see if
> these are happening at some particular point in the checkpoint
> processing.  If so, look through the archives for posts from Greg
> Smith on how to tune that -- he's worked out a nice methodology to
> iteratively improve your configuration in this regard.

Thank you, I will ... hmm ... I found this blog post:

http://blog.2ndquadrant.com/en/2010/01/measuring-postgresql-checkpoin.html

Of course I'm doing it my own way:

select *,now() as snapshot into tmp_pg_stat_bgwriter from pg_stat_bgwriter ;

create view tmp_delta_pg_stat_bgwriter as
 select a.checkpoints_timed-b.checkpoints_timed as
checkpoints_timed,a.checkpoints_req-b.checkpoints_req as
checkpoints_req,a.buffers_checkpoint-b.buffers_checkpoint as
buffers_checkpoint,a.buffers_clean-b.buffers_clean as
buffers_clean,a.maxwritten_clean-b.maxwritten_clean as
maxwritten_clean,a.buffers_backend-b.buffers_backend as
buffers_backend,a.buffers_alloc-b.buffers_alloc as buffers_alloc,
now()-b.snapshot as interval
 from pg_stat_bgwriter a ,
 (select * from tmp_pg_stat_bgwriter order by snapshot desc limit 1) as b;

Checkpoint timeout is set to 5 minutes.  Right now we're having
relatively low activity.  I'm not sure how to read the stats below,
but they look OK to me:

select * from tmp_delta_pg_stat_bgwriter ;
 checkpoints_timed | checkpoints_req | buffers_checkpoint |
buffers_clean | maxwritten_clean | buffers_backend | buffers_alloc |
 interval
---+-++---+--+-+---+-
 3 |   0 |   8277 |
15 |0 | 185 | 18691 |
00:12:02.988842
(1 row)

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Memory usage - indexes

2010-09-23 Thread Mark Kirkwood

On 24/09/10 09:50, Tobias Brox wrote:

We've come to a tipping point with one of our database servers, it's
generally quite loaded but up until recently it was handling the load
well - but now we're seeing that it struggles to process all the
selects fast enough.  Sometimes we're observing some weird lock-like
behaviour (see my other post on that), but most of the time the
database server is just not capable of handling the load fast enough
(causing the queries to pile up in the pg_stat_activity-view).

My main hypothesis is that all the important indexes would fit snuggly
into the memory before, and now they don't.  We'll eventually get the
server moved over to new and improved hardware, but while waiting for
that to happen we need to do focus on reducing the memory footprint of
the database.  I have some general questions now ...

1) Are there any good ways to verify my hypothesis?  Some months ago I
thought of running some small memory-gobbling program on the database
server just to see how much memory I could remove before we would see
indications of the database being overloaded.  It seems a bit radical,
but I think the information learned from such an experiment would be
very useful ... and we never managed to set up any testing environment
that faithfully replicates production traffic.  Anyway, it's sort of
too late now that we're already observing performance problems even
without the memory gobbling script running.

2) I've seen it discussed earlier on this list ... shared_buffers vs
OS caches.  Some claims that it has very little effect to adjust the
size of the shared buffers.  Anyway, isn't it a risk that memory is
wasted because important data is stored both in the OS cache and the
shared buffers?  What would happen if using almost all the available
memory for shared buffers?  Or turn it down to a bare minimum and let
the OS do almost all the cache handling?

3) We're discussing to drop some overlapping indexes ... i.e. to drop
one out of two indexes looking like this:

some_table(a)
some_table(a,b)

Would the query "select * from some_table where a=?" run slower if we
drop the first index?  Significantly?

(in our situation I found that the number of distinct b's for each a
is low and that the usage stats on the second index is quite low
compared with the first one, so I think we'll drop the second index).

4) We're discussing to drop other indexes.  Does it make sense at all
as long as we're not experiencing problems with inserts/updates?  I
suppose that if the index isn't used it will remain on disk and won't
affect the memory usage ... but what if the index is rarely used ...
wouldn't it be better to do a seqscan on a table that is frequently
accessed and mostly in memory than to consult an index that is stored
on the disk?

Sorry for all the stupid questions ;-)

   



All good questions! Before (or maybe as well as) looking at index sizes 
vs memory I'd check to see if any of your commonly run queries have 
suddenly started to use different plans due to data growth, e.g:


- index scan to seq scan (perhaps because effective_cache_size is too 
small now)

- hash agg to sort (work_mem too small now)

We had a case of the 1st point happen here a while ago, symptoms looked 
very like what you are describing.


Re index size, you could try indexes like:

some_table(a)
some_table(b)

which may occupy less space, and the optimizer can bitmap and/or them to 
work like the compound index some_table(a,b).


regards

Mark

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Memory usage - indexes

2010-09-23 Thread Tobias Brox
We've come to a tipping point with one of our database servers, it's
generally quite loaded but up until recently it was handling the load
well - but now we're seeing that it struggles to process all the
selects fast enough.  Sometimes we're observing some weird lock-like
behaviour (see my other post on that), but most of the time the
database server is just not capable of handling the load fast enough
(causing the queries to pile up in the pg_stat_activity-view).

My main hypothesis is that all the important indexes would fit snuggly
into the memory before, and now they don't.  We'll eventually get the
server moved over to new and improved hardware, but while waiting for
that to happen we need to do focus on reducing the memory footprint of
the database.  I have some general questions now ...

1) Are there any good ways to verify my hypothesis?  Some months ago I
thought of running some small memory-gobbling program on the database
server just to see how much memory I could remove before we would see
indications of the database being overloaded.  It seems a bit radical,
but I think the information learned from such an experiment would be
very useful ... and we never managed to set up any testing environment
that faithfully replicates production traffic.  Anyway, it's sort of
too late now that we're already observing performance problems even
without the memory gobbling script running.

2) I've seen it discussed earlier on this list ... shared_buffers vs
OS caches.  Some claims that it has very little effect to adjust the
size of the shared buffers.  Anyway, isn't it a risk that memory is
wasted because important data is stored both in the OS cache and the
shared buffers?  What would happen if using almost all the available
memory for shared buffers?  Or turn it down to a bare minimum and let
the OS do almost all the cache handling?

3) We're discussing to drop some overlapping indexes ... i.e. to drop
one out of two indexes looking like this:

some_table(a)
some_table(a,b)

Would the query "select * from some_table where a=?" run slower if we
drop the first index?  Significantly?

(in our situation I found that the number of distinct b's for each a
is low and that the usage stats on the second index is quite low
compared with the first one, so I think we'll drop the second index).

4) We're discussing to drop other indexes.  Does it make sense at all
as long as we're not experiencing problems with inserts/updates?  I
suppose that if the index isn't used it will remain on disk and won't
affect the memory usage ... but what if the index is rarely used ...
wouldn't it be better to do a seqscan on a table that is frequently
accessed and mostly in memory than to consult an index that is stored
on the disk?

Sorry for all the stupid questions ;-)

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] locking issue on simple selects?

2010-09-23 Thread Kevin Grittner
Tobias Brox  wrote:
 
> All while having this problem, there was one heavy query running
> in parallell ... not sure if that's relevant.
 
Have you turned on checkpoint logging?  You might want to see if
these are happening at some particular point in the checkpoint
processing.  If so, look through the archives for posts from Greg
Smith on how to tune that -- he's worked out a nice methodology to
iteratively improve your configuration in this regard.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] locking issue on simple selects?

2010-09-23 Thread Tobias Brox
On 15 September 2010 12:05, Tobias Brox  wrote:
> Recently we've frequently encountered issues where some simple selects
> (meaning, selects doing an index lookup and fetching one row) have
> become stuck for several minutes.  Apparently all requests on one
> exact table gets stuck, all requests not related to said table are
> going through without any problems.

Now I've set up all kind of logging regarding locks, so it seems like
we're having issues that aren't lock-related.  I just did a bit of
research into one situation today.

All while having this problem, there was one heavy query running in
parallell ... not sure if that's relevant.

Then comes one query that requires a seq scan on the problem table
(that won't happen again - I just added a new index).  Four seconds
later comes another query requiring a simple index lookup.  Still more
queries comes in, most of them simple index lookups, but on different
indexes.  After one minute there are 25 queries in the
pg_stat_activity view towards this table.  It's not a particularly
huge table.  Moments later all 25 queries have been executed.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] how to enforce index sub-select over filter+seqscan

2010-09-23 Thread Merlin Moncure
On Thu, Sep 23, 2010 at 10:26 AM, Dmitry Teslenko  wrote:
> Hello!
>
> I have this table:
>
> create table test (
>        s1 varchar(255),
>        s2 varchar(255),
>        i1 integer,
>        i2 integer,
>
> ... over 100 other fields
>
> );
>
> table contains over 8 million records
>
> there's these indexes:
>
> create index is1 on test (s1);
> create index is2 on test (s2);
> create index ii1 on test (i1);
> create index ii2 on test (i2);
> create index ii3 on test (i1, i2);
>
> and then i run this query:
>
> select
> *
> from (
>        select  *
>        from test
>        where
>                is1 = 'aa' or is2 = 'aa'
>        )
> where
>        is1 = 1
>        or (is1  = 1
>                and is2 = 1)
>        or (is1  = 2
>                and is2 = 2)
>        or (is1  = 3
>                and is2 = 3)

hm, I think you meant to say:
s1 = 'aa' or s2 = 'aa', i1 = 1 ... etc. details are important!

Consider taking the combination of 'correct' pair of i1 and i2 and
building a table with 'values' and joining to that:

select  * from test
  join
  (
values (2,2),  (3,3), ...
  ) q(i1, i2) using(i1,i2)
  where  s1 = 'aa' or s2 = 'aa' or i1=1

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Useless sort by

2010-09-23 Thread Scott Marlowe
On Thu, Sep 23, 2010 at 7:51 AM,   wrote:
> Not insulting, just amused bemusement.  PG portrays itself as the best OS 
> database, which it may well be.  But it does so by stressing the 
> row-by-agonizing-row approach to data.  In other words, as just a record 
> paradigm filestore for COBOL/java/C coders.  I was expecting more Relational 
> oomph.  As Dr. Codd says:  "A Relational Model of Data for Large Shared Data 
> Banks".  Less code, more data.

So what, exactly, would give pgsql more relationally "oomph"?

Your assertion feels pretty hand wavy right now.

-- 
To understand recursion, one must first understand recursion.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] how to enforce index sub-select over filter+seqscan

2010-09-23 Thread Kevin Grittner
Dmitry Teslenko  wrote:
 
> Seqscan is very slow and I want to avoid it. Subquery is very fast
> and i don't know why postgres chooses that plan.
> 
> I know I can set enable_seqscan = off.
> Is there other ways to enforce index usage?
 
If you come at it from that angle, you probably won't get the best
resolution.  PostgreSQL can see the alternative plans, and develops
estimated costs of running each.  It uses the one that it thinks
will be fastest.  If it's wrong, there's probably something wrong
with the statistics it uses for estimating, or with the costing
information.  (There are some cases where it's not able to
accurately estimate costs even if these are right, but let's check
the more common cases first.)
 
Please provide a little more information, like PostgreSQL version,
the postgresql.conf contents (excluding comments), OS, hardware, and
the EXPLAIN ANALYZE output of the query with and without
enable_seqscan = off.
 
Other useful ideas here:
 
http://wiki.postgresql.org/wiki/SlowQueryQuestions
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] how to enforce index sub-select over filter+seqscan

2010-09-23 Thread Dmitry Teslenko
Hello!

I have this table:

create table test (
s1 varchar(255),
s2 varchar(255),
i1 integer,
i2 integer,

... over 100 other fields

);

table contains over 8 million records

there's these indexes:

create index is1 on test (s1);
create index is2 on test (s2);
create index ii1 on test (i1);
create index ii2 on test (i2);
create index ii3 on test (i1, i2);

and then i run this query:

select
*
from (
select  *
from test
where
is1 = 'aa' or is2 = 'aa'
)
where
is1 = 1
or (is1  = 1
and is2 = 1)
or (is1  = 2
and is2 = 2)
or (is1  = 3
and is2 = 3)


where part of outer query can have different count of
"or (is1  = N
and is2 = M)"
expressions, lets name this number X.

When X is low planner chooses index scan using is1 and is2,
then BitmapAnd that with index scan using  ii1, ii2 or ii3.

But when X is big enough (> 15) planner chooses seqscan and filter on
i1, i2, s1, s2.
Seqscan is very slow and I want to avoid it. Subquery is very fast
and i don't know why postgres chooses that plan.

I know I can set enable_seqscan = off.
Is there other ways to enforce index usage?

postgres pg_class have right estimate of rowcount.

-- 
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Useless sort by

2010-09-23 Thread gnuoytr
I can't tell if you meant for this to be insulting or my reading it that way is 
wrong, but it certainly wasn't put in a helpful tone.  Let me summarize for 
you.  You've been told that putting ORDER BY into a view is a generally poor 
idea anyway, that it's better to find ways avoid this class of concern 
altogether.  There are significant non-obvious technical challenges behind 
actually implementing the behavior you'd like to see; the concerns raised by 
Tom and Maciek make your idea impractical even if it were desired.  And for 
every person like yourself who'd see the benefit you're looking for, there are 
far more that would find a change in this area a major problem.  The concerns 
around breakage due to assumed but not required aspects of the relational model 
are the ones the users of the software will be confused by, not the developers 
of it.  You have the classification wrong; the feedback you've gotten here is 
from the developers being user oriented, not theory oriented or 
 c!
ode oriented.
-- 
Greg Smith, 2ndQuadrant US g...@2ndquadrant.com Baltimore, MD
PostgreSQL Training, Services and Support  www.2ndQuadrant.us
Author, "PostgreSQL 9.0 High Performance"Pre-ordering at:
https://www.packtpub.com/postgresql-9-0-high-performance/book


Not insulting, just amused bemusement.  PG portrays itself as the best OS 
database, which it may well be.  But it does so by stressing the 
row-by-agonizing-row approach to data.  In other words, as just a record 
paradigm filestore for COBOL/java/C coders.  I was expecting more Relational 
oomph.  As Dr. Codd says:  "A Relational Model of Data for Large Shared Data 
Banks".  Less code, more data.

robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Using Between

2010-09-23 Thread mark
>>> The question is how can we make it faster.

>>If there's just one region ID for any given postal code, you might try
>>adding a column to vehicleused and storing the postal codes there.
>>You could possibly populate that column using a trigger; probably it
>>doesn't change unless the postalcode changes.  Then you could index
>>that column and query against it directly, rather than joining to
>>PostalCodeRegionCountyCity.  Short of that, I don't see any obvious
>>way to avoid reading most of the vehicleused table.  There may or may
>>not be an index that can speed that up slightly and of course you can
>>always throw hardware at the problem, but fundamentally reading half a
>>million or more rows isn't going to be instantaneous.

>>Incidentally, it would probably simplify things to store postal codes
>>in the same case throughout the system. If you can avoid the need to
>>write lower(x) = lower(y) and just write x = y you may get better
>>plans.  I'm not sure that's the case in this particular example but
>>it's something to think about.

Something else you might test is bumping the read-ahead value. Most linux
installs have this at 256, might try bumping the value to ~8Meg and tune
from there . this may help you slightly for seq scan performance. As always:
YMMV. It's not going to magically fix low performing I/O subsystems and it
won't help many applications of PG but there are a few outlying instances
where this change can help a little bit. 


I am sure someone will step in and tell you it is a bad idea - AND they will
probably have perfectly valid reasons for why it is, so you will need to
consider the ramifications.. if at all possible test and tune to see. 
..: Mark



>>-- 
>>Robert Haas
>>EnterpriseDB: http://www.enterprisedb.com
>>The Enterprise Postgres Company

-- 
>>Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
>>To make changes to your subscription:
>>http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance