Re: [PERFORM] Why does the query planner use two full indexes, when a dedicated partial index exists?

2012-12-27 Thread Richard Neill



The partial index is highly leveraged.  If every tuple in the
table is updated once, that amounts to every tuple in the index
being updated 25,000 times.

How so? That sounds like O(n_2) behaviour.

If the table has 5 million rows while the index has 200 (active) rows at
any given time, then to update every row in the table to null and back
again would be 100% turn over of the table.  But each such change would
lead to an addition and then a deletion from the index.  So 100%
turnover of the table would be a 5 million / 200 = 25,000 fold turn of
the index.


Sorry, I was being dense. I misread that as:
   every time a single tuple in the table is updated, the entire index
(every row) is updated.
Yes, of course your explanation makes sense.



There is some code that allows a btree index entry to get killed (and so
the slot to be reused) without any vacuum, if a scan follows that entry
and finds the corresponding tuple in the table no longer visible to
anyone.  I have not examined this code, and don't know whether it is
doing its job but just isn't enough to prevent the bloat, or if for some
reason it is not applicable to your situation.



It looks like my solution is going to be a REINDEX invoked from cron, or 
maybe just every 100k inserts.



In terms of trying to improve this behaviour for other PG users in the 
future, are there any more diagnostics I can do for you? Having found a 
special case, I'd like to help permanently resolve it if I can.



Thanks very much again.

Best wishes,

Richard







--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Improve performance for writing

2012-12-27 Thread Markus Innerebner
Hello

please do not consider this email as an yet another question how to speed up 
writing.

The situation is different:

My algorithm stores after the computation the result as tuples in a DB.
The tuples in addition to normal values (e.g. a,b) , contains sql statements 
that fetch values (for instance the geometry attribute) from another table 
(e.g. orig_table).

e.g. 

INSERT INTO dest_table (
  Select a,b, s.geometry,s.length from orig_table s where s.id=?
)

The number of inserts depends on the size of the result and vary from 10,000 to 
1,000,000.

My question is: how can I speed up such inserts?

Only COPY statements want work, since I need additional values
Insert statements takes long time (even if using Bulk)

What do you suggest me in such a situation?

Would it be better to perform?
- first use COPY to store values in new table
- second update the new table with values from origin table


thanks for your hints / suggestions

cheers  Markus

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Improve performance for writing

2012-12-27 Thread Charles Gomes
Markus,

Have you looked over here:
http://www.postgresql.org/docs/9.2/static/populate.html



 From: markus.innereb...@inf.unibz.it
 Subject: [PERFORM] Improve performance for writing
 Date: Thu, 27 Dec 2012 14:10:40 +0100
 To: pgsql-performance@postgresql.org

 Hello

 please do not consider this email as an yet another question how to speed up 
 writing.

 The situation is different:

 My algorithm stores after the computation the result as tuples in a DB.
 The tuples in addition to normal values (e.g. a,b) , contains sql statements 
 that fetch values (for instance the geometry attribute) from another table 
 (e.g. orig_table).

 e.g.

 INSERT INTO dest_table (
 Select a,b, s.geometry,s.length from orig_table s where s.id=?
 )

 The number of inserts depends on the size of the result and vary from 10,000 
 to 1,000,000.

 My question is: how can I speed up such inserts?

 Only COPY statements want work, since I need additional values
 Insert statements takes long time (even if using Bulk)

 What do you suggest me in such a situation?

 Would it be better to perform?
 - first use COPY to store values in new table
 - second update the new table with values from origin table


 thanks for your hints / suggestions

 cheers Markus

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance  
   

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance on Bulk Insert to Partitioned Table

2012-12-27 Thread Charles Gomes

 Date: Wed, 26 Dec 2012 23:03:33 -0500 
 Subject: Re: [PERFORM] Performance on Bulk Insert to Partitioned Table 
 From: jeff.ja...@gmail.com 
 To: charle...@outlook.com 
 CC: ondrej.iva...@gmail.com; pgsql-performance@postgresql.org 
  
 On Monday, December 24, 2012, Charles Gomes wrote: 
  
  
   
   I think your performance bottleneck is almost certainly the dynamic 
   SQL.  Using C to generate that dynamic SQL isn't going to help much, 
   because it is still the SQL engine that has to parse, plan, and execute 
   it. 
   
  
 Jeff, I've changed the code from dynamic to: 
  
 CREATE OR REPLACE FUNCTION quotes_insert_trigger() 
 RETURNS trigger AS $$ 
 DECLARE 
 r_date text; 
 BEGIN 
 r_date = to_char(new.received_time, '_MM_DD'); 
 case r_date 
  when '2012_09_10' then 
  insert into quotes_2012_09_10 values (NEW.*) using new; 
  return; 
 ... 
  
  
 However I've got no speed improvement. 
 I need to keep two weeks worth of partitions at a time, that's why all  
 the WHEN statements. 
  
 The 'using new' and return without argument are syntax errors. 
  
 When I do a model system with those fixed, I get about 2 fold  
 improvement over the dynamic SQL performance.  Even if your performance  
 did not go up, did your CPU usage go down?  Perhaps you have multiple  
 bottlenecks all sitting at about the same place, and so tackling any  
 one of them at a time doesn't get you anywhere. 



I’ve run a small test with the fixes you mentioned and it changed from
1H:20M to, 1H:30M to insert 39600 rows.


 

If there was another bottleneck, performance when targeting the
partitions directly would not be twice as fast. I’ve run another long insert
test and it takes 4H:15M to complete using triggers to distribute the inserts.  
When targeting It completes in 1H:55M. That’s
both for 70 simultaneous workers with the same data and 118800 rows.


 

The tests that Emmanuel did translating the trigger to C have great
performance improvement. While His code is very general and could work for
anyone using CHECK’s for triggers. I’m still working on fixing it so it’s
compatible with 9.2

So far I’m having a hard time using the C triggers anyway,:

ERROR:  could not load library
/var/lib/pgsql/pg_trigger_example.so:
/var/lib/pgsql/pg_trigger_example.so: failed to map segment from shared object:
Operation not permitted

I will do more reading on it.

I think having it to work again can bring some value so more people can
be aware of the performance improvement using C instead of PLSQL.


  
 How does both the dynamic and the CASE scale with the number of  
 threads?  I think you said you had something like 70 sessions, but only  
 8 CPUs.  That probably will do bad things with contention, and I don't  
 see how using more connections than CPUs is going to help you here.  If  
 the CASE starts out faster in single thread but then flat lines and the  
 EXECUTE catches up, that suggests a different avenue of investigation  
 than they are always the same. 
  


I didn’t see a significant change in CPU utilization, it seems to be a
bit less, but not that much, however IO is still idling.


  
 Wish postgres could automate the partition process natively like the  
 other sql db. 
  
 More automated would be nice (i.e. one operation to make both the check  
 constraints and the trigger, so they can't get out of sync), but would  
 not necessarily mean faster.  I don't know what you mean about other  
 db.  Last time I looked at partitioning in mysql, it was only about  
 breaking up the underlying storage into separate files (without regards  
 to contents of the rows), so that is the same as what postgres does  
 automatically.  And in Oracle, their partitioning seemed about the same  
 as postgres's as far as administrative tedium was concerned.  I'm not  
 familiar with how the MS product handles it, and maybe me experience  
 with the other two are out of date. 



The other free sql DB supports a more elaborated scheme, for example: 

CREATE
TABLE ti (id INT, amount DECIMAL(7,2), tr_date DATE)

    ENGINE=INNODB

    PARTITION BY HASH( MONTH(tr_date) )

    PARTITIONS 6;

It
also supports partitioning by RANGE, LIST or KEY.


The paid one uses a very similar style:CREATE TABLE dept (deptno NUMBER, 
deptname VARCHAR(32))
 PARTITION BY HASH(deptno) PARTITIONS 16;

Also:
CREATE TABLE sales
  ( prod_id   NUMBER(6)
  , cust_id   NUMBER
  , time_id   DATE
  , quantity_sold NUMBER(3)
  )
 PARTITION BY RANGE (time_id)
 ( PARTITION sales_q1_2006 VALUES LESS THAN 
(TO_DATE('01-APR-2006','dd-MON-'))
TABLESPACE tsa
 , PARTITION sales_q2_2006 VALUES LESS THAN 
(TO_DATE('01-JUL-2006','dd-MON-'))
TABLESPACE tsb
...




  
 Cheers, 
  
 Jeff

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:

[PERFORM] Why does the query planner use two full indexes, when a dedicated partial index exists?

2012-12-27 Thread Jeff Janes
On Thursday, December 20, 2012, Jeff Janes wrote:

 On Thursday, December 20, 2012, Richard Neill wrote:



  -  Bitmap Index Scan on tbl_tracker_exit_state_idx
 (cost=0.00..8.36 rows=151 width=0) (actual time=7.946..7.946 rows=20277
 loops=1)


 This is finding 100 times more rows than it thinks it will.  If that could
 be fixed, surely this plan would not look as good.  But then, it would
 probably just switch to another plan that is not the one you want, either.


I guess the issue here is that the histogram postgres uses to estimate the
number of rows that will be found is based on visible rows, and it is
correctly estimating the number of visible rows that will be found.  And
that is the relevant thing to pass up to a higher join for its estimation.
 But for estimating the number of blocks a given index scan will access,
the right thing would be the number of tuples visited, not the number of
them found to be visible.  So that is where this plan goes systematically
wrong.

I guess the correct thing would be for postgres to keep two histograms, one
of all tuples and one of all visible tuples, and to produce different
selectivity estimates for different purposes.  But I don't see that change
getting made.  It is only meaningful in cases where there is a fundamental
skew in distribution between visible tuples and
invisible-but-as-yet-unvacuumed tuples.

I think that that fundamental skew is the source of both the
underestimation of the bitmap scan cost, and overestimation of the partial
index scan (although I can't get it to overestimate that be anywhere near
the amount you were seeing).

I still think your best bet is to get rid of the partial index and trade
the full one on (parcel_id_code) for one on (parcel_id_code,exit_state).  I
think that will be much less fragile than reindexing in a cron job.

Cheers,

Jeff




Re: [PERFORM] Why does the query planner use two full indexes, when a dedicated partial index exists?

2012-12-27 Thread Richard Neill



On 27/12/12 16:17, Jeff Janes wrote:


I still think your best bet is to get rid of the partial index and trade
the full one on (parcel_id_code) for one on (parcel_id_code,exit_state).
  I think that will be much less fragile than reindexing in a cron job.



So, at the moment, I have 3 indexes:
  full: parcel_id_code
  full: exit_state
  full: parcel_id_code where exit state is null

Am I right that when you suggest just a single, joint index
(parcel_id_code,exit_state)
instead of all 3 of the others,

it will allow me to optimally run all of the following? :

1.  SELECT * from tbl_tracker WHERE parcel_id_code=22345 AND exit_state 
IS NULL


(this is the one we've been discussing)


2.  SELECT * from tbl_tracker where parcel_id_code=44533

3.  SELECT * from tbl_tracker where exit_code = 2

(2 and 3 are examples of queries I need to run for other purposes, 
unrelated to this thread, but which use the other indexes.).



Thanks,

Richard




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Why does the query planner use two full indexes, when a dedicated partial index exists?

2012-12-27 Thread Tom Lane
Richard Neill rn...@richardneill.org writes:
 So, at the moment, I have 3 indexes:
full: parcel_id_code
full: exit_state
full: parcel_id_code where exit state is null

 Am I right that when you suggest just a single, joint index
  (parcel_id_code,exit_state)
 instead of all 3 of the others,

I think he was just recommending replacing the first and third indexes.

 it will allow me to optimally run all of the following? :
 1.  SELECT * from tbl_tracker WHERE parcel_id_code=22345 AND exit_state 
 IS NULL
 2.  SELECT * from tbl_tracker where parcel_id_code=44533
 3.  SELECT * from tbl_tracker where exit_code = 2

You will need an index with exit_state as the first column to make #3
perform well --- at least, assuming that an index is going to help at
all anyway.  The rule of thumb is that if a query is going to fetch
more than a few percent of a table, an index is not useful because
it's going to be touching most table pages anyway, so a seqscan will
win.  I've forgotten now what you said the stats for exit_code values
other than null were, but it's quite possible that an index is useless
for #3.

These considerations are mostly covered in the manual:
http://www.postgresql.org/docs/9.2/static/indexes.html

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly

2012-12-27 Thread François Beausoleil

Le 2012-12-27 à 12:10, Nikolas Everett a écrit :

 We just upgraded from 8.3 to 9.1 and we're seeing some performance problems.  
 When we EXPLAIN ANALYZE our queries the explain result claim that the queries 
 are reasonably fast but the wall clock time is way way longer.  Does anyone 
 know why this might happen?
 
 Like so:
 db=\timing
 db=EXPLAIN ANALYZE SELECT max(id) FROM foo WHERE blah_id = 1209123;
 
 The plan is sensible.  The estimates are sensible.  The actual DB time reads 
 like it is very sensible.  But the wall clock time is like 11 seconds and the 
 \timing report confirms it.
 
 Any ideas?

Could you post the actual plans? On both versions? That would help a lot.

Also, http://explain.depesz.com/ helps readability.

Bye,
François



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly

2012-12-27 Thread Richard Neill

On 27/12/12 17:21, François Beausoleil wrote:


Le 2012-12-27 à 12:10, Nikolas Everett a écrit :


We just upgraded from 8.3 to 9.1 and we're seeing some performance problems.  
When we EXPLAIN ANALYZE our queries the explain result claim that the queries 
are reasonably fast but the wall clock time is way way longer.  Does anyone 
know why this might happen?



Is it possible you missed an optimisation setting in the migration 
process? I made that mistake, and much later found performance was 
somewhat degraded (but surprisingly not as much as I'd expected) by my 
having failed to set effective_cache_size.


Also, if you just did a dump/restore, it might help to run Analyse once
(it seems that Analyse is normally run automatically via vacuum, but the 
first time you insert the data, it may not happen).


A side-effect of Analyse it that it will pull all the tables into the OS 
memory cache (or try to) - which may give significantly faster results 
(try running the same query twice in succession: it's often 5x faster 
the 2nd time).


HTH,

Richard


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] backend suddenly becomes slow, then remains slow

2012-12-27 Thread Andrew Dunstan

On 12/26/2012 11:03 PM, Jeff Janes wrote:
On Fri, Dec 14, 2012 at 10:40 AM, Andrew Dunstan 
andrew.duns...@pgexperts.com wrote:
 One of my clients has an odd problem. Every so often a backend will 
suddenly
 become very slow. The odd thing is that once this has happened it 
remains
 slowed down, for all subsequent queries. Zone reclaim is off. There 
is no IO
 or CPU spike, no checkpoint issues or stats timeouts, no other 
symptom that

 we can see.

By no spike, do you mean that the system as a whole is not using an 
unusual amount of IO or CPU, or that this specific slow back-end is 
not using an unusual amount?



both, really.



Could you strace is and see what it is doing?



Not very easily, because it's a pool connection and we've lowered the 
pool session lifetime as part of the amelioration :-) So it's not 
happening very much any more.




 The problem was a lot worse that it is now, but two steps have
 alleviated it mostly, but not completely: much less aggressive 
autovacuuming
 and reducing the maximum lifetime of backends in the connection 
pooler to 30

 minutes.

Do you have a huge number of tables?  Maybe over the course of a 
long-lived connection, it touches enough tables to bloat the relcache 
/ syscache.  I don't know how the autovac would be involved in that, 
though.





Yes, we do indeed have a huge number of tables. This seems a plausible 
thesis.


cheers

andrew




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly

2012-12-27 Thread Nikolas Everett
Thanks!

http://explain.depesz.com/s/yfs

Looks like we're running a load of about 6.  The machines have two physical
cores hyperthreaded to 32 cores.

Interesting - the data is stored on nfs on a netapp.  We don't seem to have
a ton of nfs traffic.

Also we've got shared memory set to 48 gigs which is comfortably less than
the 146 gigs in the machine.


On Thu, Dec 27, 2012 at 12:21 PM, François Beausoleil
franc...@teksol.infowrote:


 Le 2012-12-27 à 12:10, Nikolas Everett a écrit :

  We just upgraded from 8.3 to 9.1 and we're seeing some performance
 problems.  When we EXPLAIN ANALYZE our queries the explain result claim
 that the queries are reasonably fast but the wall clock time is way way
 longer.  Does anyone know why this might happen?
 
  Like so:
  db=\timing
  db=EXPLAIN ANALYZE SELECT max(id) FROM foo WHERE blah_id = 1209123;
 
  The plan is sensible.  The estimates are sensible.  The actual DB time
 reads like it is very sensible.  But the wall clock time is like 11 seconds
 and the \timing report confirms it.
 
  Any ideas?

 Could you post the actual plans? On both versions? That would help a lot.

 Also, http://explain.depesz.com/ helps readability.

 Bye,
 François




Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly

2012-12-27 Thread Nikolas Everett
New news - the hot slave seems to be performing as expected with no long
pauses.

It looks like we're using an archive_timeout of 60 seconds and default
checkout_timeout and checkpoint_completion_target.  I didn't do any of the
research on this.  It seems like we're asking postgres to clear all of the
dirty buffers every 60 seconds.  With 48 gigs of shared buffers we could
have quite a few buffers to clear.  Is there some place I could check on
how all that is going?

On Thu, Dec 27, 2012 at 12:45 PM, Nikolas Everett nik9...@gmail.com wrote:

 p


Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly

2012-12-27 Thread Nikolas Everett
Actually that last paragraph doesn't make much sense.  Please ignore it.


On Thu, Dec 27, 2012 at 12:58 PM, Nikolas Everett nik9...@gmail.com wrote:

 New news - the hot slave seems to be performing as expected with no long
 pauses.

 It looks like we're using an archive_timeout of 60 seconds and default
 checkout_timeout and checkpoint_completion_target.  I didn't do any of the
 research on this.  It seems like we're asking postgres to clear all of the
 dirty buffers every 60 seconds.  With 48 gigs of shared buffers we could
 have quite a few buffers to clear.  Is there some place I could check on
 how all that is going?

 On Thu, Dec 27, 2012 at 12:45 PM, Nikolas Everett nik9...@gmail.comwrote:

 p





Re: [PERFORM] Why does the query planner use two full indexes, when a dedicated partial index exists?

2012-12-27 Thread Jeff Janes
On Thursday, December 27, 2012, Richard Neill wrote:



 On 27/12/12 16:17, Jeff Janes wrote:


 I still think your best bet is to get rid of the partial index and trade
 the full one on (parcel_id_code) for one on (parcel_id_code,exit_state).
   I think that will be much less fragile than reindexing in a cron job.


 So, at the moment, I have 3 indexes:
   full: parcel_id_code
   full: exit_state
   full: parcel_id_code where exit state is null

 Am I right that when you suggest just a single, joint index
 (parcel_id_code,exit_state)
 instead of all 3 of the others,


No, just instead of 1 and 3.  You still need an index on (exit_state) in
order to efficiently satisfy query 3 below.

Alternative, you could keep index 1, and replace 2 and 3 with one on
(exit_state, parcel_id_code).  And in fact this might be the better way to
go, because a big problem you are facing is that the (exit_state) index is
looking falsely attractive, and the easiest way to overcome that is to get
rid of that index and replace it with one that can do everything that it
can do, but more.

Theoretically there is technique called loose scan or skip scan which
could allow you to make one index, (exit_state, parcel_id_code) to replace
all 3 of the above, but postgres does not yet implement that technique.  I
think there is a way to achieve the same thing using recursive sql.  But I
doubt it would be worth it, as too much index maintenance is not your root
problem.



 3.  SELECT * from tbl_tracker where exit_code = 2


Cheers,

Jeff


[PERFORM] Performance on Bulk Insert to Partitioned Table

2012-12-27 Thread Jeff Janes
On Wednesday, December 26, 2012, Pavel Stehule wrote:

 2012/12/27 Jeff Janes jeff.ja...@gmail.com:
 
  More automated would be nice (i.e. one operation to make both the check
  constraints and the trigger, so they can't get out of sync), but would
 not
  necessarily mean faster.


snip some benchmarking

Native implementation should significantly effective evaluate

 expressions, mainly simple expressions - (this is significant for
 large number of partitions) and probably can do tuple forwarding
 faster than is heavy INSERT statement (is question if is possible
 decrease some overhead with more sophisticate syntax (by removing
 record expand).


If the main goal is to make it faster, I'd rather see all of plpgsql get
faster, rather than just a special case of partitioning triggers.  For
example, right now a CASE expression statement with 100 branches is about
the same speed as an equivalent list of 100 elsif.  So it seems to be doing
a linear search, when it could be doing a hash that should be a lot faster.




 So native implementation can carry significant speed up - mainly if we
 can distribute tuples without expression evaluating (evaluated by
 executor)


Making partitioning inserts native does open up other opportunities to make
it faster, and also to make it administratively easier; but do we want to
try to tackle both of those goals simultaneously?  I think the
administrative aspects would come first.  (But I doubt I will be the one to
implement either, so my vote doesn't count for much here.)


Cheers,

Jeff





Re: [PERFORM] Performance on Bulk Insert to Partitioned Table

2012-12-27 Thread Jeff Janes
On Monday, December 24, 2012, Charles Gomes wrote:

 By the way, I've just re-wrote the code to target the partitions
 individually and I've got almost 4 times improvement.
 Shouldn't it be faster to process the trigger, I would understand if there
 was no CPU left, but there is lots of cpu to chew.


Once you turned off hyperthreading, it was reporting 75% CPU usage.
 Assuming that that accounting is perfect, that means you could only get
33% faster if you were to somehow start using all of the CPU.  So I don't
think I'd call that a lot of CPU left.  And if you have 70 processes
fighting for 8 cores, I'm not surprised you can't get above that CPU usage.



 It seems that there will be no other way to speedup unless the insert code
 is partition aware.



There may be other ways, but that one will probably get you the most gain,
especially if you use COPY or \copy.  Since the main goal of partitioning
is to allow your physical storage layout to conspire with your bulk
operations, it is hard to see how you can get the benefits of partitioning
without having your bulk loading participate in that conspiracy.


Cheers,

Jeff


Re: [PERFORM] Performance on Bulk Insert to Partitioned Table

2012-12-27 Thread Charles Gomes
Pavel, 

I've been trying to port the work of Emmanuel 
http://archives.postgresql.org/pgsql-hackers/2008-12/msg01221.php


His implementation is pretty straight forward. Simple trigger doing constrain 
checks with caching for bulk inserts.
So far that's what I got http://www.widesol.com/~charles/pgsql/partition.c
I had some issues as He uses HeapTuples and on 9.2 I see a Slot.



 From: pavel.steh...@gmail.com
 Date: Thu, 27 Dec 2012 19:46:12 +0100
 Subject: Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
 To: jeff.ja...@gmail.com
 CC: charle...@outlook.com; ondrej.iva...@gmail.com; 
 pgsql-performance@postgresql.org

 2012/12/27 Jeff Janes jeff.ja...@gmail.com:
  On Wednesday, December 26, 2012, Pavel Stehule wrote:
 
  2012/12/27 Jeff Janes jeff.ja...@gmail.com:
  
   More automated would be nice (i.e. one operation to make both the check
   constraints and the trigger, so they can't get out of sync), but would
   not
   necessarily mean faster.
 
 
  snip some benchmarking
 
  Native implementation should significantly effective evaluate
 
  expressions, mainly simple expressions - (this is significant for
  large number of partitions) and probably can do tuple forwarding
  faster than is heavy INSERT statement (is question if is possible
  decrease some overhead with more sophisticate syntax (by removing
  record expand).
 
 
  If the main goal is to make it faster, I'd rather see all of plpgsql get
  faster, rather than just a special case of partitioning triggers. For
  example, right now a CASE expression statement with 100 branches is about
  the same speed as an equivalent list of 100 elsif. So it seems to be doing
  a linear search, when it could be doing a hash that should be a lot faster.

 a bottleneck is not in PL/pgSQL directly. It is in PostgreSQL
 expression executor. Personally I don't see any simple optimization -
 maybe some variant of JIT (for expression executor) should to improve
 performance.

 Any other optimization require significant redesign PL/pgSQL what is
 job what I don't would do now - personally, it is not work what I
 would to start by self, because using plpgsql triggers for
 partitioning is bad usage of plpgsql - and I believe so after native
 implementation any this work will be useless. Design some generic C
 trigger or really full implementation is better work.

 More, there is still expensive INSERT statement - forwarding tuple on
 C level should be significantly faster - because it don't be generic.

 
 
 
 
  So native implementation can carry significant speed up - mainly if we
  can distribute tuples without expression evaluating (evaluated by
  executor)
 
 
  Making partitioning inserts native does open up other opportunities to make
  it faster, and also to make it administratively easier; but do we want to
  try to tackle both of those goals simultaneously? I think the
  administrative aspects would come first. (But I doubt I will be the one to
  implement either, so my vote doesn't count for much here.)

 Anybody who starts work on native implementation will have my support
 (it is feature that lot of customers needs). I have customers that can
 support development and I believe so there are others. Actually It
 needs only one tenacious man, because it is work for two years.

 Regards

 Pavel

 
 
  Cheers,
 
  Jeff
 
 
 


 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance  
   

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance on Bulk Insert to Partitioned Table

2012-12-27 Thread Stephen Frost
* Jeff Janes (jeff.ja...@gmail.com) wrote:
 If the main goal is to make it faster, I'd rather see all of plpgsql get
 faster, rather than just a special case of partitioning triggers.  For
 example, right now a CASE expression statement with 100 branches is about
 the same speed as an equivalent list of 100 elsif.  So it seems to be doing
 a linear search, when it could be doing a hash that should be a lot faster.

That's a nice thought, but I'm not sure that it'd really be practical.
CASE statements in plpgsql are completely general and really behave more
like an if/elsif tree than a C-style switch() statement or similar.  For
one thing, the expression need not use the same variables, could be
complex multi-variable conditionals, etc.

Figuring out that you could build a dispatch table for a given CASE
statement and then building it, storing it, and remembering to use it,
wouldn't be cheap.

On the other hand, I've actually *wanted* a simpler syntax on occation.
I have no idea if there'd be a way to make it work, but this would be
kind of nice:

CASE OF x -- or whatever
  WHEN 1 THEN blah blah
  WHEN 2 THEN blah blah
  WHEN 3 THEN blah blah
END

which would be possible to build into a dispatch table by looking at the
type of x and the literals used in the overall CASE statement.  Even so,
there would likely be some number of WHEN conditions required before
it'd actually be more efficient to use, though perhaps getting rid of
the expression evaluation (if that'd be possible) would make up for it.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly

2012-12-27 Thread Tom Lane
Nikolas Everett nik9...@gmail.com writes:
 We just upgraded from 8.3 to 9.1 and we're seeing some performance
 problems.  When we EXPLAIN ANALYZE our queries the explain result claim
 that the queries are reasonably fast but the wall clock time is way way
 longer.  Does anyone know why this might happen?

 Like so:
 db=\timing
 db=EXPLAIN ANALYZE SELECT max(id) FROM foo WHERE blah_id = 1209123;

 The plan is sensible.  The estimates are sensible.  The actual DB time
 reads like it is very sensible.  But the wall clock time is like 11 seconds
 and the \timing report confirms it.

Seems like the extra time would have to be in parsing/planning, or in
waiting to acquire AccessShareLock on the table.  It's hard to believe
the former for such a simple query, unless the table has got thousands
of indexes or something silly like that.  Lock waits are surely possible
if there is something else contending for exclusive lock on the table,
but it's hard to see how the wait time would be so consistent.

BTW, the explain.depesz.com link you posted clearly does not correspond
to the above query (it's not doing a MAX), so another possibility is
confusion about what query is really causing trouble.  We've seen people
remove essential details before while trying to anonymize their query.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance on Bulk Insert to Partitioned Table

2012-12-27 Thread Pavel Stehule
2012/12/27 Stephen Frost sfr...@snowman.net:
 * Jeff Janes (jeff.ja...@gmail.com) wrote:
 If the main goal is to make it faster, I'd rather see all of plpgsql get
 faster, rather than just a special case of partitioning triggers.  For
 example, right now a CASE expression statement with 100 branches is about
 the same speed as an equivalent list of 100 elsif.  So it seems to be doing
 a linear search, when it could be doing a hash that should be a lot faster.

 That's a nice thought, but I'm not sure that it'd really be practical.
 CASE statements in plpgsql are completely general and really behave more
 like an if/elsif tree than a C-style switch() statement or similar.  For
 one thing, the expression need not use the same variables, could be
 complex multi-variable conditionals, etc.

 Figuring out that you could build a dispatch table for a given CASE
 statement and then building it, storing it, and remembering to use it,
 wouldn't be cheap.

 On the other hand, I've actually *wanted* a simpler syntax on occation.
 I have no idea if there'd be a way to make it work, but this would be
 kind of nice:

 CASE OF x -- or whatever
   WHEN 1 THEN blah blah
   WHEN 2 THEN blah blah
   WHEN 3 THEN blah blah
 END

 which would be possible to build into a dispatch table by looking at the
 type of x and the literals used in the overall CASE statement.  Even so,
 there would likely be some number of WHEN conditions required before
 it'd actually be more efficient to use, though perhaps getting rid of
 the expression evaluation (if that'd be possible) would make up for it.

I understand, but I am not happy with it. CASE is relative complex.
There is SQL CASE too, and this is third variant of CASE.  Maybe some
simple CASE statements can be supported by parser and there should be
local optimization (probably only for numeric - without casting) But
it needs relative lot of  new code? Will be this code accepted?

Regards

Pavel


 Thanks,

 Stephen


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly

2012-12-27 Thread Nikolas Everett
Sorry for the confusion around the queries.  Both queries are causing
trouble.  We've noticed that just EXPLAINING the very simple queries takes
forever.

After more digging it looks like this table has an inordinate number
of indices (10 ish).  There a whole buch of conditional indicies for other
columns that we're not checking.  The particular column that is causing us
trouble exists in both a regular (con_id) and a composite index (con_id,
somthing_else).

We checked on locks and don't see any ungranted locks.  Would waiting on
the AccessShareLock not appear in pg_locks?

Thanks!

Nik


On Thu, Dec 27, 2012 at 2:21 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Nikolas Everett nik9...@gmail.com writes:
  We just upgraded from 8.3 to 9.1 and we're seeing some performance
  problems.  When we EXPLAIN ANALYZE our queries the explain result claim
  that the queries are reasonably fast but the wall clock time is way way
  longer.  Does anyone know why this might happen?

  Like so:
  db=\timing
  db=EXPLAIN ANALYZE SELECT max(id) FROM foo WHERE blah_id = 1209123;

  The plan is sensible.  The estimates are sensible.  The actual DB time
  reads like it is very sensible.  But the wall clock time is like 11
 seconds
  and the \timing report confirms it.

 Seems like the extra time would have to be in parsing/planning, or in
 waiting to acquire AccessShareLock on the table.  It's hard to believe
 the former for such a simple query, unless the table has got thousands
 of indexes or something silly like that.  Lock waits are surely possible
 if there is something else contending for exclusive lock on the table,
 but it's hard to see how the wait time would be so consistent.

 BTW, the explain.depesz.com link you posted clearly does not correspond
 to the above query (it's not doing a MAX), so another possibility is
 confusion about what query is really causing trouble.  We've seen people
 remove essential details before while trying to anonymize their query.

 regards, tom lane



Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly

2012-12-27 Thread Nikolas Everett
Another other thing - the query seems to get faster after the first time we
plan it.  I'm not sure that this is the case but I think it might be.


On Thu, Dec 27, 2012 at 2:28 PM, Nikolas Everett nik9...@gmail.com wrote:

 Sorry for the confusion around the queries.  Both queries are causing
 trouble.  We've noticed that just EXPLAINING the very simple queries takes
 forever.

 After more digging it looks like this table has an inordinate number
 of indices (10 ish).  There a whole buch of conditional indicies for other
 columns that we're not checking.  The particular column that is causing us
 trouble exists in both a regular (con_id) and a composite index (con_id,
 somthing_else).

 We checked on locks and don't see any ungranted locks.  Would waiting on
 the AccessShareLock not appear in pg_locks?

 Thanks!

 Nik


 On Thu, Dec 27, 2012 at 2:21 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Nikolas Everett nik9...@gmail.com writes:
  We just upgraded from 8.3 to 9.1 and we're seeing some performance
  problems.  When we EXPLAIN ANALYZE our queries the explain result claim
  that the queries are reasonably fast but the wall clock time is way way
  longer.  Does anyone know why this might happen?

  Like so:
  db=\timing
  db=EXPLAIN ANALYZE SELECT max(id) FROM foo WHERE blah_id = 1209123;

  The plan is sensible.  The estimates are sensible.  The actual DB time
  reads like it is very sensible.  But the wall clock time is like 11
 seconds
  and the \timing report confirms it.

 Seems like the extra time would have to be in parsing/planning, or in
 waiting to acquire AccessShareLock on the table.  It's hard to believe
 the former for such a simple query, unless the table has got thousands
 of indexes or something silly like that.  Lock waits are surely possible
 if there is something else contending for exclusive lock on the table,
 but it's hard to see how the wait time would be so consistent.

 BTW, the explain.depesz.com link you posted clearly does not correspond
 to the above query (it's not doing a MAX), so another possibility is
 confusion about what query is really causing trouble.  We've seen people
 remove essential details before while trying to anonymize their query.

 regards, tom lane





Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly

2012-12-27 Thread Tom Lane
Nikolas Everett nik9...@gmail.com writes:
 After more digging it looks like this table has an inordinate number
 of indices (10 ish).

10 doesn't sound like a lot.

 There a whole buch of conditional indicies for other
 columns that we're not checking.  The particular column that is causing us
 trouble exists in both a regular (con_id) and a composite index (con_id,
 somthing_else).

You're not being at all clear here.  Are you trying to say that only
queries involving WHERE col = constant for a particular column seem
to be slow?  If so, maybe the column has a weird datatype or a wildly
out of line statistics target?  (Still hard to see how you get to 11-ish
seconds in planning, though.)

One thing you might do is watch the backend process in top or local
equivalent, and see if it's spending most of the 11 seconds sleeping, or
accumulating CPU time, or accumulating I/O.  That info would eliminate
a lot of possibilities.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Why does the query planner use two full indexes, when a dedicated partial index exists?

2012-12-27 Thread Richard Neill



The partial index is highly leveraged.  If every tuple in the
table is updated once, that amounts to every tuple in the index

 being updated 25,000 times.


How so? That sounds like O(n_2) behaviour.

If the table has 5 million rows while the index has 200 (active) rows at
any given time, then to update every row in the table to null and back
again would be 100% turn over of the table.  But each such change would
lead to an addition and then a deletion from the index.  So 100%
turnover of the table would be a 5 million / 200 = 25,000 fold turn of
the index.


Sorry, I was being dense. I misread that as:
   every time a single tuple in the table is updated, the entire index
(every row) is updated.
Yes, of course your explanation makes sense.



There is some code that allows a btree index entry to get killed (and so
the slot to be reused) without any vacuum, if a scan follows that entry
and finds the corresponding tuple in the table no longer visible to
anyone.  I have not examined this code, and don't know whether it is
doing its job but just isn't enough to prevent the bloat, or if for some
reason it is not applicable to your situation.



It looks like my solution is going to be a REINDEX invoked from cron, or 
maybe just every 100k inserts.



In terms of trying to improve this behaviour for other PG users in the 
future, are there any more diagnostics I can do for you? Having found a 
special case, I'd like to help permanently resolve it if I can.



Thanks very much again.

Best wishes,

Richard







--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] sched_migration_cost for high-connection counts

2012-12-27 Thread Shaun Thomas

Hey guys,

I recently stumbled over a Linux scheduler setting that has outright 
shocked me. So, after reading through this:


http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html

it became readily apparent we were hitting the same wall. I could do a 
pgbench and increase the connection count by 100 every iteration, and 
eventually performance just fell off a proverbial cliff and never recovered.


For our particular systems, this barrier is somewhere around 800 
processes. Select-only performance on a 3600-scale pgbench database in 
cache falls from about 70k TPS to about 12k TPS after crossing that 
line. Worse, sar shows over 70% CPU dedicated to system overhead.


After some fiddling around, I changed sched_migration_cost from its 
default of 50 to 500 and performance returned to linear scaling 
immediately. It's literally night and day. Setting it back to 50 
reverts to the terrible performance.


In addition, setting the migration cost to a higher value does not 
negatively affect any other performance metric I've checked. This is on 
an Ubuntu 12.04 system, and I'd love if someone out there could 
independently verify this, because frankly, I find it difficult to believe.


If legit, high-connection systems would benefit greatly.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly

2012-12-27 Thread Tom Lane
Nikolas Everett nik9...@gmail.com writes:
 We straced the backend during the explain and it looked like the open
 commands were taking several seconds each.

Kind of makes me wonder if you have a whole lot of tables (whole lot
in this context probably means tens of thousands) and are storing the
database on a filesystem that doesn't scale well to lots of files in one
directory.  If that's the explanation, the reason the 8.3 installation
was okay was likely that it was stored on a more modern filesystem.

BTW, please keep the list cc'd on replies.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly

2012-12-27 Thread Nikolas Everett
On Thu, Dec 27, 2012 at 4:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Nikolas Everett nik9...@gmail.com writes:
  We straced the backend during the explain and it looked like the open
  commands were taking several seconds each.

 Kind of makes me wonder if you have a whole lot of tables (whole lot
 in this context probably means tens of thousands) and are storing the
 database on a filesystem that doesn't scale well to lots of files in one
 directory.  If that's the explanation, the reason the 8.3 installation
 was okay was likely that it was stored on a more modern filesystem.


We have 1897 files for our largest database which really isn't a whole lot.
 The old servers were EXT3 over FC to a NetApp running RHEL5 PPC.  The new
servers are on NFS to the same NetApp running RHEL5 Intel.  We've failed
from our physical primary to a virtual secondary both of which seem to have
the same problem.  We're in the process of rebuilding the a hot slave on
EXT3 over iSCSI.  We'll fail over to it as soon as we can.

We never tried stracing the PPC infrastructure but it obviously didn't have
this problem.

We also have another cluster running with an identical setup which doesn't
seem to have the problem.  In fact, the problem never came up durring
correctness testing for this problem either - it specifically required load
before it came up.



 BTW, please keep the list cc'd on replies.


Itchy reply finger.



 regards, tom lane



Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly

2012-12-27 Thread Tom Lane
Nikolas Everett nik9...@gmail.com writes:
 On Thu, Dec 27, 2012 at 4:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Nikolas Everett nik9...@gmail.com writes:
 We straced the backend during the explain and it looked like the open
 commands were taking several seconds each.

 Kind of makes me wonder if you have a whole lot of tables (whole lot
 in this context probably means tens of thousands) and are storing the
 database on a filesystem that doesn't scale well to lots of files in one
 directory.  If that's the explanation, the reason the 8.3 installation
 was okay was likely that it was stored on a more modern filesystem.

 We have 1897 files for our largest database which really isn't a whole lot.

OK...

  The old servers were EXT3 over FC to a NetApp running RHEL5 PPC.  The new
 servers are on NFS to the same NetApp running RHEL5 Intel.

Now I'm wondering about network glitches or NFS configuration problems.
This is a bit outside my expertise unfortunately, but it seems clear
that your performance issue is somewhere in that area.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] explain analyze reports that my queries are fast but they run very slowly

2012-12-27 Thread Nikolas Everett
It looks like it was a problem with NFS.  We're not really sure what was
wrong with it but once we failed over to an iSCSI mount for the data
everything is running just fine.


On Thu, Dec 27, 2012 at 6:12 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Nikolas Everett nik9...@gmail.com writes:
  On Thu, Dec 27, 2012 at 4:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Nikolas Everett nik9...@gmail.com writes:
  We straced the backend during the explain and it looked like the open
  commands were taking several seconds each.

  Kind of makes me wonder if you have a whole lot of tables (whole lot
  in this context probably means tens of thousands) and are storing the
  database on a filesystem that doesn't scale well to lots of files in one
  directory.  If that's the explanation, the reason the 8.3 installation
  was okay was likely that it was stored on a more modern filesystem.

  We have 1897 files for our largest database which really isn't a whole
 lot.

 OK...

   The old servers were EXT3 over FC to a NetApp running RHEL5 PPC.  The
 new
  servers are on NFS to the same NetApp running RHEL5 Intel.

 Now I'm wondering about network glitches or NFS configuration problems.
 This is a bit outside my expertise unfortunately, but it seems clear
 that your performance issue is somewhere in that area.

 regards, tom lane



Re: [PERFORM] backend suddenly becomes slow, then remains slow

2012-12-27 Thread Jeff Janes
On Thursday, December 27, 2012, Andrew Dunstan wrote:

 On 12/26/2012 11:03 PM, Jeff Janes wrote:


 Do you have a huge number of tables?  Maybe over the course of a
 long-lived connection, it touches enough tables to bloat the relcache /
 syscache.  I don't know how the autovac would be involved in that, though.



 Yes, we do indeed have a huge number of tables. This seems a plausible
 thesis.


All of the syscache things have compiled hard-coded numbers of buckets, at
most 2048, and once those are exceeded the resulting collision resolution
becomes essentially linear.  It is not hard to exceed 2048 tables by a
substantial multiple, and even less hard to exceed 2048 columns (summed
over all tables).

I don't know why syscache doesn't use dynahash; whether it is older than
dynahash is and was never converted out of inertia, or if there are extra
features that don't fit the dynahash API.  If the former, then converting
them to use dynahash should give automatic resizing for free.  Maybe that
conversion should be a To Do item?



Cheers,

Jeff


Re: [PERFORM] Why does the query planner use two full indexes, when a dedicated partial index exists?

2012-12-27 Thread Jeff Janes
On Thursday, December 20, 2012, Jeff Janes wrote:

 On Thursday, December 20, 2012, Tom Lane wrote:


 What I did to try to duplicate Richard's situation was to create a test
 table in which all the exit_state values were NULL, then build the
 index, then UPDATE all but a small random fraction of the rows to 1,
 then vacuum.  This results in a rather bloated partial index, but I
 think that's probably what he's got given that every record initially
 enters the table with NULL exit_state.  It would take extremely frequent
 vacuuming to keep the partial index from accumulating a lot of dead
 entries.



Once I cranked up default_statistics_target, I could start reproducing the
very high estimates (5000) for the partial index in 9.1.

As you say, switching to 9.2 or above lowers it quite a bit, I still get
some pretty high estimates, ~100 when 8 would be more accurate.

The problem is in genericcostestimate

if (index-pages  1  index-tuples  1)
numIndexPages = ceil(numIndexTuples * index-pages / index-tuples);

The index-pages should probably not include index pages which are empty.
 Even with aggressive vacuuming, most of the pages in the partial index
seem to be empty at any given time.

However, I don't know if that number is exposed readily.  And it seems to
be updated too slowly to be useful, if pg_freespace is to be believed.

But I wonder if it couldn't be clamped to so that we there can be no more
pages than there are tuples.

numIndexPages = ceil(numIndexTuples * Min(1,index-pages /
index-tuples));


 Cheers,

Jeff