Re: [PERFORM] Inserts or Updates

2012-02-08 Thread Ofer Israeli
Andy Colson wrote:
 Oh, I knew I'd seen index usage stats someplace.
 
 give this a run:
 
 select * from pg_stat_user_indexes where relname = 'SuperBigTable';
 
 http://www.postgresql.org/docs/current/static/monitoring-stats.html
 
 -Andy
 
 Scanned by Check Point Total Security Gateway.


Thanks.  We have begun analyzing the indexes and indeed found many are pretty 
useless and will be removed.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] index scan forward vs backward = speed difference of 357X slower!

2012-02-08 Thread Tom Lane
Kevin Traster ktras...@freshgrillfoods.com writes:
 The query plan and estimates are exactly the same, except desc has index
 scan backwards instead of index scan for changes_shareschange.
 Yet, actual runtime performance is different by 357x slower for the
 ascending version instead of descending.

Apparently, there are some rows passing the filter condition that are
close to the end of the index, but none that are close to the start.
So it takes a lot longer to find the first 15 matches in one case than
the other.  You haven't shown us the index definition, but I gather from
the fact that the scan condition is just a Filter (not an Index Cond)
that the index itself doesn't offer any clue as to whether a given row
meets those conditions.  So this plan is going to be doing a lot of
random-access heap probes until it finds a match.

 Why and how do I fix it?

Probably, you need an index better suited to the query condition.
If you have one and the problem is that the planner's not choosing it,
then this is going to take more information to resolve.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Vacuuming problems on TOAST table

2012-02-08 Thread Ofer Israeli
Hi all,

We have a 4-columned table that is also split up into a TOAST table, where the 
TOASTed entries are ~35KB each.
The table size is 10K records.
The table is updated at a rate of ~100 updates a minute.

During our testing we see that the table size increases substantially.  When 
looking at the autovacuum log, set with default configuration, it seems that it 
ran for around 60 seconds (see below and note that this was a 1-minute test, 
i.e. only 100 updates)!

pid:2148 tid:0 sid:4f32a8de.864 sln:2 sst:2012-02-08 18:54:54 IST [2012-02-08 
19:24:06.967 IST]DEBUG:  autovac_balance_cost(pid=4560 db=16385, rel=17881, 
cost_limit=200, cost_delay=20)
pid:2148 tid:0 sid:4f32a8de.864 sln:3 sst:2012-02-08 18:54:54 IST [2012-02-08 
19:24:37.622 IST]DEBUG:  autovac_balance_cost(pid=4560 db=16385, rel=17881, 
cost_limit=200, cost_delay=20)
pid:4560 tid:0 sid:4f32af99.11d0 sln:14 sst:2012-02-08 19:23:37 IST [2012-02-08 
19:24:43.518 IST]DEBUG:  scanned index pg_toast_17881_index to remove 1700 
row versions
pid:4560 tid:0 sid:4f32af99.11d0 sln:15 sst:2012-02-08 19:23:37 IST [2012-02-08 
19:24:43.518 IST]DETAIL:  CPU 0.00s/0.00u sec elapsed 0.74 sec.
pid:4560 tid:0 sid:4f32af99.11d0 sln:16 sst:2012-02-08 19:23:37 IST [2012-02-08 
19:24:43.581 IST]DEBUG:  pg_toast_17881: removed 1700 row versions in 494 
pages
pid:4560 tid:0 sid:4f32af99.11d0 sln:17 sst:2012-02-08 19:23:37 IST [2012-02-08 
19:24:43.581 IST]DETAIL:  CPU 0.00s/0.00u sec elapsed 0.06 sec.
pid:4560 tid:0 sid:4f32af99.11d0 sln:18 sst:2012-02-08 19:23:37 IST [2012-02-08 
19:24:43.581 IST]DEBUG:  index pg_toast_17881_index now contains 169983 row 
versions in 473 pages
pid:4560 tid:0 sid:4f32af99.11d0 sln:19 sst:2012-02-08 19:23:37 IST [2012-02-08 
19:24:43.581 IST]DETAIL:  1700 index row versions were removed.
 4 index pages have been deleted, 0 are currently reusable.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
pid:4560 tid:0 sid:4f32af99.11d0 sln:20 sst:2012-02-08 19:23:37 IST [2012-02-08 
19:24:43.581 IST]DEBUG:  pg_toast_17881: found 1471 removable, 169983 
nonremovable row versions in 42921 pages
pid:4560 tid:0 sid:4f32af99.11d0 sln:21 sst:2012-02-08 19:23:37 IST [2012-02-08 
19:24:43.581 IST]DETAIL:  0 dead row versions cannot be removed yet.
 There were 0 unused item pointers.
 495 pages contain useful free space.
 0 pages are entirely empty.
 CPU 0.00s/0.00u sec elapsed 66.36 sec.
pid:4560 tid:0 sid:4f32af99.11d0 sln:22 sst:2012-02-08 19:23:37 IST [2012-02-08 
19:24:43.581 IST]LOG:  automatic vacuum of table 
uepm.pg_toast.pg_toast_17881: index scans: 1
 pages: 0 removed, 42921 remain
 tuples: 1471 removed, 169983 remain
 system usage: CPU 0.00s/0.00u sec elapsed 66.36 sec

When setting a higher cost for the autovacuum, tried values of 2000, it ran for 
even longer: ~400 seconds!


The only other information that I have here is that the TOASTed data is split 
up into 17 sections (which makes sense considering it splits it up into 2KB 
sections).

And one more thing that seems a bit strange - after a 1-minute run, we would 
expect to see 1700 Tuples Updated (100*17), but instead we see 1700 Tuples 
Inserted (and no deletes).


Anyone have a clue on this phenomenon?


Thanks,
Ofer


Re: [PERFORM] Vacuuming problems on TOAST table

2012-02-08 Thread Tom Lane
Ofer Israeli of...@checkpoint.com writes:
 During our testing we see that the table size increases substantially.  When 
 looking at the autovacuum log, set with default configuration, it seems that 
 it ran for around 60 seconds (see below and note that this was a 1-minute 
 test, i.e. only 100 updates)!

autovacuum is intended to run fairly slowly, so as to not consume too
much resources.  If you think it's too slow you can adjust the
autovacuum_cost tunables.

 When setting a higher cost for the autovacuum, tried values of 2000, it ran 
 for even longer: ~400 seconds!

That's the wrong direction, no?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] index scan forward vs backward = speed difference of 357X slower!

2012-02-08 Thread Kevin Traster
Typo: Work_mem = 32 MB

The definition for both column and index:
 shareschange  | numeric |
changes_shareschange btree (shareschange)

Index created using: CREATE INDEX changes_shareschange ON changes(shareschange);

The entire table is created nightly (and analyzed afterwords), and
used only for reporting - there no updates/deletes, so there shouldn't
be any dead rows in the table.
Likewise, there is no nulls in the column.

Please elaborate on:

You haven't shown us the index definition, but I gather from
 the fact that the scan condition is just a Filter (not an Index Cond)
 that the index itself doesn't offer any clue as to whether a given row
 meets those conditions

Are you saying it is the retrieval of the physically random located 15
rows to meet the ascending condition that causes the 5 sec difference?
The table is not-clustered, so it is random for descending also.

The condition is shareschange ascending, I have an index for that
condition and the planner is using it.

What else can I look at?



On Wed, Feb 8, 2012 at 11:31 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Traster ktras...@freshgrillfoods.com writes:
 The query plan and estimates are exactly the same, except desc has index
 scan backwards instead of index scan for changes_shareschange.
 Yet, actual runtime performance is different by 357x slower for the
 ascending version instead of descending.

 Apparently, there are some rows passing the filter condition that are
 close to the end of the index, but none that are close to the start.
 So it takes a lot longer to find the first 15 matches in one case than
 the other.  You haven't shown us the index definition, but I gather from
 the fact that the scan condition is just a Filter (not an Index Cond)
 that the index itself doesn't offer any clue as to whether a given row
 meets those conditions.  So this plan is going to be doing a lot of
 random-access heap probes until it finds a match.

 Why and how do I fix it?

 Probably, you need an index better suited to the query condition.
 If you have one and the problem is that the planner's not choosing it,
 then this is going to take more information to resolve.

                        regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Vacuuming problems on TOAST table

2012-02-08 Thread Ofer Israeli
Tom Lane wrote:
 Ofer Israeli of...@checkpoint.com writes:
 During our testing we see that the table size increases
 substantially.  When looking at the autovacuum log, set with default
 configuration, it seems that it ran for around 60 seconds (see below
 and note that this was a 1-minute test, i.e. only 100 updates)!   
 
 autovacuum is intended to run fairly slowly, so as to not consume too
 much resources.  If you think it's too slow you can adjust the
 autovacuum_cost tunables.  
 
 When setting a higher cost for the autovacuum, tried values of 2000,
 it ran for even longer: ~400 seconds! 
 
 That's the wrong direction, no?

The settings we used were not in the postgresql.conf file, but rather an update 
of the pg_autovacuum table where we set the vac_cost_limit to 2000.  The reason 
for this being that we wanted this definition only for the big (TOASTed) table 
I was referring to.

The logged settings in the ~400 second case were:
autovac_balance_cost(pid=6224 db=16385, rel=17881, cost_limit=10, cost_delay=1)

Which comes as quite a surprise as it seems that the cost_limit is not set or 
am I missing something?


Thanks,
Ofer


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pl/pgsql functions outperforming sql ones?

2012-02-08 Thread Robert Haas
On Sat, Jan 28, 2012 at 11:20 PM, Carlo Stonebanks
stonec.regis...@sympatico.ca wrote:
 Update: The main stored function in question and all of its sub
 sub-functions were recoded to new pure sql functions.

 I then stub tested the sub functions sql vs. plpgsql.

 Here were the results for new sql vs old plpgsql:

 Individual sub functions tested 20-30% faster

 But the main function calling new sql sub functions ran 100% slower

 So I tried this:

 I modified the old plpgsql function to call the new sql sub functions.

 THAT ran 20-30% faster then the unmodified version.

 That modified function is listed below. All the functions ending in 2 are
 the new SQL versions.

One advantage of PL/pgsql for code like this is that you can compute
values once and save them in variables.  SQL doesn't have variables,
so you can end up repeating the same SQL in multiple places (causing
multiple evaluation), or even if you manage to avoid that, the system
can inline things in multiple places and produce the same effect.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] index scan forward vs backward = speed difference of 357X slower!

2012-02-08 Thread Merlin Moncure
On Wed, Feb 8, 2012 at 1:58 PM, Kevin Traster
ktras...@freshgrillfoods.com wrote:
 Typo: Work_mem = 32 MB

 The definition for both column and index:
  shareschange                  | numeric |
 changes_shareschange btree (shareschange)

 Index created using: CREATE INDEX changes_shareschange ON 
 changes(shareschange);

 The entire table is created nightly (and analyzed afterwords), and
 used only for reporting - there no updates/deletes, so there shouldn't
 be any dead rows in the table.
 Likewise, there is no nulls in the column.

 Please elaborate on:

You haven't shown us the index definition, but I gather from
 the fact that the scan condition is just a Filter (not an Index Cond)
 that the index itself doesn't offer any clue as to whether a given row
 meets those conditions

 Are you saying it is the retrieval of the physically random located 15
 rows to meet the ascending condition that causes the 5 sec difference?
 The table is not-clustered, so it is random for descending also.

 The condition is shareschange ascending, I have an index for that
 condition and the planner is using it.

This is not a problem with dead rows, but the index is not really
satisfying your query and the database has to look through an
indeterminate amount of rows until the 'limit 15' is satisfied.  Yeah,
backwards scans are slower, especially for disk bound scans but you
also have to consider how many filter misses your have.  The smoking
gun is here:

Index Scan Backward using changes_shareschange on changes
(cost=0.00..925150.26 rows=181997 width=98) (actual time=3.161..15.843
rows=15 loops=1)
Filter: ((activity = ANY ('{4,5}'::integer[])) AND (mfiled = $1))

When you see Filter: xyz, xyz is what each record has to be compared
against after the index pointed you to an area(s) in the heap.  It's
pure luck going forwards or backwards that determines how many records
you have to look through to get 15 good ones as defined by satisfying
the filter.  To prove that one way or the other you can convert your
where to a boolean returning (and bump the limit appropriately)
expression to see how many records get filtered out.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] index scan forward vs backward = speed difference of 357X slower!

2012-02-08 Thread Kevin Traster
 This is not a problem with dead rows, but the index is not really
 satisfying your query and the database has to look through an
 indeterminate amount of rows until the 'limit 15' is satisfied.  Yeah,
 backwards scans are slower, especially for disk bound scans but you
 also have to consider how many filter misses your have.  The smoking
 gun is here:

 Index Scan Backward using changes_shareschange on changes
 (cost=0.00..925150.26 rows=181997 width=98) (actual time=3.161..15.843
 rows=15 loops=1)
 Filter: ((activity = ANY ('{4,5}'::integer[])) AND (mfiled = $1))

 When you see Filter: xyz, xyz is what each record has to be compared
 against after the index pointed you to an area(s) in the heap.  It's
 pure luck going forwards or backwards that determines how many records
 you have to look through to get 15 good ones as defined by satisfying
 the filter.  To prove that one way or the other you can convert your
 where to a boolean returning (and bump the limit appropriately)
 expression to see how many records get filtered out.

 merlin

I have indexes also on activity and mfiled (both btree) - wouldn't the
database use them? - Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] index scan forward vs backward = speed difference of 357X slower!

2012-02-08 Thread Kevin Grittner
Kevin Traster ktras...@freshgrillfoods.com wrote:
 
 I have indexes also on activity and mfiled (both btree) - wouldn't
 the database use them? - Kevin
 
It will use them if they are part of the plan which had the lowest
cost when it compared the costs of all possible plans.
 
You haven't really shown us the schema, so there's more guesswork
involved in trying to help you than there could be.  This page might
be worth reviewing:
 
http://wiki.postgresql.org/wiki/SlowQueryQuestions
 
In particular, if there are indexes that aren't being used which you
think should be, there is a good chance that either there is a type
mismatch or your costing factors may need adjustment.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] random_page_cost = 2.0 on Heroku Postgres

2012-02-08 Thread Josh Berkus
On 2/7/12 4:59 PM, Peter van Hardenberg wrote:
 Per the thread from last month, I've updated the default
 random_page_cost on Heroku Postgres to reduce the expected cost of a
 random_page on all new databases.

This is because Heroku uses AWS storage, which has fast seeks but poor
throughput compared to internal disk on a standard system, BTW.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] random_page_cost = 2.0 on Heroku Postgres

2012-02-08 Thread Scott Marlowe
On Wed, Feb 8, 2012 at 5:50 PM, Josh Berkus j...@agliodbs.com wrote:
 On 2/7/12 4:59 PM, Peter van Hardenberg wrote:
 Per the thread from last month, I've updated the default
 random_page_cost on Heroku Postgres to reduce the expected cost of a
 random_page on all new databases.

 This is because Heroku uses AWS storage, which has fast seeks but poor
 throughput compared to internal disk on a standard system, BTW.

Also judging by the other thread, it might be something to stop closer
to 1.2 to 1.4 or something.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] random_page_cost = 2.0 on Heroku Postgres

2012-02-08 Thread Peter van Hardenberg
Having read the thread, I don't really see how I could study what a
more principled value would be.

That said, I have access to a very large fleet in which to can collect
data so I'm all ears for suggestions about how to measure and would
gladly share the results with the list.

Peter

On Wed, Feb 8, 2012 at 5:39 PM, Scott Marlowe scott.marl...@gmail.com wrote:
 On Wed, Feb 8, 2012 at 5:50 PM, Josh Berkus j...@agliodbs.com wrote:
 On 2/7/12 4:59 PM, Peter van Hardenberg wrote:
 Per the thread from last month, I've updated the default
 random_page_cost on Heroku Postgres to reduce the expected cost of a
 random_page on all new databases.

 This is because Heroku uses AWS storage, which has fast seeks but poor
 throughput compared to internal disk on a standard system, BTW.

 Also judging by the other thread, it might be something to stop closer
 to 1.2 to 1.4 or something.

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



-- 
Peter van Hardenberg
San Francisco, California
Everything was beautiful, and nothing hurt. -- Kurt Vonnegut

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] random_page_cost = 2.0 on Heroku Postgres

2012-02-08 Thread Scott Marlowe
On Wed, Feb 8, 2012 at 6:45 PM, Peter van Hardenberg p...@pvh.ca wrote:
 Having read the thread, I don't really see how I could study what a
 more principled value would be.

Agreed.  Just pointing out more research needs to be done.

 That said, I have access to a very large fleet in which to can collect
 data so I'm all ears for suggestions about how to measure and would
 gladly share the results with the list.

I wonder if some kind of script that grabbed random queries and ran
them with explain analyze and various random_page_cost to see when
they switched and which plans are faster would work?

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] random_page_cost = 2.0 on Heroku Postgres

2012-02-08 Thread Marcos Ortiz Valmaseda

On 08/02/12 21:15, Peter van Hardenberg wrote:

Having read the thread, I don't really see how I could study what a
more principled value would be.

That said, I have access to a very large fleet in which to can collect
data so I'm all ears for suggestions about how to measure and would
gladly share the results with the list.

Peter

On Wed, Feb 8, 2012 at 5:39 PM, Scott Marlowescott.marl...@gmail.com  wrote:

On Wed, Feb 8, 2012 at 5:50 PM, Josh Berkusj...@agliodbs.com  wrote:

On 2/7/12 4:59 PM, Peter van Hardenberg wrote:

Per the thread from last month, I've updated the default
random_page_cost on Heroku Postgres to reduce the expected cost of a
random_page on all new databases.

This is because Heroku uses AWS storage, which has fast seeks but poor
throughput compared to internal disk on a standard system, BTW.

Also judging by the other thread, it might be something to stop closer
to 1.2 to 1.4 or something.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



You can execute several queries with the three different values provided 
by Scott and Josh.

- SET random_page_cost  = 2.0
First execution of the queries with EXPLAIN ANALYZE
- SET random_page_cost  = 1.4
Second execution of the queries with EXPLAIN ANALYZE
- SET random_page_cost  = 1.2
Second execution of the queries with EXPLAIN ANALYZE

And then, you can compare the pattern behind these queries executions
Regards,

--
Marcos Luis Ortíz Valmaseda
 Sr. Software Engineer (UCI)
 http://marcosluis2186.posterous.com
 http://www.linkedin.com/in/marcosluis2186
 Twitter: @marcosluis2186




Fin a la injusticia, LIBERTAD AHORA A NUESTROS CINCO COMPATRIOTAS QUE SE 
ENCUENTRAN INJUSTAMENTE EN PRISIONES DE LOS EEUU!
http://www.antiterroristas.cu
http://justiciaparaloscinco.wordpress.com

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] random_page_cost = 2.0 on Heroku Postgres

2012-02-08 Thread Peter van Hardenberg
On Wed, Feb 8, 2012 at 6:28 PM, Scott Marlowe scott.marl...@gmail.com wrote:
 On Wed, Feb 8, 2012 at 6:45 PM, Peter van Hardenberg p...@pvh.ca wrote:
 That said, I have access to a very large fleet in which to can collect
 data so I'm all ears for suggestions about how to measure and would
 gladly share the results with the list.

 I wonder if some kind of script that grabbed random queries and ran
 them with explain analyze and various random_page_cost to see when
 they switched and which plans are faster would work?

We aren't exactly in a position where we can adjust random_page_cost
on our users' databases arbitrarily to see what breaks. That would
be... irresponsible of us.

How would one design a meta-analyzer which we could run across many
databases and collect data? Could we perhaps collect useful
information from pg_stat_user_indexes, for example?

-p

-- 
Peter van Hardenberg
San Francisco, California
Everything was beautiful, and nothing hurt. -- Kurt Vonnegut

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] random_page_cost = 2.0 on Heroku Postgres

2012-02-08 Thread Peter van Hardenberg
On Wed, Feb 8, 2012 at 6:47 PM, Peter van Hardenberg p...@pvh.ca wrote:
 On Wed, Feb 8, 2012 at 6:28 PM, Scott Marlowe scott.marl...@gmail.com wrote:
 On Wed, Feb 8, 2012 at 6:45 PM, Peter van Hardenberg p...@pvh.ca wrote:
 That said, I have access to a very large fleet in which to can collect
 data so I'm all ears for suggestions about how to measure and would
 gladly share the results with the list.

 I wonder if some kind of script that grabbed random queries and ran
 them with explain analyze and various random_page_cost to see when
 they switched and which plans are faster would work?

 We aren't exactly in a position where we can adjust random_page_cost
 on our users' databases arbitrarily to see what breaks. That would
 be... irresponsible of us.


Oh, of course we could do this on the session, but executing
potentially expensive queries would still be unneighborly.

Perhaps another way to think of this problem would be that we want to
find queries where the cost estimate is inaccurate.

-- 
Peter van Hardenberg
San Francisco, California
Everything was beautiful, and nothing hurt. -- Kurt Vonnegut

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] random_page_cost = 2.0 on Heroku Postgres

2012-02-08 Thread Scott Marlowe
On Wed, Feb 8, 2012 at 7:54 PM, Peter van Hardenberg p...@pvh.ca wrote:
 On Wed, Feb 8, 2012 at 6:47 PM, Peter van Hardenberg p...@pvh.ca wrote:
 On Wed, Feb 8, 2012 at 6:28 PM, Scott Marlowe scott.marl...@gmail.com 
 wrote:
 On Wed, Feb 8, 2012 at 6:45 PM, Peter van Hardenberg p...@pvh.ca wrote:
 That said, I have access to a very large fleet in which to can collect
 data so I'm all ears for suggestions about how to measure and would
 gladly share the results with the list.

 I wonder if some kind of script that grabbed random queries and ran
 them with explain analyze and various random_page_cost to see when
 they switched and which plans are faster would work?

 We aren't exactly in a position where we can adjust random_page_cost
 on our users' databases arbitrarily to see what breaks. That would
 be... irresponsible of us.


 Oh, of course we could do this on the session, but executing
 potentially expensive queries would still be unneighborly.

 Perhaps another way to think of this problem would be that we want to
 find queries where the cost estimate is inaccurate.

Yeah, have a script the user runs for you heroku guys in their spare
time to see what queries are using the most time and then to jangle
the random_page_cost while running them to get an idea what's faster
and why.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] random_page_cost = 2.0 on Heroku Postgres

2012-02-08 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


 You can execute several queries with the three different values provided 
 by Scott and Josh.
 - SET random_page_cost  = 2.0
 First execution of the queries with EXPLAIN ANALYZE
 - SET random_page_cost  = 1.4
 Second execution of the queries with EXPLAIN ANALYZE
 - SET random_page_cost  = 1.2
 Second execution of the queries with EXPLAIN ANALYZE

Well, such a tool would ideally be smarter than that, such that 
you would run EXPLAIN and compare to the previous plan and 
only run EXPLAIN ANALYZE if the plan changed. One could even 
decrement rpc slowly and find out at one points it changes, 
which would be more interesting than testing arbitrary numbers.
Would lead to some really sweet graphs as well. :)

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201202082338
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAk8zTewACgkQvJuQZxSWSsiprACfTlYKiC4SS1UnERU+1N/2EGhJ
s9AAoIXLJk88hoNHEkWKhUTqikDBtC/B
=S65l
-END PGP SIGNATURE-



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] random_page_cost = 2.0 on Heroku Postgres

2012-02-08 Thread Marcos Ortiz Valmaseda

On 09/02/12 00:09, Greg Sabino Mullane wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160



You can execute several queries with the three different values provided
by Scott and Josh.
- SET random_page_cost  = 2.0
First execution of the queries with EXPLAIN ANALYZE
- SET random_page_cost  = 1.4
Second execution of the queries with EXPLAIN ANALYZE
- SET random_page_cost  = 1.2
Second execution of the queries with EXPLAIN ANALYZE

Well, such a tool would ideally be smarter than that, such that
you would run EXPLAIN and compare to the previous plan and
only run EXPLAIN ANALYZE if the plan changed. One could even
decrement rpc slowly and find out at one points it changes,
which would be more interesting than testing arbitrary numbers.
Would lead to some really sweet graphs as well. :)

Well, the MyYearBook.com´s guys built something seemed called Posuta, I 
don´t know is this project is alive, but we can ask to them 
(michael.glaesem...@myyearbook.com).


http://area51.myyearbook.com
Posuta can be a starting point for it. It uses Ruby and Clojure for core 
functionalities, jQuery/Flot for graphics,


--
Marcos Luis Ortíz Valmaseda
 Sr. Software Engineer (UCI)
 http://marcosluis2186.posterous.com
 http://www.linkedin.com/in/marcosluis2186
 Twitter: @marcosluis2186




Fin a la injusticia, LIBERTAD AHORA A NUESTROS CINCO COMPATRIOTAS QUE SE 
ENCUENTRAN INJUSTAMENTE EN PRISIONES DE LOS EEUU!
http://www.antiterroristas.cu
http://justiciaparaloscinco.wordpress.com

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance