[PERFORM] Performance on large, append-only tables

2012-02-10 Thread David Yeu
Hi there, We've got a pretty large table that sees millions of new rows a day, and we're trying our best to optimize queries against it. We're hoping to find some guidance on this list. Thankfully, the types of queries that we perform against this table are pretty constrained. We never update

Re: [PERFORM] Performance on large, append-only tables

2012-02-10 Thread Merlin Moncure
On Wed, Feb 8, 2012 at 12:03 PM, David Yeu david@skype.net wrote: Hi there, We've got a pretty large table that sees millions of new rows a day, and we're trying our best to optimize queries against it. We're hoping to find some guidance on this list. Thankfully, the types of queries

Re: [PERFORM] Performance on large, append-only tables

2012-02-10 Thread Claudio Freire
On Wed, Feb 8, 2012 at 3:03 PM, David Yeu david@skype.net wrote: Thankfully, the types of queries that we perform against this table are pretty constrained. We never update rows and we never join against other tables. The table essentially looks like this: | id | group_id | created_at |

Re: [PERFORM] Performance on large, append-only tables

2012-02-10 Thread Marti Raudsepp
On Wed, Feb 8, 2012 at 20:03, David Yeu david@skype.net wrote:  * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET ?;  * Pages of twenty rows. A good improvement for this sort of queries is the scalable paging trick. Instead of increasing the OFFSET argument -- which means that

Re: [PERFORM] Performance on large, append-only tables

2012-02-10 Thread Tom Lane
David Yeu david@skype.net writes: Our queries essentially fall into the following cases: * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20; * Š WHERE group_id = ? AND id ? ORDER BY created_at DESC; * Š WHERE group_id = ? AND id ? ORDER BY created_at DESC LIMIT 20; * Š WHERE

Re: [PERFORM] Performance on large, append-only tables

2012-02-10 Thread Kevin Grittner
David Yeu david@skype.net wrote: We have indices against the primary key and the group_id. Our queries essentially fall into the following cases: * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20; * Š WHERE group_id = ? AND id ? ORDER BY created_at DESC; * Š WHERE group_id

Re: [PERFORM] Performance on large, append-only tables

2012-02-10 Thread David Yeu
Yeah, Reply-All... Begin forwarded message: From: David Yeu david@skype.net Subject: Re: [PERFORM] Performance on large, append-only tables Date: February 10, 2012 10:59:04 AM EST To: Merlin Moncure mmonc...@gmail.com On Feb 10, 2012, at 10:19 AM, Merlin Moncure wrote: You can

Re: [PERFORM] Performance on large, append-only tables

2012-02-10 Thread Claudio Freire
On Fri, Feb 10, 2012 at 1:19 PM, David Yeu david@skype.net wrote: = EXPLAIN ANALYZE SELECT  lines.* FROM lines WHERE (lines.deleted_at IS NULL) AND (lines.group_id = ?) AND (id ?) ORDER BY id DESC LIMIT 20 OFFSET 0; Interesting... Do you have many deleted rows? Do you always filter

Re: [PERFORM] Performance on large, append-only tables

2012-02-10 Thread David Yeu
On Feb 10, 2012, at 11:26 AM, Claudio Freire wrote: That, and an index on (group_id, created_at) where (deleted_at IS NULL) to catch the sorted by date kind of query, and you'll be done I think. Yeah, I didn't quite get that right -- we're actually sorting all these queries by id DESC, not

Re: [PERFORM] Performance on large, append-only tables

2012-02-10 Thread Claudio Freire
On Fri, Feb 10, 2012 at 1:45 PM, David Yeu david@skype.net wrote: On Feb 10, 2012, at 11:26 AM, Claudio Freire wrote: That, and an index on (group_id, created_at) where (deleted_at IS NULL) to catch the sorted by date kind of query, and you'll be done I think. Yeah, I didn't quite get

Re: [PERFORM] Performance on large, append-only tables

2012-02-10 Thread David Yeu
On Feb 10, 2012, at 11:58 AM, Claudio Freire wrote: From your OP: * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20; Yup, sorry. Dave -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription:

Re: [PERFORM] Performance on large, append-only tables

2012-02-10 Thread Claudio Freire
On Fri, Feb 10, 2012 at 2:00 PM, David Yeu david@skype.net wrote: From your OP: * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20; Yup, sorry. Ah, ok, so that should do it. If you need further improvement, remember to take a look at the deleted stuff. -- Sent via

Re: [PERFORM] random_page_cost = 2.0 on Heroku Postgres

2012-02-10 Thread Josh Berkus
On 2/9/12 2:41 PM, Peter van Hardenberg wrote: Hmm, perhaps we could usefully aggregate auto_explain output. The other option is to take a statistical approach. After all, what you want to do is optimize average response times across all your user's databases, not optimize for a few specific

Re: [PERFORM] random_page_cost = 2.0 on Heroku Postgres

2012-02-10 Thread Cédric Villemain
Le vendredi 10 février 2012 20:32:50, Josh Berkus a écrit : On 2/9/12 2:41 PM, Peter van Hardenberg wrote: Hmm, perhaps we could usefully aggregate auto_explain output. The other option is to take a statistical approach. After all, what you want to do is optimize average response times

Re: [PERFORM] timestamp with time zone

2012-02-10 Thread Alessandro Gagliardi
Hm. Tried running ANALYZE. Took almost 10 minutes to run. (Don't know if it would have been run automatically since I last tried this yesterday, but figured it couldn't hurt.) Still, no difference: http://explain.depesz.com/s/xHq Actually, it's 10x worse (maybe because this is my first time

Re: [PERFORM] random_page_cost = 2.0 on Heroku Postgres

2012-02-10 Thread Peter van Hardenberg
On Fri, Feb 10, 2012 at 11:32 AM, Josh Berkus j...@agliodbs.com wrote: On 2/9/12 2:41 PM, Peter van Hardenberg wrote: So one thought would be to add in pg_stat_statements to your platform ... something I'd like to see Heroku do anyway.  Then you can sample this across dozens (or hundreds) of

Re: [PERFORM] random_page_cost = 2.0 on Heroku Postgres

2012-02-10 Thread Josh Berkus
Peter, We've funded some work by Peter Geoghegan to make pg_stat_statements more useful, but the patch is currently sitting in the commitfest in need of a champion. I'd very much like to see it landed. Ok, let me review it then ... Between that work, 9.2, and Dimitri's extension whitelist

Re: [PERFORM] random_page_cost = 2.0 on Heroku Postgres

2012-02-10 Thread Peter van Hardenberg
On Fri, Feb 10, 2012 at 5:40 PM, Josh Berkus j...@agliodbs.com wrote: Peter, We've funded some work by Peter Geoghegan to make pg_stat_statements more useful, but the patch is currently sitting in the commitfest in need of a champion. I'd very much like to see it landed. Ok, let me review

Re: [PERFORM] rough benchmarks, sata vs. ssd

2012-02-10 Thread CSS
On Feb 3, 2012, at 6:23 AM, Ivan Voras wrote: On 31/01/2012 09:07, CSS wrote: Hello all, Just wanted to share some results from some very basic benchmarking runs comparing three disk configurations on the same hardware: http://morefoo.com/bench.html That's great! Thanks. I did