Re: [PERFORM] Millions of tables

Greg Spiegelberg Tue, 27 Sep 2016 08:51:40 -0700

On Tue, Sep 27, 2016 at 9:42 AM, Mike Sofen <mso...@runbox.com> wrote:


> *From:* Mike Sofen   *Sent:* Tuesday, September 27, 2016 8:10 AM
>
> *From:* Greg Spiegelberg   *Sent:* Monday, September 26, 2016 7:25 AM
> I've gotten more responses than anticipated and have answered some
> questions and gotten some insight but my challenge again is what should I
> capture along the way to prove or disprove this storage pattern?
> Alternatives to the storage pattern aside, I need ideas to test rig,
> capture metrics and suggestions to tune it.
>
>
>
> In the next 24 hours, I will be sending ~1 trillion records to the test
> database.  Because of time to set up, I'd rather have things set up
> properly the first go.
>
>
>
> Thanks!
>
> -Greg
>
> ---------------------
>
> Greg, I ran another quick test on a wider table than you’ve described, but
> this time with 80 million rows, with core counts, ram and ssd storage
> similar to what you’d have on that AWS EC2 instance.  This table had 7
> columns (3 integers, 3 text, 1 timestamptz) with an average width of 157
> chars, one btree index on the pk int column.  Using explain analyze, I
> picked one id value out of the 80m and ran a select * where id = x.  It did
> an index scan, had a planning time of 0.077ms, and an execution time of
> 0.254 seconds.  I ran the query for a variety of widely spaced values (so
> the data was uncached) and the timing never changed. This has been
> mirroring my general experience with PG – very fast reads on indexed
> queries.
>
>
>
> Summary:  I think your buckets can be WAY bigger than you are envisioning
> for the simple table design you’ve described.  I’m betting you can easily
> do 500 million rows per bucket before approaching anything close to the
> 30ms max query time.
>
>
>
> Mike Sofen (Synthetic Genomics)
>
>
>
> Totally typo’d the execution time:  it was 0.254 MILLISECONDS, not
> SECONDS.  Thus my comment about going up 10x in bucket size instead of
> appearing to be right at the limit.  Sorry!
>
>
>
I figured.  :)

Haven't ruled it out but expectations of this implementation is to perform
at worst 3X slower than memcache or Redis.

Bigger buckets mean a wider possibility of response times.  Some buckets
may contain 140k records and some 100X more.

-Greg

Re: [PERFORM] Millions of tables

Reply via email to