From: Rick Otten   Sent: Monday, September 26, 2016 3:24 AM
Are the tables constantly being written to, or is this a mostly read scenario? 

 

With regards to consistent query performance, I think you need to get out of 
AWS.  That environment is terrible if you are going for consistency unless you 
buy dedicated hardware, and then you are paying so much money it is ridiculous.

 

Also I think having 10M rows in a table is not a problem for the query times 
you are referring to.  So instead of millions of tables, unless I'm doing my 
math wrong, you probably only need thousands of tables.

----------

Excellent thoughts:  the read/write behavior will/should drive a lot of the 
design;  AWS does not guarantee consistency or latency;  and 10m rows is 
nothing to PG.

 

Re AWS:  we’re on it, at least for now.  In my profiling of our performance 
there, I consistently get low latencies…I just know that there will be random 
higher latencies, but the statistical average will be low.  I just ran a quick 
test against a modest sized table on a modest sized EC2 instance (m4.xlarge – 4 
core/16gb ram, 3 tb ssd):  the table has 15m rows but is huge (it represents 
nearly 500m rows compressed in jsonb documents), with 5 indexed key columns and 
a total of 12 columns.  I queried for a single, non-PK, indexed value using 
“select *” (so it included the json) and it took 22ms, without the json it took 
11ms.  Especially with the db/memory-optimized EC2 instances now available 
(with guaranteed IOPS), performance against even 100m row tables should still 
stay within your requirements.

 

So Rick’s point about not needing millions of tables is right on.  If there’s a 
way to create table “clumps”, at least you’ll have a more modest table count.

 

Mike Sofen (Synthetic Genomics)

Reply via email to