Greg, sorry for the resent: I had forgotten to include the list.
On Wed, Oct 5, 2016 at 2:34 PM, Greg Spiegelberg wrote:
> Data is not static. The 4M tables fall into one of two groups.
>
> Group A contains 2M tables. INSERT will occur ~100 times/day and maximum
>
On 10/5/16 7:34 AM, Greg Spiegelberg wrote:
When you say "must do a vacuum of the entire database", are you saying
the entire database must be vacuum'd as a whole per 2B transactions or
all tables must be vacuum'd eventually at least once?
All tables at least once. Prior to 9.6, that had to
On Fri, Sep 30, 2016 at 4:49 PM, Jim Nasby wrote:
> On 9/29/16 6:11 AM, Alex Ignatov (postgrespro) wrote:
>
>> With millions of tables you have to setautovacuum_max_workers
>> sky-high =). We have some situation when at thousands of tables
>> autovacuum can’t
On Thu, Sep 29, 2016 at 4:11 AM, Alex Ignatov (postgrespro) <
a.igna...@postgrespro.ru> wrote:
>
>
> *From:* pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-
> ow...@postgresql.org] *On Behalf Of *
>
> Thank you Terry. You get the gold star. :) I was waiting for that to
>
On 9/29/16 6:11 AM, Alex Ignatov (postgrespro) wrote:
With millions of tables you have to setautovacuum_max_workers
sky-high =). We have some situation when at thousands of tables
autovacuum can’t vacuum all tables that need it. Simply it vacuums some
of most modified table and never reach
From: pgsql-performance-ow...@postgresql.org
[mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Greg Spiegelberg
Sent: Tuesday, September 27, 2016 7:28 PM
To: Terry Schmitt <tschm...@schmittworks.com>
Cc: pgsql-performa. <pgsql-performance@postgresql.org>
Subject:
On 26 September 2016 at 05:19, Greg Spiegelberg wrote:
> I did look at PostgresXL and CitusDB. Both are admirable however neither
> could support the need to read a random record consistently under 30ms.
> It's a similar problem Cassandra and others have: network latency.
Greg,
* Greg Spiegelberg (gspiegelb...@gmail.com) wrote:
> On Wed, Sep 28, 2016 at 11:27 AM, Stephen Frost wrote:
> > * Greg Spiegelberg (gspiegelb...@gmail.com) wrote:
> > > Bigger buckets mean a wider possibility of response times. Some buckets
> > > may contain 140k
On Wed, Sep 28, 2016 at 11:27 AM, Stephen Frost wrote:
> Greg,
>
> * Greg Spiegelberg (gspiegelb...@gmail.com) wrote:
> > Bigger buckets mean a wider possibility of response times. Some buckets
> > may contain 140k records and some 100X more.
>
> Have you analyzed the depth
Greg,
* Greg Spiegelberg (gspiegelb...@gmail.com) wrote:
> Bigger buckets mean a wider possibility of response times. Some buckets
> may contain 140k records and some 100X more.
Have you analyzed the depth of the btree indexes to see how many more
pages need to be read to handle finding a row
On Wed, Sep 28, 2016 at 9:39 AM, Vitalii Tymchyshyn wrote:
> Have you considered having many databases (e.g. 100) and possibly many
> postgresql servers (e.g. 10) started on different ports?
> This would give you 1000x less tables per db.
>
The system design already allows for many
If going that route, why not just use plproxy?
On Wed, Sep 28, 2016 at 11:39 AM, Vitalii Tymchyshyn wrote:
> Have you considered having many databases (e.g. 100) and possibly many
> postgresql servers (e.g. 10) started on different ports?
> This would give you 1000x less tables per
>
> Have you considered having many databases (e.g. 100) and possibly many
postgresql servers (e.g. 10) started on different ports?
This would give you 1000x less tables per db.
>
>>
On Tue, Sep 27, 2016 at 10:15 AM, Terry Schmitt
wrote:
>
>
> On Sun, Sep 25, 2016 at 7:50 PM, Greg Spiegelberg
> wrote:
>
>> Hey all,
>>
>> Obviously everyone who's been in PostgreSQL or almost any RDBMS for a
>> time has said not to have
On Sun, Sep 25, 2016 at 7:50 PM, Greg Spiegelberg
wrote:
> Hey all,
>
> Obviously everyone who's been in PostgreSQL or almost any RDBMS for a time
> has said not to have millions of tables. I too have long believed it until
> recently.
>
> AWS d2.8xlarge instance with
On Tue, Sep 27, 2016 at 9:42 AM, Mike Sofen wrote:
> *From:* Mike Sofen *Sent:* Tuesday, September 27, 2016 8:10 AM
>
> *From:* Greg Spiegelberg *Sent:* Monday, September 26, 2016 7:25 AM
> I've gotten more responses than anticipated and have answered some
> questions and
On Tue, Sep 27, 2016 at 8:30 AM, Craig James wrote:
> On Sun, Sep 25, 2016 at 7:50 PM, Greg Spiegelberg
> wrote:
>
>> Hey all,
>>
>> Obviously everyone who's been in PostgreSQL or almost any RDBMS for a
>> time has said not to have millions of
From: Mike Sofen Sent: Tuesday, September 27, 2016 8:10 AM
From: Greg Spiegelberg Sent: Monday, September 26, 2016 7:25 AM
I've gotten more responses than anticipated and have answered some questions
and gotten some insight but my challenge again is what should I capture along
the way to
From: Greg Spiegelberg Sent: Monday, September 26, 2016 7:25 AM
I've gotten more responses than anticipated and have answered some questions
and gotten some insight but my challenge again is what should I capture along
the way to prove or disprove this storage pattern? Alternatives to
On Sun, Sep 25, 2016 at 7:50 PM, Greg Spiegelberg
wrote:
> Hey all,
>
> Obviously everyone who's been in PostgreSQL or almost any RDBMS for a time
> has said not to have millions of tables. I too have long believed it until
> recently.
>
> AWS d2.8xlarge instance with
Jeff Janes writes:
> A problem is that those statistics are stored in one file (per database; it
> used to be one file per cluster). With 8 million tables, that is going to
> be a pretty big file. But the code pretty much assumes the file is going
> to be pretty small, and
On Mon, Sep 26, 2016 at 5:53 AM, Greg Spiegelberg
wrote:
>
>>
>> I may need to understand autovacuum better. My impression was it
> consulted statistics and performed vacuums one table at a time based on the
> vacuum threshold formula on https://www.postgresql.org/
>
On Sun, Sep 25, 2016 at 8:50 PM, Greg Spiegelberg
wrote:
> Hey all,
>
> Obviously everyone who's been in PostgreSQL or almost any RDBMS for a time
> has said not to have millions of tables. I too have long believed it until
> recently.
>
> AWS d2.8xlarge instance with
On 26 September 2016 at 20:51, Greg Spiegelberg
wrote:
>
> An alternative if you exhaust or don't trust other options, use a foreign
>> data wrapper to access your own custom storage. A single table at the PG
>> level, you can shard the data yourself into 8 bazillion
On Mon, Sep 26, 2016 at 4:23 AM, Rick Otten
wrote:
> Are the tables constantly being written to, or is this a mostly read
> scenario? One architecture possibility, if the writes are not so
> frequent, is to create just a handful of very big tables for writing, and
>
On Mon, Sep 26, 2016 at 7:05 AM, Mike Sofen wrote:
> *From:* Rick Otten *Sent:* Monday, September 26, 2016 3:24 AM
> Are the tables constantly being written to, or is this a mostly read
> scenario?
>
>
>
> With regards to consistent query performance, I think you need to get
Consider the problem though. Random access to trillions of records with no
guarantee any one will be fetched twice in a short time frame nullifies the
effectiveness of a cache unless the cache is enormous. If such a cache
were that big, 100's of TB's, I wouldn't be looking at on-disk storage
On Mon, Sep 26, 2016 at 3:43 AM, Stuart Bishop
wrote:
> On 26 September 2016 at 11:19, Greg Spiegelberg
> wrote:
>
>> I did look at PostgresXL and CitusDB. Both are admirable however neither
>> could support the need to read a random record
From: Rick Otten Sent: Monday, September 26, 2016 3:24 AM
Are the tables constantly being written to, or is this a mostly read scenario?
With regards to consistent query performance, I think you need to get out of
AWS. That environment is terrible if you are going for consistency unless
Something that is not talked about at all in this thread is caching. A bunch
of memcache servers in front of the DB should be able to help with the 30ms
constraint (doesn't have to be memcache, some caching technology).
--
http://yves.zioup.com
gpg: 4096R/32B0F416
--
Sent via
Following list etiquette response inline ;)
On Mon, Sep 26, 2016 at 2:28 AM, Álvaro Hernández Tortosa
wrote:
>
>
> On 26/09/16 05:50, Greg Spiegelberg wrote:
>
>> Hey all,
>>
>> Obviously everyone who's been in PostgreSQL or almost any RDBMS for a
>> time has said not to have
Are the tables constantly being written to, or is this a mostly read
scenario? One architecture possibility, if the writes are not so
frequent, is to create just a handful of very big tables for writing, and
then make smaller tables as materialized views for reading. The vacuum and
bloat
On 26 September 2016 at 11:19, Greg Spiegelberg
wrote:
> I did look at PostgresXL and CitusDB. Both are admirable however neither
> could support the need to read a random record consistently under 30ms.
> It's a similar problem Cassandra and others have: network
On 26/09/16 05:50, Greg Spiegelberg wrote:
Hey all,
Obviously everyone who's been in PostgreSQL or almost any RDBMS for a
time has said not to have millions of tables. I too have long
believed it until recently.
AWS d2.8xlarge instance with 9.5 is my test rig using XFS on EBS (io1)
for
-sorry for my last email, which also not bottom posting-
Hi Greg,
On Mon, Sep 26, 2016 at 11:19 AM, Greg Spiegelberg
wrote:
> I did look at PostgresXL and CitusDB. Both are admirable however neither
> could support the need to read a random record consistently under
On Sun, Sep 25, 2016 at 7:50 PM, Greg Spiegelberg
wrote:
> Hey all,
>
> Obviously everyone who's been in PostgreSQL or almost any RDBMS for a time
> has said not to have millions of tables. I too have long believed it until
> recently.
>
> AWS d2.8xlarge instance with
Hi Greg,
Please follow the conventions of this mailing list, to avoid confusion
- see bottom of this posting for further comments
On 26/09/16 17:05, Greg Spiegelberg wrote:
Precisely why I shared with the group. I must understand the risks
involved. I need to explore if it can be stable
I did look at PostgresXL and CitusDB. Both are admirable however neither
could support the need to read a random record consistently under 30ms.
It's a similar problem Cassandra and others have: network latency. At this
scale, to provide the ability to access any given record amongst trillions
Precisely why I shared with the group. I must understand the risks
involved. I need to explore if it can be stable at this size when does it
become unstable? Aside from locking down user access to superuser, is
there a way to prohibit database-wide VACUUM & ANALYZE? Certainly putting
my trust
From: Greg Spiegelberg Sent: Sunday, September 25, 2016 7:50 PM
… Over the weekend, I created 8M tables with 16M indexes on those tables.
… A system or database crash could take potentially hours to days to recover.
There are likely other issues ahead.
You may wonder, "why is Greg
Dear Greg,
Have you checked PostgresXL ?
with millions of table, how the apps choose which table is approriate?
in my opinion, with that scale it should go with parallel query with
data sharing like what PostgresXL is done.
Thanks,
Julyanto SUTANDANG
Equnix Business Solutions, PT
(An Open
41 matches
Mail list logo