On Mon, Sep 26, 2016 at 3:43 AM, Stuart Bishop <stu...@stuartbishop.net> wrote:
> On 26 September 2016 at 11:19, Greg Spiegelberg <gspiegelb...@gmail.com> > wrote: > >> I did look at PostgresXL and CitusDB. Both are admirable however neither >> could support the need to read a random record consistently under 30ms. >> It's a similar problem Cassandra and others have: network latency. At this >> scale, to provide the ability to access any given record amongst trillions >> it is imperative to know precisely where it is stored (system & database) >> and read a relatively small index. I have other requirements that prohibit >> use of any technology that is eventually consistent. >> >> I liken the problem to fishing. To find a particular fish of length, >> size, color &c in a data lake you must accept the possibility of scanning >> the entire lake. However, if all fish were in barrels where each barrel >> had a particular kind of fish of specific length, size, color &c then the >> problem is far simpler. >> >> -Greg >> > > My gut tells me that if you do solve the problem and get PostgreSQL (or > anything) reading consistently at under 30ms with that many tables you will > have solved one problem by creating another. > > Exactly why I am exploring. What are the trade offs? > You discounted Cassandra due to network latency, but are now trying a > monolithic PostgreSQL setup. It might be worth trying a single node > ScyllaDB or Cassandra deploy (no need for QUORUM or network overhead), > perhaps using layered compaction so all your data gets broken out into > 160MB chunks. And certainly wander over to the ScyllaDB mailing list, as > they are very focused on performance problems like yours and should offer > some insight even if a Cassandra style architecture cannot meet your > requirements. > > Cassandra performance, according to the experts I consulted, starts to fall off once the stored dataset exceeds ~3 TB. Much too small for my use case. Again, I do have other reasons for not using Cassandra and others namely deduplication of information referenced by my millions of tables. There are no guarantees in many outside of the RDBMS realm. > An alternative if you exhaust or don't trust other options, use a foreign > data wrapper to access your own custom storage. A single table at the PG > level, you can shard the data yourself into 8 bazillion separate stores, in > whatever structure suites your read and write operations (maybe reusing an > embedded db engine, ordered flat file+log+index, whatever). > > However even 8 bazillion FDW's may cause an "overflow" of relationships at the loss of having an efficient storage engine acting more like a traffic cop. In such a case, I would opt to put such logic in the app to directly access the true storage over using FDW's. -Greg