Re: Half billion records in one table? RDS

Aaron Werman Tue, 05 Dec 2017 18:00:23 -0800

Why not store metadata in pg and the payload in S3?

On Mon, Nov 27, 2017 at 11:58 AM Jean Baro <jfb...@gmail.com> wrote:


> Hi there,
>
> We are creating a new DB which will behave most like a file system, I
> mean, there will be no complex queries or joins running in the DB. The idea
> is to grab the WHOLE set of messages for a particular user and then filter,
> order, combine or full text search in the function itself (AWS Lambda). The
> maximum number of messages is limited to 1.000 messages per user. So we
> expect Postgres to have an amazing performance for this scenario.
>
> As I am not really familiar with PG (9.6, or 10, in case RDS release it
> before February) I would like to share what we are planning to do for this
> DB. So if you guys could share your thoughts, that would be great! :)
>
> Table structure:
>
>
>
> ·         MessageID (UUID) - PK
>
> ·         UserCountry (ISO)
>
> ·         UserRole (TEXT 15)
>
> ·         UserID (TEXT 30) – FK (although there is no constraint)
>
> ·         LifeCycle (RANGE DATE? Or 2 TimeStampWithTZ? Start_date and
> end_date?)
>
> ·         Channel (TEXT 15)
>
> ·         Tags (TEXT 2000)
>
> ·         Menu (TEXT 200)
>
> ·         Icon (TEXT 500) – URL to an image which will be used as an icon;
>
> ·         Title (TEXT 150)
>
> ·         *Body (JSON – up to 10K) – Meta data describing all the data to
> a specific type of message. The JSON changes according to the type of
> message. We are assuming most messages will use less than 1K for this
> field.*
>
> ·         Delete (BOOLEAN) – Soft Delete
>
> ·         Created (Timestamp – With TZ)
>
> ·         CreatedBy (TEXT 50)
>
>
>
> Only 1 table
>
> ·         Messages
>
> 3 indexes:
>
> ·         MessageID PK (UUID)
>
> ·         Main fetch key (UserCountry + UserID) - *****
>
> ·         End_date (To locate old messages that can be moved to another
> DB - which will hold the old messages);
>
>
>
> Sizing and worst case scenario:
>
>
>
> ·         500MM messages in the main DB
>
> ·         4K queries per second (by UserID) – Max time of 500ms per
> query. Simples SELECT, with no ORDER, WHERE OR GROUP BY. Just grab all the
> messages for a particular user. MAX 1000 messages per USER.
>
> ·         1K inserts per second on average (So that in 1 hour we can
> insert around 3MM messages)
>
> ·         1K deletes per second on average (So that in 1 hour we can
> remove around 3MM messages)
>
>
> My question is:
>
>
>    - Can we use any kind of compression for PostgreSQL which would result
>    in reduced IO and disk size?
>    - We are not relying on any kind of table partitioning, is that the
>    best approach for this scenario?
>    - Is PG on RDS capable of delivering this type of performance while
>    requiring low maintenance?
>    - What about Auto Vacuum? Any suggestion how to optimize it for such a
>    work load (we will insert and delete millions of rows every day).
>
> P.S.: We are going to test all this, but if we don't get the performance
> we are expecting, all optimization tips from you guys will be really
> appreciated. :)
>
> Thanks
>
>
>
> --

Regards,
/Aaron

Re: Half billion records in one table? RDS

Reply via email to