Thanks Mike,

We are using the standard RDS instance m4.large, it's not Aurora, which is
a much more powerful server  (according to AWS).

Yes, we could install it on EC2, but it would take some extra effort from
our side, it can be an investment though in case it will help us finding
the bottle neck, BUT after tuning the database it must run on RDS for
production use. As the company I work for demands we run microseconds DB as
a managed service (RDS in this case).

Mike, what can we expect to see if we run PG on EC2? More logging? More
tuning options? Let me know what your intention is so that I can convince
other people on the team. But keep in mind in the end that payload should
run on RDS m4.large (500gb to 1TB of general purpose SSD).

Again, thanks a lot!

Em 27 de dez de 2017 13:59, "Mike Sofen" <> escreveu:

Hi Jean,

I’ve used Postgres on a regular EC2 instance (an m4.xlarge), storing
complex genomic data, hundreds of millions of rows in a table and “normal”
queries that used an index returned in 50-100ms, depending on the query…so
this isn’t a Postgres issue per se.

Your table and index structures look ok, although in PG, use the “text”
datatype instead of varchar, it is the optimized type for storing string
data of any size (even a 2 char country code).  Since you have 2 such
columns that you’ve indexed and are querying for, there is a chance you’ll
see an improvement.

I have not yet used Aurora or RDS for any large data…it sure seems like the
finger could be pointing there, but it isn’t clear what mechanism in Aurora
could be creating the slowness.

Is there a possibility of you creating the same db on a normal EC2 instance
with PG installed and running the same test?  There is nothing else obvious
about your data/structure that could result in such terrible performance.

Mike Sofen

*From:* Jean Baro []
*Sent:* Wednesday, December 27, 2017 7:14 AM


We are still seeing queries  (by UserID + UserCountry) taking over 2
seconds, even when there is no batch insert going on at the same time.

Each query returns from 100 to 200 messagens, which would be a 400kb pay
load, which is super tiny.

I don't know what else I can do with the limitations (m4.large), 167MM
rows, almost 500GB database and 29GB of indexes (all indexes).

I am probably to optimistic, but I was expecting queries (up to 50 queries
per second) to return  (99th) under 500ms or even less, as the index is
simple, there is no aggregation or join involves.

Any suggestion?

The table structure:

CREATE TABLE public.card


    id character(36) NOT NULL,

    user_id character varying(40) NOT NULL,

    user_country character(2) NOT NULL,

    user_channel character varying(40),

    user_role character varying(40),

    created_by_system_key character(36) NOT NULL,

    created_by_username character varying(40),

    created_at timestamp with time zone NOT NULL,

    last_modified_at timestamp with time zone NOT NULL,

    date_start timestamp with time zone NOT NULL,

    date_end timestamp with time zone NOT NULL,

    payload json NOT NULL,

    tags character varying(500),

    menu character varying(50),

    deleted boolean NOT NULL,

    campaign character varying(500) NOT NULL,

    correlation_id character varying(50),

    PRIMARY KEY (id)


CREATE INDEX idx_user_country

    ON public.card USING btree

    (user_id COLLATE pg_catalog."default", user_country COLLATE

CREATE INDEX idx_last_modified_at

    ON public.card USING btree

    (last_modified_at ASC NULLS LAST);

CREATE INDEX idx_campaign

    ON public.card USING btree

    (campaign ASC NULLS LAST)


'Index Scan using idx_user_country on card  (cost=0.57..1854.66 rows=460

'  Index Cond: (((user_id)::text = '4684'::text) AND (user_country =

Em 25 de dez de 2017 01:10, "Jean Baro" <> escreveu:

Thanks for the clarification guys.

It will be super useful. After trying this I'll post the results!

Merry Christmas!

Reply via email to