Multiple connections, but we are going to test it with only one. Would it make any difference?
Thanks Em 24 de dez de 2017 21:52, "michael...@sqlexec.com" <michael...@sqlexec.com> escreveu: > Are the inserts being done through one connection or multiple connections > concurrently? > > Sent from my iPhone > > > On Dec 24, 2017, at 2:51 PM, Jean Baro <jfb...@gmail.com> wrote: > > > > Hi there, > > > > We are testing a new application to try to find performance issues. > > > > AWS RDS m4.large 500GB storage (SSD) > > > > One table only, called Messages: > > > > Uuid > > Country (ISO) > > Role (Text) > > User id (Text) > > GroupId (integer) > > Channel (text) > > Title (Text) > > Payload (JSON, up to 20kb) > > Starts_in (UTC) > > Expires_in (UTC) > > Seen (boolean) > > Deleted (boolean) > > LastUpdate (UTC) > > Created_by (UTC) > > Created_in (UTC) > > > > Indexes: > > > > UUID (PK) > > UserID + Country (main index) > > LastUpdate > > GroupID > > > > > > We inserted 160MM rows, around 2KB each. No partitioning. > > > > Insert started at around 3.000 inserts per second, but (as expected) > started to slow down as the number of rows increased. In the end we got > around 500 inserts per second. > > > > Queries by Userd_ID + Country took less than 2 seconds, but while the > batch insert was running the queries took over 20 seconds!!! > > > > We had 20 Lambda getting messages from SQS and bulk inserting them into > Postgresql. > > > > The insert performance is important, but we would slow it down if needed > in order to ensure a more flat query performance. (Below 2 seconds). Each > query (userId + country) returns around 100 diferent messages, which are > filtered and order by the synchronous Lambda function. So we don't do any > special filtering, sorting, ordering or full text search in Postgres. In > some ways we use it more like a glorified file system. :) > > > > We are going to limit the number of lambda workers to 1 or 2, and then > run some queries concurrently to see if the query performance is not affect > too much. We aim to get at least 50 queries per second (returning 100 > messages each) under 2 seconds, even when there is millions of messages on > SQS being inserted into PG. > > > > We haven't done any performance tuning in the DB. > > > > With all that said, the question is: > > > > What can be done to ensure good query performance (UserID+ country) even > when the bulk insert is running (low priority). > > > > We are limited to use AWS RDS at the moment. > > > > Cheers > > > > > >