Hello there.

I am not an PG expert, as currently I work as a Enterprise Architect (who
believes in OSS and in particular PostgreSQL 😍). So please forgive me if
this question is too simple. 🙏

Here it goes:

We have a new Inventory system running  on its own database (PG 10 AWS
RDS.m5.2xlarge 1TB SSD EBS - Multizone). The DB effective size is less than
10GB at the moment. We provided 1TB to get more IOPS from EBS.

As we don't have a lot of different products in our catalogue it's quite
common (especially when a particular product is on sale) to have a high
rate of concurrent updates against the same row. There is also a frequent
(every 30 minutes) update to all items which changed their current
stock/Inventory coming from the warehouses (SAP), the latter is a batch
process. We have just installed this system for a new tenant (one of the
smallest one) and although it's running great so far, we believe this
solution would not scale as we roll out this system to new (and bigger)
tenants. Currently there is up to 1.500 transactions per second (mostly
SELECTS and 1 particular UPDATE which I believe is the one being
aborted/deadlocked some tImes) in this inventory database.

I am not a DBA, but as the DBAs (most of them old school Oracle DBAs who
are not happy with the move to POSTGRES) are considering ditching
Postgresql without any previous tunning I would like to understand the
possibilities.

Considering this is a highly concurrent (same row) system I thought to
suggest:

1) Set up Shared_buffer to 25% of the RAM on the RDS instance;

2) Install a pair (HA) of PGBouncers (session) in front of PG and setup it
in a way that it would keep only 32 connections (4 per core) open to the
database at the same time, but all connections going to PGBouncer (might be
thousands) would be queued as soon as there is more than 32 active
connections to the Database. We have reached more than 500 concurrent
connections so far. But these numbers will grow.

3) set work_mem to 3 times the size of  largest temp file;

4) set maintenance_work_mem to 2GB;

5) set effective_cache_size to 50% of total memory.

The most used update is already a HOT UPDATE, as it (or any trigger)
doesn't change indexed columns.

It seems to me the kind of problem we have is similar to those systems
which sell limited number of tickets to large concerts/events, like
googleIO used to be... Where everyone tried to buy the ticket as soon as
possible, and the system had to keep a consistent number of available
tickets. I believe that's a hard problem to solve. So that's way I am
asking for suggestions/ideas from the experts.

Thanks so much!

Reply via email to