Hi Tomas, On Mon, Jun 29, 2020 at 12:10 AM Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > > Hi, > > One of the issues I'm fairly regularly reminded by users/customers is > that inserting into tables sharded using FDWs are rather slow. We do > even get it reported on pgsql-bugs from time to time [1]. > > Some of the slowness / overhead is expected, doe to the latency between > machines in the sharded setup. Even just 1ms latency will make it way > more expensive than a single instance. > > But let's do a simple experiment, comparing a hash-partitioned regular > partitions, and one with FDW partitions in the same instance. Scripts to > run this are attached. The duration of inserting 1M rows to this table > (average of 10 runs on my laptop) looks like this: > > regular: 2872 ms > FDW: 64454 ms > > Yep, it's ~20x slower. On setup with ping latency well below 0.05ms. > Imagine how would it look on sharded setups with 0.1ms or 1ms latency, > which is probably where most single-DC clusters are :-( > > Now, the primary reason why the performance degrades like this is that > while FDW has batching for SELECT queries (i.e. we read larger chunks of > data from the cursors), we don't have that for INSERTs (or other DML). > Every time you insert a row, it has to go all the way down into the > partition synchronously. > > For some use cases this may be reduced by having many independent > connnections from different users, so the per-user latency is higher but > acceptable. But if you need to import larger amounts of data (say, a CSV > file for analytics, ...) this may not work. > > Some time ago I wrote an ugly PoC adding batching, just to see how far > would it get us, and it seems quite promising - results for he same > INSERT benchmarks look like this: > > FDW batching: 4584 ms > > So, rather nice improvement, I'd say ...
Very nice indeed. > Before I spend more time hacking on this, I have a couple open questions > about the design, restrictions etc. I think you may want to take a look this recent proposal by Andrey Lepikhov: * [POC] Fast COPY FROM command for the table with foreign partitions * https://www.postgresql.org/message-id/flat/3d0909dc-3691-a576-208a-90986e55489f%40postgrespro.ru -- Amit Langote EnterpriseDB: http://www.enterprisedb.com