Hi po 26. 2. 2024 v 7:28 odesÃlatel Emiel Mols <em...@crisp.nl> napsal:
> Hello, > > To improve our unit and end-to-end testing performance, we are looking to > optimize initialization of around 500-1000 database *schemas* from a > schema.sql file. > > Background: in postgres, you cannot change databases on > existing connections, and using schemas would allow us to minimize the > amount of active connections needed from our test runners, as we can reuse > connections by updating search_path. In our current database-per-test setup > we see that there is around a ~30% (!!) total CPU overhead in > native_queued_spin_lock_slowpath (as profiled with linux perf), presumably > because of the high connection count. We run ~200 tests in parallel to > saturate a 128 core machine. > > In the schema-per-test setup, however, it becomes harder to cheaply create > the schema. Before we could `create database test001 with template > testbase` to set up the database for a test, and this was reasonably fast. > Re-inserting a schema with ~500 table/index definitions across 500 test > schema's is prohibitively expensive (around 1000ms per schema insertion > means we're wasting 500 cpu-seconds, and there seems to be quite some lock > contention too). Linux perf shows that in this setup we're reducing the > native_queued_spin_lock_slowpath overhead to around 10%, but total test > time is still slower due to all schema initialization being done. Also it > feels quite elegant functions and types can be reused between tests. > > Does anyone have some insights or great ideas :)? Also pretty curious to > the fundamental reason why having high/changing connection counts to > postgres results in this much (spin)lock contention (or perhaps we're doing > something wrong in either our configuration or measurements?). > > An alternative we haven't explored yet is to see if we can use pgbouncer > or other connection pooler to mitigate the 30% issue (set limits so there > are only ever X connections to postgres in total, and perhaps max Y per > test/database). This does add another piece of infrastructure/complexity, > so not really prefered. > For testing a) use templates - CREATE DATABASE test TEMPLATE some; b) disable fsync (only for testing!!!) Regards Pavel > Best, > > Emiel >