Hi, On 2023-08-28 17:29:56 +1200, Thomas Munro wrote: > Every time we run a SQL query, we fork a new psql process and a new > cold backend process. It's not free on Unix, and quite a lot worse on > Windows, at around 70ms per query. Take amcheck/001_verify_heapam for > example. It runs 272 subtests firing off a stream of queries, and > completes in ~51s on Windows (!), and ~6-9s on the various Unixen, on > CI.
Whoa. > Here are some timestamps I captured from CI by instrumenting various > Perl and C bits: > > 0.000s: IPC::Run starts > 0.023s: postmaster socket sees connection > 0.025s: postmaster has created child process > 0.033s: backend starts running main() > 0.039s: backend has reattached to shared memory > 0.043s: backend connection authorized message > 0.046s: backend has executed and logged query > 0.070s: IPC::Run returns > > I expected process creation to be slow on that OS, but it seems like > something happening at the end is even slower. CI shows Windows > consuming 4 CPUs at 100% for a full 10 minutes to run a test suite > that finishes in 2-3 minutes everywhere else with the same number of > CPUs. It finishes in that time on linux, even with sanitizers enabled... > As an experiment, I hacked up a not-good-enough-to-share experiment > where $node->safe_psql() would automatically cache a BackgroundPsql > object and reuse it, and the times for that test dropped ~51 -> ~9s on > Windows, and ~7 -> ~2s on the Unixen. But even that seems non-ideal > (well it's certainly non-ideal the way I hacked it up anyway...). I > suppose there are quite a few ways we could do better: That's a really impressive win. Even if we "just" converted some of the safe_psql() cases and converted poll_query_until() to this, we'd win a lot. > 1. Don't fork anything at all: open (and cache) a connection directly > from Perl. One advantage of that is that the socket is entirely controlled by perl, so waiting for IO should be easy... > 2b. Maybe give psql or a new libpq-wrapper a new low level stdio/pipe > protocol that is more fun to talk to from Perl/machines? That does also seem promising - a good chunk of the complexity around some of the IPC::Run uses is that we end up parsing psql input/output... Greetings, Andres Freund