Resurrecting this old thread ... I decided it'd be interesting to re-examine where initdb's runtime is going, seeing that we just got done with a lot of bootstrap data restructuring. I stuck some timing code into initdb, and got results like this:
creating directory /home/postgres/testversion/data ... ok elapsed = 0.256 msec creating subdirectories ... ok elapsed = 2.385 msec selecting default max_connections ... 100 elapsed = 13.528 msec selecting default shared_buffers ... 128MB elapsed = 13.699 msec selecting dynamic shared memory implementation ... posix elapsed = 0.129 msec elapsed = 281.335 msec in select_default_timezone creating configuration files ... ok elapsed = 1.319 msec running bootstrap script ... ok elapsed = 162.143 msec performing post-bootstrap initialization ... ok elapsed = 832.569 msec Sync to disk skipped. real 0m1.316s user 0m0.941s sys 0m0.395s (I'm using "initdb -N" because the cost of the sync step is so platform-dependent, and it's not interesting anyway for buildfarm or make check-world testing. Also, I rearranged the code slightly so that select_default_timezone could be timed separately from the rest of the "creating configuration files" step.) In trying to break down the "post-bootstrap initialization" step a bit further, I soon realized that trying to time the sub-steps from initdb is useless. initdb is just shoving bytes down the pipe as fast as the kernel will let it; it has no idea how long it's taking the backend to do any one query or queries. So I ended up adding "-c log_statement=all -c log_min_duration_statement=0" to the backend_options, and digging query durations out of the log output. I got these totals for the major steps in the post-boot run: pg_authid setup: 0.909 ms pg_depend setup: 64.980 ms system views: 106.221 ms pg_description: 39.665 ms pg_collation: 65.162 ms conversions: 72.024 ms text search: 29.454 ms init-acl hacking: 14.339 ms information schema: 188.497 ms plpgsql: 2.531 ms analyze/vacuum/additional db creation: 171.762 ms So the conversions don't look nearly as interesting as Andreas suggested upthread. Pushing them into .bki format would at best save ~ 70 ms out of 1300. Which is not nothing, but it's not going to change the world either. Really the only thing here that jumps out as being unduly expensive for what it's doing is select_default_timezone. That is, and always has been, a brute-force algorithm; I wonder if there's a way to do better? We can probably guess that every non-Windows platform is using the IANA timezone data these days. If there were some way to extract the name of the active timezone setting directly, we wouldn't have to try to reverse-engineer it. But I don't know of any portable way :-( regards, tom lane