One more observation while I'm thinking about bulk_write.c... hmm, it writes the data out and asks the checkpointer to fsync it, but doesn't call smgrwriteback(). I assume that means that on Linux the physical writeback sometimes won't happen until the checkpointer eventually calls fsync() sequentially, one segment file at a time. I see that it's difficult to decide how to do that though; unlike checkpoints, which have rate control/spreading, bulk writes could more easily flood the I/O subsystem in a burst. I expect most non-Linux systems' write-behind heuristics to fire up for bulk sequential writes, but on Linux where most users live, there is no write-behind heuristic AFAIK (on the most common file systems anyway), so you have to crank that handle if you want it to wake up and smell the coffee before it hits internal limits, but then you have to decide how fast to crank it.
This problem will come into closer focus when we start talking about streaming writes. For the current non-streaming bulk_write.c coding, I don't have any particular idea of what to do about that, so I'm just noting the observation here. Sorry for the sudden wall of text/monologue; this is all a sort of reaction to bulk_write.c that I should perhaps have sent to the bulk_write.c thread, triggered by a couple of debugging sessions.