Tom, Andres - Is there an issue tracker I could be looking at to follow along on the progress on this issue?
Thanks so much! On Mon, Oct 2, 2017 at 9:06 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Andres Freund <and...@anarazel.de> writes: > > On 2017-10-02 19:50:51 -0400, Tom Lane wrote: > >> What I saw was that the backend process was consuming 100% of (one) CPU, > >> while the I/O transaction rate viewed by "iostat 1" started pretty low > >> --- under 10% of what the machine is capable of --- and dropped from > >> there as the copy proceeded. I did not think to check if that was user > >> or kernel-space CPU, but I imagine it has to be the latter. > > > So that's pretty clearly a kernel bug... Hm. I wonder if it's mmap() or > > msync() that's the problem here. I guess you didn't run a profile? > > Interestingly, profiling with Activity Monitor seems to blame the problem > entirely on munmap() ... which squares with the place I hit every time > when randomly stopping the process with gdb^Hlldb, so I'm inclined to > believe it. > > This still offers no insight as to why CREATE DATABASE is hitting the > problem while regular flush activity doesn't. > > > One interesting thing here is that in the CREATE DATABASE case there'll > > probably be a lot larger contiguous mappings than in *_flush_after > > cases. So it might be related to the size of the mapping / flush "unit". > > Meh, the mapping is only 64K in this case vs. 8K in the other. Hard > to credit that it breaks that easily. > > regards, tom lane >