"Tsunakawa, Takayuki" <tsunakawa.ta...@jp.fujitsu.com> writes: >> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas >> I think that we shouldn't start changing things based on guesses about what >> the problem is, even if they're fairly smart guesses. The thing to do would >> be to construct a test rig, crash the server repeatedly, and add debugging >> instrumentation to figure out where the time is actually going.
> We have tried to reproduce the problem in the past several days with much > more stress on our environment than on the customer's one -- 1,000 tables > aiming for a dozens of times larger stats file and repeated reconnection > requests from hundreds of clients -- but we could not succeed. >> I do think your theory about the stats collector might be worth pursuing. >> It seems that the stats collector only responds to SIGQUIT, ignoring SIGTERM. >> Making it do a clean shutdown on SIGTERM and a fast exit on SIGQUIT seems >> possibly worthwhile. > Thank you for giving confidence for proceeding. And I also believe that > postmaster should close the listening ports earlier. Regardless of whether > this problem will be solved not confident these will solve the, I think it'd > be better to fix these two points so that postmaster doesn't longer time than > necessary. I think I'll create a patch after giving it a bit more thought. FWIW, I'm pretty much -1 on messing with the timing of the socket close actions. I broke that once within recent memory, so maybe I'm gun-shy, but I think that the odds of unpleasant side effects greatly outweigh any likely benefit there. Allowing SIGQUIT to prompt fast shutdown of the stats collector seems sane, though. Try to make sure it doesn't leave partly-written stats files behind. regards, tom lane -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers