Re: [HACKERS] pgstat wait timeout (RE: contrib/cache_scan)
Jeff Janes writes: > On Wed, Mar 12, 2014 at 7:42 AM, Tom Lane wrote: >> We've seen sporadic reports of that sort of behavior for years, but no >> developer has ever been able to reproduce it reliably. Now that you've >> got a reproducible case, do you want to poke into it and see what's going >> on? > I didn't know we were trying to reproduce it, nor that it was a mystery. > Do anything that causes serious IO constipation, and you will probably see > that message. The cases that are a mystery to me are where there's no reason to believe that I/O is particularly overloaded. But perhaps Kaigai-san's example is only that ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgstat wait timeout (RE: contrib/cache_scan)
On Wed, Mar 12, 2014 at 7:42 AM, Tom Lane wrote: > Kouhei Kaigai writes: > > WARNING: pgstat wait timeout > > WARNING: pgstat wait timeout > > WARNING: pgstat wait timeout > > WARNING: pgstat wait timeout > > > Once I got above messages, write performance is dramatically > > degraded, even though I didn't take detailed investigation. > > > I could reproduce it on the latest master branch without my > > enhancement, so I guess it is not a problem something special > > to me. > > One other strangeness is, right now, this problem is only > > happen on my virtual machine environment - VMware ESXi 5.5.0. > > I couldn't reproduce the problem on my physical environment > > (Fedora20, core i5-4570S). > > We've seen sporadic reports of that sort of behavior for years, but no > developer has ever been able to reproduce it reliably. Now that you've > got a reproducible case, do you want to poke into it and see what's going > on? > I didn't know we were trying to reproduce it, nor that it was a mystery. Do anything that causes serious IO constipation, and you will probably see that message. For example, turn off synchronous_commit and run the default pgbench transaction at a large scale but that still comfortably fits in RAM, and wait for a checkpoint sync phase to kick in. The pgstat wait timeout is a symptom, not the cause. Cheers, Jeff
Re: [HACKERS] pgstat wait timeout (RE: contrib/cache_scan)
On 12 Březen 2014, 14:54, Kouhei Kaigai wrote: > It is another topic from the main thread, > > I noticed the following message under the test cases that > takes heavy INSERT workload; provided by Haribabu. > > [kaigai@iwashi ~]$ createdb mytest > [kaigai@iwashi ~]$ psql -af ~/cache_scan.sql mytest > \timing > Timing is on. > --cache scan select 5 million > create table test(f1 int, f2 char(70), f3 float, f4 char(100)); > CREATE TABLE > Time: 22.373 ms > truncate table test; > TRUNCATE TABLE > Time: 17.705 ms > insert into test values (generate_series(1,500), 'fujitsu', 1.1, > 'Australia software tech pvt ltd'); > WARNING: pgstat wait timeout > WARNING: pgstat wait timeout > WARNING: pgstat wait timeout > WARNING: pgstat wait timeout >: > > Once I got above messages, write performance is dramatically > degraded, even though I didn't take detailed investigation. > > I could reproduce it on the latest master branch without my > enhancement, so I guess it is not a problem something special > to me. > One other strangeness is, right now, this problem is only > happen on my virtual machine environment - VMware ESXi 5.5.0. > I couldn't reproduce the problem on my physical environment > (Fedora20, core i5-4570S). > Any ideas? I've seen this happening in cases when it was impossible to write the stat file for some reason. IIRC there were two basic causes I've seen in the past: (1) writing the stat copy failed - for example when the temporary stat directory was placed in tmpfs, but it was too small (2) writing the stat copy took too long - e.g. with tmpfs and memory pressure, forcing the system to swap to free space for the stat copy (3) IIRC the inquiry (backend -> postmaster) to write the file is sent using UDP, which may be dropped in some cases (e.g. when the system is overloaded), so the postmaster does not even know it should write the file I'm not familiar with VMware ESXi virtualization, but I suppose it might be relevant to all three causes. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgstat wait timeout (RE: contrib/cache_scan)
Kouhei Kaigai writes: > WARNING: pgstat wait timeout > WARNING: pgstat wait timeout > WARNING: pgstat wait timeout > WARNING: pgstat wait timeout > Once I got above messages, write performance is dramatically > degraded, even though I didn't take detailed investigation. > I could reproduce it on the latest master branch without my > enhancement, so I guess it is not a problem something special > to me. > One other strangeness is, right now, this problem is only > happen on my virtual machine environment - VMware ESXi 5.5.0. > I couldn't reproduce the problem on my physical environment > (Fedora20, core i5-4570S). We've seen sporadic reports of that sort of behavior for years, but no developer has ever been able to reproduce it reliably. Now that you've got a reproducible case, do you want to poke into it and see what's going on? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers