pgbench initialization has been going on for almost 5 hours now and still stuck before vacuum starts .. something is definitely wrong as I don't remember it took so long first time I created the db. Here are the current stats now:
*iostat (xbd13-14 are WAL zpool)* device r/s w/s kr/s kw/s qlen svc_t %b xbd8 161.3 109.8 1285.4 3450.5 0 12.5 19 xbd7 159.5 110.6 1272.3 3450.5 0 11.4 14 xbd6 161.1 108.8 1284.4 3270.6 0 10.9 14 xbd5 159.5 109.0 1273.1 3270.6 0 11.6 15 xbd14 0.0 0.0 0.0 0.0 0 0.0 0 xbd13 0.0 0.0 0.0 0.0 0 0.0 0 xbd12 204.6 110.8 1631.3 3329.2 0 9.1 15 xbd11 216.0 111.2 1722.5 3329.2 1 8.6 16 xbd10 197.2 109.4 1573.5 3285.8 0 9.8 15 xbd9 195.0 109.4 1557.1 3285.8 0 9.9 15 * * *zpool iostat (db pool)* pool alloc free read write read write db 143G 255G 1.40K 1.53K 11.2M 12.0M *vmstat* * * procs memory page disks faults cpu r b w avm fre flt re pi po fr sr ad0 xb8 in sy cs us sy id 0 0 0 5634M 28G 7 0 0 0 7339 0 0 245 2091 6358 20828 2 5 93 0 0 0 5634M 28G 10 0 0 0 6989 0 0 312 1993 6033 20090 1 4 95 0 0 0 5634M 28G 7 0 0 0 6803 0 0 292 1974 6111 22763 2 5 93 0 0 0 5634M 28G 10 0 0 0 7418 0 0 339 2041 6170 20838 2 4 94 0 0 0 5634M 28G 123 0 0 0 6980 0 0 282 1977 5906 19961 2 4 94 * * *top* * * last pid: 2430; load averages: 0.72, 0.73, 0.69 up 0+04:56:16 04:52:53 32 processes: 1 running, 31 sleeping CPU: 1.8% user, 0.0% nice, 5.3% system, 1.4% interrupt, 91.5% idle Mem: 1817M Active, 25M Inact, 36G Wired, 24K Cache, 699M Buf, 28G Free Swap: PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 1283 pgsql 1 34 0 3967M 1896M zio->i 5 80:14 21.00% postgres 1282 pgsql 1 25 0 25740K 3088K select 2 10:34 0.00% pgbench 1274 pgsql 1 20 0 2151M 76876K select 1 0:09 0.00% postgres On Wed, Sep 12, 2012 at 9:16 PM, Sébastien Lorion <s...@thestrangefactory.com>wrote: > I recreated the DB and WAL pools, and launched pgbench -i -s 10000. Here > are the stats during the load (still running): > > *iostat (xbd13-14 are WAL zpool)* > device r/s w/s kr/s kw/s qlen svc_t %b > xbd8 0.0 471.5 0.0 14809.3 40 67.9 84 > xbd7 0.0 448.1 0.0 14072.6 39 62.0 74 > xbd6 0.0 472.3 0.0 14658.6 39 61.3 77 > xbd5 0.0 464.7 0.0 14433.1 39 61.4 76 > xbd14 0.0 0.0 0.0 0.0 0 0.0 0 > xbd13 0.0 0.0 0.0 0.0 0 0.0 0 > xbd12 0.0 460.1 0.0 14189.7 40 63.4 78 > xbd11 0.0 462.9 0.0 14282.8 40 61.8 76 > xbd10 0.0 477.0 0.0 14762.1 38 61.2 77 > xbd9 0.0 477.6 0.0 14796.2 38 61.1 77 > > *zpool iostat (db pool)* > pool alloc free read write read write > db 11.1G 387G 0 6.62K 0 62.9M > > *vmstat* > procs memory page disks faults cpu > r b w avm fre flt re pi po fr sr ad0 xb8 in sy cs > us sy id > 0 0 0 3026M 35G 126 0 0 0 29555 0 0 478 2364 31201 26165 > 10 9 81 > > *top* > last pid: 1333; load averages: 1.89, 1.65, 1.08 up 0+01:17:08 > 01:13:45 > 32 processes: 2 running, 30 sleeping > CPU: 10.3% user, 0.0% nice, 7.8% system, 1.2% interrupt, 80.7% idle > Mem: 26M Active, 19M Inact, 33G Wired, 16K Cache, 25M Buf, 33G Free > > > > On Wed, Sep 12, 2012 at 9:02 PM, Sébastien Lorion < > s...@thestrangefactory.com> wrote: > > > > One more question .. I could not set wal_sync_method to anything else > but fsync .. is that expected or should other choices be also available ? I > am not sure how the EC2 SSD cache flushing is handled on EC2, but I hope it > is flushing the whole cache on every sync .. As a side note, I got > corrupted databases (errors about pg_xlog directories not found, etc) at > first when running my tests, and I suspect it was because of > vfs.zfs.cache_flush_disable=1, though I cannot prove it for sure. > > > > Sébastien > > > > > > On Wed, Sep 12, 2012 at 8:49 PM, Sébastien Lorion < > s...@thestrangefactory.com> wrote: > >> > >> Is dedicating 2 drives for WAL too much ? Since my whole raid is > comprised of SSD drives, should I just put it in the main pool ? > >> > >> Sébastien > >> > >> > >> On Wed, Sep 12, 2012 at 8:28 PM, Sébastien Lorion < > s...@thestrangefactory.com> wrote: > >>> > >>> Ok, make sense .. I will update that as well and report back. Thank > you for your advice. > >>> > >>> Sébastien > >>> > >>> > >>> On Wed, Sep 12, 2012 at 8:04 PM, John R Pierce <pie...@hogranch.com> > wrote: > >>>> > >>>> On 09/12/12 4:49 PM, Sébastien Lorion wrote: > >>>>> > >>>>> You set shared_buffers way below what is suggested in Greg Smith > book (25% or more of RAM) .. what is the rationale behind that rule of > thumb ? Other values are more or less what I set, though I could lower the > effective_cache_size and vfs.zfs.arc_max and see how it goes. > >>>> > >>>> > >>>> I think those 25% rules were typically created when ram was no more > than 4-8GB. > >>>> > >>>> for our highly transactional workload, at least, too large of a > shared_buffers seems to slow us down, perhaps due to higher overhead of > managing that many 8k buffers. I've heard other read-mostly workloads, > such as data warehousing, can take advantage of larger buffer counts. > >>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> john r pierce N 37, W 122 > >>>> santa cruz ca mid-left coast > >>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > >>>> To make changes to your subscription: > >>>> http://www.postgresql.org/mailpref/pgsql-general > >>> > >>> > >> > > >