Re: [GENERAL] Amazon High I/O instances

Sébastien Lorion Wed, 12 Sep 2012 23:28:41 -0700

pgbench initialization has been going on for almost 5 hours now and still
stuck before vacuum starts .. something is definitely wrong as I don't
remember it took so long first time I created the db. Here are the current
stats now:


*iostat (xbd13-14 are WAL zpool)*

 device     r/s   w/s    kr/s    kw/s qlen svc_t  %b
xbd8     161.3 109.8  1285.4  3450.5    0  12.5  19
xbd7     159.5 110.6  1272.3  3450.5    0  11.4  14
xbd6     161.1 108.8  1284.4  3270.6    0  10.9  14
xbd5     159.5 109.0  1273.1  3270.6    0  11.6  15
xbd14      0.0   0.0     0.0     0.0    0   0.0   0
xbd13      0.0   0.0     0.0     0.0    0   0.0   0
xbd12    204.6 110.8  1631.3  3329.2    0   9.1  15
xbd11    216.0 111.2  1722.5  3329.2    1   8.6  16
xbd10    197.2 109.4  1573.5  3285.8    0   9.8  15
xbd9     195.0 109.4  1557.1  3285.8    0   9.9  15
*
*
*zpool iostat (db pool)*
pool        alloc   free   read  write   read  write
db           143G   255G  1.40K  1.53K  11.2M  12.0M

*vmstat*
*
*
procs      memory      page                    disks     faults         cpu
 r b w     avm    fre   flt  re  pi  po    fr  sr ad0 xb8   in   sy   cs us
sy id
 0 0 0   5634M    28G     7   0   0   0  7339   0   0 245 2091 6358 20828
 2  5 93
 0 0 0   5634M    28G    10   0   0   0  6989   0   0 312 1993 6033 20090
 1  4 95
 0 0 0   5634M    28G     7   0   0   0  6803   0   0 292 1974 6111 22763
 2  5 93
 0 0 0   5634M    28G    10   0   0   0  7418   0   0 339 2041 6170 20838
 2  4 94
 0 0 0   5634M    28G   123   0   0   0  6980   0   0 282 1977 5906 19961
 2  4 94
*
*
*top*
*
*
 last pid:  2430;  load averages:  0.72,  0.73,  0.69         up 0+04:56:16
 04:52:53
32 processes:  1 running, 31 sleeping
CPU:  1.8% user,  0.0% nice,  5.3% system,  1.4% interrupt, 91.5% idle
Mem: 1817M Active, 25M Inact, 36G Wired, 24K Cache, 699M Buf, 28G Free
Swap:

  PID USERNAME  THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
 1283 pgsql       1  34    0  3967M  1896M zio->i  5  80:14 21.00% postgres
 1282 pgsql       1  25    0 25740K  3088K select  2  10:34  0.00% pgbench
 1274 pgsql       1  20    0  2151M 76876K select  1   0:09  0.00% postgres

On Wed, Sep 12, 2012 at 9:16 PM, Sébastien Lorion
<s...@thestrangefactory.com>wrote:

> I recreated the DB and WAL pools, and launched pgbench -i -s 10000. Here
> are the stats during the load (still running):
>
> *iostat (xbd13-14 are WAL zpool)*
> device     r/s   w/s    kr/s    kw/s qlen svc_t  %b
> xbd8       0.0 471.5     0.0 14809.3   40  67.9  84
> xbd7       0.0 448.1     0.0 14072.6   39  62.0  74
> xbd6       0.0 472.3     0.0 14658.6   39  61.3  77
> xbd5       0.0 464.7     0.0 14433.1   39  61.4  76
> xbd14      0.0   0.0     0.0     0.0    0   0.0   0
> xbd13      0.0   0.0     0.0     0.0    0   0.0   0
> xbd12      0.0 460.1     0.0 14189.7   40  63.4  78
> xbd11      0.0 462.9     0.0 14282.8   40  61.8  76
> xbd10      0.0 477.0     0.0 14762.1   38  61.2  77
> xbd9       0.0 477.6     0.0 14796.2   38  61.1  77
>
> *zpool iostat (db pool)*
> pool        alloc   free   read  write   read  write
> db          11.1G   387G      0  6.62K      0  62.9M
>
> *vmstat*
> procs      memory      page                    disks     faults         cpu
>  r b w     avm    fre   flt  re  pi  po    fr  sr ad0 xb8   in   sy   cs
> us sy id
>  0 0 0   3026M    35G   126   0   0   0 29555   0   0 478 2364 31201 26165
> 10  9 81
>
> *top*
> last pid:  1333;  load averages:  1.89,  1.65,  1.08      up 0+01:17:08
>  01:13:45
> 32 processes:  2 running, 30 sleeping
> CPU: 10.3% user,  0.0% nice,  7.8% system,  1.2% interrupt, 80.7% idle
> Mem: 26M Active, 19M Inact, 33G Wired, 16K Cache, 25M Buf, 33G Free
>
>
>
> On Wed, Sep 12, 2012 at 9:02 PM, Sébastien Lorion <
> s...@thestrangefactory.com> wrote:
> >
> > One more question .. I could not set wal_sync_method to anything else
> but fsync .. is that expected or should other choices be also available ? I
> am not sure how the EC2 SSD cache flushing is handled on EC2, but I hope it
> is flushing the whole cache on every sync .. As a side note, I got
> corrupted databases (errors about pg_xlog directories not found, etc) at
> first when running my tests, and I suspect it was because of
> vfs.zfs.cache_flush_disable=1, though I cannot prove it for sure.
> >
> > Sébastien
> >
> >
> > On Wed, Sep 12, 2012 at 8:49 PM, Sébastien Lorion <
> s...@thestrangefactory.com> wrote:
> >>
> >> Is dedicating 2 drives for WAL too much ? Since my whole raid is
> comprised of SSD drives, should I just put it in the main pool ?
> >>
> >> Sébastien
> >>
> >>
> >> On Wed, Sep 12, 2012 at 8:28 PM, Sébastien Lorion <
> s...@thestrangefactory.com> wrote:
> >>>
> >>> Ok, make sense .. I will update that as well and report back. Thank
> you for your advice.
> >>>
> >>> Sébastien
> >>>
> >>>
> >>> On Wed, Sep 12, 2012 at 8:04 PM, John R Pierce <pie...@hogranch.com>
> wrote:
> >>>>
> >>>> On 09/12/12 4:49 PM, Sébastien Lorion wrote:
> >>>>>
> >>>>> You set shared_buffers way below what is suggested in Greg Smith
> book (25% or more of RAM) .. what is the rationale behind that rule of
> thumb ? Other values are more or less what I set, though I could lower the
> effective_cache_size and vfs.zfs.arc_max and see how it goes.
> >>>>
> >>>>
> >>>> I think those 25% rules were typically created when ram was no more
> than 4-8GB.
> >>>>
> >>>> for our highly transactional workload, at least, too large of a
> shared_buffers seems to slow us down, perhaps due to higher overhead of
> managing that many 8k buffers.    I've heard other read-mostly workloads,
> such as data warehousing, can take advantage of larger buffer counts.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> john r pierce                            N 37, W 122
> >>>> santa cruz ca                         mid-left coast
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> >>>> To make changes to your subscription:
> >>>> http://www.postgresql.org/mailpref/pgsql-general
> >>>
> >>>
> >>
> >
>

Re: [GENERAL] Amazon High I/O instances

Reply via email to