On Mon, Jan 2, 2017 at 6:46 AM, Wido den Hollander <[email protected]> wrote:

>
> > Op 2 januari 2017 om 15:43 schreef Matteo Dacrema <[email protected]>:
> >
> >
> > Increasing pg_num will lead to several slow requests and cluster freeze,
> but  due to creating pgs operation , for what I’ve seen until now.
> > During the creation period all the request are frozen , and the creation
> period take a lot of time even for 128 pgs.
> >
> > I’ve observed that during creation period most of the OSD goes at 100%
> of their performance capacity. I think that without operation running in
> the cluster I’ll be able to upgrade pg_num quickly without causing down
> time several times.
> >
>
> First, slowly increase pg_num to the number you want, then increase
> pgp_num in small baby steps as well.
>
> Wido
>

As Wido mentioned, low+slow is the way to go for production environments.
increase in small increments.

pg_num increases should be fairly transparent to client IO, but test first
by increasing your pool in increasing amounts. pgp_num increase will cause
client interruption in a lot of cases, so this is what you'll need to be
wary of.

Here's some select logic from a quick and dirty script I wrote to do the
last PG increase job, maybe it will help in your endeavors:

https://gist.github.com/oddomatik/7cca9b64d7b13d17e800cc35894037ac


>
> > Matteo
> >
> > > Il giorno 02 gen 2017, alle ore 15:02, [email protected] ha scritto:
> > >
> > > Well, as the doc said:
> > >> Set or clear the pause flags in the OSD map. If set, no IO requests
> will be sent to any OSD. Clearing the flags via unpause results in
> resending pending requests.
> > > If you do that on a production cluster, that means your cluster will no
> > > longer be in production :)
> > >
> > > Depending on your needs, but ..
> > > Maybe you want do this operation as fast as possible
> > > Or maybe you want to make that operation as transparent as possible,
> > > from a user point of view
> > >
> > > You may have a look at osd_recovery_op_priority &
> > > osd_client_op_priority, they might be interesting for you
> > >
> > > On 02/01/2017 14:37, Matteo Dacrema wrote:
> > >> Hi All,
> > >>
> > >> what happen if I set pause flag on a production cluster?
> > >> I mean, will all the request remain pending/waiting or all the
> volumes attached to the VMs will become read-only?
> > >>
> > >> I need to quickly upgrade placement group number from 3072 to 8192 or
> better to 165336 and I think doing it without client operations will be
> much faster.
> > >>
> > >> Thanks
> > >> Regards
> > >> Matteo
> > >>
> > >>
> > >>
> > >>
> > >> _______________________________________________
> > >> ceph-users mailing list
> > >> [email protected]
> > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > >>
> > >
> > > _______________________________________________
> > > ceph-users mailing list
> > > [email protected]
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > >
> > > --
> > > Questo messaggio e' stato analizzato con Libra ESVA ed e' risultato
> non infetto.
> > > Seguire il link qui sotto per segnalarlo come spam:
> > > http://mx01.enter.it/cgi-bin/learn-msg.cgi?id=9F3C956B85.A333A
> > >
> > >
> >
> > _______________________________________________
> > ceph-users mailing list
> > [email protected]
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Brian Andrus
Cloud Systems Engineer
DreamHost, LLC
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to