Thanks Dan,

After talking it through we've decided to adopt your approach too and leave
the tunables till after the upgrade.

Regards,
Adrian.

On Mon, May 14, 2018 at 5:14 PM, Dan van der Ster <d...@vanderster.com>
wrote:

> Hi Adrian,
>
> Is there a strict reason why you *must* upgrade the tunables?
>
> It is normally OK to run with old (e.g. hammer) tunables on a luminous
> cluster. The crush placement won't be state of the art, but that's not
> a huge problem.
>
> We have a lot of data in a jewel cluster with hammer tunables. We'll
> upgrade that to luminous soon, but don't plan to set chooseleaf_stable
> until there's less disruptive procedure, e.g.  [1].
>
> Cheers, Dan
>
> [1] One idea I had to make this much less disruptive would be to
> script something that uses upmap's to lock all PGs into their current
> placement, then set chooseleaf_stable, then gradually remove the
> upmap's. There are some details to work out, and it requires all
> clients to be running luminous, but I think something like this could
> help...
>
>
>
>
> On Mon, May 14, 2018 at 9:01 AM, Adrian <aussie...@gmail.com> wrote:
> > Hi all,
> >
> > We recently upgraded our old ceph cluster to jewel (5xmon, 21xstorage
> hosts
> > with 9x6tb filestore osds and 3xssd's with 3 journals on each) - mostly
> used
> > for openstack compute/cinder.
> >
> > In order to get there we had to go with chooseleaf_vary_r = 4 in order to
> > minimize client impact and save time. We now need to get to luminous (on
> a
> > deadline and time is limited).
> >
> > Current tunables are:
> >   {
> >       "choose_local_tries": 0,
> >       "choose_local_fallback_tries": 0,
> >       "choose_total_tries": 50,
> >       "chooseleaf_descend_once": 1,
> >       "chooseleaf_vary_r": 4,
> >       "chooseleaf_stable": 0,
> >       "straw_calc_version": 1,
> >       "allowed_bucket_algs": 22,
> >       "profile": "unknown",
> >       "optimal_tunables": 0,
> >       "legacy_tunables": 0,
> >       "minimum_required_version": "firefly",
> >       "require_feature_tunables": 1,
> >       "require_feature_tunables2": 1,
> >       "has_v2_rules": 0,
> >       "require_feature_tunables3": 1,
> >       "has_v3_rules": 0,
> >       "has_v4_buckets": 0,
> >       "require_feature_tunables5": 0,
> >       "has_v5_rules": 0
> >   }
> >
> > Setting chooseleaf_stable to 1, the crush compare tool says:
> >    Replacing the crushmap specified with --origin with the crushmap
> >   specified with --destination will move 8774 PGs (59.08417508417509% of
> the
> > total)
> >   from one item to another.
> >
> > Current tunings we have in ceph.conf are:
> >   #THROTTLING CEPH
> >   osd_max_backfills = 1
> >   osd_recovery_max_active = 1
> >   osd_recovery_op_priority = 1
> >   osd_client_op_priority = 63
> >
> >   #PERFORMANCE TUNING
> >   osd_op_threads = 6
> >   filestore_op_threads = 10
> >   filestore_max_sync_interval = 30
> >
> > I was wondering if anyone has any advice as to anything else we can do
> > balancing client impact and speed of recovery or war stories of other
> things
> > to consider.
> >
> > I'm also wondering about the interplay between chooseleaf_vary_r and
> > chooseleaf_stable.
> > Are we better with
> > 1) sticking with choosleaf_vary_r = 4, setting chooseleaf_stable =1,
> > upgrading and then setting chooseleaf_vary_r incrementally to 1 when more
> > time is available
> > or
> > 2) setting chooseleaf_vary_r incrementally first, then chooseleaf_stable
> and
> > finally upgrade
> >
> > All this bearing in mind we'd like to keep the time it takes us to get to
> > luminous as short as possible ;-) (guestimating a 59% rebalance to take
> many
> > days)
> >
> > Any advice/thoughts gratefully received.
> >
> > Regards,
> > Adrian.
> >
> > --
> > ---
> > Adrian : aussie...@gmail.com
> > If violence doesn't solve your problem, you're not using enough of it.
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>



-- 
---
Adrian : aussie...@gmail.com
If violence doesn't solve your problem, you're not using enough of it.
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to