Re: [ceph-users] Migration from "classless pre luminous" to"deviceclasses" CRUSH.

2018-02-03 Thread Konstantin Shalygin

Migration was complete flawless without any issues and slow requests.
Thanks.



k
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Migration from "classless pre luminous" to"deviceclasses" CRUSH.

2018-02-01 Thread Konstantin Shalygin

On 02/01/2018 08:56 PM, David Turner wrote:
You can attempt to mitigate this by creating new, duplicate rules and 
change 1 pool at a time to start using them.



Yes, I'm already prepared to this strategy.



k
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Migration from "classless pre luminous" to"deviceclasses" CRUSH.

2018-02-01 Thread David Turner
It doesn't matter what your failure domain is, the data movement is
significant to change your crush rules to use device classes. You can
attempt to mitigate this by creating new, duplicate rules and change 1 pool
at a time to start using them. In that way you can somewhat control the
backfilling unless you have 1 pool with a vast majority of the data.

On Thu, Feb 1, 2018, 5:31 AM Konstantin Shalygin  wrote:

>
> > We had a MASSIVE data movement upon changing the crush rules to device
> > class based one. I'm not sure about the exact reasons, but I assume that
> > the order of hosts in the crush tree has changed (hosts are ordered
> > lexically now...), which resulted in about 80% of data being moved
> around.
>
> What is you failure domain? Host I think?
>
> This cluster failure domain is rack.
>
>
>
> k
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Migration from "classless pre luminous" to"deviceclasses" CRUSH.

2018-02-01 Thread Konstantin Shalygin



We had a MASSIVE data movement upon changing the crush rules to device
class based one. I'm not sure about the exact reasons, but I assume that
the order of hosts in the crush tree has changed (hosts are ordered
lexically now...), which resulted in about 80% of data being moved around.


What is you failure domain? Host I think?

This cluster failure domain is rack.



k

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Migration from "classless pre luminous" to"deviceclasses" CRUSH.

2018-02-01 Thread Burkhard Linke

Hi,


On 02/01/2018 10:43 AM, Konstantin Shalygin wrote:


Hi cephers.


I have typical double root crush - for nvme pools and hdd pools 
created on Kraken cluster (what I mean: 
http://cephnotes.ksperis.com/blog/2015/02/02/crushmap-example-of-a-hierarchical-cluster-map).


Now cluster upgraded to Luminous and going to devices classes crush 
rules and I looking for experience.


1. Enable new crush rule with devices-class is safe for data and clients?

2. How much data movement? Should I be ready for slow requests?



We have changed our similar setup to a device class based one. According 
to the documentation the device classes are implemented by 'shadow' 
crush tree. 'ceph osd crush tree --show-shadow' displays all tree 
including the device class specific ones. This allows the device class 
setup to be backwards compatible with older releases.


We had a MASSIVE data movement upon changing the crush rules to device 
class based one. I'm not sure about the exact reasons, but I assume that 
the order of hosts in the crush tree has changed (hosts are ordered 
lexically now...), which resulted in about 80% of data being moved around.


So be prepared for slow requests, and set the corresponding 
configuration values to reduce the backfill impact.


Regards,
Burkhard
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com