Re: [ceph-users] osds with different disk sizes may killing, > performance (?? ?)

2018-04-19 Thread Van Leeuwen, Robert
>> There is no way to fill up all disks evenly with the same number of
>> Bytes and then stop filling the small disks when they're full and
>> only continue filling the larger disks.

>This is possible with adjusting crush weights.  Initially the smaller 
>drives are weighted more highly than larger drives.  As data gets added 
>the weights are changed so that larger drives continue to fill while no 
>drives becomes overfull.

So IMHO when you diverge from the default crush weight (e.g. for performance) 
it is something you should do permanently or
have a very clear path to what the next steps will be (e.g. we need to do this 
temporarily until the new hardware comes in).

You gain some short-term performance while you have the space.
However, as the cluster gets fuller:
You will need to change the weight which results in a lot of data movement 
which is a pain in itself especially when the cluster is near its IO limits.
Finally, you still end up with the exact same (bad) performance but it might 
now be a performance cliff instead of a gradual worsening over time.

A possible way to solve this would be to implement some mechanism to reshuffle 
the data based on the IO-patterns so each OSD get the same IO-pressure (or even 
better based on a new "IOPS" weight you can set).
So you could give each OSD the proper "size weight" and ceph would make sure 
you get the optimal IO performance by making sure each disk has the proper 
amount of "hot" data where the IOs happen.
But I guess that’s a very hard thing to build properly. 

Final note: if you only have SSDs in the cluster the problem might not be there 
because usually bigger SSDs are also faster :)

Cheers,
Robert van Leeuwen



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] osds with different disk sizes may killing, > performance (?? ?)

2018-04-18 Thread Chad William Seys
You'll find it said time and time agin on the ML... avoid disks of 
different sizes in the same cluster. It's a headache that sucks. It's

not impossible, it's not even overly hard to pull off... but it's
very easy to cause a mess and a lot of headaches. It will also make
it harder to diagnose performance issues in the cluster.

Not very practical for clusters which aren't new.


There is no way to fill up all disks evenly with the same number of
Bytes and then stop filling the small disks when they're full and
only continue filling the larger disks.


This is possible with adjusting crush weights.  Initially the smaller 
drives are weighted more highly than larger drives.  As data gets added 
the weights are changed so that larger drives continue to fill while no 
drives becomes overfull.



What will happen if you are filling all disks evenly with Bytes
instead of % is that the small disks will get filled completely and
all writes to the cluster will block until you do something to reduce
the amount used on the full disks.
That means the crush weights were not adjusted correctly as the cluster 
filled.



but in this case you would have a steep drop off of performance. when
you reach the fill level where small drives do not accept more data,
suddenly you would have a performance cliff where only your larger disks
are doing new writes. and only larger disks doing reads on new data.


Good point!  Although if this is implemented by changing crush weights, 
adjusting the weights as the cluster fills will cause the data to churn 
and the new data will not only be assigned to larger drives. :)


Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] osds with different disk sizes may killing, > performance (?? ?)

2018-04-16 Thread Chad William Seys
You'll find it said time and time agin on the ML... avoid disks of 
different sizes in the same cluster. It's a headache that sucks. It's

not impossible, it's not even overly hard to pull off... but it's
very easy to cause a mess and a lot of headaches. It will also make
it harder to diagnose performance issues in the cluster.

Not very practical for clusters which aren't new.


There is no way to fill up all disks evenly with the same number of
Bytes and then stop filling the small disks when they're full and
only continue filling the larger disks.


This is possible with adjusting crush weights.  Initially the smaller 
drives are weighted more highly than larger drives.  As data gets added 
the weights are changed so that larger drives continue to fill while no 
drives becomes overfull.



What will happen if you are filling all disks evenly with Bytes
instead of % is that the small disks will get filled completely and
all writes to the cluster will block until you do something to reduce
the amount used on the full disks.
That means the crush weights were not adjusted correctly as the cluster 
filled.



but in this case you would have a steep drop off of performance. when
you reach the fill level where small drives do not accept more data,
suddenly you would have a performance cliff where only your larger disks
are doing new writes. and only larger disks doing reads on new data.


Good point!  Although if this is implemented by changing crush weights, 
adjusting the weights as the cluster fills will cause the data to churn 
and the new data will not only be assigned to larger drives. :)


Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osds with different disk sizes may killing performance (?? ?)

2018-04-13 Thread David Turner
You'll find it said time and time agin on the ML... avoid disks of
different sizes in the same cluster.  It's a headache that sucks.  It's not
impossible, it's not even overly hard to pull off... but it's very easy to
cause a mess and a lot of headaches.  It will also make it harder to
diagnose performance issues in the cluster.

There is no way to fill up all disks evenly with the same number of Bytes
and then stop filling the small disks when they're full and only continue
filling the larger disks.  What will happen if you are filling all disks
evenly with Bytes instead of % is that the small disks will get filled
completely and all writes to the cluster will block until you do something
to reduce the amount used on the full disks.

On Fri, Apr 13, 2018 at 1:28 AM Ronny Aasen 
wrote:

> On 13. april 2018 05:32, Chad William Seys wrote:
> > Hello,
> >I think your observations suggest that, to a first approximation,
> > filling drives with bytes to the same absolute level is better for
> > performance than filling drives to the same percentage full. Assuming
> > random distribution of PGs, this would cause the smallest drives to be
> > as active as the largest drives.
> >E.g. if every drive had 1TB of data, each would be equally likely to
> > contain the PG of interest.
> >Of course, as more data was added the smallest drives could not hold
> > more and the larger drives become more active, but at least the smaller
> > drives would as active as possible.
>
> but in this case you would have a steep drop off of performance. when
> you reach the fill level where small drives do not accept more data,
> suddenly you would have a performance cliff where only your larger disks
> are doing new writes. and only larger disks doing reads on new data.
>
>
> it is also easier to make the logical connection while you are
> installing new nodes/disks. then a year later when your cluster just
> happen to reach that fill level.
>
> it would also be an easier job balancing disks between nodes when you
> are adding osd's anyway and the new ones are mostly empty. rather then
> when your small osd's are full and your large disks have significant
> data on them.
>
>
>
> kind regards
> Ronny Aasen
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osds with different disk sizes may killing performance (?? ?)

2018-04-12 Thread Ronny Aasen

On 13. april 2018 05:32, Chad William Seys wrote:

Hello,
   I think your observations suggest that, to a first approximation, 
filling drives with bytes to the same absolute level is better for 
performance than filling drives to the same percentage full. Assuming 
random distribution of PGs, this would cause the smallest drives to be 
as active as the largest drives.
   E.g. if every drive had 1TB of data, each would be equally likely to 
contain the PG of interest.
   Of course, as more data was added the smallest drives could not hold 
more and the larger drives become more active, but at least the smaller 
drives would as active as possible.


but in this case you would have a steep drop off of performance. when 
you reach the fill level where small drives do not accept more data, 
suddenly you would have a performance cliff where only your larger disks 
are doing new writes. and only larger disks doing reads on new data.



it is also easier to make the logical connection while you are 
installing new nodes/disks. then a year later when your cluster just 
happen to reach that fill level.


it would also be an easier job balancing disks between nodes when you 
are adding osd's anyway and the new ones are mostly empty. rather then 
when your small osd's are full and your large disks have significant 
data on them.




kind regards
Ronny Aasen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osds with different disk sizes may killing performance (?? ?)

2018-04-12 Thread Chad William Seys

Hello,
  I think your observations suggest that, to a first approximation, 
filling drives with bytes to the same absolute level is better for 
performance than filling drives to the same percentage full. Assuming 
random distribution of PGs, this would cause the smallest drives to be 
as active as the largest drives.
  E.g. if every drive had 1TB of data, each would be equally likely to 
contain the PG of interest.
  Of course, as more data was added the smallest drives could not hold 
more and the larger drives become more active, but at least the smaller 
drives would as active as possible.


Thanks!
Chad.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osds with different disk sizes may killing performance

2018-04-12 Thread Steve Taylor
I can't comment directly on the relation XFS fragmentation has to Bluestore, 
but I had a similar issue probably 2-3 years ago where XFS fragmentation was 
causing a significant degradation in cluster performance. The use case was RBDs 
with lots of snapshots created and deleted at regular intervals. XFS got pretty 
severely fragmented and the cluster slowed down quickly.

The solution I found was to set the XFS allocsize to match the RBD object size 
via osd_mount_options_xfs. Of course I also had to defragment XFS to clear up 
the existing fragmentation, but that was fairly painless. XFS fragmentation 
hasn't been an issue since. That solution isn't as applicable in an object 
store use case where the object size is more variable, but increasing the XFS 
allocsize could still help.

As far as Bluestore goes, I haven't deployed it in production yet, but I would 
expect that manipulating bluestore_min_alloc_size in a similar fashion would 
yield similar benefits. Of course you are then wasting some disk space for 
every object that ends up being smaller than that allocation size in both 
cases. That's the trade-off.




[cid:SC_LOGO_VERT_4C_100x72_f823be1a-ae53-43d3-975c-b054a1b22ec3.jpg]


Steve Taylor | Senior Software Engineer | StorageCraft Technology 
Corporation<https://storagecraft.com>
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2799 |



If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.



On Thu, 2018-04-12 at 04:13 +0200, Marc Roos wrote:


Is that not obvious? The 8TB is handling twice as much as the 4TB. Afaik
there is not a linear relationship with the iops of a disk and its size.


But interesting about this xfs defragmentation, how does this
relate/compare to bluestore?





-Original Message-
From: ? ?? [mailto:yaozong...@outlook.com]
Sent: donderdag 12 april 2018 4:36
To: ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: *SPAM* [ceph-users] osds with different disk sizes may
killing performance
Importance: High

Hi,

For anybody who may be interested, here I share a process of locating
the reason for ceph cluster performance slow down in our environment.

Internally, we have a cluster with capacity 1.1PB, used 800TB, and raw
user data is about 500TB. Each day, 3TB' data is uploaded and 3TB oldest
data is lifecycled (we are using s3 object store, and bucket lifecycle
is enabled). As time goes by, the cluster becomes some slower, we doubt
the xfs fragmentation is the fiend.

After some testing, we do find xfs fragmentation slow down filestore's
performance, for example, at 15% fragmentation, the performance is 85%
of the original, and at 25%, the performance is 74.73% of the original.

But the main reason for our cluster's deterioration of performance is
not the xfs fragmentation.

Initially, our ceph cluster contains only osds with 4TB's disk, as time
goes by, we scale out our cluster by adding some new osds with 8TB's
disk. And as the new disk's capacity is double times of the old disks,
so each new osd's weight is double of the old osd. And new osd has
double pgs than old osd, and new osd used double disk space than the old
osd. Everything looks good and fine.

But even though the new osd has double capacity than the old osd, the
new osd's performance is not double than the old osd. After digging into
our internal system stats, we find the new added's disk io util is about
two times than the old. And from time to time, the new disks' io util
rises up to 100%. The new added osds are the performance killer. They
slow down the whole cluster's performance.

As the reason is found, the solution is very simple. After lower new
added osds's weight, the annoying slow request warnings have died away.

So the conclusion is: in cluster with different osd's disk size, osd's
weight is not only determined by its capacity, we should also have a
look at its performance.

Best wishes,
Yao Zongyou
___
ceph-users mailing list
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osds with different disk sizes may killing performance

2018-04-12 Thread ulembke

Hi,
you can also set the primary_affinity to 0.5 at the 8TB-disks to lower 
the reading access (in this case you don't waste 50% of space).



Udo

Am 2018-04-12 04:36, schrieb ? ??:

Hi, 

For anybody who may be interested, here I share a process of locating
the reason for ceph cluster performance slow down in our environment.

Internally, we have a cluster with capacity 1.1PB, used 800TB, and raw
user data is about 500TB. Each day, 3TB' data is uploaded and 3TB
oldest data is lifecycled (we are using s3 object store, and bucket
lifecycle is enabled). As time goes by, the cluster becomes some
slower, we doubt the xfs fragmentation is the fiend. 

After some testing, we do find xfs fragmentation slow down filestore's
performance, for example, at 15% fragmentation, the performance is 85%
of the original, and at 25%, the performance is 74.73% of the
original.

But the main reason for our cluster's deterioration of performance is
not the xfs fragmentation.

Initially, our ceph cluster contains only osds with 4TB's disk, as
time goes by, we scale out our cluster by adding some new osds with
8TB's disk. And as the new disk's capacity is double times of the old
disks, so each new osd's weight is double of the old osd. And new osd
has double pgs than old osd, and new osd used double disk space than
the old osd. Everything looks good and fine.

But even though the new osd has double capacity than the old osd, the
new osd's performance is not double than the old osd. After digging
into our internal system stats, we find the new added's disk io util
is about two times than the old. And from time to time, the new disks'
io util rises up to 100%. The new added osds are the performance
killer. They slow down the whole cluster's performance.

As the reason is found, the solution is very simple. After lower new
added osds's weight, the annoying slow request warnings have died
away.

So the conclusion is: in cluster with different osd's disk size, osd's
weight is not only determined by its capacity, we should also have a
look at its performance.

Best wishes,
Yao Zongyou
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osds with different disk sizes may killing performance

2018-04-12 Thread Marc Roos
 
Is that not obvious? The 8TB is handling twice as much as the 4TB. Afaik 
there is not a linear relationship with the iops of a disk and its size. 


But interesting about this xfs defragmentation, how does this 
relate/compare to bluestore?





-Original Message-
From: ? ?? [mailto:yaozong...@outlook.com] 
Sent: donderdag 12 april 2018 4:36
To: ceph-users@lists.ceph.com
Subject: *SPAM* [ceph-users] osds with different disk sizes may 
killing performance
Importance: High

Hi, 

For anybody who may be interested, here I share a process of locating 
the reason for ceph cluster performance slow down in our environment.

Internally, we have a cluster with capacity 1.1PB, used 800TB, and raw 
user data is about 500TB. Each day, 3TB' data is uploaded and 3TB oldest 
data is lifecycled (we are using s3 object store, and bucket lifecycle 
is enabled). As time goes by, the cluster becomes some slower, we doubt 
the xfs fragmentation is the fiend. 

After some testing, we do find xfs fragmentation slow down filestore's 
performance, for example, at 15% fragmentation, the performance is 85% 
of the original, and at 25%, the performance is 74.73% of the original.

But the main reason for our cluster's deterioration of performance is 
not the xfs fragmentation.

Initially, our ceph cluster contains only osds with 4TB's disk, as time 
goes by, we scale out our cluster by adding some new osds with 8TB's 
disk. And as the new disk's capacity is double times of the old disks, 
so each new osd's weight is double of the old osd. And new osd has 
double pgs than old osd, and new osd used double disk space than the old 
osd. Everything looks good and fine.

But even though the new osd has double capacity than the old osd, the 
new osd's performance is not double than the old osd. After digging into 
our internal system stats, we find the new added's disk io util is about 
two times than the old. And from time to time, the new disks' io util 
rises up to 100%. The new added osds are the performance killer. They 
slow down the whole cluster's performance.

As the reason is found, the solution is very simple. After lower new 
added osds's weight, the annoying slow request warnings have died away.

So the conclusion is: in cluster with different osd's disk size, osd's 
weight is not only determined by its capacity, we should also have a 
look at its performance.

Best wishes,
Yao Zongyou
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osds with different disk sizes may killing performance

2018-04-12 Thread Konstantin Shalygin

On 04/12/2018 11:21 AM, 宗友 姚 wrote:

Currently, this can only be done by hand. Maybe we need some scripts to handle 
this automatically.



Mixed hosts, i.e. half old disks + half new disks is better than "old 
hosts" and "new hosts" in your case.




k
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osds with different disk sizes may killing performance

2018-04-12 Thread Wido den Hollander


On 04/12/2018 04:36 AM, ? ?? wrote:
> Hi, 
> 
> For anybody who may be interested, here I share a process of locating the 
> reason for ceph cluster performance slow down in our environment.
> 
> Internally, we have a cluster with capacity 1.1PB, used 800TB, and raw user 
> data is about 500TB. Each day, 3TB' data is uploaded and 3TB oldest data is 
> lifecycled (we are using s3 object store, and bucket lifecycle is enabled). 
> As time goes by, the cluster becomes some slower, we doubt the xfs 
> fragmentation is the fiend. 
> 
> After some testing, we do find xfs fragmentation slow down filestore's 
> performance, for example, at 15% fragmentation, the performance is 85% of the 
> original, and at 25%, the performance is 74.73% of the original.
> 
> But the main reason for our cluster's deterioration of performance is not the 
> xfs fragmentation.
> 
> Initially, our ceph cluster contains only osds with 4TB's disk, as time goes 
> by, we scale out our cluster by adding some new osds with 8TB's disk. And as 
> the new disk's capacity is double times of the old disks, so each new osd's 
> weight is double of the old osd. And new osd has double pgs than old osd, and 
> new osd used double disk space than the old osd. Everything looks good and 
> fine.
> 
> But even though the new osd has double capacity than the old osd, the new 
> osd's performance is not double than the old osd. After digging into our 
> internal system stats, we find the new added's disk io util is about two 
> times than the old. And from time to time, the new disks' io util rises up to 
> 100%. The new added osds are the performance killer. They slow down the whole 
> cluster's performance.
> 
> As the reason is found, the solution is very simple. After lower new added 
> osds's weight, the annoying slow request warnings have died away.
> 

This is to be expected. However, lowering the weight of new disks means
that you can't fully use their storage capacity.

This is the nature of having a heterogeneous cluster with Ceph.
Different disks of different sizes mean that performance will fluctuate.

Wido

> So the conclusion is: in cluster with different osd's disk size, osd's weight 
> is not only determined by its capacity, we should also have a look at its 
> performance.>
> Best wishes,
> Yao Zongyou
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osds with different disk sizes may killing performance

2018-04-11 Thread 宗友 姚
Currently, this can only be done by hand. Maybe we need some scripts to handle 
this automatically.
I don't know if 
https://github.com/ceph/ceph/tree/master/src/pybind/mgr/balancer can handle 
this.


From: Konstantin Shalygin <k0...@k0ste.ru>
Sent: Thursday, April 12, 2018 12:00
To: ceph-users@lists.ceph.com
Cc: ?? ?
Subject: Re: [ceph-users] osds with different disk sizes may killing performance

On 04/12/2018 10:58 AM, ?? ? wrote:
> Yes, according to crush algorithms, large drives are given high weight, this 
> is expected. By default, crush gives no consideration of each drive's 
> performance, which may cause the performance distribution is not balanced. 
> And the highest io util osd may slow down the whole cluster.


You can control 'how much use' drive via adjusting crush weights.




k
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osds with different disk sizes may killing performance

2018-04-11 Thread Konstantin Shalygin

On 04/12/2018 10:58 AM, ?? ? wrote:

Yes, according to crush algorithms, large drives are given high weight, this is 
expected. By default, crush gives no consideration of each drive's performance, 
which may cause the performance distribution is not balanced. And the highest 
io util osd may slow down the whole cluster.



You can control 'how much use' drive via adjusting crush weights.




k
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osds with different disk sizes may killing performance

2018-04-11 Thread ?? ?
Yes, according to crush algorithms, large drives are given high weight, this is 
expected. By default, crush gives no consideration of each drive's performance, 
which may cause the performance distribution is not balanced. And the highest 
io util osd may slow down the whole cluster.


From: Konstantin Shalygin <k0...@k0ste.ru>
Sent: Thursday, April 12, 2018 11:29
To: ceph-users@lists.ceph.com
Cc: ? ??
Subject: Re: [ceph-users] osds with different disk sizes may killing performance

> After digging into our internal system stats, we find the new added's disk io 
> util is about two times than the old.

This is obviously and expected. Yours 8Tb drives weighted double against
4Tb and do *double* crush work in comparison.




k

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osds with different disk sizes may killing performance

2018-04-11 Thread Konstantin Shalygin

After digging into our internal system stats, we find the new added's disk io 
util is about two times than the old.


This is obviously and expected. Yours 8Tb drives weighted double against 
4Tb and do *double* crush work in comparison.





k

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] osds with different disk sizes may killing performance

2018-04-11 Thread ? ??
Hi, 

For anybody who may be interested, here I share a process of locating the 
reason for ceph cluster performance slow down in our environment.

Internally, we have a cluster with capacity 1.1PB, used 800TB, and raw user 
data is about 500TB. Each day, 3TB' data is uploaded and 3TB oldest data is 
lifecycled (we are using s3 object store, and bucket lifecycle is enabled). As 
time goes by, the cluster becomes some slower, we doubt the xfs fragmentation 
is the fiend. 

After some testing, we do find xfs fragmentation slow down filestore's 
performance, for example, at 15% fragmentation, the performance is 85% of the 
original, and at 25%, the performance is 74.73% of the original.

But the main reason for our cluster's deterioration of performance is not the 
xfs fragmentation.

Initially, our ceph cluster contains only osds with 4TB's disk, as time goes 
by, we scale out our cluster by adding some new osds with 8TB's disk. And as 
the new disk's capacity is double times of the old disks, so each new osd's 
weight is double of the old osd. And new osd has double pgs than old osd, and 
new osd used double disk space than the old osd. Everything looks good and fine.

But even though the new osd has double capacity than the old osd, the new osd's 
performance is not double than the old osd. After digging into our internal 
system stats, we find the new added's disk io util is about two times than the 
old. And from time to time, the new disks' io util rises up to 100%. The new 
added osds are the performance killer. They slow down the whole cluster's 
performance.

As the reason is found, the solution is very simple. After lower new added 
osds's weight, the annoying slow request warnings have died away.

So the conclusion is: in cluster with different osd's disk size, osd's weight 
is not only determined by its capacity, we should also have a look at its 
performance.

Best wishes,
Yao Zongyou
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com