And some output from rest-bench:
2013-03-04 19:31:41.503865min lat: 0.166207 max lat: 3.44611 avg lat: 0.911577
2013-03-04 19:31:41.503865 sec Cur ops started finished avg MB/s
cur MB/s last lat avg lat
2013-03-04 19:31:41.503865 40 16 715 699 69.7985
64 1.54288 0.911577
2013-03-04 19:31:42.504218 41 16 721 705 68.6825
24 0.949049 0.909889
2013-03-04 19:31:43.504528 42 16 742 726 69.0462
84 0.566944 0.9164
2013-03-04 19:31:44.504857 43 16 761 745 69.2071
76 1.17317 0.919921
2013-03-04 19:31:45.505099 44 16 766 750 68.0899
20 1.23423 0.918905
2013-03-04 19:31:46.506975 45 16 785 769 68.2626
76 0.711296 0.92321
2013-03-04 19:31:47.507964 46 16 794 778 67.5607
36 1.79786 0.926638
2013-03-04 19:31:48.508148 47 16 812 796 67.6548
72 0.847533 0.930029
2013-03-04 19:31:49.508347 48 16 829 813 67.6617
68 0.807918 0.940498
2013-03-04 19:31:50.508547 49 16 840 824 67.1792
44 0.95126 0.938767
2013-03-04 19:31:51.508753 50 16 858 842 67.2752
72 0.711993 0.937664
2013-03-04 19:31:52.509076 51 13 859 846 66.2706
16 1.49896 0.939526
2013-03-04 19:31:53.509662 Total time run: 51.235707
Total writes made: 859
Write size: 4194304
Bandwidth (MB/sec): 67.063
Stddev Bandwidth: 22.35
Max bandwidth (MB/sec): 100
Min bandwidth (MB/sec): 0
Average Latency: 0.951978
Stddev Latency: 0.456654
Max latency: 3.44611
Min latency: 0.166207
On Mon, Mar 4, 2013 at 6:42 PM, Sławomir Skowron <[email protected]> wrote:
> Alone (one of this slow osd in mentioned tripple)
>
> 2013-03-04 18:39:27.683035 osd.23 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 15.241943 sec at 68795 KB/sec
>
> in for loop (some slow request appear):
>
> for x in `seq 0 25`; do ceph osd tell $x bench;done
> 2013-03-04 18:41:08.259454 osd.12 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 37.658448 sec at 27844 KB/sec
> 2013-03-04 18:41:07.850213 osd.5 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 37.402402 sec at 28034 KB/sec
> 2013-03-04 18:41:07.850231 osd.11 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 37.201831 sec at 28186 KB/sec
> 2013-03-04 18:41:08.100186 osd.10 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 37.540605 sec at 27931 KB/sec
> 2013-03-04 18:41:08.319766 osd.21 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 37.532806 sec at 27937 KB/sec
> 2013-03-04 18:41:08.415835 osd.14 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 37.772730 sec at 27760 KB/sec
> 2013-03-04 18:41:08.775264 osd.9 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 38.195523 sec at 27452 KB/sec
> 2013-03-04 18:41:08.808824 osd.6 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 38.338387 sec at 27350 KB/sec
> 2013-03-04 18:41:08.923809 osd.19 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 38.177933 sec at 27465 KB/sec
> 2013-03-04 18:41:08.925848 osd.18 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 38.201476 sec at 27448 KB/sec
> 2013-03-04 18:41:08.936961 osd.15 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 38.273058 sec at 27397 KB/sec
> 2013-03-04 18:41:08.619022 osd.20 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 37.713017 sec at 27804 KB/sec
> 2013-03-04 18:41:08.764705 osd.22 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 37.954886 sec at 27626 KB/sec
> 2013-03-04 18:41:08.499156 osd.0 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 38.035553 sec at 27568 KB/sec
> 2013-03-04 18:41:07.873457 osd.2 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 37.489969 sec at 27969 KB/sec
> 2013-03-04 18:41:08.134530 osd.13 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 37.513056 sec at 27952 KB/sec
> 2013-03-04 18:41:08.219142 osd.1 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 37.856368 sec at 27698 KB/sec
> 2013-03-04 18:41:08.485806 osd.4 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 38.060621 sec at 27550 KB/sec
> 2013-03-04 18:41:08.612236 osd.7 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 38.122105 sec at 27505 KB/sec
> 2013-03-04 18:41:08.647494 osd.8 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 38.134885 sec at 27496 KB/sec
> 2013-03-04 18:41:08.649267 osd.3 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 37.961966 sec at 27621 KB/sec
> 2013-03-04 18:41:08.943610 osd.24 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 38.091272 sec at 27527 KB/sec
> 2013-03-04 18:41:08.975838 osd.17 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 38.270884 sec at 27398 KB/sec
> 2013-03-04 18:41:09.544561 osd.23 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 38.715030 sec at 27084 KB/sec
> 2013-03-04 18:41:08.969981 osd.16 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 38.287596 sec at 27386 KB/sec
> 2013-03-04 18:41:09.533789 osd.25 [INF] bench: wrote 1024 MB in blocks
> of 4096 KB in 37.954333 sec at 27627 KB/sec
>
> I have a little fragmented xfs, but performance is still good.
>
> On Mon, Mar 4, 2013 at 6:25 PM, Gregory Farnum <[email protected]> wrote:
>> On Mon, Mar 4, 2013 at 9:23 AM, Sławomir Skowron <[email protected]> wrote:
>>> On Mon, Mar 4, 2013 at 6:02 PM, Sage Weil <[email protected]> wrote:
>>>> On Mon, 4 Mar 2013, S?awomir Skowron wrote:
>>>>> Ok, thanks for response. But if i have crush map like this in attachment.
>>>>>
>>>>> All data should be balanced equal, not including hosts with 0.5 weight.
>>>>>
>>>>> How make data auto balanced ?? when i know that some pq's have too
>>>>> much data ?? I have 4800 pg's on RGW only with 78 OSD, it is quite
>>>>> enough.
>>>>>
>>>>> pool 3 '.rgw.buckets' rep size 3 crush_ruleset 0 object_hash rjenkins
>>>>> pg_num 4800 pgp_num 4800 last_change 908 owner 0
>>>>>
>>>>> When will bee possible to expand number of pg's ??
>>>>
>>>> Soon. :)
>>>>
>>>> The bigger question for me is why there is one PG that is getting pounded
>>>> while the others are not. Is there a large skew in the workload toward a
>>>> small number of very hot objects?
>>>
>>> Yes, there are constantly about 100-200 operations in second, all
>>> going into RGW backend. But when problems comes, there are more
>>> requests, more GET, and PUT, because of reconnect of applications,
>>> with short timeouts. But statistically all new PUTs normally goes for
>>> many pg's, this should not overload a single master OSD. Maybe
>>> balanced Reads from all replicas could help a little ??.
>>>
>>>> I expect it should be obvious if you go
>>>> to the loaded osd and do
>>>>
>>>> ceph --admin-daemon /var/run/ceph/ceph-osd.NN.asok dump_ops_in_flight
>>>>
>>>
>>> Yes i did that, but only when cluster going unstable there are such
>>> long operations. Normaly there are no ops in queue, only when cluster
>>> going to rebalance, remap, or anything else.
>>
>> Have you checked the baseline disk performance of the OSDs? Perhaps
>> it's not that the PG is bad but that the OSDs are slow.
>
>
>
> --
> -----
> Pozdrawiam
>
> Sławek "sZiBis" Skowron
--
-----
Pozdrawiam
Sławek "sZiBis" Skowron
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html