Re: [ceph-users] Can't get ceph mgr balancer to work (Luminous 12.2.4)

2018-05-27 Thread Linh Vu
I turned debug_mgr to 4/5 and found this while executing the plan. Apparently 
the command has error but the reply is "Success!" while nothing is done. Not 
sure what the 'foo' part is doing there.


2018-05-28 14:24:02.570822 7fc3f5ff7700  0 log_channel(audit) log [DBG] : 
from='client.282784 $IPADDRESS:0/2504876563' entity='client.admin' 
cmd=[{"prefix": "balancer execute", "plan": "mynewplan2", "target": ["mgr", 
""]}]: dispatch
2018-05-28 14:24:02.570858 7fc3f5ff7700  1 mgr.server handle_command 
pyc_prefix: 'balancer status'
2018-05-28 14:24:02.570862 7fc3f5ff7700  1 mgr.server handle_command 
pyc_prefix: 'balancer mode'
2018-05-28 14:24:02.570864 7fc3f5ff7700  1 mgr.server handle_command 
pyc_prefix: 'balancer on'
2018-05-28 14:24:02.570866 7fc3f5ff7700  1 mgr.server handle_command 
pyc_prefix: 'balancer off'
2018-05-28 14:24:02.570869 7fc3f5ff7700  1 mgr.server handle_command 
pyc_prefix: 'balancer eval'
2018-05-28 14:24:02.570872 7fc3f5ff7700  1 mgr.server handle_command 
pyc_prefix: 'balancer eval-verbose'
2018-05-28 14:24:02.570874 7fc3f5ff7700  1 mgr.server handle_command 
pyc_prefix: 'balancer optimize'
2018-05-28 14:24:02.570876 7fc3f5ff7700  1 mgr.server handle_command 
pyc_prefix: 'balancer show'
2018-05-28 14:24:02.570878 7fc3f5ff7700  1 mgr.server handle_command 
pyc_prefix: 'balancer rm'
2018-05-28 14:24:02.570880 7fc3f5ff7700  1 mgr.server handle_command 
pyc_prefix: 'balancer reset'
2018-05-28 14:24:02.570882 7fc3f5ff7700  1 mgr.server handle_command 
pyc_prefix: 'balancer dump'
2018-05-28 14:24:02.570885 7fc3f5ff7700  1 mgr.server handle_command 
pyc_prefix: 'balancer execute'
2018-05-28 14:24:02.570886 7fc3f5ff7700  4 mgr.server handle_command passing 
through 3
2018-05-28 14:24:02.571183 7fc3f67f8700  1 mgr[balancer] Handling command: 
'{'prefix': 'balancer execute', 'plan': 'mynewplan2', 'target': ['mgr', '']}'
2018-05-28 14:24:02.571252 7fc3f67f8700  4 mgr[balancer] Executing plan 
mynewplan2
2018-05-28 14:24:02.571855 7fc3f67f8700  4 mgr[balancer] ceph osd 
pg-upmap-items 10.3 mappings [{'to': 5L, 'from': 15L}, {'to': 45L, 'from': 
33L}, {'to': 58L, 'from': 62L}]
2018-05-28 14:24:02.572073 7fc3f67f8700  4 mgr[balancer] ceph osd 
pg-upmap-items 10.8 mappings [{'to': 45L, 'from': 41L}, {'to': 5L, 'from': 13L}]
2018-05-28 14:24:02.572217 7fc3f67f8700  4 mgr[balancer] ceph osd 
pg-upmap-items 10.b mappings [{'to': 18L, 'from': 17L}, {'to': 45L, 'from': 
41L}, {'to': 5L, 'from': 13L}]
2018-05-28 14:24:02.572367 7fc3f67f8700  4 mgr[balancer] ceph osd 
pg-upmap-items 10.10 mappings [{'to': 28L, 'from': 27L}]
2018-05-28 14:24:02.572491 7fc3f67f8700  4 mgr[balancer] ceph osd 
pg-upmap-items 10.11 mappings [{'to': 58L, 'from': 51L}]
2018-05-28 14:24:02.572602 7fc3f67f8700  4 mgr[balancer] ceph osd 
pg-upmap-items 10.2a mappings [{'to': 45L, 'from': 32L}, {'to': 21L, 'from': 
27L}]
2018-05-28 14:24:02.572712 7fc3f67f8700  4 mgr[balancer] ceph osd 
pg-upmap-items 10.32 mappings [{'to': 45L, 'from': 40L}, {'to': 5L, 'from': 
7L}, {'to': 58L, 'from': 51L}]
2018-05-28 14:24:02.572848 7fc3f67f8700  4 mgr[balancer] ceph osd 
pg-upmap-items 10.47 mappings [{'to': 45L, 'from': 43L}, {'to': 58L, 'from': 
51L}]
2018-05-28 14:24:02.572940 7fc3f67f8700  4 mgr[balancer] ceph osd 
pg-upmap-items 10.4c mappings [{'to': 5L, 'from': 4L}, {'to': 54L, 'from': 51L}]
2018-05-28 14:24:02.573028 7fc3f67f8700  4 mgr[balancer] ceph osd 
pg-upmap-items 10.61 mappings [{'to': 54L, 'from': 51L}]
2018-05-28 14:24:02.573341 7fc3f67f8700  0 mgr[balancer] Error on command
2018-05-28 14:24:02.573407 7fc3f67f8700  1 mgr.server reply handle_command (0) 
Success
2018-05-28 14:24:02.573560 7fc3f67f8700  1 mgr[restful] Unknown request 'foo'
2018-05-28 14:24:02.573617 7fc3f67f8700  1 mgr[restful] Unknown request 'foo'
2018-05-28 14:24:02.573660 7fc3f67f8700  1 mgr[restful] Unknown request 'foo'
2018-05-28 14:24:02.573699 7fc3f67f8700  1 mgr[restful] Unknown request 'foo'
2018-05-28 14:24:02.573873 7fc3f67f8700  1 mgr[restful] Unknown request 'foo'
2018-05-28 14:24:02.573915 7fc3f67f8700  1 mgr[restful] Unknown request 'foo'
2018-05-28 14:24:02.573961 7fc3f67f8700  1 mgr[restful] Unknown request 'foo'
2018-05-28 14:24:02.574008 7fc3f67f8700  1 mgr[restful] Unknown request 'foo'
2018-05-28 14:24:02.574047 7fc3f67f8700  1 mgr[restful] Unknown request 'foo'
2018-05-28 14:24:02.574142 7fc3f67f8700  1 mgr[restful] Unknown request 'foo'




From: ceph-users  on behalf of Linh Vu 

Sent: Monday, 28 May 2018 12:47:41 PM
To: ceph-users
Subject: [ceph-users] Can't get ceph mgr balancer to work (Luminous 12.2.4)


Hi all,


I'm testing out ceph mgr balancer as per 
http://docs.ceph.com/docs/master/mgr/balancer/ on our test cluster on Luminous 
12.2.4, but can't seem to get it to work. Everything looks good in the prep, 
the new plan shows that it will take some actions, but it doesn't execute at 
all. Am I missing something? Details below:

# ceph mgr 

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-05-27 Thread Alexandre DERUMIER
>>could you send me full output of dump_mempools 

#  ceph daemon mds.ceph4-2.odiso.net dump_mempools 
{
"bloom_filter": {
"items": 41262668,
"bytes": 41262668
},
"bluestore_alloc": {
"items": 0,
"bytes": 0
},
"bluestore_cache_data": {
"items": 0,
"bytes": 0
},
"bluestore_cache_onode": {
"items": 0,
"bytes": 0
},
"bluestore_cache_other": {
"items": 0,
"bytes": 0
},
"bluestore_fsck": {
"items": 0,
"bytes": 0
},
"bluestore_txc": {
"items": 0,
"bytes": 0
},
"bluestore_writing_deferred": {
"items": 0,
"bytes": 0
},
"bluestore_writing": {
"items": 0,
"bytes": 0
},
"bluefs": {
"items": 0,
"bytes": 0
},
"buffer_anon": {
"items": 712726,
"bytes": 106964870
},
"buffer_meta": {
"items": 15,
"bytes": 1320
},
"osd": {
"items": 0,
"bytes": 0
},
"osd_mapbl": {
"items": 0,
"bytes": 0
},
"osd_pglog": {
"items": 0,
"bytes": 0
},
"osdmap": {
"items": 216,
"bytes": 12168
},
"osdmap_mapping": {
"items": 0,
"bytes": 0
},
"pgmap": {
"items": 0,
"bytes": 0
},
"mds_co": {
"items": 50741038,
"bytes": 5114319203
},
"unittest_1": {
"items": 0,
"bytes": 0
},
"unittest_2": {
"items": 0,
"bytes": 0
},
"total": {
"items": 92716663,
"bytes": 5262560229
}
}





ceph daemon mds.ceph4-2.odiso.net perf dump 
{
"AsyncMessenger::Worker-0": {
"msgr_recv_messages": 1276789161,
"msgr_send_messages": 1317625246,
"msgr_recv_bytes": 10630409633633,
"msgr_send_bytes": 1093972769957,
"msgr_created_connections": 207,
"msgr_active_connections": 204,
"msgr_running_total_time": 63745.463077594,
"msgr_running_send_time": 22210.867549070,
"msgr_running_recv_time": 51944.624353942,
"msgr_running_fast_dispatch_time": 9185.274084187
},
"AsyncMessenger::Worker-1": {
"msgr_recv_messages": 641622644,
"msgr_send_messages": 616664293,
"msgr_recv_bytes": 7287546832466,
"msgr_send_bytes": 588278035895,
"msgr_created_connections": 494,
"msgr_active_connections": 494,
"msgr_running_total_time": 35390.081250881,
"msgr_running_send_time": 11559.689889195,
"msgr_running_recv_time": 29844.885712902,
"msgr_running_fast_dispatch_time": 6361.466445253
},
"AsyncMessenger::Worker-2": {
"msgr_recv_messages": 1972469623,
"msgr_send_messages": 1886060294,
"msgr_recv_bytes": 7924136565846,
"msgr_send_bytes": 5072502101797,
"msgr_created_connections": 181,
"msgr_active_connections": 176,
"msgr_running_total_time": 93257.811989806,
"msgr_running_send_time": 35556.662488302,
"msgr_running_recv_time": 81686.262228047,
"msgr_running_fast_dispatch_time": 6476.875317930
},
"finisher-PurgeQueue": {
"queue_len": 0,
"complete_latency": {
"avgcount": 3390753,
"sum": 44364.742135193,
"avgtime": 0.013084038
}
},
"mds": {
"request": 2780760988,
"reply": 2780760950,
"reply_latency": {
"avgcount": 2780760950,
"sum": 8467119.492491407,
"avgtime": 0.003044892
},
"forward": 0,
"dir_fetch": 173374097,
"dir_commit": 3235888,
"dir_split": 23,
"dir_merge": 45,
"inode_max": 2147483647,
"inodes": 1762555,
"inodes_top": 388540,
"inodes_bottom": 173389,
"inodes_pin_tail": 1200626,
"inodes_pinned": 1207497,
"inodes_expired": 32837415801,
"inodes_with_caps": 1206864,
"caps": 1565063,
"subtrees": 2,
"traverse": 2976675748,
"traverse_hit": 1725898480,
"traverse_forward": 0,
"traverse_discover": 0,
"traverse_dir_fetch": 157542892,
"traverse_remote_ino": 46197,
"traverse_lock": 294516,
"load_cent": 18446743922292121894,
"q": 169,
"exported": 0,
"exported_inodes": 0,
"imported": 0,
"imported_inodes": 0
},
"mds_cache": {
"num_strays": 6004,
"num_strays_delayed": 23,
"num_strays_enqueuing": 0,
"strays_created": 3123475,
"strays_enqueued": 3118819,
"strays_reintegrated": 1279,
"strays_migrated": 0,
"num_recovering_processing": 0,
"num_recovering_enqueued": 0,
"num_recovering_prioritized": 0,
"recovery_started": 17,
"recovery_completed": 17,
"ireq_enqueue_scrub": 0,

[ceph-users] Can't get ceph mgr balancer to work (Luminous 12.2.4)

2018-05-27 Thread Linh Vu
Hi all,


I'm testing out ceph mgr balancer as per 
http://docs.ceph.com/docs/master/mgr/balancer/ on our test cluster on Luminous 
12.2.4, but can't seem to get it to work. Everything looks good in the prep, 
the new plan shows that it will take some actions, but it doesn't execute at 
all. Am I missing something? Details below:

# ceph mgr module enable balancer

# ceph balancer eval
current cluster score 0.06 (lower is better)

# ceph balancer mode upmap

# ceph balancer optimize mynewplan2

# ceph balancer status
{
"active": true,
"plans": [
"mynewplan2"
],
"mode": "upmap"
}

# ceph balancer show mynewplan2
# starting osdmap epoch 10629
# starting crush version 71
# mode upmap
ceph osd pg-upmap-items 10.3 15 5 33 45 62 58
ceph osd pg-upmap-items 10.8 41 45 13 5
ceph osd pg-upmap-items 10.b 17 18 41 45 13 5
ceph osd pg-upmap-items 10.10 27 28
ceph osd pg-upmap-items 10.11 51 58
ceph osd pg-upmap-items 10.2a 32 45 27 21
ceph osd pg-upmap-items 10.32 40 45 7 5 51 58
ceph osd pg-upmap-items 10.47 43 45 51 58
ceph osd pg-upmap-items 10.4c 4 5 51 54
ceph osd pg-upmap-items 10.61 51 54

# ceph balancer eval mynewplan2
plan mynewplan2 final score 0.010474 (lower is better)

# ceph balancer execute mynewplan2
(nothing happens)

# ceph balancer on
(still nothing happens)

Regards,
Linh
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-05-27 Thread Yan, Zheng
could you send me full output of dump_mempools

On Thu, May 24, 2018 at 7:22 PM, Alexandre DERUMIER  wrote:
> Thanks!
>
>
> here the profile.pdf
>
> 10-15min profiling, I can't do it longer because my clients where lagging.
>
> but I think it should be enough to observe the rss memory increase.
>
>
>
>
> - Mail original -
> De: "Zheng Yan" 
> À: "aderumier" 
> Cc: "ceph-users" 
> Envoyé: Jeudi 24 Mai 2018 11:34:20
> Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
>
> On Tue, May 22, 2018 at 3:11 PM, Alexandre DERUMIER  
> wrote:
>> Hi,some new stats, mds memory is not 16G,
>>
>> I have almost same number of items and bytes in cache vs some weeks ago when 
>> mds was using 8G. (ceph 12.2.5)
>>
>>
>> root@ceph4-2:~# while sleep 1; do ceph daemon mds.ceph4-2.odiso.net perf 
>> dump | jq '.mds_mem.rss'; ceph daemon mds.ceph4-2.odiso.net dump_mempools | 
>> jq -c '.mds_co'; done
>> 16905052
>> {"items":43350988,"bytes":5257428143}
>> 16905052
>> {"items":43428329,"bytes":5283850173}
>> 16905052
>> {"items":43209167,"bytes":5208578149}
>> 16905052
>> {"items":43177631,"bytes":5198833577}
>> 16905052
>> {"items":43312734,"bytes":5252649462}
>> 16905052
>> {"items":43355753,"bytes":5277197972}
>> 16905052
>> {"items":43700693,"bytes":5303376141}
>> 16905052
>> {"items":43115809,"bytes":5156628138}
>> ^C
>>
>>
>>
>>
>> root@ceph4-2:~# ceph status
>> cluster:
>> id: e22b8e83-3036-4fe5-8fd5-5ce9d539beca
>> health: HEALTH_OK
>>
>> services:
>> mon: 3 daemons, quorum ceph4-1,ceph4-2,ceph4-3
>> mgr: ceph4-1.odiso.net(active), standbys: ceph4-2.odiso.net, 
>> ceph4-3.odiso.net
>> mds: cephfs4-1/1/1 up {0=ceph4-2.odiso.net=up:active}, 2 up:standby
>> osd: 18 osds: 18 up, 18 in
>> rgw: 3 daemons active
>>
>> data:
>> pools: 11 pools, 1992 pgs
>> objects: 75677k objects, 6045 GB
>> usage: 20579 GB used, 6246 GB / 26825 GB avail
>> pgs: 1992 active+clean
>>
>> io:
>> client: 14441 kB/s rd, 2550 kB/s wr, 371 op/s rd, 95 op/s wr
>>
>>
>> root@ceph4-2:~# ceph daemon mds.ceph4-2.odiso.net cache status
>> {
>> "pool": {
>> "items": 44523608,
>> "bytes": 5326049009
>> }
>> }
>>
>>
>> root@ceph4-2:~# ceph daemon mds.ceph4-2.odiso.net perf dump
>> {
>> "AsyncMessenger::Worker-0": {
>> "msgr_recv_messages": 798876013,
>> "msgr_send_messages": 825999506,
>> "msgr_recv_bytes": 7003223097381,
>> "msgr_send_bytes": 691501283744,
>> "msgr_created_connections": 148,
>> "msgr_active_connections": 146,
>> "msgr_running_total_time": 39914.832387470,
>> "msgr_running_send_time": 13744.704199430,
>> "msgr_running_recv_time": 32342.160588451,
>> "msgr_running_fast_dispatch_time": 5996.336446782
>> },
>> "AsyncMessenger::Worker-1": {
>> "msgr_recv_messages": 429668771,
>> "msgr_send_messages": 414760220,
>> "msgr_recv_bytes": 5003149410825,
>> "msgr_send_bytes": 396281427789,
>> "msgr_created_connections": 132,
>> "msgr_active_connections": 132,
>> "msgr_running_total_time": 23644.410515392,
>> "msgr_running_send_time": 7669.068710688,
>> "msgr_running_recv_time": 19751.610043696,
>> "msgr_running_fast_dispatch_time": 4331.023453385
>> },
>> "AsyncMessenger::Worker-2": {
>> "msgr_recv_messages": 1312910919,
>> "msgr_send_messages": 1260040403,
>> "msgr_recv_bytes": 5330386980976,
>> "msgr_send_bytes": 3341965016878,
>> "msgr_created_connections": 143,
>> "msgr_active_connections": 138,
>> "msgr_running_total_time": 61696.635450100,
>> "msgr_running_send_time": 23491.027014598,
>> "msgr_running_recv_time": 53858.409319734,
>> "msgr_running_fast_dispatch_time": 4312.451966809
>> },
>> "finisher-PurgeQueue": {
>> "queue_len": 0,
>> "complete_latency": {
>> "avgcount": 1889416,
>> "sum": 29224.227703697,
>> "avgtime": 0.015467333
>> }
>> },
>> "mds": {
>> "request": 1822420924,
>> "reply": 1822420886,
>> "reply_latency": {
>> "avgcount": 1822420886,
>> "sum": 5258467.616943274,
>> "avgtime": 0.002885429
>> },
>> "forward": 0,
>> "dir_fetch": 116035485,
>> "dir_commit": 1865012,
>> "dir_split": 17,
>> "dir_merge": 24,
>> "inode_max": 2147483647,
>> "inodes": 1600438,
>> "inodes_top": 210492,
>> "inodes_bottom": 100560,
>> "inodes_pin_tail": 1289386,
>> "inodes_pinned": 1299735,
>> "inodes_expired": 3476046,
>> "inodes_with_caps": 1299137,
>> "caps": 2211546,
>> "subtrees": 2,
>> "traverse": 1953482456,
>> "traverse_hit": 1127647211,
>> "traverse_forward": 0,
>> "traverse_discover": 0,
>> "traverse_dir_fetch": 105833969,
>> "traverse_remote_ino": 31686,
>> "traverse_lock": 4344,
>> "load_cent": 182244014474,
>> "q": 104,
>> "exported": 0,
>> "exported_inodes": 0,
>> "imported": 0,
>> "imported_inodes": 0
>> },
>> "mds_cache": {
>> "num_strays": 14980,
>> "num_strays_delayed": 7,
>> "num_strays_enqueuing": 0,
>> "strays_created": 1672815,
>> "strays_enqueued": 1659514,
>> "strays_reintegrated": 666,
>> "strays_migrated": 0,
>> "num_recovering_processing": 0,
>> "num_recovering_enqueued": 0,
>> 

[ceph-users] RBD lock on unmount

2018-05-27 Thread Joshua Collins

Hi

I've set up a Ceph Cluster with and RBD storage device for use in a 
pacemaker/corosync cluster. When attempting to move the resources from 
one node to the other, the filesystem on the RBD will not unmount. lsof 
and fuser show no files in use on the device. I thought this may be an 
issue with an NFS lock, so I moved the ceph OSD and monitor off the 
machine where the filesystem is mounted to a virtual machine, however 
I'm still unable to unmount the filesystem.


Is this a known issue with RBD filesystem mounts? Is there a system 
change I need to make in order to get the filesystem to reliably unmount?


Thanks in advance,

--
*Joshua Collins*
Systems Engineer

VRT Systems 
38b Douglas Street
Milton QLD 4064 T  +61 7 3535 9615
F  +61 7 3535 9699
E joshua.coll...@vrt.com.au 

www.vrt.com.au 

Follow us on:
Linked In  twitter 
 Google Plus 
 YouTube 
 	This email is 
the property of VRT Systems, a division of Vector International Pacific 
Pty. Ltd. ACN: 009 278 979. The information contained in this email is 
confidential and is intended only for the use of the addressee(s). If 
you have received this email in error, please immediately notify the 
sender and delete the email. Thank you.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Erasure: Should k+m always be equal to the total number of OSDs?

2018-05-27 Thread Paul Emmerich
2018-05-27 16:30 GMT+02:00 Leônidas Villeneuve :

>
> I'm very confused about it, as every tutorial and docs take for granted
> k+m = OSDs, but neither say about adding more OSDs and hosts to an existing
> erasure cluster.
>

Which doc or tutorial does state that? Because that's a horribly bad setup.

k + m should be at least one less than your total number of servers and
less than ~10 or so as a rule of thumb.
(There are of course scenarios where different configurations are feasible)


Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Erasure: Should k+m always be equal to the total number of OSDs?

2018-05-27 Thread Leônidas Villeneuve
Hello everyone,

I'm struggling with the documentation regarding the scalability of Erasure
coded pools.

Let's say I have an Erasure coded pool (Jerasure, Reed Salomon) with k = 6
and m = 3, distributed among 3 hosts, with 3 OSDs each. Everything seems
nice and working (this is just an example), but I'm reaching the total
usable capacity of my pool.

What should I do?

Can I just add another 3 OSD pool without touching the k and m variables
and be just fine? Should I change the k (and/or m), will Ceph migrate data
between hosts, causing bandwidth bottlenecks in the process?

I'm very confused about it, as every tutorial and docs take for granted k+m
= OSDs, but neither say about adding more OSDs and hosts to an existing
erasure cluster.

I'll be pretty glad if someone could help me, please.

Best regards,

- Leonidas.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com