please disregard my last email. I followed recommendation for tunables, but missed the note that kernel version should be 3.5 or later in order to support the tunables. I reverted them back to the legacy ones and everything is back online.
2013/1/10 Roman Hlynovskiy <[email protected]>: > Hello again! > > I left the system in working state overnight and got it in a wierd > state this morning: > > chef@ceph-node02:/var/log/ceph$ ceph -s > health HEALTH_OK > monmap e4: 3 mons at > {a=192.168.7.11:6789/0,b=192.168.7.12:6789/0,c=192.168.7.13:6789/0}, > election epoch 254, quorum 0,1,2 a,b,c > osdmap e348: 3 osds: 3 up, 3 in > pgmap v114606: 384 pgs: 384 active+clean; 161 GB data, 326 GB > used, 429 GB / 755 GB avail > mdsmap e4623: 1/1/1 up {0=b=up:active}, 1 up:standby > > so, it looks ok from the first point of view, however I am not able > to mount ceph from any of nodes: > be01:~# mount /var/www/jroger.org/data > mount: 192.168.7.11:/: can't read superblock > > on the nodes, which had ceph mounted yesterday I am able to look > through the filesystem, but any kind of data read causes client to > hang. > > I made a trace on the active mds with debug ms/mds = 20 > (http://wh.of.kz/ceph_logs.tar.gz) > Could you please help to identify what's going on. > > 2013/1/9 Roman Hlynovskiy <[email protected]>: >>>> How many pgs do you have? ('ceph osd dump | grep ^pool'). >>> >>> I believe this is it. 384 PGs, but three pools of which only one (or maybe >>> a second one, sort of) is in use. Automatically setting the right PG counts >>> is coming some day, but until then being able to set up pools of the right >>> size is a big gotcha. :( >>> Depending on how mutable the data is, recreate with larger PG counts on the >>> pools in use. Otherwise we can do something more detailed. >>> -Greg >> >> hm... what would be recommended PG size per pool ? >> >> chef@cephgw:~$ ceph osd lspools >> 0 data,1 metadata,2 rbd, >> chef@cephgw:~$ ceph osd pool get data pg_num >> PG_NUM: 128 >> chef@cephgw:~$ ceph osd pool get metadata pg_num >> PG_NUM: 128 >> chef@cephgw:~$ ceph osd pool get rbd pg_num >> PG_NUM: 128 >> >> according to the >> http://ceph.com/docs/master/rados/operations/placement-groups/ >> >> (OSDs * 100) >> Total PGs = ------------ >> Replicas >> >> I have 3 OSDs and 2 replicas for each object, which gives recommended PG = >> 150 >> >> will it make much difference to set 150 instead of 128 or I should >> base on different values? >> >> btw, just one more off-topic question: >> >> chef@ceph-node03:~$ ceph pg dump| egrep -v '^(0\.|1\.|2\.)'| column -t >> dumped all in format plain >> version 113906 >> last_osdmap_epoch 323 >> last_pg_scan 1 >> full_ratio 0.95 >> nearfull_ratio 0.85 >> pg_stat objects mip degr unf bytes >> log disklog state state_stamp v reported up >> acting last_scrub scrub_stamp last_deep_scrub deep_scrub_stamp >> pool 0 74748 0 0 0 >> 286157692336 17668034 17668034 >> pool 1 618 0 0 0 >> 131846062 6414518 6414518 >> pool 2 0 0 0 0 >> 0 0 0 >> sum 75366 0 0 0 >> 286289538398 24082552 24082552 >> osdstat kbused kbavail kb hb in >> hb out >> 0 157999220 106227596 264226816 [1,2] [] >> 1 185604948 78621868 264226816 [0,2] [] >> 2 219475396 44751420 264226816 [0,1] [] >> sum 563079564 229600884 792680448 >> >> pool 0 (data) is used for data storage >> pool 1 (metadata) is used for metadata storage >> >> what is pool 2 (rbd) for? looks like it's absolutely empty. >> >> >>> >>>> >>>> You might also adjust the crush tunables, see >>>> >>>> http://ceph.com/docs/master/rados/operations/crush-map/?highlight=tunable#tunables >>>> >>>> sage >>>> >> >> Thanks for the link, Sage I set tunable values according to the doc. >> Btw, online document is missing magical param for crushmap which >> allows those scary_tunables ) >> >> >> >> -- >> ...WBR, Roman Hlynovskiy > > > > -- > ...WBR, Roman Hlynovskiy -- ...WBR, Roman Hlynovskiy -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
