Re: ceph status reporting non-existing osd

Andrey Korolyov Wed, 18 Jul 2012 00:48:03 -0700

On Wed, Jul 18, 2012 at 11:18 AM, Gregory Farnum <[email protected]> wrote:
> On Tuesday, July 17, 2012 at 11:22 PM, Andrey Korolyov wrote:
>> On Wed, Jul 18, 2012 at 10:09 AM, Gregory Farnum <[email protected] 
>> (mailto:[email protected])> wrote:
>> > On Monday, July 16, 2012 at 11:55 AM, Andrey Korolyov wrote:
>> > > On Mon, Jul 16, 2012 at 10:48 PM, Gregory Farnum <[email protected] 
>> > > (mailto:[email protected])> wrote:
>> > > > "ceph pg set_full_ratio 0.95"
>> > > > "ceph pg set_nearfull_ratio 0.94"
>> > > >
>> > > >
>> > > > On Monday, July 16, 2012 at 11:42 AM, Andrey Korolyov wrote:
>> > > >
>> > > > > On Mon, Jul 16, 2012 at 8:12 PM, Gregory Farnum <[email protected] 
>> > > > > (mailto:[email protected])> wrote:
>> > > > > > On Saturday, July 14, 2012 at 7:20 AM, Andrey Korolyov wrote:
>> > > > > > > On Fri, Jul 13, 2012 at 9:09 PM, Sage Weil <[email protected] 
>> > > > > > > (mailto:[email protected])> wrote:
>> > > > > > > > On Fri, 13 Jul 2012, Gregory Farnum wrote:
>> > > > > > > > > On Fri, Jul 13, 2012 at 1:17 AM, Andrey Korolyov 
>> > > > > > > > > <[email protected] (mailto:[email protected])> wrote:
>> > > > > > > > > > Hi,
>> > > > > > > > > >
>> > > > > > > > > > Recently I`ve reduced my test suite from 6 to 4 osds at 
>> > > > > > > > > > ~60% usage on
>> > > > > > > > > > six-node,
>> > > > > > > > > > and I have removed a bunch of rbd objects during recovery 
>> > > > > > > > > > to avoid
>> > > > > > > > > > overfill.
>> > > > > > > > > > Right now I`m constantly receiving a warn about nearfull 
>> > > > > > > > > > state on
>> > > > > > > > > > non-existing osd:
>> > > > > > > > > >
>> > > > > > > > > > health HEALTH_WARN 1 near full osd(s)
>> > > > > > > > > > monmap e3: 3 mons at
>> > > > > > > > > > {0=192.168.10.129:6789/0,1=192.168.10.128:6789/0,2=192.168.10.127:6789/0},
>> > > > > > > > > > election epoch 240, quorum 0,1,2 0,1,2
>> > > > > > > > > > osdmap e2098: 4 osds: 4 up, 4 in
>> > > > > > > > > > pgmap v518696: 464 pgs: 464 active+clean; 61070 MB data, 
>> > > > > > > > > > 181 GB
>> > > > > > > > > > used, 143 GB / 324 GB avail
>> > > > > > > > > > mdsmap e181: 1/1/1 up {0=a=up:active}
>> > > > > > > > > >
>> > > > > > > > > > HEALTH_WARN 1 near full osd(s)
>> > > > > > > > > > osd.4 is near full at 89%
>> > > > > > > > > >
>> > > > > > > > > > Needless to say, osd.4 remains only in ceph.conf, but not 
>> > > > > > > > > > at crushmap.
>> > > > > > > > > > Reducing has been done 'on-line', e.g. without restart 
>> > > > > > > > > > entire cluster.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > Whoops! It looks like Sage has written some patches to fix 
>> > > > > > > > > this, but
>> > > > > > > > > for now you should be good if you just update your ratios to 
>> > > > > > > > > a larger
>> > > > > > > > > number, and then bring them back down again. :)
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > Restarting ceph-mon should also do the trick.
>> > > > > > > >
>> > > > > > > > Thanks for the bug report!
>> > > > > > > > sage
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > Should I restart mons simultaneously?
>> > > > > > I don't think restarting will actually do the trick for you — you 
>> > > > > > actually will need to set the ratios again.
>> > > > > >
>> > > > > > > Restarting one by one has no
>> > > > > > > effect, same as filling up data pool up to ~95 percent(btw, when 
>> > > > > > > I
>> > > > > > > deleted this 50Gb file on cephfs, mds was stuck permanently and 
>> > > > > > > usage
>> > > > > > > remained same until I dropped and recreated data pool - hope 
>> > > > > > > it`s one
>> > > > > > > of known posix layer bugs). I also deleted entry from config, 
>> > > > > > > and then
>> > > > > > > restarted mons, with no effect. Any suggestions?
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > I'm not sure what you're asking about here?
>> > > > > > -Greg
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > Oh, sorry, I have mislooked and thought that you suggested filling up
>> > > > > osds. How do I can set full/nearfull ratios correctly?
>> > > > >
>> > > > > $ceph injectargs '--mon_osd_full_ratio 96'
>> > > > > parsed options
>> > > > > $ ceph injectargs '--mon_osd_near_full_ratio 94'
>> > > > > parsed options
>> > > > >
>> > > > > ceph pg dump | grep 'full'
>> > > > > full_ratio 0.95
>> > > > > nearfull_ratio 0.85
>> > > > >
>> > > > > Setting parameters in the ceph.conf and then restarting mons does not
>> > > > > affect ratios either.
>> > > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > Thanks, it worked, but setting values back result to turn warning back.
>> > Hrm. That shouldn't be possible if the OSD has been removed. How did you 
>> > take it out? It sounds like maybe you just marked it in the OUT state (and 
>> > turned it off quite quickly) without actually taking it out of the cluster?
>> > -Greg
>>
>>
>>
>> As I have did removal, it was definitely not like that - at first
>> place, I have marked osds(4 and 5 on same host) out, then rebuilt
>> crushmap and then kill osd processes. As I mentioned before, osd.4
>> doest not exist in crushmap and therefore it shouldn`t be reported at
>> all(theoretically).
>
> Okay, that's what happened — marking an OSD out in the CRUSH map means all 
> the data gets moved off it, but that doesn't remove it from all the places 
> where it's registered in the monitor and in the map, for a couple reasons:
> 1) You might want to mark an OSD out before taking it down, to allow for more 
> orderly data movement.
> 2) OSDs can get marked out automatically, but the system shouldn't be able to 
> forget about them on its own.
> 3) You might want to remove an OSD from the CRUSH map in the process of 
> placing it somewhere else (perhaps you moved the physical machine to a new 
> location).
> etc.
>
> You want to run "ceph osd rm 4 5" and that should unregister both of them 
> from everything[1]. :)
> -Greg
> [1]: Except for the full lists, which have a bug in the version of code 
> you're running — remove the OSDs, then adjust the full ratios again, and all 
> will be well.
>


$ ceph osd rm 4
osd.4 does not exist
$ ceph -s
   health HEALTH_WARN 1 near full osd(s)
   monmap e3: 3 mons at
{0=192.168.10.129:6789/0,1=192.168.10.128:6789/0,2=192.168.10.127:6789/0},
election epoch 58, quorum 0,1,2 0,1,2
   osdmap e2198: 4 osds: 4 up, 4 in
    pgmap v586056: 464 pgs: 464 active+clean; 66645 MB data, 231 GB
used, 95877 MB / 324 GB avail
   mdsmap e207: 1/1/1 up {0=a=up:active}

$ ceph health detail
HEALTH_WARN 1 near full osd(s)
osd.4 is near full at 89%

$ ceph osd dump
....
max_osd 4
osd.0 up   in  weight 1 up_from 2183 up_thru 2187 down_at 2172
last_clean_interval [2136,2171) 192.168.10.128:6800/4030
192.168.10.128:6801/4030 192.168.10.128:6802/4030 exists,up
68b3deec-e80a-48b7-9c29-1b98f5de4f62
osd.1 up   in  weight 1 up_from 2136 up_thru 2186 down_at 2135
last_clean_interval [2115,2134) 192.168.10.129:6800/2980
192.168.10.129:6801/2980 192.168.10.129:6802/2980 exists,up
b2a26fe9-aaa8-445f-be1f-fa7d2a283b57
osd.2 up   in  weight 1 up_from 2181 up_thru 2187 down_at 2172
last_clean_interval [2136,2171) 192.168.10.128:6803/4128
192.168.10.128:6804/4128 192.168.10.128:6805/4128 exists,up
378d367a-f7fb-4892-9ec9-db8ffdd2eb20
osd.3 up   in  weight 1 up_from 2136 up_thru 2186 down_at 2135
last_clean_interval [2115,2134) 192.168.10.129:6803/3069
192.168.10.129:6804/3069 192.168.10.129:6805/3069 exists,up
faf8eda8-55fc-4a0e-899f-47dbd32b81b8
....
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ceph status reporting non-existing osd

Reply via email to