Re: [ceph-users] mds standby + standby-reply upgrade

2016-07-18 Thread Dzianis Kahanovich
Patrick Donnelly пишет:

>> Infernalis: e5165: 1/1/1 up {0=c=up:active}, 1 up:standby-replay, 1 
>> up:standby
>>
>> Now after upgrade start and next mon restart, active monitor falls with
>> "assert(info.state == MDSMap::STATE_STANDBY)" (even without running mds) .
> 
> This is the first time you've upgraded your pool to jewel right?
> Straight from 9.X to 10.2.2?
> 

Yes

-- 
WBR, Dzianis Kahanovich AKA Denis Kaganovich, http://mahatma.bspu.by/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mds standby + standby-reply upgrade

2016-07-08 Thread Patrick Donnelly
Hi Dzianis,

On Thu, Jun 30, 2016 at 4:03 PM, Dzianis Kahanovich  wrote:
> Upgraded infernalis->jewel (git, Gentoo). Upgrade passed over global
> stop/restart everything oneshot.
>
> Infernalis: e5165: 1/1/1 up {0=c=up:active}, 1 up:standby-replay, 1 up:standby
>
> Now after upgrade start and next mon restart, active monitor falls with
> "assert(info.state == MDSMap::STATE_STANDBY)" (even without running mds) .

This is the first time you've upgraded your pool to jewel right?
Straight from 9.X to 10.2.2?

-- 
Patrick Donnelly
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mds standby + standby-reply upgrade

2016-07-05 Thread Gregory Farnum
On Mon, Jul 4, 2016 at 12:38 PM, Dzianis Kahanovich  wrote:
> Gregory Farnum пишет:
>> On Thu, Jun 30, 2016 at 1:03 PM, Dzianis Kahanovich  wrote:
>>> Upgraded infernalis->jewel (git, Gentoo). Upgrade passed over global
>>> stop/restart everything oneshot.
>>>
>>> Infernalis: e5165: 1/1/1 up {0=c=up:active}, 1 up:standby-replay, 1 
>>> up:standby
>>>
>>> Now after upgrade start and next mon restart, active monitor falls with
>>> "assert(info.state == MDSMap::STATE_STANDBY)" (even without running mds) . 
>>> Fixed:
>>>
>>> --- a/src/mon/MDSMonitor.cc 2016-06-27 21:26:26.0 +0300
>>> +++ b/src/mon/MDSMonitor.cc 2016-06-28 10:44:32.0 +0300
>>> @@ -2793,7 +2793,11 @@ bool MDSMonitor::maybe_promote_standby(s
>>>  for (const auto  : pending_fsmap.standby_daemons) {
>>>const auto  = j.first;
>>>const auto  = j.second;
>>> -  assert(info.state == MDSMap::STATE_STANDBY);
>>> +//  assert(info.state == MDSMap::STATE_STANDBY);
>>> +  if (info.state != MDSMap::STATE_STANDBY) {
>>> +dout(0) << "gid " << gid << " ex-assert(info.state ==
>>> MDSMap::STATE_STANDBY) " << do_propose << dendl;
>>> +   return do_propose;
>>> +  }
>>>
>>>if (!info.standby_replay) {
>>>  continue;
>>>
>>>
>>> Now: e5442: 1/1/1 up {0=a=up:active}, 1 up:standby
>>> - but really there are 3 mds (active, replay, standby).
>>>
>>> # ceph mds dump
>>> dumped fsmap epoch 5442
>>> fs_name cephfs
>>> epoch   5441
>>> flags   0
>>> created 2016-04-10 23:44:38.858769
>>> modified2016-06-27 23:08:26.211880
>>> tableserver 0
>>> root0
>>> session_timeout 60
>>> session_autoclose   300
>>> max_file_size   1099511627776
>>> last_failure5239
>>> last_failure_osd_epoch  18473
>>> compat  compat={},rocompat={},incompat={1=base v0.20,2=client writeable
>>> ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds 
>>> uses
>>> versioned encoding,6=dirfrag is stored in omap,8=no anchor table}
>>> max_mds 1
>>> in  0
>>> up  {0=3104110}
>>> failed
>>> damaged
>>> stopped
>>> data_pools  5
>>> metadata_pool   6
>>> inline_data disabled
>>> 3104110:10.227.227.103:6800/14627 'a' mds.0.5436 up:active seq 30
>>> 3084126:10.227.227.104:6800/24069 'c' mds.0.0 up:standby-replay seq 
>>> 1
>>>
>>>
>>> If standby-replay false - all OK: 1/1/1 up {0=a=up:active}, 2 up:standby
>>>
>>> How to fix this 3-mds behaviour?
>>
>> Ah, you hit a known bug with that assert. I thought the fix was
>> already in the latest point release; are you behind?
>> -Greg
>>
>
> Cheked in logs - observed in version 10.2.2-45-g9aafefe
> (9aafefeab6b0f01d7467f70cb2f1b16ae88340e8) - 27.06 git jewel branch latest.
> Where is fixed point?

Ah, I see another report of this as well. Created a ticket:
http://tracker.ceph.com/issues/16592.
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mds standby + standby-reply upgrade

2016-07-04 Thread Dzianis Kahanovich
Gregory Farnum пишет:
> On Thu, Jun 30, 2016 at 1:03 PM, Dzianis Kahanovich  wrote:
>> Upgraded infernalis->jewel (git, Gentoo). Upgrade passed over global
>> stop/restart everything oneshot.
>>
>> Infernalis: e5165: 1/1/1 up {0=c=up:active}, 1 up:standby-replay, 1 
>> up:standby
>>
>> Now after upgrade start and next mon restart, active monitor falls with
>> "assert(info.state == MDSMap::STATE_STANDBY)" (even without running mds) . 
>> Fixed:
>>
>> --- a/src/mon/MDSMonitor.cc 2016-06-27 21:26:26.0 +0300
>> +++ b/src/mon/MDSMonitor.cc 2016-06-28 10:44:32.0 +0300
>> @@ -2793,7 +2793,11 @@ bool MDSMonitor::maybe_promote_standby(s
>>  for (const auto  : pending_fsmap.standby_daemons) {
>>const auto  = j.first;
>>const auto  = j.second;
>> -  assert(info.state == MDSMap::STATE_STANDBY);
>> +//  assert(info.state == MDSMap::STATE_STANDBY);
>> +  if (info.state != MDSMap::STATE_STANDBY) {
>> +dout(0) << "gid " << gid << " ex-assert(info.state ==
>> MDSMap::STATE_STANDBY) " << do_propose << dendl;
>> +   return do_propose;
>> +  }
>>
>>if (!info.standby_replay) {
>>  continue;
>>
>>
>> Now: e5442: 1/1/1 up {0=a=up:active}, 1 up:standby
>> - but really there are 3 mds (active, replay, standby).
>>
>> # ceph mds dump
>> dumped fsmap epoch 5442
>> fs_name cephfs
>> epoch   5441
>> flags   0
>> created 2016-04-10 23:44:38.858769
>> modified2016-06-27 23:08:26.211880
>> tableserver 0
>> root0
>> session_timeout 60
>> session_autoclose   300
>> max_file_size   1099511627776
>> last_failure5239
>> last_failure_osd_epoch  18473
>> compat  compat={},rocompat={},incompat={1=base v0.20,2=client writeable
>> ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds 
>> uses
>> versioned encoding,6=dirfrag is stored in omap,8=no anchor table}
>> max_mds 1
>> in  0
>> up  {0=3104110}
>> failed
>> damaged
>> stopped
>> data_pools  5
>> metadata_pool   6
>> inline_data disabled
>> 3104110:10.227.227.103:6800/14627 'a' mds.0.5436 up:active seq 30
>> 3084126:10.227.227.104:6800/24069 'c' mds.0.0 up:standby-replay seq 1
>>
>>
>> If standby-replay false - all OK: 1/1/1 up {0=a=up:active}, 2 up:standby
>>
>> How to fix this 3-mds behaviour?
> 
> Ah, you hit a known bug with that assert. I thought the fix was
> already in the latest point release; are you behind?
> -Greg
> 

Cheked in logs - observed in version 10.2.2-45-g9aafefe
(9aafefeab6b0f01d7467f70cb2f1b16ae88340e8) - 27.06 git jewel branch latest.
Where is fixed point?

-- 
WBR, Dzianis Kahanovich AKA Denis Kaganovich, http://mahatma.bspu.by/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mds standby + standby-reply upgrade

2016-06-30 Thread Gregory Farnum
On Thu, Jun 30, 2016 at 1:03 PM, Dzianis Kahanovich  wrote:
> Upgraded infernalis->jewel (git, Gentoo). Upgrade passed over global
> stop/restart everything oneshot.
>
> Infernalis: e5165: 1/1/1 up {0=c=up:active}, 1 up:standby-replay, 1 up:standby
>
> Now after upgrade start and next mon restart, active monitor falls with
> "assert(info.state == MDSMap::STATE_STANDBY)" (even without running mds) . 
> Fixed:
>
> --- a/src/mon/MDSMonitor.cc 2016-06-27 21:26:26.0 +0300
> +++ b/src/mon/MDSMonitor.cc 2016-06-28 10:44:32.0 +0300
> @@ -2793,7 +2793,11 @@ bool MDSMonitor::maybe_promote_standby(s
>  for (const auto  : pending_fsmap.standby_daemons) {
>const auto  = j.first;
>const auto  = j.second;
> -  assert(info.state == MDSMap::STATE_STANDBY);
> +//  assert(info.state == MDSMap::STATE_STANDBY);
> +  if (info.state != MDSMap::STATE_STANDBY) {
> +dout(0) << "gid " << gid << " ex-assert(info.state ==
> MDSMap::STATE_STANDBY) " << do_propose << dendl;
> +   return do_propose;
> +  }
>
>if (!info.standby_replay) {
>  continue;
>
>
> Now: e5442: 1/1/1 up {0=a=up:active}, 1 up:standby
> - but really there are 3 mds (active, replay, standby).
>
> # ceph mds dump
> dumped fsmap epoch 5442
> fs_name cephfs
> epoch   5441
> flags   0
> created 2016-04-10 23:44:38.858769
> modified2016-06-27 23:08:26.211880
> tableserver 0
> root0
> session_timeout 60
> session_autoclose   300
> max_file_size   1099511627776
> last_failure5239
> last_failure_osd_epoch  18473
> compat  compat={},rocompat={},incompat={1=base v0.20,2=client writeable
> ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds 
> uses
> versioned encoding,6=dirfrag is stored in omap,8=no anchor table}
> max_mds 1
> in  0
> up  {0=3104110}
> failed
> damaged
> stopped
> data_pools  5
> metadata_pool   6
> inline_data disabled
> 3104110:10.227.227.103:6800/14627 'a' mds.0.5436 up:active seq 30
> 3084126:10.227.227.104:6800/24069 'c' mds.0.0 up:standby-replay seq 1
>
>
> If standby-replay false - all OK: 1/1/1 up {0=a=up:active}, 2 up:standby
>
> How to fix this 3-mds behaviour?

Ah, you hit a known bug with that assert. I thought the fix was
already in the latest point release; are you behind?
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com