On 07/08/2013, at 4:06 PM, "Ulrich Windl" <[email protected]> 
wrote:

>>>> Andrew Beekhof <[email protected]> schrieb am 06.08.2013 um 22:10 in 
>>>> Nachricht
> <[email protected]>:
> 
>> On 06/08/2013, at 5:24 PM, "Ulrich Windl" 
>> <[email protected]> 
>> wrote:
>> 
>>>>>> Thomas Glanzmann <[email protected]> schrieb am 05.08.2013 um 19:03 in
>>> Nachricht <[email protected]>:
>>>> Hello Ulrich,
>>>> 
>>>>> Did it happen when you put the cluster into maintenance-mode, or did
>>>>> it happen after someone fiddled with the resources manually? Or did it
>>>>> happen when you turned maintenance-mode off again?
>>>> 
>>>> I did not remember, but checked the log files, and yes I did a config
>>>> change (I removed apache_loadbalancer from group apache). And that is 
>>>> probably
>>>> the reason I could not reproduce it in my lab environemnt because I never 
>>>> tried
>>>> to fiddle with it afterwards.. Probably the way to reproduce it is: put it 
>>>> to
>>>> maintance-mode and than change something to the config and it crashes, but 
>>>> I
>>>> have to verify that in my lab and report back. I'll do that right now and
>>>> report back.
>>> 
>>> Hi!
>>> 
>>> I think it's a common misconception that you can modify cluster resources 
>> while in maintenance mode:
>> 
>> No, you _should_ be able to.  If that's not the case, its a bug.
> 
> So the end of maintenance mode starts with a "re-probe"?

No, but it doesn't need to.  
The policy engine already knows if the resource definitions changed and the 
recurring monitor ops will find out if any are not running.

> Even if, until the end of the re-probe all resources should be considered to 
> be in an unclean state.

They should probably be considered unclean the moment you turn maintenance mode 
on.

> I had some quite bad experience with maintenenace mode...
> 
>> 
>>> It seems the cluster expects the state it had when mainteneance mode was 
>> turned on later when you turn it off. Maybe it's like the airplane's 
>> autopilot: You can turn it off, fly the plane the way the autopilot would 
>> have done, and then when you turn the autopilot on again, the flight path 
>> will continue; however if you change direction while the autopilot is off, 
>> big confusion may arise when you turn the autopilot on again... ;-)
>>> 
>>> 
>>> Am I right?
>> 
>> No, more likely its just not a well tested scenario.
>> 
>>> 
>>> Still this doesn't explain where your configuration or log files went...
>>> 
>>> Regards,
>>> Ulrich
>>> 
>>> 
>>>> 
>>>> ...
>>>> Aug  4 18:49:18 apache-03 cib: [29394]: info: cib:diff: +   <configuration 
>>>> >
>>>> Aug  4 18:49:18 apache-03 cib: [29394]: info: cib:diff: +     <crm_config >
>>>> Aug  4 18:49:18 apache-03 cib: [29394]: info: cib:diff: +       
>>>> <cluster_property_set id="cib-bootstrap-options" >
>>>> Aug  4 18:49:18 apache-03 cib: [29394]: info: cib:diff: +         <nvpair 
>>>> id="cib-bootstrap-options-maintenance-mode" name="maintenance-mode" 
>>>> value="true" 
>>>> __crm_diff_marker__="added:top" />
>>>> Aug  4 18:49:18 apache-03 cib: [29394]: info: cib:diff: +       
>>>> </cluster_property_set>
>>>> Aug  4 18:49:18 apache-03 cib: [29394]: info: cib:diff: +     </crm_config>
>>>> Aug  4 18:49:18 apache-03 cib: [29394]: info: cib:diff: +   
>>>> </configuration>
>>>> Aug  4 18:49:18 apache-03 cib: [29394]: info: cib:diff: + </cib>
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: - <cib 
>>>> admin_epoch="0" 
>>>> epoch="94" num_updates="100" >
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: -   <configuration 
>>>> >
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: -     <resources >
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: -       <group 
>>>> id="apache" >
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: -         
>>>> <primitive 
>>>> class="ocf" id="apache_loadbalancer" provider="heartbeat" type="apachetg" 
>>>> __crm_diff_marker__="removed:top" >
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: -           
>>>> <operations 
>> 
>>>>> 
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: -             <op 
>>>> id="apache_loadbalancer-monitor-60s" interval="60s" name="monitor" />
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: -           
>>>> </operations>
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: -         
>>>> </primitive>
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: -       </group>
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: -     </resources>
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: -   
>>>> </configuration>
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: - </cib>
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: + <cib epoch="95" 
>>>> num_updates="1" admin_epoch="0" validate-with="pacemaker-1.2" 
>>>> crm_feature_set="3.0.6" update-origin="apache-03" update-client="cibadmin" 
>>>> cib-last-written="Sun Aug  4
>>>> 18:49:18 2013" have-quorum="1" 
>>>> dc-uuid="61e8f424-b538-4352-b3fe-955ca853e5fb" >
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: +   <configuration 
>>>> >
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: +     <resources >
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: +       <primitive 
>>>> class="ocf" id="apache_loadbalancer" provider="heartbeat" type="apachetg" 
>>>> __crm_diff_marker__="added:top" >
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: +         
>>>> <operations >
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: +           <op 
>>>> id="apache_loadbalancer-monitor-60s" interval="60s" name="monitor" />
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: +         
>>>> </operations>
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: +       
>>>> </primitive>
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: +     </resources>
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: +   
>>>> </configuration>
>>>> Aug  4 18:50:20 apache-03 cib: [29394]: info: cib:diff: + </cib>
>>>> ...
>>>> Aug  4 18:50:27 apache-03 heartbeat: [29380]: ERROR: Managed 
>>>> /usr/lib/heartbeat/crmd process 29398 dumped core
>>>> 
>>>> Complete syslog is my other e-mail I just sent to Alan, if you want to
>>>> check it.
>>>> 
>>>> Cheers,
>>>>       Thomas
>>>> _______________________________________________
>>>> Linux-HA mailing list
>>>> [email protected] 
>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha 
>>>> See also: http://linux-ha.org/ReportingProblems 
>>> 
>>> 
>>> _______________________________________________
>>> Linux-HA mailing list
>>> [email protected] 
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha 
>>> See also: http://linux-ha.org/ReportingProblems 
>> 
>> _______________________________________________
>> Linux-HA mailing list
>> [email protected] 
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha 
>> See also: http://linux-ha.org/ReportingProblems 
> 
> 
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to