Re: [Puppet-dev] Discussion for fixing 2.6.5 regression around :absent -> correct_value changes.

Markus Roberts Wed, 02 Mar 2011 07:53:47 -0800

R.I. --

> > If :absent is there, I believe it was correct (that is, that the
> > > file really was absent).
>
> the file wasn't absent.



This is the thing we need to be trying to reproduce then.

also notice its :type not :ensure.
>

That's an implementation detail, which I don't think we need to worry about.


> > >
> > > The one case where we have an issue (I believe) is when the file
> > > was absent,
> > > being manged but not audited, and we update to a new version with
> > > Jesse's
> > > fix at the same time we start auditing.  (Without the fix, it's
> > > much worse,
> > > as R.I. documented).
> >
> > all agreed. Thanks for clarifying and tightening the language Markus.
> >
> > I mis-characterized the situation because I was under the impression
> > I'd shown that content changes triggered :absent, not just ...
> > absence.
>
> I dont agree.
>
> I think there are 2 things going on, and we're focussing on the wrong one.
>
> INCORRECT STUFF WRITTEN TO STATE.YAML:
>

I agree, this is definitely the part we should be focusing on.


> something, at some point wrote stuff into state.yaml that is wrong, in
> this case:
>
>    !ruby/sym type: !ruby/sym absent
>
> for files that were _never_ set to absent.  Setting them to ensure =>
> absent
> today doesnt result in the same line being written either, I think there's
> some combination of properties, source etc that causes this.
>

>From my experimentation, what causes it is the file being actually absent.

Another way to put this: it appears that when a resource is both audited and
managed the state that is recorded is the state *before* the changes due to
management are applied.  This can happen because you either intentionally
set a managed resource to audit or, as we found, if audit is set for you as
a result of a bug.



> with current code I cannot reproduce this, we might have had a bug in the
> past
> that did this my machines have been updated since back in 0.24 days and
> state
> never gets deleted.  For all we know we've fixed this bug months/years ago.
>
> The specifics of this bug doesnt matter, its this _kind_ of bug that
> matters.
> Lets ignore this problem for now and focus on the next bit.
>
> WE DO NOT SCRUB INCORRECT THINGS FROM STATE.YAML:
>

I am not sure if this is presently working or not, but I believe it was a
bug previously.



> Imagine something writes some bogus info to state.yaml, be it corruption,
> software bug, a user messing around or whatever.  The origin doesnt
> matter.
>
> With auditing disabled we only update some parts of a specific resource
> in state.yaml, we do not make sure there isn't bogus stuff in there.  So
> for example in the case:
>
>  "File[/etc/mcollective/ssl/clients/rip.pem]":
>     !ruby/sym type: !ruby/sym absent
>     !ruby/sym checked: 2011-03-02 08:59:18.609780 +00:00
>    !ruby/sym synced: 2011-01-01 21:01:56.182650 +00:00
>
> The file exist, its part of a recursive copy with purge enforced, this
> specific
> file has never been set as ensure => absent - also please note the property
> is TYPE not ENSURE.  The correct value here is TYPE => FILE not ABSENT.
>
> The :type being :absent is bogus and should not be there.  During a normal
> run with auditing disabled it should not care for this property and the
> fact
> that we do not care for it should be enforced by removing it.
>
> As it stands we are effectively orphaning this data, never updating it,
> never verifying it and never cleaning it.
>
> I imagine something like have a hash of file properties from state.yaml
> and you're just updating the checked and synced ones then writing out
> what is there unchanged.  What I think we should do is clear the hash for
> the file and make new values then save it.  We should do this in both
> audit and unaudited mode so that we remove stale information about a
> managed
> file from the state.yaml.
>
> The impact of this is that as soon as a user changes the resources into a
> state where puppet _does_ care for this property - by enabling auditing -
> the information in there is wrong, doesn't reflect current state and causes
> audit events to fire in error.
>

Bingo.  I think this is the crux of the matter.


> We should care for the first problem - but right now I think its a
> distraction.
>

I'm not so sure; I think writing the state that was found rather than the
state that was left, if that is what's happening, may also be an issue we
need to look at.

The how that :type got in there doesnt matter right now what matters is a
> few
> people on IRC confirmed they have similar bogus data in their state.yaml
> and
> so we should focus on the 2nd issue and that is how do we make puppets
> handling
> of state.yaml more robust so that old bugs that dumped unexpected or
> incompatible
> data into it doesnt impact new versions causing them to send unexpected
> notifies.
>

I'm not sure that the "how it got in there" part is irrelevant (for
instance, I'd like if you could confirm that state.yaml shows type=>absent
on a node that has not been upgraded, and note the version), but I agree
that a borked state.yaml shouldn't cause a node to panic and go into spin
mode.  The question is, how do we detect it -- that is, how do we
distinguish it from "correct" information?

-- M
-----------------------------------------------------------
When in trouble or in doubt, run in circles,
scream and shout. -- 1920's parody of the
maritime general prudential rule
------------------------------------------------------------

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/puppet-dev?hl=en.

Re: [Puppet-dev] Discussion for fixing 2.6.5 regression around :absent -> correct_value changes.

Reply via email to