On Tue, May 10, 2011 at 09:29:59AM -0400, Mark Burgess wrote:
>
>Wow, Jesse -- someone actually reads the stuff I write. Thank you! ;-)

Yeah, it's interesting stuff, when I have time to read it.  :-)

For me, one of the strongest points in favor of CFengine is the amount
of research that has been put into it.

>What Cfengine does with monitoring is really different to any other
>software. Sometimes people say -- we don't want that monitoring stuff
>because we like RRD tool better, or something like that. But these offer
>quite different insights. I think we have a job to do making it easier
>to understand the value of the results. A lot of this is visual, so it
>belongs in a GUI.

One of the drawbacks of RRD files is that they are just datafiles.
Thus, they are inherently inert, and require an external process to make
sense of it (Ganglia, Cacti, MRTG, etc).  There is some anomaly
detection support in the form of Holt-Winters forcasting (and something
else in very new versions of rrdtool), but that's still of limited
use.  It's hard to get that anomaly information "out" of the .rrd file
and into a useful form.

A lot of it is visual, but a lot of it isn't.  It's hard to send a graph
to a pager (although with smartphones, this is less an issue as of
late...).

>After 10 years of researching this, I think the usefulness of the
>anomaly detection classes is limited, because it is so context sensitive
>to many factors. It needs a human interpretation to make sense of it, so
>really these numbers are for the intelligent sysadmin to watch with
>interest.

I'd also humbly suggest that the lack of useful examples doesn't help
either.  There are a *few* examples, and they are frequently recycled
between documents.  I saw the same "entropy_www_in_low.www_high_dev1"
sort of example in almost every paper I read. ;-)

That said, the cfengine-Anomalies.pdf paper does have several more.

One general problem with CFengine is that it is "really big."  There's a
lot to wrap your head around, and a set of promises that work for *me*
at *my location* are probably useless for someone else.  So passing
around examples, while useful, is far from a "drop-in" sort of solution.

On the other hand, based my understanding of how the anomaly detection
works, it's system (and site) neutral because the anomaly classes are
based on a baseline for that specific system.  So in theory, it should
be possible to come up with a fairly generic set of classes for finding
problems.

In practice...I'm not sure, I haven't tried yet.  :)






>On 05/10/2011 02:26 PM, Jesse Becker wrote:
>> On Tue, May 10, 2011 at 02:08:22AM -0400, Jerome Baum wrote:
>>> On Tue, May 10, 2011 at 07:51, Aleksey 
>>> Tsalolikhin<atsaloli.t...@gmail.com<mailto:atsaloli.t...@gmail.com>>  wrote:
>>> What is entropy here and how it is computed?  Are both low and high
>>> entropy "bad"?  Or is low entropy good, high entropy bad?
>> Generally speaking, entropy is a measurement of "disorder" or
>> "variation."  There are specific, formal definitions, but I think that
>> these two are sufficient for now.
>>
>>> Low entropy is bad (not "bad" but bad, for security reasons). Entropy is 
>>> basically how much "randomness" is available, which is very important for 
>>> cryptographic systems -- such as SSL, SSH, and security in cfengine.
>> Right.  Entropy is typically used to make your RNG much more random. :)
>> It is possible to "run out" of entropy as well.  An example of this, on
>> Linux systems is to compare the behavior of "od /dev/random" and
>> "od /dev/urandom".  The output from /dev/random will pause when you run
>> out of entropy, whereas output from /dev/urandom has no such limitation.
>> The data from /dev/random is consider much higher quality with regards
>> to randomness.
>>
>>> You tend to get low entropy on server systems w/out keyboard and mouse to 
>>> take entropy from. For further reading 
>>> http://en.wikipedia.org/wiki/Entropy_(computing) helps.
>> Yep, and various other sources as well (audio input, video, etc).
>>
>> Back to cfengine...
>>
>> The entropy and anomaly classes come from cf-monitord (so if you turn it
>> off, you won't get those classes).  The cf-monitord process will try to track
>> various metrics, and provide those to cf-agent.  It can actually watch
>> the traffic flows, and categorize traffic by port number, but this
>> requires, essentially, letting cf-monitor "sniff" all traffic--which
>> might not be acceptable in your environment.
>>
>> Metrics other than network traffic can also be checked.  Your other
>> email mentions "loadavg_high_ldt", which means that cf-monitord thinks
>> that, at that time, the load average was higher than usual based on the
>> "Leap-Detection Test" (hence "ldt").  You may also see entries like
>> "messages_high_dev1", which indicate that that the current value of the
>> metric is more than 1 standard deviation above the average.
>>
>> This paper also talks about it in detail for CF2:
>>
>>      http://www.iu.hio.no/cfengine/docs/cfengine-Anomalies.pdf
>>
>> And this one goes into the mathematics behind it:
>>
>>      http://www.iu.hio.no/~mark/papers/anomaly.pdf
>>
>> One of the better explanations of how anomaly detection works is
>> actually in the SAGE short-topics booklet that Mark Burgess and Aeleen
>> Frisch wrote a few years back.  It uses CF2 syntax, but I believe that
>> the general concepts are still valid.  Unfortunately, it doesn't cover
>> the LDT stuff I mentioned before.
>>
>>      http://www.sage.org/pubs/16_cfengine/
>>
>> Unfortunately, I've been unable to find a paper that discusses anomaly
>> detection for CF3 in detail.
>>
>
>_______________________________________________
>Help-cfengine mailing list
>Help-cfengine@cfengine.org
>https://cfengine.org/mailman/listinfo/help-cfengine

-- 
Jesse Becker
NHGRI Linux support (Digicon Contractor)
_______________________________________________
Help-cfengine mailing list
Help-cfengine@cfengine.org
https://cfengine.org/mailman/listinfo/help-cfengine

Reply via email to