It'll support dynamic-key, so any property will work. On Wed, Oct 7, 2015 at 2:38 PM, chenlin rao <[email protected]> wrote: > I hope there is a stats about metrics based on $programname, $severity, > $fromhost-ip etc, extends the ruleset(impstats). > > 2015-10-07 16:19 GMT+08:00 singh.janmejay <[email protected]>: > >> -- >> Regards, >> Janmejay >> >> PS: Please blame the typos in this mail on my phone's uncivilized soft >> keyboard sporting it's not-so-smart-assist technology. >> >> On Oct 7, 2015 3:26 AM, "David Lang" <[email protected]> wrote: >> > >> > On Wed, 7 Oct 2015, singh.janmejay wrote: >> > >> >> -- >> >> Regards, >> >> Janmejay >> >> >> >> PS: Please blame the typos in this mail on my phone's uncivilized soft >> >> keyboard sporting it's not-so-smart-assist technology. >> >> >> >> On Oct 6, 2015 10:32 PM, "David Lang" <[email protected]> wrote: >> >>> >> >>> >> >>> On Tue, 6 Oct 2015, singh.janmejay wrote: >> >>> >> >>>> It is possible to use global-variables (it'll require some >> >>>> enhancements, table-support etc), but it'll be very inefficient >> >>>> compared to this approach. For instance, choice of data-structure etc >> >>>> allows making the solution a lot more efficient. >> >>> >> >>> >> >>> >> >>> As for the data structures, Rainer has been identifying inefficencies >> in >> >> >> >> how json-c works and working to improve them >> >>> >> >>> >> >> >> >> That optimizes variable system. But it still is a general propose >> variable >> >> system. It can't and shouldn't understand relationship between >> variables. >> > >> > >> > what relationships are there between the different metrics? >> > >> >> The fact that they are read in one shot, reported and reset. >> >> > >> >>> >> >>>> Here its possible to locklessly increment counters in most cases, so >> >>>> its overhead is a lot lesser than global-variables. >> >>> >> >>> >> >>> >> >>> how can you manage counters in multiple threads without locks? >> Especially >> >> >> >> when dealing with batches. >> >>> >> >>> >> >> >> >> Consider a trie based implementation. With bounded fanout-factor, it's >> O(1) >> >> wrt metric-names cardinality. It also has very little lock contention >> >> involved. Usually operations work with read-locks, only when new metric >> is >> >> initialized it requires a write lock on patent node. If recycles are few >> >> and far apart, lock contention would be negligible. >> > >> > >> > if you have multiple threads that may need to update the same metric at >> the same time, a tree doesn't eliminate the locking. >> > >> >> The only situation involving a lock that is contended for, is when a metric >> is to be initialized. Consider this trie: >> >> A -> B -> C >> -> D >> >> Now for incrementing key ABC, no contention exists, because it involves >> read-locks only. It just uses atomic-increment to bump the counter at node >> C. Same for ABD. >> >> But ABE will require a write-lock at node B, because node E doesn't exist >> yet. However key with not shared parent can again be initialized >> concurrently. Init operation can be amortized over a large set of increment >> operations making its cost negligible(this is the knob that reset interval >> exposes). >> >> > The current json-c locking is being make intentionally over-broad right >> now because it appears that some json-c code is not thread-safe and we >> haven't identified it yet. Once that's tracked down and fixed (or json-c >> replaced), updating one item should not require locking any more than that >> item. >> > >> > >> >>> >> >>>> Recycle is precisely to allow this lockless mechanism to work. Its >> >>>> basically saying, it'll track metric-names he has seen in last 1 hour. >> >>>> If we kill tracking of it as soon as we don't see an increment >> >>>> (between 2 reporting runs of impstats), it'll lead to unnecessary >> >>>> churn when low-values are common or load is not uniform in time. >> >>> >> >>> >> >>> >> >>> that depends on the cost of initializing a metric vs the cost of >> tracking >> >> >> >> the recycle mechanism. >> >>> >> >>> >> >> >> >> 0 value data-points can easily be filtered out. So they don't create any >> >> processing overhead downstream. Cost of tracking for recycle is minimal >> >> because it's a single counter bring tracked, when it reaches zero it's >> >> reset to orig starting value and trie is killed after reporting >> accumulated >> >> stats. >> > >> > >> > actually, filtering out 0 data-points can be a very bad thing. Far too >> many monitoring tools produce stright-line graphs/estimates between >> reported data-points, so it's very important to report 0 value data-points >> > >> >> I agree. >> >> > >> >>>> Implementing it on top of global-variables is not only has very high >> >>>> performance-penalty(it'll be prohibitive for high-throughput >> >>>> scenarios), it also exposes too much complexity to the user (where >> >>>> user has to worry about reset etc). >> >>>> >> >>>> I don't plan to have a scheduler in this implementation. >> >>>> GetAllStatsLines call will purge the tree instead of reset at that >> >>>> interval. Its basically a balance between freeing-up memory occupied >> >>>> by stale-metric-names vs. performance (lockless handling of >> >>>> increment). So it will be governed by impstat schedule. May be I >> >>>> should change name to better name (equivalent of >> >>>> purge_known_keys_after_they_have_been_reported_N_times). >> >>> >> >>> >> >>> >> >>> if this is just adding additional metrics to the impstats output that >> >> >> >> eliminates the schedular/reset issue. >> >>> >> >>> >> >>> I think we should have a metric configuration be fairly static, allow >> >> >> >> configuring custom metrics and add to them, but don't use data from the >> >> message as part of the name of the metric, and continue reporting them >> >> forever, even if they are 0 (so no need to 'recycle' names) >> >> >> >> Dynamic metrics are a real usecase for any shared system(utilisation >> across >> >> several users, several hosts, several clusters, several-subnets etc are >> >> easily reportable with this). The only way to report utilisation in many >> >> scenarios is to have dyn-metric names. The alternative is to pre-declare >> >> all keys, but that to me is a more indirect solution. It's not as >> >> flexible/adaptive. >> >> >> >> I think declarative static-key a useful feature on its own, for eg when >> >> classifying reportable metric into buckets known in advance, but dyn-key >> >> and configurable-static-key are not interchangeable. >> > >> > >> > dynamic systems will have pathalogical failure condiditions. Consider >> what happens when someone uses hostname in a dynafile template and then >> some system starts spewing malformed logs that put garbage data in that >> field. Creating hundreds or thousands of metric variables is much worse. >> >> The max-cardinality optional field in dyn-metric-namespace declaration was >> exactly to prevent this kind of unbounded growth. We can choose a sensible >> default. >> >> > >> > I agree that pre-declared keys are less flexible, but they are also going >> to be far safer and easier to deal with. >> > >> > >> > David Lang >> > _______________________________________________ >> > rsyslog mailing list >> > http://lists.adiscon.net/mailman/listinfo/rsyslog >> > http://www.rsyslog.com/professional-services/ >> > What's up with rsyslog? Follow https://twitter.com/rgerhards >> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> DON'T LIKE THAT. >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> DON'T LIKE THAT. >> > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T > LIKE THAT.
-- Regards, Janmejay http://codehunk.wordpress.com _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

