It is possible to use global-variables (it'll require some enhancements, table-support etc), but it'll be very inefficient compared to this approach. For instance, choice of data-structure etc allows making the solution a lot more efficient.
Here its possible to locklessly increment counters in most cases, so its overhead is a lot lesser than global-variables. Recycle is precisely to allow this lockless mechanism to work. Its basically saying, it'll track metric-names he has seen in last 1 hour. If we kill tracking of it as soon as we don't see an increment (between 2 reporting runs of impstats), it'll lead to unnecessary churn when low-values are common or load is not uniform in time. Implementing it on top of global-variables is not only has very high performance-penalty(it'll be prohibitive for high-throughput scenarios), it also exposes too much complexity to the user (where user has to worry about reset etc). I don't plan to have a scheduler in this implementation. GetAllStatsLines call will purge the tree instead of reset at that interval. Its basically a balance between freeing-up memory occupied by stale-metric-names vs. performance (lockless handling of increment). So it will be governed by impstat schedule. May be I should change name to better name (equivalent of purge_known_keys_after_they_have_been_reported_N_times). On Tue, Oct 6, 2015 at 4:30 PM, David Lang <[email protected]> wrote: > On Tue, 6 Oct 2015, singh.janmejay wrote: > >> Hi, >> >> I am working on support for stats with dynamic-name. This comes handy >> in situations where metric-name is dependent upon value of a certain >> attribute of the message. >> >> Say, for a central log-aggregation service, its valuable to know what >> is inbound message-count distribution across application-clusters that >> send logs to it, or for a shared-server, its valuable to know what is >> the log-volume generation across users etc. >> >> Im thinking of using functions-like interface to support this. It may >> look similar to this: >> >> ==================== >> dyn_stats("user_msg_count") >> >> ... >> >> ruleset(...) { >> ... >> dyn_inc("user_msg_count", $.user) >> ... >> } >> ==================== >> >> dyn_stats signature looks like: >> dyn_stats(<name_space>, <resettable: default=true>, <max_cardinality: >> default=10k>, <recycle_metric_names_after: default=1hr>) >> >> dyn_inc signature looks like: >> dyn_inc(<name_space>, <metric_name>) >> >> >> Reporting would work similar to static-metric via impstats. Mapping: >> statsobj_s.name = name_space >> statsobj_s.origin = "dyn" >> ctr_s.name = "foo" (say $.user had value foo) >> >> >> Thoughts / suggestions? > > > how is this different/better than global variables? (although we may need to > implement soem functions, atomic inc/dec copy+clear) If you have pstats > output in json format, you can even piggyback on it's schedule to output the > data. > > > things like stats can very easily end up being expensive in terms of locking > (something global variables already have figured out), and it sounds like > you are proposing adding a scheduler of some sort to output the data. > > variables should not need to be 'recycled', either they contain data or they > don't. If they contain data, you need to keep the data until you do > something with it, if they don't, you don't have to track them. > > > I am actually doing this sort of thing external to rsyslog in SEC > > I have a template in rsyslog that contains hostname, fromhost-ip, > programname and I output it via improg to SEC. SEC accumulates counters and > has scheduled outputs to files. > > before I started using SEC for this, I used the same template to output to a > file and then for reports, used cut + sort + uniq -c to extract the data I > need. When the files only contain the significant data, this is actually not > bad to do, even at higher volumes. > > David Lang > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T > LIKE THAT. -- Regards, Janmejay http://codehunk.wordpress.com _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

