Sure, sounds good. The ability to gather stats with dynamic-key is important. I am willing to even help with a rewrite if at some point we feel its best implemented in a different way in the light of variable-implementation changes.
On Tue, Oct 6, 2015 at 9:48 PM, Rainer Gerhards <[email protected]> wrote: > 2015-10-06 18:04 GMT+02:00 singh.janmejay <[email protected]>: >> Rainer, >> >> I see this as something completely outside the scope of variables. >> Building stats collector over variables is possible, but then we are >> then talking about a general purpose language which allows building >> such complex things. This increases the scope of Rainerscript and with >> larger scope comes complexity. I feel this is in-line with the other >> Lua discussion where you emphasized that Rainerscript should not >> become a fully-general-purpose language? >> >> Eg. creating an atomic-increment function for variable requires that >> we educate users about what can and can't be done if atomic-increment >> function is used anywhere on a variable. What relationship they can >> expect it to have with other atomic-incrementing variables (which gets >> into memory model). > > Maybe I just feel overwhelemed in the moment with keeping track of > everything that is going on. How about this: we can merge it BUT flag > it as experimental. If all works out well, I am free starting early > next year to have a deep look at the overall design and sticking > together all those loose edges. I suspect that I would like to change > a couple of things in the interest of tying it all well together (like > I currently do in liblognorm). > > But if I need to carry all this legacy, that's really a burden (e.g. > liblognorm now contains the full v1 code as a copy, which means it > also needs to be somewhat maintained). I want to avoid this. As long > as we document this as an *interim* solution that is not necessarily > here to stay and as such "use at your own risk and it will probably > break next year" I am sufficiently happy with that. We just must be > aware that things may really break and there is a big chance the > actually will. And I don't want to hear about potential vuln or > compatiblity issues or whatever when this code is changed/removed. > Also keep on your mind that I probably need to totally revamp the > variable system, as json-c has many problematic parts for our use > (what I learned when digging deep with liblognorm). So I *know* that > there are big changes coming up next year! > > And, full ack: I want to limit the scope of RainerScript. Arrays was a > good sample of why it may be a bad idea to go to boldly forward > without thinking about the big picture. > > Rainer >> >> >> >> On Tue, Oct 6, 2015 at 8:49 PM, Rainer Gerhards >> <[email protected]> wrote: >>> I can't fully dig into this, but I think we must *very carefully* >>> evaluate the overall design. Some time ago we introduced arrays for >>> the limited liblognorm use case, and it hurts us every now and then >>> when folks want to use arrays for other use cases. It may probably >>> make sense to re-think how the variable engine etc behaves before >>> adding more functionality. And make sure that everything works smooth >>> in all use cases. While anything else may take care for some use >>> cases, I fear we may get too fragmented. At least this is what I >>> learned in the past months discussions. >>> >>> Anyone else? >>> >>> Rainer >>> >>> 2015-10-06 17:10 GMT+02:00 singh.janmejay <[email protected]>: >>>> It is possible to use global-variables (it'll require some >>>> enhancements, table-support etc), but it'll be very inefficient >>>> compared to this approach. For instance, choice of data-structure etc >>>> allows making the solution a lot more efficient. >>>> >>>> Here its possible to locklessly increment counters in most cases, so >>>> its overhead is a lot lesser than global-variables. >>>> >>>> Recycle is precisely to allow this lockless mechanism to work. Its >>>> basically saying, it'll track metric-names he has seen in last 1 hour. >>>> If we kill tracking of it as soon as we don't see an increment >>>> (between 2 reporting runs of impstats), it'll lead to unnecessary >>>> churn when low-values are common or load is not uniform in time. >>>> >>>> Implementing it on top of global-variables is not only has very high >>>> performance-penalty(it'll be prohibitive for high-throughput >>>> scenarios), it also exposes too much complexity to the user (where >>>> user has to worry about reset etc). >>>> >>>> I don't plan to have a scheduler in this implementation. >>>> GetAllStatsLines call will purge the tree instead of reset at that >>>> interval. Its basically a balance between freeing-up memory occupied >>>> by stale-metric-names vs. performance (lockless handling of >>>> increment). So it will be governed by impstat schedule. May be I >>>> should change name to better name (equivalent of >>>> purge_known_keys_after_they_have_been_reported_N_times). >>>> >>>> >>>> On Tue, Oct 6, 2015 at 4:30 PM, David Lang <[email protected]> wrote: >>>>> On Tue, 6 Oct 2015, singh.janmejay wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I am working on support for stats with dynamic-name. This comes handy >>>>>> in situations where metric-name is dependent upon value of a certain >>>>>> attribute of the message. >>>>>> >>>>>> Say, for a central log-aggregation service, its valuable to know what >>>>>> is inbound message-count distribution across application-clusters that >>>>>> send logs to it, or for a shared-server, its valuable to know what is >>>>>> the log-volume generation across users etc. >>>>>> >>>>>> Im thinking of using functions-like interface to support this. It may >>>>>> look similar to this: >>>>>> >>>>>> ==================== >>>>>> dyn_stats("user_msg_count") >>>>>> >>>>>> ... >>>>>> >>>>>> ruleset(...) { >>>>>> ... >>>>>> dyn_inc("user_msg_count", $.user) >>>>>> ... >>>>>> } >>>>>> ==================== >>>>>> >>>>>> dyn_stats signature looks like: >>>>>> dyn_stats(<name_space>, <resettable: default=true>, <max_cardinality: >>>>>> default=10k>, <recycle_metric_names_after: default=1hr>) >>>>>> >>>>>> dyn_inc signature looks like: >>>>>> dyn_inc(<name_space>, <metric_name>) >>>>>> >>>>>> >>>>>> Reporting would work similar to static-metric via impstats. Mapping: >>>>>> statsobj_s.name = name_space >>>>>> statsobj_s.origin = "dyn" >>>>>> ctr_s.name = "foo" (say $.user had value foo) >>>>>> >>>>>> >>>>>> Thoughts / suggestions? >>>>> >>>>> >>>>> how is this different/better than global variables? (although we may need >>>>> to >>>>> implement soem functions, atomic inc/dec copy+clear) If you have pstats >>>>> output in json format, you can even piggyback on it's schedule to output >>>>> the >>>>> data. >>>>> >>>>> >>>>> things like stats can very easily end up being expensive in terms of >>>>> locking >>>>> (something global variables already have figured out), and it sounds like >>>>> you are proposing adding a scheduler of some sort to output the data. >>>>> >>>>> variables should not need to be 'recycled', either they contain data or >>>>> they >>>>> don't. If they contain data, you need to keep the data until you do >>>>> something with it, if they don't, you don't have to track them. >>>>> >>>>> >>>>> I am actually doing this sort of thing external to rsyslog in SEC >>>>> >>>>> I have a template in rsyslog that contains hostname, fromhost-ip, >>>>> programname and I output it via improg to SEC. SEC accumulates counters >>>>> and >>>>> has scheduled outputs to files. >>>>> >>>>> before I started using SEC for this, I used the same template to output >>>>> to a >>>>> file and then for reports, used cut + sort + uniq -c to extract the data I >>>>> need. When the files only contain the significant data, this is actually >>>>> not >>>>> bad to do, even at higher volumes. >>>>> >>>>> David Lang >>>>> _______________________________________________ >>>>> rsyslog mailing list >>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>> http://www.rsyslog.com/professional-services/ >>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>>>> of >>>>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T >>>>> LIKE THAT. >>>> >>>> >>>> >>>> -- >>>> Regards, >>>> Janmejay >>>> http://codehunk.wordpress.com >>>> _______________________________________________ >>>> rsyslog mailing list >>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>> http://www.rsyslog.com/professional-services/ >>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>> DON'T LIKE THAT. >>> _______________________________________________ >>> rsyslog mailing list >>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>> http://www.rsyslog.com/professional-services/ >>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of >>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T >>> LIKE THAT. >> >> >> >> -- >> Regards, >> Janmejay >> http://codehunk.wordpress.com >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of >> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T >> LIKE THAT. > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T > LIKE THAT. -- Regards, Janmejay http://codehunk.wordpress.com _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

