2015-10-06 18:04 GMT+02:00 singh.janmejay <[email protected]>: > Rainer, > > I see this as something completely outside the scope of variables. > Building stats collector over variables is possible, but then we are > then talking about a general purpose language which allows building > such complex things. This increases the scope of Rainerscript and with > larger scope comes complexity. I feel this is in-line with the other > Lua discussion where you emphasized that Rainerscript should not > become a fully-general-purpose language? > > Eg. creating an atomic-increment function for variable requires that > we educate users about what can and can't be done if atomic-increment > function is used anywhere on a variable. What relationship they can > expect it to have with other atomic-incrementing variables (which gets > into memory model).
Maybe I just feel overwhelemed in the moment with keeping track of everything that is going on. How about this: we can merge it BUT flag it as experimental. If all works out well, I am free starting early next year to have a deep look at the overall design and sticking together all those loose edges. I suspect that I would like to change a couple of things in the interest of tying it all well together (like I currently do in liblognorm). But if I need to carry all this legacy, that's really a burden (e.g. liblognorm now contains the full v1 code as a copy, which means it also needs to be somewhat maintained). I want to avoid this. As long as we document this as an *interim* solution that is not necessarily here to stay and as such "use at your own risk and it will probably break next year" I am sufficiently happy with that. We just must be aware that things may really break and there is a big chance the actually will. And I don't want to hear about potential vuln or compatiblity issues or whatever when this code is changed/removed. Also keep on your mind that I probably need to totally revamp the variable system, as json-c has many problematic parts for our use (what I learned when digging deep with liblognorm). So I *know* that there are big changes coming up next year! And, full ack: I want to limit the scope of RainerScript. Arrays was a good sample of why it may be a bad idea to go to boldly forward without thinking about the big picture. Rainer > > > > On Tue, Oct 6, 2015 at 8:49 PM, Rainer Gerhards > <[email protected]> wrote: >> I can't fully dig into this, but I think we must *very carefully* >> evaluate the overall design. Some time ago we introduced arrays for >> the limited liblognorm use case, and it hurts us every now and then >> when folks want to use arrays for other use cases. It may probably >> make sense to re-think how the variable engine etc behaves before >> adding more functionality. And make sure that everything works smooth >> in all use cases. While anything else may take care for some use >> cases, I fear we may get too fragmented. At least this is what I >> learned in the past months discussions. >> >> Anyone else? >> >> Rainer >> >> 2015-10-06 17:10 GMT+02:00 singh.janmejay <[email protected]>: >>> It is possible to use global-variables (it'll require some >>> enhancements, table-support etc), but it'll be very inefficient >>> compared to this approach. For instance, choice of data-structure etc >>> allows making the solution a lot more efficient. >>> >>> Here its possible to locklessly increment counters in most cases, so >>> its overhead is a lot lesser than global-variables. >>> >>> Recycle is precisely to allow this lockless mechanism to work. Its >>> basically saying, it'll track metric-names he has seen in last 1 hour. >>> If we kill tracking of it as soon as we don't see an increment >>> (between 2 reporting runs of impstats), it'll lead to unnecessary >>> churn when low-values are common or load is not uniform in time. >>> >>> Implementing it on top of global-variables is not only has very high >>> performance-penalty(it'll be prohibitive for high-throughput >>> scenarios), it also exposes too much complexity to the user (where >>> user has to worry about reset etc). >>> >>> I don't plan to have a scheduler in this implementation. >>> GetAllStatsLines call will purge the tree instead of reset at that >>> interval. Its basically a balance between freeing-up memory occupied >>> by stale-metric-names vs. performance (lockless handling of >>> increment). So it will be governed by impstat schedule. May be I >>> should change name to better name (equivalent of >>> purge_known_keys_after_they_have_been_reported_N_times). >>> >>> >>> On Tue, Oct 6, 2015 at 4:30 PM, David Lang <[email protected]> wrote: >>>> On Tue, 6 Oct 2015, singh.janmejay wrote: >>>> >>>>> Hi, >>>>> >>>>> I am working on support for stats with dynamic-name. This comes handy >>>>> in situations where metric-name is dependent upon value of a certain >>>>> attribute of the message. >>>>> >>>>> Say, for a central log-aggregation service, its valuable to know what >>>>> is inbound message-count distribution across application-clusters that >>>>> send logs to it, or for a shared-server, its valuable to know what is >>>>> the log-volume generation across users etc. >>>>> >>>>> Im thinking of using functions-like interface to support this. It may >>>>> look similar to this: >>>>> >>>>> ==================== >>>>> dyn_stats("user_msg_count") >>>>> >>>>> ... >>>>> >>>>> ruleset(...) { >>>>> ... >>>>> dyn_inc("user_msg_count", $.user) >>>>> ... >>>>> } >>>>> ==================== >>>>> >>>>> dyn_stats signature looks like: >>>>> dyn_stats(<name_space>, <resettable: default=true>, <max_cardinality: >>>>> default=10k>, <recycle_metric_names_after: default=1hr>) >>>>> >>>>> dyn_inc signature looks like: >>>>> dyn_inc(<name_space>, <metric_name>) >>>>> >>>>> >>>>> Reporting would work similar to static-metric via impstats. Mapping: >>>>> statsobj_s.name = name_space >>>>> statsobj_s.origin = "dyn" >>>>> ctr_s.name = "foo" (say $.user had value foo) >>>>> >>>>> >>>>> Thoughts / suggestions? >>>> >>>> >>>> how is this different/better than global variables? (although we may need >>>> to >>>> implement soem functions, atomic inc/dec copy+clear) If you have pstats >>>> output in json format, you can even piggyback on it's schedule to output >>>> the >>>> data. >>>> >>>> >>>> things like stats can very easily end up being expensive in terms of >>>> locking >>>> (something global variables already have figured out), and it sounds like >>>> you are proposing adding a scheduler of some sort to output the data. >>>> >>>> variables should not need to be 'recycled', either they contain data or >>>> they >>>> don't. If they contain data, you need to keep the data until you do >>>> something with it, if they don't, you don't have to track them. >>>> >>>> >>>> I am actually doing this sort of thing external to rsyslog in SEC >>>> >>>> I have a template in rsyslog that contains hostname, fromhost-ip, >>>> programname and I output it via improg to SEC. SEC accumulates counters and >>>> has scheduled outputs to files. >>>> >>>> before I started using SEC for this, I used the same template to output to >>>> a >>>> file and then for reports, used cut + sort + uniq -c to extract the data I >>>> need. When the files only contain the significant data, this is actually >>>> not >>>> bad to do, even at higher volumes. >>>> >>>> David Lang >>>> _______________________________________________ >>>> rsyslog mailing list >>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>> http://www.rsyslog.com/professional-services/ >>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of >>>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T >>>> LIKE THAT. >>> >>> >>> >>> -- >>> Regards, >>> Janmejay >>> http://codehunk.wordpress.com >>> _______________________________________________ >>> rsyslog mailing list >>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>> http://www.rsyslog.com/professional-services/ >>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of >>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T >>> LIKE THAT. >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of >> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T >> LIKE THAT. > > > > -- > Regards, > Janmejay > http://codehunk.wordpress.com > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T > LIKE THAT. _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

