Re: [rsyslog] RFC: dynamic-stats support

David Lang Tue, 06 Oct 2015 14:56:28 -0700

On Wed, 7 Oct 2015, singh.janmejay wrote:

--
Regards,
Janmejay


PS: Please blame the typos in this mail on my phone's uncivilized soft
keyboard sporting it's not-so-smart-assist technology.

On Oct 6, 2015 10:32 PM, "David Lang" <[email protected]> wrote:


On Tue, 6 Oct 2015, singh.janmejay wrote:

It is possible to use global-variables (it'll require some
enhancements, table-support etc), but it'll be very inefficient
compared to this approach. For instance, choice of data-structure etc
allows making the solution a lot more efficient.



As for the data structures, Rainer has been identifying inefficencies in

how json-c works and working to improve them


That optimizes variable system. But it still is a general propose variable
system. It can't and shouldn't understand relationship between variables.


what relationships are there between the different metrics?

Here its possible to locklessly increment counters in most cases, so
its overhead is a lot lesser than global-variables.



how can you manage counters in multiple threads without locks? Especially

when dealing with batches.


Consider a trie based implementation. With bounded fanout-factor, it's O(1)
wrt metric-names cardinality. It also has very little lock contention
involved. Usually operations work with read-locks, only when new metric is
initialized it requires a write lock on patent node. If recycles are few
and far apart, lock contention would be negligible.

if you have multiple threads that may need to update the same metric at the sametime, a tree doesn't eliminate the locking.

The current json-c locking is being make intentionally over-broad right nowbecause it appears that some json-c code is not thread-safe and we haven'tidentified it yet. Once that's tracked down and fixed (or json-c replaced),updating one item should not require locking any more than that item.

Recycle is precisely to allow this lockless mechanism to work. Its
basically saying, it'll track metric-names he has seen in last 1 hour.
If we kill tracking of it as soon as we don't see an increment
(between 2 reporting runs of impstats), it'll lead to unnecessary
churn when low-values are common or load is not uniform in time.



that depends on the cost of initializing a metric vs the cost of tracking

the recycle mechanism.


0 value data-points can easily be filtered out. So they don't create any
processing overhead downstream. Cost of tracking for recycle is minimal
because it's a single counter bring tracked, when it reaches zero it's
reset to orig starting value and trie is killed after reporting accumulated
stats.

actually, filtering out 0 data-points can be a very bad thing. Far too manymonitoring tools produce stright-line graphs/estimates between reporteddata-points, so it's very important to report 0 value data-points

Implementing it on top of global-variables is not only has very high
performance-penalty(it'll be prohibitive for high-throughput
scenarios), it also exposes too much complexity to the user (where
user has to worry about reset etc).

I don't plan to have a scheduler in this implementation.
GetAllStatsLines call will purge the tree instead of reset at that
interval. Its basically a balance between freeing-up memory occupied
by stale-metric-names vs. performance (lockless handling of
increment). So it will be governed by impstat schedule. May be I
should change name to better name (equivalent of
purge_known_keys_after_they_have_been_reported_N_times).



if this is just adding additional metrics to the impstats output that

eliminates the schedular/reset issue.


I think we should have a metric configuration be fairly static, allow

configuring custom metrics and add to them, but don't use data from the
message as part of the name of the metric, and continue reporting them
forever, even if they are 0 (so no need to 'recycle' names)

Dynamic metrics are a real usecase for any shared system(utilisation across
several users, several hosts, several clusters, several-subnets etc are
easily reportable with this). The only way to report utilisation in many
scenarios is to have dyn-metric names. The alternative is to pre-declare
all keys, but that to me is a more indirect solution. It's not as
flexible/adaptive.

I think declarative static-key a useful feature on its own, for eg when
classifying reportable metric into buckets known in advance, but dyn-key
and configurable-static-key are not interchangeable.

dynamic systems will have pathalogical failure condiditions. Consider whathappens when someone uses hostname in a dynafile template and then some systemstarts spewing malformed logs that put garbage data in that field. Creatinghundreds or thousands of metric variables is much worse.

I agree that pre-declared keys are less flexible, but they are also going to befar safer and easier to deal with.


David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] RFC: dynamic-stats support

Reply via email to