On 18/01/2019 19:25, Ken Gaillot wrote: > On Fri, 2019-01-18 at 18:53 +0100, Lars Ellenberg wrote: >> On Thu, Jan 17, 2019 at 09:09:11AM +1100, Andrew Beekhof wrote: >>> >>> >>>> On 17 Jan 2019, at 2:59 am, Ken Gaillot <kgail...@redhat.com> >>>> wrote: >>>> >>>> I'm not familiar with the reasoning for the current setup, but >>>> pacemaker's crm_crit(), crm_error(), etc. use qb_logt(), while >>>> crm_debug() and crm_trace() (which won't be used in ordinary >>>> runs) do >>>> something similar to what you propose. >>>> >>>> Pacemaker has about 1,700 logging calls that would be affected >>>> (not >>>> counting another 2,000 debug/trace). Presumably that means >>>> Pacemaker >>>> currently has about +16KB of memory overhead and binary size for >>>> debug/trace logging static pointers, and that would almost double >>>> using >>>> them for all logs. Not a big deal today? Or meaningful in an >>>> embedded >>>> context? >>>> >>>> Not sure if that overhead vs runtime trade-off is the original >>>> motivation or not, but that's the first thing that comes to mind. >>> >>> I believe my interest was the ability to turn them on dynamically >>> in a >>> running program (yes, i used it plenty back in the day) and have >>> the >>> overhead be minimal for the normal case when they weren't in use. >> >> Also, with libqb before the commit mentioned in the subject (633f262) >> and that is what pacemaker is using right now, >> you'd get one huge static array of "struct callsites" (in a special >> linker section; that's the linker magic that patch removes). >> >> Note: the whole struct was statically allocated, >> it is an array of structs, not just an array of pointers. >> >> sizeof(struct qb_log_callsite) is 40 > > OK, so prior to the change, pacemaker used ~66KB of static struct > allocation in the linker section for <=info logs, and ~16KB of normal > static pointer allocation for debug/trace. > > After the change, ~16KB is used statically, and an additional fraction > of 66KB is dynamically allocated according to what messages are hit. > > With your suggestion, an additional ~13KB would be used statically, and > some fraction of that would be saved from dynamic allocation (besides > the CPU efficiency). Could be worth it. > >> Now, those structs get dynamically allocated, >> and put in some lineno based lookup hash. >> (so already at least additional 16 bytes), >> not counting malloc overhead for all the tiny objects. > > true, but now only messages that are potentially logged get > (dynamically) allocated, whereas before, even messages in functions and > code blocks that will never be used in a given run were (statically) > allocated > >> The additional 8 byte static pointer >> is certainly not "doubling" that overhead. > > right, a better estimate is above > >> But can be used to skip the additional lookup, >> sprintf, memcpy and whatnot, and even the function call, >> if the callsite at hand is currently disabled, >> which is probably the case for most >= trace >> callsites most of the time. > > pacemaker already implements >=debug similar to what you propose, so it > likely wouldn't help much. other users of libqb might benefit though > >> Any volunteers to benchmark the cpu usage? >> I think we'd need >> (trace logging: {enabled, disabled}) > > maybe info logging for pacemaker; debug and trace won't be affected. > some other libqb app might be more significant. can also compare binary > sizes. > > I'm definitely not opposed to your suggestion. I think it probably > makes sense. > >> x ({before 633f262, >> after 633f262, >> after 633f262 + lars patch}) >> >> BTW, >> I think without the "linker magic" >> (static array of structs), >> the _log_filter_apply() becomes a no-op? >> That's qb_log_filter_ctl2() at runtime. >> It would have to iterate over all the collision lists in all the >> buckets >> of the dynamically allocated callsites, instead of iterating the >> (now non-existing) static array of callsites. >> >> One side-effect of not using a static pointer, >> but *always* doing the lookup (qb_log_calsite_get()) again, >> is that a potentially set _custom_filter_fn() would be called >> and that filter applied to the callsite, at each invocation. >> >> But I don't think that that is intentional? >>
(sorry for late reply, I've not been well lately) It looks like a good optimisation to me. As Pacemaker is the biggest used of libqb I'll take Ken's opinion very seriously :) Chrissie _______________________________________________ Developers mailing list Developers@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/developers