On Sat, Mar 19, 2022 at 12:10:11AM +0100, Alexander Bluhm wrote: > On Thu, Mar 17, 2022 at 07:25:27AM +0000, Visa Hankala wrote: > > On Thu, Mar 17, 2022 at 12:42:13AM +0100, Alexander Bluhm wrote: > > > I would like to use btrace to debug refernce counting. The idea > > > is to a a tracepoint for every type of refcnt we have. When it > > > changes, print the actual object, the current counter and the change > > > value. > > > > > Do we want that feature? > > > > I am against this in its current form. The code would become more > > complex, and the trace points can affect timing. There is a risk that > > the kernel behaves slightly differently when dt has been compiled in. > > On our main architectures dt(4) is in GENERIC. I see your timing > point for uvm structures.
In my opinion, having dt(4) enabled by default is another reason why there should be no carte blanche for adding trace points. Each trace point adds a tiny amount of bloat. Few users will use the tracing facility. Maybe high-rate trace points could be behind a build option... > What do you think about this? The check starts with a > __predict_false(index > 0) in #define DT_INDEX_ENTER. The r_traceidx > is very likely in the same cache line as r_refs. So the additional > overhead of the branch should be small compared to the atomic > operation. The __predict_false(dt_tracing) might take longer as > it is a global variable. I have no hard data to back up my claim, but I think dt_tracing should be checked first. This would make the situation easier for branch prediction. It is likely that dt_tracing is already in cache.