> suppose we want to calculate per - cpu CPU utilization of 
> each process 
> I n the meantime , suppose we are also interested to have a 
> reverse statistics: per - process CPU utilization for each CPU 


> By using the current organization of the attribute tree , we may 
> need to duplicate the data and store them twice in the history 
> tree, a separate value for each attribute pair 
> (e.g. cpu1--> process1 and process1-->cpu1 
> However, it may be useful to somehow relax the definition of 
> the attribute tree and let different applications define their 
> own organizations of the attributes. 
I agree that it is wasteful to keep everything symmetrical and thus 
compute/store some values more than once. This brings two choices to make: 

- select an arbitrary canonical order for such values, for instance process / 
cpu / system call and thus avoid the variants (cpu/process/system call, 
cpu/system call/process...). The optimal order to select depends on the typical 
branching pattern as discussed below. 

- what is the desired granularity level? For example, we could store for each 
process the execution time spent in "system calls" or separately in each 
specific system call (read, write...) or even further on each CPU and each 
specific system call. Moreover, we may use different granularity levels for 
different resources, only looking at all system calls for each process but 
accounting for each specific system call globally for the system. This is a 
compromise between granularity and memory. 

For those interested, lets look into more details about the cost and 
optimization of these computations. 

Assuming that we create an attribute leaf to store a statistic only when 
needed, the number of leaves will be the same irrespective of the canonical 
order. The number of intermediate nodes, which is smaller than the number of 
leaves, may be minimized by using a category with fewer members first. A 
typical case is to have 8 CPU, hundreds of processes (let say 1000), each 
running on mostly 1 or 2 CPUs and a few hundred system calls but each process 
calling only about 10 different ones. 

The number of leaves will be about 1000 * 1.5 * 10 = 15000. If we go pid / 
syscall / cpu, there will be 1 + 1000 + 1000 * 10 = 11001 intermediate nodes 
and 1000 + 1000 * 10 + 1000 * 10 * 1.5 = 26000 
edge. If we go the other way around, cpu / syscall / pid, there will be 
approximately 1 + 8 + 8 * 50 = 409 intermediate nodes and 8 + 8 * 50 + 1000 * 
10 * 1.5 edges = 15408 edges. There is thus a slight advantage to select the 
smallest branching factors first. 

In terms of computation speed and state history storage, the situation is quite 
interesting. Each time a system call or scheduling switch event comes in, some 
value needs to be summed. Whether it is always summed in the same value (e.g. 
pid / user_space and pid / system) or in different branches (cpu / syscall / 
pid), the computation time is pretty much the same and should not be a concern. 
The state history currently stores an interval for each update to statistics 
(basically for each event since event counts are kept). Thus, updating values 
and creating a state history entry for each scheduling switch or syscall entry 
/ exit is already much less than than what is required for event counts. 
Furthermore, there will be one entry in the history for each change, whether 
they are always to the same entry (pid / system) or to detailed entries (cpu / 
syscall / pid). Of course, the partial history optimization is always 
applicable. 

What happens when we want to interactively navigate the statistics ? We may 
want to show a 100GB window within a 1TB trace. We can seek at the window begin 
and end, t0 and t1, and gather the values for cpu / syscall / pid for each. We 
could also do the samething 1000 times for showing an histogram with a value 
computed for each pixel in a 1000 pixel wide window. Once we have the values 
for a specific time, we can dynamically make the needed sums (all syscalls / 
cpu for each process, all processes / syscall for a CPU...) relatively quickly. 

My conclusion is that it has little impact on the computation time and state 
history size of storing statistics with a fine granularity. The largest impact 
is on the size of the "current state". Of course, we could always have an 
option to specify the desired level of granularity wanted in the state 
computation. We had the cpu / syscall / pid granularity in LTTV and it was very 
useful (especially pid / syscall but not so much pid / cpu). 
_______________________________________________
linuxtools-dev mailing list
linuxtools-dev@eclipse.org
https://dev.eclipse.org/mailman/listinfo/linuxtools-dev

Reply via email to