Hello

I want to mention a general case, in which we want to support "group by"
queries for different attributes and resources.

Regarding the  problem mentioned by François, suppose we want to calculate *
per*-*cpu CPU utilization of each process* (select CPU usages for each CPU
separately, group by process).

> Process #1 : 25% total
>     -> CPU0 : 20%
>     -> CPU1 : 5%
>
> Process #2 : 10% total
>     -> CPU0 : 10%
>     -> CPU1 : 0%

In the meantime, suppose we are also interested to have a reverse
statistics:  *per*-*process CPU utilization for each CPU* (select CPU
usages of each process separately, group by CPU).

> CPU0 : 30% total
>     -> Process #1 : 20%
>     -> Process #2 : 10%
>
> CPU1 : 5% total
>     -> Process #1 : 5%
>     -> Process #2 : 0%


Or another example, we want to calculate the IO throughout of
processes and files grouped by each one separately:


For IO throughput:

Process #1 : 25% total
    -> File0 : 10%  (quark: 1)
    -> File1 : 5%   (quark: 2)

Process #2 : 15% total
    -> File0 : 5%    (quark: 5)
    -> File1 : 10%   (quark: 6)

and

File0 : 12% total
    -> Process #1 : 8%   (quark: 10)
    -> Process #2 : 4%   (quark: 11)

File1 : 20% total
    -> Process #1 : 10%  (quark: 15)
    -> Process #2 : 10%  (quark: 16)


By using the current organization of the attribute tree , we may need to
duplicate the data and store them twice in the history tree, a separate
value for each attribute pair (e.g. cpu1--> process1  and process1-->cpu1
have different quark values and need to store their equal statistics values
separately in different places of the history tree).

*However, it may be useful to somehow relax the definition of
the attribute tree and let different applications define their own
organizations of the attributes.*


For instance, I suggest a new organization for managing the statistics:


1- We firstly create hierarchy of resources separately.

Processes
    -> Process #1
    -> Process #2


CPUs
    -> CPU #1
    -> CPU #2


Files
    -> File #1
    -> File #2


2- Then, define the metric nodes between different resources and
assign them different quark values. For example, we define "cpu usage"
metric node between each process and each CPU:

    -> Process #2

              ---> CPU usage    (quark: 1)

    -> CPU #1

or IO between each File and Process

    -> Process #1

              ---> IO           (quark: 2)

    -> File #3


This organization avoids duplication in the history tree: for each tuple
(e.g. process and CPU), it stores only one value in the history tree.
Furthermore, it supports different "group by" queries, aggregation
functions, etc.


Thanks,
Naser
_______________________________________________
linuxtools-dev mailing list
linuxtools-dev@eclipse.org
https://dev.eclipse.org/mailman/listinfo/linuxtools-dev

Reply via email to