At first glance:

this would push all of the libpfm-type of code into the kernel, where it's 
harder to maintain.
it would be slower to read multiple counters, requiring one syscall per 
counter.
how does it handle sampling and interrupt on overflow?
if scheduling of the counters is dynamic, I think that means there's more 
overhead on every task switch
there's a pile of TODO there, code which is already written and working in 
perfmon (full)

I'll look at this some more tomorrow.

Regards,

- Corey

Corey Ashford
Software Engineer
IBM Linux Technology Center, Linux Toolchain
Beaverton, OR 
503-578-3507 
[EMAIL PROTECTED]

"stephane eranian" <[EMAIL PROTECTED]> wrote on 12/04/2008 11:14:39 
PM:

> Hello everyone,
> 
> Here is another competing perfmon API proposal posted just today on LKML
> by Ingo Molnar, and Thomas Gleixner (x86 kernel maintainers).
> 
> They think their design is superior to that of the perfmon API.
> Looks like there is yet another battle to fight...
> 
> I intend to respond tomorrow but feel free to contribute to the LKML 
thread
> if you have a strong opinion on the matter.
> 
> 
> ---------- Forwarded message ----------
> From: Thomas Gleixner <[EMAIL PROTECTED]>
> Date: Fri, Dec 5, 2008 at 12:44 AM
> Subject: [patch 0/3] [Announcement] Performance Counters for Linux
> To: LKML <[EMAIL PROTECTED]>
> Cc: [EMAIL PROTECTED], Andrew Morton
> <[EMAIL PROTECTED]>, Ingo Molnar <[EMAIL PROTECTED]>, Stephane
> Eranian <[EMAIL PROTECTED]>, Eric Dumazet <[EMAIL PROTECTED]>,
> Robert Richter <[EMAIL PROTECTED]>, Arjan van de Veen
> <[EMAIL PROTECTED]>, Peter Anvin <[EMAIL PROTECTED]>, Peter Zijlstra
> <[EMAIL PROTECTED]>, Steven Rostedt <[EMAIL PROTECTED]>, David
> Miller <[EMAIL PROTECTED]>, Paul Mackerras <[EMAIL PROTECTED]>
> 
> 
> Performance counters are special hardware registers available on most 
modern
> CPUs. These register count the number of certain types of hw events: 
such
> as instructions executed, cachemisses suffered, or branches 
mis-predicted,
> without slowing down the kernel or applications. These registers can 
also
> trigger interrupts when a threshold number of events have passed - and 
can
> thus be used to profile the code that runs on that CPU.
> 
> We'd like to announce a brand new implementation of performance counter
> support for Linux. It is a very simple and extensible design that has 
the
> potential to implement the full range of features we would expect from 
such
> a subsystem.
> 
> The Linux Performance Counter subsystem (implemented via the patches
> posted in this announcement) provides an abstraction of performance 
counter
> hardware capabilities. It provides per task and per CPU counters, and it
> provides event capabilities on top of those.
> 
> The code is far from complete - but the basic approach is already there
> and stable.
> 
> The biggest missing detail is lowlevel support for non-Intel CPUs and
> older Intel CPUs - right now the code is implemented for Intel Core2
> (and later) Intel CPUs that have the PERFMON CPU feature. (see below
> a wider list of missing/upcoming features)
> 
> We are aware of the perfmon3 patchset that has been submitted to lkml
> recently. Our patchset tries to achieve a similar end result, with
> a fundamentally different (and we believe, superior :-) design:
> 
>  - The API is based on a single counter abstraction
> 
>  - Only one single new system call is needed: sys_perf_counter_open().
>   All performance-counter operations are implemented via standard
>   VFS APIs such as read() / fcntl() and poll().
> 
>  - User-space is not exposed to lowlevel details like contexts or
>   arrays of counters. Opening and reading a basic counter is as simple
>   as 2 lines of C code:
> 
>   void main(void)
>   {
>      u64 count;
> 
>      fd = perf_counter_open(3 /* PERF_COUNT_CACHE_MISSES */, 0, 0, 0, 
-1);
>      ret = read(fd, &count, sizeof(count));
>      if (ret == sizeof(count))
>              printf("Current count: %Ld cachemisses!", count);
>   }
> 
>  - Events, blocking/sleep are natural built-in properties of counters.
> 
>  - No interaction with ptrace: any task (with sufficient permissions) 
can
>   monitor other tasks, without having to stop that task.
> 
>  - Mapping of counters to hw counters is not static - counters are
>   scheduled dynamically on each CPU where a task runs.
> 
>  - There's a /sys based reservation facility that allows the allocation
>   of a certain number of hw counters for guaranteed sysadmin access.
> 
>  - Generalized enumeration for common hw event types. Raw event codes
>   can be passed to the API too - but the most common (and most useful)
>   event codes are generalized into a hardware-independent registry
>   of events:
> 
>    enum hw_event_types {
>           PERF_COUNT_CYCLES,
>           PERF_COUNT_INSTRUCTIONS,
>           PERF_COUNT_CACHE_REFERENCES,
>           PERF_COUNT_CACHE_MISSES,
>           PERF_COUNT_BRANCH_INSTRUCTIONS,
>           PERF_COUNT_BRANCH_MISSES,
>    };
> 
>  - Simplified lowlevel/arch support. The x86 code for Intel CPUs (with
>   the PERFMON CPU feature) is 340 lines of code that implements
>   7 straightforward lowlevel API calls:
> 
>    int hw_perf_counter_init(struct perf_counter *counter, u32 
hw_event_type);
>    void hw_perf_counter_enable(struct perf_counter *counter);
>    void hw_perf_counter_disable(struct perf_counter *counter);
>    void hw_perf_counter_read(struct perf_counter *counter);
>    void hw_perf_counter_enable_config(struct perf_counter *counter);
>    void hw_perf_counter_disable_config(struct perf_counter *counter);
>    void hw_perf_counter_setup(void);
> 
>   There's one kernel/perf_counter.c core file, and a single
>   arch/x86/kernel/cpu/perf_counter.c architecture support file.
> 
>   The impact on the kernel tree is relatively moderate:
> 
>       27 files changed, 1641 insertions(+), 7 deletions(-)
> 
> TODO:
> 
>  - Non-Intel CPU support. Help is welcome :-)
> 
>  - Round-robin scheduling of counters, when there's more task counters
>   than hw counters available.
> 
>  - Support for extended record types such as PEBS.
> 
>  - Support for NMI events in the x86 code (the core design is ready)
> 
>  - Make sure it works well with OProfile and the x86 NMI watchdog
> 
> Short documentation is available in Documentation/perf-counters.txt
> 
> Find below the source of a simple monitoring demo.
> 
> We'd like to seek the feedback of perfmon developers and architecture
> maintainers - what do you think about this approach?
> 
> Comments, reports, suggestions, flames and other types of feedback
> is more than welcome,
> 
>        Thomas, Ingo
> ---
> 
> /*
>  * Performance counters monitoring test case
>  */
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <sys/time.h>
> #include <unistd.h>
> #include <stdint.h>
> #include <stdlib.h>
> #include <string.h>
> #include <getopt.h>
> #include <fcntl.h>
> #include <stdio.h>
> #include <errno.h>
> 
> #define __user
> 
> #include "sys.h"
> 
> static int count = 10000;
> static int eventid;
> static int tid;
> static char *debuginfo;
> 
> static void display_help(void)
> {
>        printf("monitor\n");
>        printf("Usage:\n"
>               "monitor options threadid\n\n"
>               "-e EID   --eventid=EID  eventid\n"
>               "-c CNT   --count=CNT    event count on which IP is 
sampled\n"
>               "-d FILE  --debug=FILE   path to binary file with 
> debug info\n");
>        exit(0);
> }
> 
> static void process_options (int argc, char *argv[])
> {
>        int error = 0;
> 
>        for (;;) {
>                int option_index = 0;
>                /** Options for getopt */
>                static struct option long_options[] = {
>                        {"count", required_argument, NULL, 'c'},
>                        {"debug", required_argument, NULL, 'd'},
>                        {"eventid", required_argument, NULL, 'e'},
>                        {"help", no_argument, NULL, 'h'},
>                        {NULL, 0, NULL, 0}
>                };
>                int c = getopt_long(argc, argv, "c:d:e:",
>                                    long_options, &option_index);
>                if (c == -1)
>                        break;
>                switch (c) {
>                case 'c': count = atoi(optarg); break;
>                case 'd': debuginfo = strdup(optarg); break;
>                case 'e': eventid = atoi(optarg); break;
>                default: error = 1; break;
>                }
>        }
>        if (error || optind == argc)
>                display_help ();
> 
>        tid = atoi(argv[optind]);
> }
> 
> int main(int argc, char *argv[])
> {
>        char str[256];
>        uint64_t ip;
>        ssize_t res;
>        int fd;
> 
>        process_options(argc, argv);
> 
>        fd = perf_counter_open(eventid, count, 1, tid, -1);
>        if (fd < 0) {
>                perror("Create counter");
>                exit(-1);
>        }
> 
>        while (1) {
>                res = read(fd, (char *) &ip, sizeof(ip));
>                if (res != sizeof(ip)) {
>                        perror("Read counter");
>                        break;
>                }
> 
>                if (!debuginfo) {
>                        printf("IP: 0x%016llx\n", (unsigned long 
long)ip);
>                } else {
>                        sprintf(str, "addr2line -e %s 0x%llx\n", 
debuginfo,
>                                (unsigned long long)ip);
>                        system(str);
>                }
>        }
> 
>        close(fd);
>        exit(0);
> }
> 
> 
------------------------------------------------------------------------------
> SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, 
Nevada.
> The future of the web can't happen without you.  Join us at MIX09 to 
help
> pave the way to the Next Web now. Learn more and register at
> 
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
> _______________________________________________
> perfmon2-devel mailing list
> perfmon2-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/perfmon2-devel


------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to