Martin - Phil is correct in that the L3 events on Shanghai are shared across all cores in a chip. I don't know if perfmon2 specifically traps for this; I don't think it does. However the AMD documents suggest that by convention only one core per chip should access these events. This suggestion isn't enforced in PAPI, resulting in unpredictable counts. I'm surprised that you're seeing error messages here, since we've monitored these events on Barcelona chips, but not specifically on Shanghai. I guess an error is somewhat better than a silent, but wrong result :( - dan
> -----Original Message----- > From: [email protected] [mailto:ptools-perfapi- > [email protected]] On Behalf Of Philip Mucci > Sent: Friday, July 03, 2009 9:21 AM > To: Martin Vogt > Cc: perfmon2-devel; papi list > Subject: Re: [Ptools-perfapi] PAPI_L3_TCM error with threads > > Hi Martin, > > I believe this is an issue that the L3 counter is a shared counter by > all the cores. perfmon2 unfortunately does not offer per-thread access > to these counters, as far as I can tell. > > Stefane, can you confirm? > > Phil > > On Jul 2, 2009, at 6:18 AM, Martin Vogt wrote: > > > > > > > Hello list, > > > > I encounter a error code in my papi program > > > > kernel 2.6.28 with perfmon2 pachtes on a 8 numa node shanghai > > machine. > > > > The error output is: > > > >> PAPI_ESYS > >> Invalid argument > > > > The relevant code section for this is: > > > >> //start the thread specific event set > >> if ( (retval = PAPI_start (thr_locals[my_id].event_set)) != > >> PAPI_OK) { > >> PAPI_perror( retval, NULL, 0 ); > >> if(retval == PAPI_ESYS) > >> perror(0); > >> > >> std::runtime_error e("PAPI_start error!" ); > >> throw e; > >> } > > > > This only happens if I monitor PAPI_L3_TCM events from different > > threads. > > > > This means: > > > > -one threads monitors PAPI_L3_TCM correct > > -two threads on different (or the same) Numa nodes throw this error. > > > > Other Event counter seems to work correctly (without errors) > > > > I I strace my demo program I get: > > > > > >> 24813 syscall_296(0x6, 0x7ff160ed9110, 0x2, 0x2, 0x7ff150d8f830, > >> [.............] > >> 0x7ff161d2e6c0, 0x7ff161d2e6c0, 0x7ff161d2e6c0) = 0 > > > >> 24813 syscall_297(0x6, 0x7ff160edf110, 0x2, 0x7ff150d8f830, > >> 0x7ff160e93028, > >> [..............] > >> 0x7ff161d2e6c0, 0x7ff161d2e6c0, 0x7ff161d2e6c0) = 0 > > > >> 24813 syscall_299(0x6, 0x50fb38, 0xffffffffffffffff, 0x7ff160e93028, > >> [...............] > >> 0x7ff161d2e6c0, 0x7ff161d2e6c0, 0x7ff161d2e6c0, 0x7ff161d2e6c0) > >> = -1 (errno 16) > > > >> 24813 syscall_301(0x6, 0xfffffffd, 0x50, 0x2, 0x7ff161447a80, > >> [................] > >> 161d2e6c0, 0x7ff161d2e6c0, 0x7ff161d2e6c0, 0x7ff161d2e6c0, > >> 0x7ff161d2e6c0, > >> 0x7ff161d2e6c0, 0x7ff161d2e6c0) = -1 (errno 22) > >> 24813 gettid() = 24813 > >> 24813 write(2, "PAPI_ESYS\n", 10) = 10 > >> 24813 write(2, "Invalid argument\n", 17) = 17 > > > > > > Should PAPI_L3_TCM events should work from diffrent threads? > > At least the PAPI documentation suggests this. > > > >> According to docu: > >> > http://icl.cs.utk.edu/projects/papi/files/documentation/PAPI_USER_GUIDE_30 > 6.htm > >> section "USING PAPI WITH PARALLEL PROGRAMS" > > > > Is this a bug? > > At least papi should not thrown an error. > > > > regards, > > > > Martin > > > > _______________________________________________ > > Ptools-perfapi mailing list > > [email protected] > > http://lists.cs.utk.edu/listinfo/ptools-perfapi > > _______________________________________________ > Ptools-perfapi mailing list > [email protected] > http://lists.cs.utk.edu/listinfo/ptools-perfapi ------------------------------------------------------------------------------ _______________________________________________ perfmon2-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/perfmon2-devel
