See comments inline. > -----Original Message----- > From: Vince Weaver [mailto:vincent.wea...@maine.edu] > Sent: Friday, August 29, 2014 8:58 PM > To: Gary Mohr > Cc: Stephane Eranian; Michel Brown; perfmon2-devel > Subject: Re: [perfmon2] Error reporting when using invalid combination of > umasks. > > On Thu, 28 Aug 2014, Gary Mohr wrote: > > > Given this sequence of events, I understand why neither PAPI or libpfm4 > > detected that these events cannot be used at the same time. But it seems > to > > me like the kernel should have been able to detect this when the second > > event was opened and return an error. This would have been a much > better > > time to report the error than when trying to read the results of an event > > that the kernel said was successfully opened. > > This is a long-standing issue with the linux-kernel interface, though > I have to admit I've only seen it when dealing with general purpose > counters when the watchdog is enabled (i.e., when the kernel has stolen a > performance counter for its own use). > I've complained about this off and on to the kernel developers for years > without much luck, I think the consensus was it would be too much work > to report a proper error. > > I do find it odd that this error is cropping up in the uncore and RAPL > pmus though, it might be a different error path that can be fixed more > easily. >
We have only seen this when using two uncore events that have counter constraint conflicts (have not seen it with RAPL). We are able to use either one of these events by itself and it works fine but when we use them together, they are both opened by the kernel successfully but the attempt to read the results of the second event fails. > > Is there a way that PAPI or libpfm4 could have detected this invalid > > combination of events before doing the kernel opens ? > > PAPI currently works around this problem in the perf_event component by, > at event add time, doing a quick perf_event_open()/read()/close() > sequence > to see if the event is in fact valid. This adds overhead, but it's the > only reliable way of telling if an event is actually valid. > > I probably dropped this extra check in the uncore component thinking it > was unnecessary, it could be added back in. I presume that the code you referred to above is the function check_scheduability() which is in the core component but not the uncore component. I will try putting this code into the uncore component to see if it handles this condition any better. It will need some adjustment because it currently only does the start/stop/read sequence on the group leader. That makes sense in the core component but the uncore component does not use grouping (so we can count events from multiple pmu's at the same time). I will need to loop through all of the events doing a start/stop/read on each. Thanks for your feedback. Gary > > Vince ------------------------------------------------------------------------------ Slashdot TV. Video for Nerds. Stuff that matters. http://tv.slashdot.org/ _______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel