Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT
On Mon, Nov 13, 2017 at 12:02:56AM -0800, Milind Chabbi wrote: > SNIP > > On Sun, Nov 12, 2017 at 11:46 PM, Jiri Olsa wrote: > > > but you closed fd4 before openning fd5..? > > Yes, that is correct. I closed fd4. The reason is by closing fd4, we > are having a total of 3 hardware breakpoints active, but we are making > the software counting in the kernel think that four TYPE_DATA > breakpoints active. The counting should have disallowed us from > creating fd5 as per the following logic in the kernel: > > static int __reserve_bp_slot(struct perf_event *bp) > > { > > > /* Flexible counters need to keep at least one slot */ > if (slots.pinned + (!!slots.flexible) > nr_slots[type]) > return -ENOSPC; > > } So the issue is with the cpu pinned breakpoints, because we keep their slot counts for both breakpoint types. For task breakpoints we dont keep the slot count, we just count it every time we need it. The issue will not expose on x86, because both breakpoint types share same slot count (CONFIG_HAVE_MIXED_BREAKPOINTS_REGS). I'm seeing the issue on arm machine (with 4 watchpoints and 6 breakpoints) creating 4 watchpoints: 2028 perf_event_open(0xdb232bd0, -1, 0, -1, PERF_FLAG_FD_CLOEXEC) = 3 2028 perf_event_open(0xdb232c40, -1, 0, -1, PERF_FLAG_FD_CLOEXEC) = 4 2028 perf_event_open(0xdb232cb0, -1, 0, -1, PERF_FLAG_FD_CLOEXEC) = 5 2028 perf_event_open(0xdb232d20, -1, 0, -1, PERF_FLAG_FD_CLOEXEC) = 6 changing last one to breakpoint: 2028 ioctl(6, _IOC(_IOC_WRITE, 0x24, 0x0a, 0x08), 0xdb232e08) = 0 and trying to create one more watchpoint: 2028 perf_event_open(0xdb232d90, -1, 0, -1, PERF_FLAG_FD_CLOEXEC) = -1 ENOSPC (No space left on device) after this, we have slot counts: get_bp_info(0, TYPE_DATA)->cpu_pinned = 4 get_bp_info(0, TYPE_INST)->cpu_pinned = 0 now when we close all of it: close(3) close(4) close(5) close(6) we get the slot counts messed up, because fd 6 has different type now: get_bp_info(0, TYPE_DATA)->cpu_pinned = 1 get_bp_info(0, TYPE_INST)->cpu_pinned = -1 I put together some fix and put it in here: https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git perf/bp if you could please run your tests on it, and if it's all good I'll post it thanks, jirka
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT
SNIP On Sun, Nov 12, 2017 at 11:46 PM, Jiri Olsa wrote: > but you closed fd4 before openning fd5..? Yes, that is correct. I closed fd4. The reason is by closing fd4, we are having a total of 3 hardware breakpoints active, but we are making the software counting in the kernel think that four TYPE_DATA breakpoints active. The counting should have disallowed us from creating fd5 as per the following logic in the kernel: static int __reserve_bp_slot(struct perf_event *bp) { /* Flexible counters need to keep at least one slot */ if (slots.pinned + (!!slots.flexible) > nr_slots[type]) return -ENOSPC; }
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT
On Sun, Nov 12, 2017 at 11:09:23AM -0800, Milind Chabbi wrote: > , > > On Thu, Nov 9, 2017 at 10:59 AM, Milind Chabbi > wrote: > > SNIP > > > > On Thu, Nov 9, 2017 at 5:12 AM, Jiri Olsa wrote: > >> > >> > >> how about something like below (untested) > >> > >> looks like there's no irq caller for modify_user_hw_breakpoint, > >> so we should be fine with locking nr_bp_mutex > >> > >> jirka > >> > >> > >> --- > >> diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c > >> index 3f8cb1e14588..f062b68399ea 100644 > >> --- a/kernel/events/hw_breakpoint.c > >> +++ b/kernel/events/hw_breakpoint.c > >> @@ -448,6 +448,8 @@ int modify_user_hw_breakpoint(struct perf_event *bp, > >> struct perf_event_attr *att > >> else > >> perf_event_disable(bp); > >> > >> + release_bp_slot(bp); > >> + > >> bp->attr.bp_addr = attr->bp_addr; > >> bp->attr.bp_type = attr->bp_type; > >> bp->attr.bp_len = attr->bp_len; > >> @@ -455,9 +457,9 @@ int modify_user_hw_breakpoint(struct perf_event *bp, > >> struct perf_event_attr *att > >> if (attr->disabled) > >> goto end; > >> > >> - err = validate_hw_breakpoint(bp); > >> + err = reserve_bp_slot(bp); > >> if (!err) > >> - perf_event_enable(bp); > >> + err = validate_hw_breakpoint(bp); > >> > >> if (err) { > >> bp->attr.bp_addr = old_addr; > >> @@ -469,6 +471,7 @@ int modify_user_hw_breakpoint(struct perf_event *bp, > >> struct perf_event_attr *att > >> return err; > >> } > >> > >> + perf_event_enable(bp); > >> end: > >> bp->attr.disabled = attr->disabled; > >> > > > > We can do this accounting only if bp->attr.bp_type != attr->bp_type. > > > > -Milind > > > Jirka, > > Neither of us seems to fully understand the convoluted logic used in > breakpoint counting. yea, I was hoping some of the guys would take over ;-) the problem I have with the patch above is that we could fail to reserve the slot at the end, which is not what the caller might expect > > I tested the following sequence on an x86 machine, which has four > debug registers (without your suggested patch for counting > correction). > > fd1 = perf_event_open(...); //BP_TYPE= HW_BREAKPOINT_RW @ ADDR1 > fd2 = perf_event_open(...); //BP_TYPE= HW_BREAKPOINT_RW @ ADDR2 > fd3 = perf_event_open(...); //BP_TYPE= HW_BREAKPOINT_RW @ ADDR3 > fd4 = perf_event_open(...); //BP_TYPE= HW_BREAKPOINT_RW @ ADDR4 > ioctl(fd4, MODIFY, ...); // change fd4 to BP_TYPE= HW_BREAKPOINT_X @ ADDR5 > close(fd4); > fd5 = perf_event_open(); //BP_TYPE=RW @ ADDR6 > > We expected fd5 to fail because four BP_TYPE=TYPE_DATA are in use as > per the accounting, but in reality, fd5 was successfully opened. but you closed fd4 before openning fd5..? > > Is the accounting accidentally working on x86? > Is there another architecture where TYPE_DATA and TYPE_INS are counted > differently? [jolsa@krava linux-perf]$ grep -r HAVE_MIXED_BREAKPOINTS_REGS arch/* arch/Kconfig:config HAVE_MIXED_BREAKPOINTS_REGS arch/sh/Kconfig:select HAVE_MIXED_BREAKPOINTS_REGS arch/x86/Kconfig: select HAVE_MIXED_BREAKPOINTS_REGS I'll try to check on it this week jirka
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT
, On Thu, Nov 9, 2017 at 10:59 AM, Milind Chabbi wrote: > SNIP > > On Thu, Nov 9, 2017 at 5:12 AM, Jiri Olsa wrote: >> >> >> how about something like below (untested) >> >> looks like there's no irq caller for modify_user_hw_breakpoint, >> so we should be fine with locking nr_bp_mutex >> >> jirka >> >> >> --- >> diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c >> index 3f8cb1e14588..f062b68399ea 100644 >> --- a/kernel/events/hw_breakpoint.c >> +++ b/kernel/events/hw_breakpoint.c >> @@ -448,6 +448,8 @@ int modify_user_hw_breakpoint(struct perf_event *bp, >> struct perf_event_attr *att >> else >> perf_event_disable(bp); >> >> + release_bp_slot(bp); >> + >> bp->attr.bp_addr = attr->bp_addr; >> bp->attr.bp_type = attr->bp_type; >> bp->attr.bp_len = attr->bp_len; >> @@ -455,9 +457,9 @@ int modify_user_hw_breakpoint(struct perf_event *bp, >> struct perf_event_attr *att >> if (attr->disabled) >> goto end; >> >> - err = validate_hw_breakpoint(bp); >> + err = reserve_bp_slot(bp); >> if (!err) >> - perf_event_enable(bp); >> + err = validate_hw_breakpoint(bp); >> >> if (err) { >> bp->attr.bp_addr = old_addr; >> @@ -469,6 +471,7 @@ int modify_user_hw_breakpoint(struct perf_event *bp, >> struct perf_event_attr *att >> return err; >> } >> >> + perf_event_enable(bp); >> end: >> bp->attr.disabled = attr->disabled; >> > > We can do this accounting only if bp->attr.bp_type != attr->bp_type. > > -Milind Jirka, Neither of us seems to fully understand the convoluted logic used in breakpoint counting. I tested the following sequence on an x86 machine, which has four debug registers (without your suggested patch for counting correction). fd1 = perf_event_open(...); //BP_TYPE= HW_BREAKPOINT_RW @ ADDR1 fd2 = perf_event_open(...); //BP_TYPE= HW_BREAKPOINT_RW @ ADDR2 fd3 = perf_event_open(...); //BP_TYPE= HW_BREAKPOINT_RW @ ADDR3 fd4 = perf_event_open(...); //BP_TYPE= HW_BREAKPOINT_RW @ ADDR4 ioctl(fd4, MODIFY, ...); // change fd4 to BP_TYPE= HW_BREAKPOINT_X @ ADDR5 close(fd4); fd5 = perf_event_open(); //BP_TYPE=RW @ ADDR6 We expected fd5 to fail because four BP_TYPE=TYPE_DATA are in use as per the accounting, but in reality, fd5 was successfully opened. Is the accounting accidentally working on x86? Is there another architecture where TYPE_DATA and TYPE_INS are counted differently? -Milind
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT
SNIP On Thu, Nov 9, 2017 at 5:12 AM, Jiri Olsa wrote: > > > how about something like below (untested) > > looks like there's no irq caller for modify_user_hw_breakpoint, > so we should be fine with locking nr_bp_mutex > > jirka > > > --- > diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c > index 3f8cb1e14588..f062b68399ea 100644 > --- a/kernel/events/hw_breakpoint.c > +++ b/kernel/events/hw_breakpoint.c > @@ -448,6 +448,8 @@ int modify_user_hw_breakpoint(struct perf_event *bp, > struct perf_event_attr *att > else > perf_event_disable(bp); > > + release_bp_slot(bp); > + > bp->attr.bp_addr = attr->bp_addr; > bp->attr.bp_type = attr->bp_type; > bp->attr.bp_len = attr->bp_len; > @@ -455,9 +457,9 @@ int modify_user_hw_breakpoint(struct perf_event *bp, > struct perf_event_attr *att > if (attr->disabled) > goto end; > > - err = validate_hw_breakpoint(bp); > + err = reserve_bp_slot(bp); > if (!err) > - perf_event_enable(bp); > + err = validate_hw_breakpoint(bp); > > if (err) { > bp->attr.bp_addr = old_addr; > @@ -469,6 +471,7 @@ int modify_user_hw_breakpoint(struct perf_event *bp, > struct perf_event_attr *att > return err; > } > > + perf_event_enable(bp); > end: > bp->attr.disabled = attr->disabled; > We can do this accounting only if bp->attr.bp_type != attr->bp_type. -Milind
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT
On Thu, Nov 09, 2017 at 08:46:58AM +0100, Jiri Olsa wrote: SNIP > > Jirka, > > > > I carefully looked at bp_cpuinfo[] and nr_slots[] data structures. > > nr_slots[] is an array of length two (one slot of TYPE_INST and > > another for TYPE_DATA). > > The accounting "thinks" that there is one limit on the number of > > instruction breakpoints and another limit on the number of data > > breakpoints. > > The assumption is clearly broken; for example, on x86 there exists a > > limit on the *total* number of all breakpoints disregarding their kind > > and the code has failed to capture this aspect. > > there's the CONFIG_HAVE_MIXED_BREAKPOINTS_REGS that puts DATA and INST > under one count on x86.. but that seems to be the enabled only for: > > arch/sh/Kconfig:select HAVE_MIXED_BREAKPOINTS_REGS > arch/x86/Kconfig: select HAVE_MIXED_BREAKPOINTS_REGS > > > > > As such, modify_user_hw_breakpoint() makes no attempt to keep the > > counts correct. Instead, it simply tries to change and install a new > > breakpoint and fails if the hardware disallows. > > This can lead to a situation where, say on x86, someone creates 4 > > TYPE_DATA breakpoints, then changes one of them to TYPE_INS via > > modify_user_hw_breakpoint() and then releases the TYPE_INS breakpoint. > > Since the accounting still thinks that there are four TYPE_DATA > > breakpoints, it will disallow creating a new TYPE_DATA breakpoint, > > although there is place for one TYPE_DATA breakpoint. > > > > This convinces me that the problem and the solution are outside of > > this current patch. > > Do you agree? > > I'll leave this decision to maintainer ;-) but seems better to fix > the interface before we add any new dependent function calls how about something like below (untested) looks like there's no irq caller for modify_user_hw_breakpoint, so we should be fine with locking nr_bp_mutex jirka --- diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c index 3f8cb1e14588..f062b68399ea 100644 --- a/kernel/events/hw_breakpoint.c +++ b/kernel/events/hw_breakpoint.c @@ -448,6 +448,8 @@ int modify_user_hw_breakpoint(struct perf_event *bp, struct perf_event_attr *att else perf_event_disable(bp); + release_bp_slot(bp); + bp->attr.bp_addr = attr->bp_addr; bp->attr.bp_type = attr->bp_type; bp->attr.bp_len = attr->bp_len; @@ -455,9 +457,9 @@ int modify_user_hw_breakpoint(struct perf_event *bp, struct perf_event_attr *att if (attr->disabled) goto end; - err = validate_hw_breakpoint(bp); + err = reserve_bp_slot(bp); if (!err) - perf_event_enable(bp); + err = validate_hw_breakpoint(bp); if (err) { bp->attr.bp_addr = old_addr; @@ -469,6 +471,7 @@ int modify_user_hw_breakpoint(struct perf_event *bp, struct perf_event_attr *att return err; } + perf_event_enable(bp); end: bp->attr.disabled = attr->disabled;
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT
On Wed, Nov 08, 2017 at 08:59:22AM -0800, Milind Chabbi wrote: > On Wed, Nov 8, 2017 at 7:57 AM, Jiri Olsa wrote: > > On Wed, Nov 08, 2017 at 07:51:10AM -0800, Milind Chabbi wrote: > >> On Wed, Nov 8, 2017 at 7:12 AM, Jiri Olsa wrote: > >> > >> > > I am not able to fully understand your concern. > >> > > Can you point to a code file and line related to your observation? > >> > > The patch is modeled after the existing modify_user_hw_breakpoint() > >> > > function > >> > > present in events/hw_breakpoint.c; don't you see this problem in that > >> > > code? > >> > > >> > the reserve_bp_slot/release_bp_slot functions manage > >> > counts for current breakpoints based on its type > >> > > >> > those counts are cumulated in here: > >> > static DEFINE_PER_CPU(struct bp_cpuinfo, bp_cpuinfo[TYPE_MAX]); > >> > > >> > you allow to change the breakpoint type, so I'd expect > >> > to see some code that release slot count for old type > >> > and take new one (if it's available) > >> > > >> > jirka > >> > >> > >> Why is this not a concern for modify_user_hw_breakpoint() function? > > > > I don't know ;-) > > > > jirka > > > Jirka, > > I carefully looked at bp_cpuinfo[] and nr_slots[] data structures. > nr_slots[] is an array of length two (one slot of TYPE_INST and > another for TYPE_DATA). > The accounting "thinks" that there is one limit on the number of > instruction breakpoints and another limit on the number of data > breakpoints. > The assumption is clearly broken; for example, on x86 there exists a > limit on the *total* number of all breakpoints disregarding their kind > and the code has failed to capture this aspect. there's the CONFIG_HAVE_MIXED_BREAKPOINTS_REGS that puts DATA and INST under one count on x86.. but that seems to be the enabled only for: arch/sh/Kconfig:select HAVE_MIXED_BREAKPOINTS_REGS arch/x86/Kconfig: select HAVE_MIXED_BREAKPOINTS_REGS > > As such, modify_user_hw_breakpoint() makes no attempt to keep the > counts correct. Instead, it simply tries to change and install a new > breakpoint and fails if the hardware disallows. > This can lead to a situation where, say on x86, someone creates 4 > TYPE_DATA breakpoints, then changes one of them to TYPE_INS via > modify_user_hw_breakpoint() and then releases the TYPE_INS breakpoint. > Since the accounting still thinks that there are four TYPE_DATA > breakpoints, it will disallow creating a new TYPE_DATA breakpoint, > although there is place for one TYPE_DATA breakpoint. > > This convinces me that the problem and the solution are outside of > this current patch. > Do you agree? I'll leave this decision to maintainer ;-) but seems better to fix the interface before we add any new dependent function calls jirka
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT
On Wed, Nov 8, 2017 at 7:57 AM, Jiri Olsa wrote: > On Wed, Nov 08, 2017 at 07:51:10AM -0800, Milind Chabbi wrote: >> On Wed, Nov 8, 2017 at 7:12 AM, Jiri Olsa wrote: >> >> > > I am not able to fully understand your concern. >> > > Can you point to a code file and line related to your observation? >> > > The patch is modeled after the existing modify_user_hw_breakpoint() >> > > function >> > > present in events/hw_breakpoint.c; don't you see this problem in that >> > > code? >> > >> > the reserve_bp_slot/release_bp_slot functions manage >> > counts for current breakpoints based on its type >> > >> > those counts are cumulated in here: >> > static DEFINE_PER_CPU(struct bp_cpuinfo, bp_cpuinfo[TYPE_MAX]); >> > >> > you allow to change the breakpoint type, so I'd expect >> > to see some code that release slot count for old type >> > and take new one (if it's available) >> > >> > jirka >> >> >> Why is this not a concern for modify_user_hw_breakpoint() function? > > I don't know ;-) > > jirka Jirka, I carefully looked at bp_cpuinfo[] and nr_slots[] data structures. nr_slots[] is an array of length two (one slot of TYPE_INST and another for TYPE_DATA). The accounting "thinks" that there is one limit on the number of instruction breakpoints and another limit on the number of data breakpoints. The assumption is clearly broken; for example, on x86 there exists a limit on the *total* number of all breakpoints disregarding their kind and the code has failed to capture this aspect. As such, modify_user_hw_breakpoint() makes no attempt to keep the counts correct. Instead, it simply tries to change and install a new breakpoint and fails if the hardware disallows. This can lead to a situation where, say on x86, someone creates 4 TYPE_DATA breakpoints, then changes one of them to TYPE_INS via modify_user_hw_breakpoint() and then releases the TYPE_INS breakpoint. Since the accounting still thinks that there are four TYPE_DATA breakpoints, it will disallow creating a new TYPE_DATA breakpoint, although there is place for one TYPE_DATA breakpoint. This convinces me that the problem and the solution are outside of this current patch. Do you agree? -Milind
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT
On Wed, Nov 08, 2017 at 07:51:10AM -0800, Milind Chabbi wrote: > On Wed, Nov 8, 2017 at 7:12 AM, Jiri Olsa wrote: > > > > I am not able to fully understand your concern. > > > Can you point to a code file and line related to your observation? > > > The patch is modeled after the existing modify_user_hw_breakpoint() > > > function > > > present in events/hw_breakpoint.c; don't you see this problem in that > > > code? > > > > the reserve_bp_slot/release_bp_slot functions manage > > counts for current breakpoints based on its type > > > > those counts are cumulated in here: > > static DEFINE_PER_CPU(struct bp_cpuinfo, bp_cpuinfo[TYPE_MAX]); > > > > you allow to change the breakpoint type, so I'd expect > > to see some code that release slot count for old type > > and take new one (if it's available) > > > > jirka > > > Why is this not a concern for modify_user_hw_breakpoint() function? I don't know ;-) jirka
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT
On Wed, Nov 8, 2017 at 7:12 AM, Jiri Olsa wrote: > > I am not able to fully understand your concern. > > Can you point to a code file and line related to your observation? > > The patch is modeled after the existing modify_user_hw_breakpoint() function > > present in events/hw_breakpoint.c; don't you see this problem in that code? > > the reserve_bp_slot/release_bp_slot functions manage > counts for current breakpoints based on its type > > those counts are cumulated in here: > static DEFINE_PER_CPU(struct bp_cpuinfo, bp_cpuinfo[TYPE_MAX]); > > you allow to change the breakpoint type, so I'd expect > to see some code that release slot count for old type > and take new one (if it's available) > > jirka Why is this not a concern for modify_user_hw_breakpoint() function?
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT
On Wed, Nov 08, 2017 at 07:02:22AM -0800, Milind Chabbi wrote: > On Wed, Nov 8, 2017 at 6:15 AM, Jiri Olsa wrote: > > On Mon, Nov 06, 2017 at 07:04:40AM -0800, Milind Chabbi wrote: > >> Hi Jirka, > >> > >> I see the tabs in my sent email, do you have suggestions on how best to > >> send this patch so that the tabs are preserved by the email client? > >> Can anybody else also check if they received with/without tabs? > >> > >> release_bp_slot/reserve_bp_slot majic is not necessary since > >> _IOC_MODIFY_BREAKPOINT ioctl modifies an already registered breakpoint > >> without affecting the count of breakpoints active. > > > > but AFAICS you allow to change the breakpoint type (bp_type) > > and slot counts are based on the breakpoint type > > > > jirka > > Jirka, > I am not able to fully understand your concern. > Can you point to a code file and line related to your observation? > The patch is modeled after the existing modify_user_hw_breakpoint() function > present in events/hw_breakpoint.c; don't you see this problem in that code? the reserve_bp_slot/release_bp_slot functions manage counts for current breakpoints based on its type those counts are cumulated in here: static DEFINE_PER_CPU(struct bp_cpuinfo, bp_cpuinfo[TYPE_MAX]); you allow to change the breakpoint type, so I'd expect to see some code that release slot count for old type and take new one (if it's available) jirka
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT
On Wed, Nov 8, 2017 at 6:15 AM, Jiri Olsa wrote: > On Mon, Nov 06, 2017 at 07:04:40AM -0800, Milind Chabbi wrote: >> Hi Jirka, >> >> I see the tabs in my sent email, do you have suggestions on how best to >> send this patch so that the tabs are preserved by the email client? >> Can anybody else also check if they received with/without tabs? >> >> release_bp_slot/reserve_bp_slot majic is not necessary since >> _IOC_MODIFY_BREAKPOINT ioctl modifies an already registered breakpoint >> without affecting the count of breakpoints active. > > but AFAICS you allow to change the breakpoint type (bp_type) > and slot counts are based on the breakpoint type > > jirka Jirka, I am not able to fully understand your concern. Can you point to a code file and line related to your observation? The patch is modeled after the existing modify_user_hw_breakpoint() function present in events/hw_breakpoint.c; don't you see this problem in that code? -Milind
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT
On Mon, Nov 06, 2017 at 07:04:40AM -0800, Milind Chabbi wrote: > Hi Jirka, > > I see the tabs in my sent email, do you have suggestions on how best to > send this patch so that the tabs are preserved by the email client? > Can anybody else also check if they received with/without tabs? > > release_bp_slot/reserve_bp_slot majic is not necessary since > _IOC_MODIFY_BREAKPOINT ioctl modifies an already registered breakpoint > without affecting the count of breakpoints active. but AFAICS you allow to change the breakpoint type (bp_type) and slot counts are based on the breakpoint type jirka
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT.
Hi Milind, Thank you for the patch! Yet something to improve: [auto build test ERROR on tip/perf/core] [also build test ERROR on v4.14-rc8 next-20171108] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Milind-Chabbi/perf-core-fast-breakpoint-modification-via-_IOC_MODIFY_BREAKPOINT/20171108-184959 config: arm-multi_v5_defconfig (attached as .config) compiler: arm-linux-gnueabi-gcc (Debian 6.1.1-9) 6.1.1 20160705 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=arm All errors (new ones prefixed by >>): kernel/events/core.o: In function `_perf_ioctl': >> core.c:(.text+0xa644): undefined reference to `validate_hw_breakpoint' --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT.
Hi Milind, Thank you for the patch! Yet something to improve: [auto build test ERROR on tip/perf/core] [also build test ERROR on v4.14-rc8 next-20171108] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Milind-Chabbi/perf-core-fast-breakpoint-modification-via-_IOC_MODIFY_BREAKPOINT/20171108-184959 config: parisc-c3000_defconfig (attached as .config) compiler: hppa-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=parisc All errors (new ones prefixed by >>): kernel/events/core.o: In function `_perf_ioctl': >> (.text._perf_ioctl+0x2c8): undefined reference to `validate_hw_breakpoint' --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT.
On Tue, Nov 7, 2017 at 11:01 AM, Peter Zijlstra wrote: > On Tue, Nov 07, 2017 at 09:42:25AM -0800, Milind Chabbi wrote: >> I am fine with Andi's suggestion. In summary, > > A: Because it messes up the order in which people normally read text. > Q: Why is top-posting such a bad thing? > A: Top-posting. > Q: What is the most annoying thing in e-mail? Point taken Peter. Posted below this time.
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT.
On Tue, Nov 07, 2017 at 09:42:25AM -0800, Milind Chabbi wrote: > I am fine with Andi's suggestion. In summary, A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing in e-mail?
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT.
I am fine with Andi's suggestion. In summary, 1. I will introduce an ioctl flag _IOC_MODIFY_ATTRIBUTES. (Yes, plural ATTRIBUTES not ATTRIBUTE) 2. Currently, implement only updates to breakpoints and all others will fail with -EOPNOTSUPP. 3. The implementation of breakpoint update shall check the following before modifying: (event->attr.type == PERF_TYPE_BREAKPOINT) && (new_attr.type == PERF_TYPE_BREAKPOINT) This ensures that both the passed in fd's event and the new_attr are PERF_TYPE_BREAKPOINT. Can we have a consensus on this? Now the question is what other attribute values to check in the implementation of the breakpoint update. Do you expect all fields other than the ones that we allow modification remain unchanged from the original creation time? and if anything changes should we fail with -EOPNOTSUPP? I think that is too strict. Expecting them to be zeros can be seen as a change from the original values, hence zero is not the right expectation. I am open to suggestions here and your help in listing a few attribute fields that need validation will be valuable. -Milind On Tue, Nov 7, 2017 at 9:24 AM, Andi Kleen wrote: > On Tue, Nov 07, 2017 at 07:43:35AM -0800, Milind Chabbi wrote: >> Peter, >> >> Generic update perf_event_attr interface is noble but impractical. >> It will cause a validation nightmare. >> Many of the behaviors or choices will become hard to reason. > > I don't think you would necessarily need to support modifying > all of this. Just define a general interface that could be used > to modify these things, but right now it would be only > implemented for the special case of breakpoints. > > Your ioctl is very near it anyways, just need to change > the name and do more sanity checking on the input values. > > -Andi
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT.
On Tue, Nov 07, 2017 at 07:43:35AM -0800, Milind Chabbi wrote: > Peter, > > Generic update perf_event_attr interface is noble but impractical. > It will cause a validation nightmare. > Many of the behaviors or choices will become hard to reason. I don't think you would necessarily need to support modifying all of this. Just define a general interface that could be used to modify these things, but right now it would be only implemented for the special case of breakpoints. Your ioctl is very near it anyways, just need to change the name and do more sanity checking on the input values. -Andi
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT.
On Tue, Nov 07, 2017 at 09:15:41AM +0100, Peter Zijlstra wrote: > On Mon, Nov 06, 2017 at 03:16:58PM -0800, Andi Kleen wrote: > > > +static int _perf_event_modify_breakpoint(struct perf_event *bp, > > > + struct perf_event_attr *attr) > > > +{ > > > + u64 old_addr = bp->attr.bp_addr; > > > + u64 old_len = bp->attr.bp_len; > > > + int old_type = bp->attr.bp_type; > > > + int err = 0; > > > + > > > + _perf_event_disable(bp); > > > + > > > + bp->attr.bp_addr = attr->bp_addr; > > > + bp->attr.bp_type = attr->bp_type; > > > + bp->attr.bp_len = attr->bp_len; > > > > You don't check any of the other fields, so user space is free > > to fill in junk. That means they can never be used for anything. > > It would be better to check at least some of them for being > > zero, and also that the type matches the break point. > > Yes, the values should at the very least get the exact same validation > they would get on creating an event with those values. In this case the ioctl could be also generalized. Not call it _BREAKPOINT, just _MODIFY. Just for now it would be only implemented for break points, but that could be potentially extended later. -Andi
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT.
Peter, Generic update perf_event_attr interface is noble but impractical. It will cause a validation nightmare. Many of the behaviors or choices will become hard to reason. If somebody changes "sample_period" it is unclear what to do it the number of samples collected so far exceeds the newly configured N events. If somebody changes sample_type, the type of data recorded in the mmap ring buffer will be a mix of two (or more) different kinds of data, which make it untenable. Uncommon but possible: somebody may like to change the "type" itself. The list goes on. When you proposed generic update perf_event_attr interface, do you have a clear use case in mind with measured performance impact? If so, one can consider that path, which is an entirely different project. I believe your proposal is to introduce a new system call perf_event_update_attr(). The changes proposed in the patch are motivated by a clear use case with a clear performance impact. Changing the address monitored by a breakpoint is a common operation by profilers, and hence it need not go through the whole process of unmapping the ring buffer, closing the fd, re-opening a perf event and remapping the ring buffer. -Milind On Tue, Nov 7, 2017 at 12:14 AM, Peter Zijlstra wrote: > > On Mon, Nov 06, 2017 at 05:09:15PM -0500, Milind Chabbi wrote: > > diff --git a/include/uapi/linux/perf_event.h > > b/include/uapi/linux/perf_event.h > > index 362493a..d458214 100644 > > --- a/include/uapi/linux/perf_event.h > > +++ b/include/uapi/linux/perf_event.h > > @@ -433,6 +433,8 @@ struct perf_event_attr { > > #define PERF_EVENT_IOC_ID_IOR('$', 7, __u64 *) > > #define PERF_EVENT_IOC_SET_BPF _IOW('$', 8, __u32) > > #define PERF_EVENT_IOC_PAUSE_OUTPUT _IOW('$', 9, __u32) > > +#define PERF_EVENT_IOC_MODIFY_BREAKPOINT \ > > + _IOW('$', 10, struct perf_event_attr *) > > > I really hate this thing. I would much rather see a more generic update > perf_event_attr interface. Where we allow modifying some of the > perf_event_attr fields.
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT.
On Mon, Nov 06, 2017 at 05:09:15PM -0500, Milind Chabbi wrote: > diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h > index 362493a..d458214 100644 > --- a/include/uapi/linux/perf_event.h > +++ b/include/uapi/linux/perf_event.h > @@ -433,6 +433,8 @@ struct perf_event_attr { > #define PERF_EVENT_IOC_ID_IOR('$', 7, __u64 *) > #define PERF_EVENT_IOC_SET_BPF _IOW('$', 8, __u32) > #define PERF_EVENT_IOC_PAUSE_OUTPUT _IOW('$', 9, __u32) > +#define PERF_EVENT_IOC_MODIFY_BREAKPOINT \ > + _IOW('$', 10, struct perf_event_attr *) I really hate this thing. I would much rather see a more generic update perf_event_attr interface. Where we allow modifying some of the perf_event_attr fields.
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT.
On Mon, Nov 06, 2017 at 03:16:58PM -0800, Andi Kleen wrote: > > +static int _perf_event_modify_breakpoint(struct perf_event *bp, > > +struct perf_event_attr *attr) > > +{ > > + u64 old_addr = bp->attr.bp_addr; > > + u64 old_len = bp->attr.bp_len; > > + int old_type = bp->attr.bp_type; > > + int err = 0; > > + > > + _perf_event_disable(bp); > > + > > + bp->attr.bp_addr = attr->bp_addr; > > + bp->attr.bp_type = attr->bp_type; > > + bp->attr.bp_len = attr->bp_len; > > You don't check any of the other fields, so user space is free > to fill in junk. That means they can never be used for anything. > It would be better to check at least some of them for being > zero, and also that the type matches the break point. Yes, the values should at the very least get the exact same validation they would get on creating an event with those values.
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT.
> +static int _perf_event_modify_breakpoint(struct perf_event *bp, > + struct perf_event_attr *attr) > +{ > + u64 old_addr = bp->attr.bp_addr; > + u64 old_len = bp->attr.bp_len; > + int old_type = bp->attr.bp_type; > + int err = 0; > + > + _perf_event_disable(bp); > + > + bp->attr.bp_addr = attr->bp_addr; > + bp->attr.bp_type = attr->bp_type; > + bp->attr.bp_len = attr->bp_len; You don't check any of the other fields, so user space is free to fill in junk. That means they can never be used for anything. It would be better to check at least some of them for being zero, and also that the type matches the break point. -Andi
[PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT.
Problem and motivation: Once a breakpoint perf event (PERF_TYPE_BREAKPOINT) is created, there is no flexibility to change the breakpoint type (bp_type), breakpoint address (bp_addr), or breakpoint length (bp_len). The only option is to close the perf event and configure a new breakpoint event. This inflexibility has a significant performance overhead. For example, sampling-based, lightweight performance profilers (and also concurrency bug detection tools), monitor different addresses for a short duration using PERF_TYPE_BREAKPOINT and change the address (bp_addr) to another address or change the kind of breakpoint (bp_type) from "write" to a "read" or vice-versa or change the length (bp_len) of the address being monitored. The cost of these modifications is prohibitive since it involves unmapping the circular buffer associated with the perf event, closing the perf event, opening another perf event and mmaping another circular buffer. Solution: The new ioctl flag for perf events, PERF_EVENT_IOC_MODIFY_BREAKPOINT, introduced in this patch takes a pointer to a struct perf_event_attr as an argument to update an old breakpoint event with new address, type, and size. This facility allows retaining a previous mmaped perf events ring buffer and avoids having to close and reopen another perf event. The patch replicates some of its functionality of modify_user_hw_breakpoint() in kernel/events/hw_breakpoint.c. modify_user_hw_breakpoint cannot be called directed since perf_event_ctx_lock() is already held in _perf_ioctl(). Evidence: Experiments show that the baseline (not able to modify an already created breakpoint) costs an order of magnitude (~10x) more than the suggested optimization (having the ability to dynamically modifying a configured breakpoint via ioctl). When the breakpoints typically do not trap, the speedup due to the suggested optimization is ~10x; even when the breakpoints always trap, the speedup is ~4x due to the suggested optimization. Testing: tests posted at https://github.com/linux-contrib/perf_event_modify_bp demonstrate the performance significance of this patch. Tests also check the functional correctness of the patch. Signed-off-by: Milind Chabbi --- include/uapi/linux/perf_event.h | 2 ++ kernel/events/core.c | 46 +++ kernel/events/hw_breakpoint.c | 2 +- kernel/events/internal.h | 1 + tools/include/uapi/linux/perf_event.h | 2 ++ 5 files changed, 52 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index 362493a..d458214 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -433,6 +433,8 @@ struct perf_event_attr { #define PERF_EVENT_IOC_ID _IOR('$', 7, __u64 *) #define PERF_EVENT_IOC_SET_BPF _IOW('$', 8, __u32) #define PERF_EVENT_IOC_PAUSE_OUTPUT_IOW('$', 9, __u32) +#define PERF_EVENT_IOC_MODIFY_BREAKPOINT \ + _IOW('$', 10, struct perf_event_attr *) enum perf_event_ioc_flags { PERF_IOC_FLAG_GROUP = 1U << 0, diff --git a/kernel/events/core.c b/kernel/events/core.c index 9d93db8..85718da 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -2746,6 +2746,41 @@ int perf_event_refresh(struct perf_event *event, int refresh) } EXPORT_SYMBOL_GPL(perf_event_refresh); +static int _perf_event_modify_breakpoint(struct perf_event *bp, +struct perf_event_attr *attr) +{ + u64 old_addr = bp->attr.bp_addr; + u64 old_len = bp->attr.bp_len; + int old_type = bp->attr.bp_type; + int err = 0; + + _perf_event_disable(bp); + + bp->attr.bp_addr = attr->bp_addr; + bp->attr.bp_type = attr->bp_type; + bp->attr.bp_len = attr->bp_len; + + if (attr->disabled) + goto end; + + err = validate_hw_breakpoint(bp); + if (!err) { + _perf_event_enable(bp); + } else { + bp->attr.bp_addr = old_addr; + bp->attr.bp_type = old_type; + bp->attr.bp_len = old_len; + if (!bp->attr.disabled) + _perf_event_enable(bp); + + return err; + } +end: + bp->attr.disabled = attr->disabled; + return 0; +} + + static void ctx_sched_out(struct perf_event_context *ctx, struct perf_cpu_context *cpuctx, enum event_type_t event_type) @@ -4731,6 +4766,8 @@ static int perf_event_set_output(struct perf_event *event, struct perf_event *output_event); static int perf_event_set_filter(struct perf_event *event, void __user *arg); static int perf_event_set_bpf_prog(struct perf_event *event, u32 prog_fd); +static int perf_copy_attr(struct perf_event_attr __user *uattr, + struct perf_event_attr *attr); static long _perf_ioctl(struct perf_ev
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT
Em Mon, Nov 06, 2017 at 07:00:29AM -0800, Milind Chabbi escreveu: > Hi Jirka, > > I see the tabs in my sent email, do you have suggestions on how best to > send this patch so that the tabs are preserved by the email client? Documentation/process/email-clients.rst - Arnaldo > Can anybody else also check if they received with/without tabs? > > release_bp_slot/reserve_bp_slot majic is not necessary since > _IOC_MODIFY_BREAKPOINT ioctl modifies an already registered breakpoint > without affecting the count of breakpoints active. > > -Milind > > On Mon, Nov 6, 2017 at 1:23 AM, Jiri Olsa wrote: > > > On Sun, Nov 05, 2017 at 02:35:34PM -0800, Milind Chabbi wrote: > > > > SNIP > > > > > +static int _perf_event_modify_breakpoint(struct perf_event *bp, > > > + struct perf_event_attr *attr) > > > +{ > > > + u64 old_addr = bp->attr.bp_addr; > > > + u64 old_len = bp->attr.bp_len; > > > + int old_type = bp->attr.bp_type; > > > + int err = 0; > > > + > > > + _perf_event_disable(bp); > > > + > > > + bp->attr.bp_addr = attr->bp_addr; > > > + bp->attr.bp_type = attr->bp_type; > > > + bp->attr.bp_len = attr->bp_len; > > > + > > > + if (attr->disabled) > > > + goto end; > > > + > > > + err = validate_hw_breakpoint(bp); > > > > thre patch is mangled.. seems like you've lost all your tabs somehow > > > > anyway, should you also do the release_bp_slot/reserve_bp_slot > > magic to keep the slot accounting corrent? > > > > jirka > >
Re: [PATCH] perf/core: fast breakpoint modification via _IOC_MODIFY_BREAKPOINT
On Sun, Nov 05, 2017 at 02:35:34PM -0800, Milind Chabbi wrote: SNIP > +static int _perf_event_modify_breakpoint(struct perf_event *bp, > + struct perf_event_attr *attr) > +{ > + u64 old_addr = bp->attr.bp_addr; > + u64 old_len = bp->attr.bp_len; > + int old_type = bp->attr.bp_type; > + int err = 0; > + > + _perf_event_disable(bp); > + > + bp->attr.bp_addr = attr->bp_addr; > + bp->attr.bp_type = attr->bp_type; > + bp->attr.bp_len = attr->bp_len; > + > + if (attr->disabled) > + goto end; > + > + err = validate_hw_breakpoint(bp); thre patch is mangled.. seems like you've lost all your tabs somehow anyway, should you also do the release_bp_slot/reserve_bp_slot magic to keep the slot accounting corrent? jirka