Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On 04/25, Jacob Shin wrote: > > Right now perf > userland tool hard codes bp_len to 4, so I need to modify it to allow > user to override the length if desired. imho this itself looks like a good change... > Oleg, Frederic, et al. > > Which syntax do you prefer? > > $ perf stat -e mem:0x1000/16 Personally I like this a bit more, but I won't argue in any case. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On Fri, Apr 26, 2013 at 05:20:44PM +0100, Will Deacon wrote: > Hi Jacob, > > On Fri, Apr 26, 2013 at 12:19:11AM +0100, Jacob Shin wrote: > > On Thu, Apr 25, 2013 at 10:17:35AM -0700, H. Peter Anvin wrote: > > > On 04/25/2013 10:06 AM, Oleg Nesterov wrote: > > > >> > > > >> The downside is that in userland perf tool we need differing > > > >> documentation > > > >> on what the mask syntax means for each architecture. > > > > > > > > Personally I think this is acceptable. > > > > > > > > But I am new to this code, so... > > > > > > > > > > That would seem really, really awkward. Yes, perf has a bunch of > > > low-level stuff, but it would seem highly undesirable to force the user > > > to deal with something like that. > > > > > > It would be good to have a user-friendly syntax that covers most of what > > > users may want to do and perhaps a longer form that can express > > > everything including ARM's byte selects; if the system can't honor the > > > request it should return an error. > > > > Okay, > > > > If arch specific masks are a no go, then I think I'm convinced that > > Oleg's idea of using bp_len is the right thing to do. Right now perf > > userland tool hard codes bp_len to 4, so I need to modify it to allow > > user to override the length if desired. > > So what value ends up in the bp_len field: a length or a mask? I just want > to make sure it's flexible enough that, if we add another user interface for > the byte-select stuff, we don't need to butcher the attr in another way. It is length. Just as it is today, userland API does not change at all. > > > Oleg, Frederic, et al. > > > > Which syntax do you prefer? > > > > If we want to set bp_len to 16: > > > > $ perf stat -e mem:0x1000:rw:16 > > > > Or > > > > $ perf stat -e mem:0x1000:16 > > > > Or > > > > $ perf stat -e mem:0x1000/16 > > > > If no bp_len value is specified, it will still default to 4 as it did > > before. > > I certainly like the ability to change the length of an execute breakpoint, > as that helps for halfword instructions on ARM (not sure if your first > syntax precludes that, but just checking...). All three are equivalent, I just need to pick one and implement/document. -Jacob -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
Hi Jacob, On Fri, Apr 26, 2013 at 12:19:11AM +0100, Jacob Shin wrote: > On Thu, Apr 25, 2013 at 10:17:35AM -0700, H. Peter Anvin wrote: > > On 04/25/2013 10:06 AM, Oleg Nesterov wrote: > > >> > > >> The downside is that in userland perf tool we need differing > > >> documentation > > >> on what the mask syntax means for each architecture. > > > > > > Personally I think this is acceptable. > > > > > > But I am new to this code, so... > > > > > > > That would seem really, really awkward. Yes, perf has a bunch of > > low-level stuff, but it would seem highly undesirable to force the user > > to deal with something like that. > > > > It would be good to have a user-friendly syntax that covers most of what > > users may want to do and perhaps a longer form that can express > > everything including ARM's byte selects; if the system can't honor the > > request it should return an error. > > Okay, > > If arch specific masks are a no go, then I think I'm convinced that > Oleg's idea of using bp_len is the right thing to do. Right now perf > userland tool hard codes bp_len to 4, so I need to modify it to allow > user to override the length if desired. So what value ends up in the bp_len field: a length or a mask? I just want to make sure it's flexible enough that, if we add another user interface for the byte-select stuff, we don't need to butcher the attr in another way. > Oleg, Frederic, et al. > > Which syntax do you prefer? > > If we want to set bp_len to 16: > > $ perf stat -e mem:0x1000:rw:16 > > Or > > $ perf stat -e mem:0x1000:16 > > Or > > $ perf stat -e mem:0x1000/16 > > If no bp_len value is specified, it will still default to 4 as it did > before. I certainly like the ability to change the length of an execute breakpoint, as that helps for halfword instructions on ARM (not sure if your first syntax precludes that, but just checking...). Will -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
Hi Jacob, On Fri, Apr 26, 2013 at 12:19:11AM +0100, Jacob Shin wrote: On Thu, Apr 25, 2013 at 10:17:35AM -0700, H. Peter Anvin wrote: On 04/25/2013 10:06 AM, Oleg Nesterov wrote: The downside is that in userland perf tool we need differing documentation on what the mask syntax means for each architecture. Personally I think this is acceptable. But I am new to this code, so... That would seem really, really awkward. Yes, perf has a bunch of low-level stuff, but it would seem highly undesirable to force the user to deal with something like that. It would be good to have a user-friendly syntax that covers most of what users may want to do and perhaps a longer form that can express everything including ARM's byte selects; if the system can't honor the request it should return an error. Okay, If arch specific masks are a no go, then I think I'm convinced that Oleg's idea of using bp_len is the right thing to do. Right now perf userland tool hard codes bp_len to 4, so I need to modify it to allow user to override the length if desired. So what value ends up in the bp_len field: a length or a mask? I just want to make sure it's flexible enough that, if we add another user interface for the byte-select stuff, we don't need to butcher the attr in another way. Oleg, Frederic, et al. Which syntax do you prefer? If we want to set bp_len to 16: $ perf stat -e mem:0x1000:rw:16 Or $ perf stat -e mem:0x1000:16 Or $ perf stat -e mem:0x1000/16 If no bp_len value is specified, it will still default to 4 as it did before. I certainly like the ability to change the length of an execute breakpoint, as that helps for halfword instructions on ARM (not sure if your first syntax precludes that, but just checking...). Will -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On Fri, Apr 26, 2013 at 05:20:44PM +0100, Will Deacon wrote: Hi Jacob, On Fri, Apr 26, 2013 at 12:19:11AM +0100, Jacob Shin wrote: On Thu, Apr 25, 2013 at 10:17:35AM -0700, H. Peter Anvin wrote: On 04/25/2013 10:06 AM, Oleg Nesterov wrote: The downside is that in userland perf tool we need differing documentation on what the mask syntax means for each architecture. Personally I think this is acceptable. But I am new to this code, so... That would seem really, really awkward. Yes, perf has a bunch of low-level stuff, but it would seem highly undesirable to force the user to deal with something like that. It would be good to have a user-friendly syntax that covers most of what users may want to do and perhaps a longer form that can express everything including ARM's byte selects; if the system can't honor the request it should return an error. Okay, If arch specific masks are a no go, then I think I'm convinced that Oleg's idea of using bp_len is the right thing to do. Right now perf userland tool hard codes bp_len to 4, so I need to modify it to allow user to override the length if desired. So what value ends up in the bp_len field: a length or a mask? I just want to make sure it's flexible enough that, if we add another user interface for the byte-select stuff, we don't need to butcher the attr in another way. It is length. Just as it is today, userland API does not change at all. Oleg, Frederic, et al. Which syntax do you prefer? If we want to set bp_len to 16: $ perf stat -e mem:0x1000:rw:16 Or $ perf stat -e mem:0x1000:16 Or $ perf stat -e mem:0x1000/16 If no bp_len value is specified, it will still default to 4 as it did before. I certainly like the ability to change the length of an execute breakpoint, as that helps for halfword instructions on ARM (not sure if your first syntax precludes that, but just checking...). All three are equivalent, I just need to pick one and implement/document. -Jacob -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On 04/25, Jacob Shin wrote: Right now perf userland tool hard codes bp_len to 4, so I need to modify it to allow user to override the length if desired. imho this itself looks like a good change... Oleg, Frederic, et al. Which syntax do you prefer? $ perf stat -e mem:0x1000/16 Personally I like this a bit more, but I won't argue in any case. Oleg. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On Thu, Apr 25, 2013 at 10:17:35AM -0700, H. Peter Anvin wrote: > On 04/25/2013 10:06 AM, Oleg Nesterov wrote: > >> > >> The downside is that in userland perf tool we need differing documentation > >> on what the mask syntax means for each architecture. > > > > Personally I think this is acceptable. > > > > But I am new to this code, so... > > > > That would seem really, really awkward. Yes, perf has a bunch of > low-level stuff, but it would seem highly undesirable to force the user > to deal with something like that. > > It would be good to have a user-friendly syntax that covers most of what > users may want to do and perhaps a longer form that can express > everything including ARM's byte selects; if the system can't honor the > request it should return an error. Okay, If arch specific masks are a no go, then I think I'm convinced that Oleg's idea of using bp_len is the right thing to do. Right now perf userland tool hard codes bp_len to 4, so I need to modify it to allow user to override the length if desired. Oleg, Frederic, et al. Which syntax do you prefer? If we want to set bp_len to 16: $ perf stat -e mem:0x1000:rw:16 Or $ perf stat -e mem:0x1000:16 Or $ perf stat -e mem:0x1000/16 If no bp_len value is specified, it will still default to 4 as it did before. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On 04/25, Jacob Shin wrote: > > On Thu, Apr 25, 2013 at 05:10:35PM +0200, Oleg Nesterov wrote: > > On 04/25, Frederic Weisbecker wrote: > > > > > > Do we need len and mask to work at the same time? I can't think of a > > > situation when len and mask mix up together in a useful way to define > > > a range. > > Okay, we can make it: > > union { > __u64 bp_len; > __u64 bp_addr_mask; > __config2; > }; > > And in x86, bp_len != HW_BREAKPOINT_LEN_1,2,4,8 will be interpreted as > bp_addr_mask. I think this can work too. And this needs almost the same changes as extending ->bp_len. > > Well. Another option is to extend bp_len. Fortunately HW_BREAKPOINT_LEN_* > > match the length, so we can simply allow any 2^n length and amd.c can > > translate it into the mask. > > Okay, this is nice because we can just ride on top of what already exits, > but ... Yes, yes, I agree with your "but". As I said from the very beginning I am not sure about this idea. > addr of 0x1000 and mask of 0xf0 will count accesses to: > > 0x1000, 0x1010, 0x1020, .. 0x10e0, 0x10f0 > > Maybe there is some big blob of data and user wants to see how many times > 16 byte aligned addresses get hit. This might be not as common, but it is > plausible no? I'd say this is certainly uncommon ;) But in any case we should not limit a user, so I agree. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On 04/25/2013 10:06 AM, Oleg Nesterov wrote: >> >> The downside is that in userland perf tool we need differing documentation >> on what the mask syntax means for each architecture. > > Personally I think this is acceptable. > > But I am new to this code, so... > That would seem really, really awkward. Yes, perf has a bunch of low-level stuff, but it would seem highly undesirable to force the user to deal with something like that. It would be good to have a user-friendly syntax that covers most of what users may want to do and perhaps a longer form that can express everything including ARM's byte selects; if the system can't honor the request it should return an error. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On 04/24, Jacob Shin wrote: > > And only x86 > hw_breakpoint will know what .config = 0xf means for x86 and do the right > thing. For ARM, 0xf will mean something different. > > The downside is that in userland perf tool we need differing documentation > on what the mask syntax means for each architecture. Personally I think this is acceptable. But I am new to this code, so... Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On Thu, Apr 25, 2013 at 05:10:35PM +0200, Oleg Nesterov wrote: > On 04/25, Frederic Weisbecker wrote: > > > > 2013/4/23 Jacob Shin : > > > @@ -286,7 +286,10 @@ struct perf_event_attr { > > > __u64 config1; /* extension of config */ > > > }; > > > union { > > > - __u64 bp_len; > > > + struct { > > > + __u32 bp_len; > > > + __u32 bp_addr_mask; > > > + }; > > > > Do we need len and mask to work at the same time? I can't think of a > > situation when len and mask mix up together in a useful way to define > > a range. Okay, we can make it: union { __u64 bp_len; __u64 bp_addr_mask; __config2; }; And in x86, bp_len != HW_BREAKPOINT_LEN_1,2,4,8 will be interpreted as bp_addr_mask. > > And it would be nice (I think) if we could simply turn bp_len into > bp_mask. It is already the mask actually, bp_addr should be aligned. > > But I do not see how we can do this, so I guess we need another field. > > Well. Another option is to extend bp_len. Fortunately HW_BREAKPOINT_LEN_* > match the length, so we can simply allow any 2^n length and amd.c can > translate it into the mask. Okay, this is nice because we can just ride on top of what already exits, but ... > > Of course, this doesn't allow to use, say, mask=0xF0. But perhaps this > is not really useful? Exactly .. Right I think most of the time we are trying to trap on range of contiguous addresses, but .. mask of 0xf0 allows us to trap on 16 byte aligned addresses: addr of 0x1000 and mask of 0xf0 will count accesses to: 0x1000, 0x1010, 0x1020, .. 0x10e0, 0x10f0 Maybe there is some big blob of data and user wants to see how many times 16 byte aligned addresses get hit. This might be not as common, but it is plausible no? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On 04/25, Oleg Nesterov wrote: > > On 04/25, Frederic Weisbecker wrote: > > > > 2013/4/23 Jacob Shin : > > > @@ -286,7 +286,10 @@ struct perf_event_attr { > > > __u64 config1; /* extension of config */ > > > }; > > > union { > > > - __u64 bp_len; > > > + struct { > > > + __u32 bp_len; > > > + __u32 bp_addr_mask; > > > + }; > > > > Do we need len and mask to work at the same time? I can't think of a > > situation when len and mask mix up together in a useful way to define > > a range. > > And it would be nice (I think) if we could simply turn bp_len into > bp_mask. It is already the mask actually, bp_addr should be aligned. > > But I do not see how we can do this, so I guess we need another field. > > Well. Another option is to extend bp_len. Fortunately HW_BREAKPOINT_LEN_* > match the length, so we can simply allow any 2^n length and amd.c can > translate it into the mask. > > Of course, this doesn't allow to use, say, mask=0xF0. But perhaps this > is not really useful? > > I dunno. I leave this to you and Jacob ;) IOW, I meant something like the incomplete patch below. But let me repeat I am not sure this is the good idea. Hmm. And note the change arch_check_bp_in_kernelspace(). Whatever we do it should be updated if we have a mask... And this reminds me that arch_check_bp_in_kernelspace() is buggy but my trivial fix was ignored ;) see http://marc.info/?t=13524859301 Oleg. --- x/arch/x86/include/asm/hw_breakpoint.h +++ x/arch/x86/include/asm/hw_breakpoint.h @@ -12,6 +12,7 @@ */ struct arch_hw_breakpoint { unsigned long address; + unsigned long mask; u8 len; u8 type; }; --- x/arch/x86/kernel/hw_breakpoint.c +++ x/arch/x86/kernel/hw_breakpoint.c @@ -128,6 +128,8 @@ int arch_install_hw_breakpoint(struct pe *dr7 |= encode_dr7(i, info->len, info->type); set_debugreg(*dr7, 7); + if (info->mask) + set_dr_addr_mask(...); return 0; } @@ -165,29 +167,6 @@ void arch_uninstall_hw_breakpoint(struct set_debugreg(*dr7, 7); } -static int get_hbp_len(u8 hbp_len) -{ - unsigned int len_in_bytes = 0; - - switch (hbp_len) { - case X86_BREAKPOINT_LEN_1: - len_in_bytes = 1; - break; - case X86_BREAKPOINT_LEN_2: - len_in_bytes = 2; - break; - case X86_BREAKPOINT_LEN_4: - len_in_bytes = 4; - break; -#ifdef CONFIG_X86_64 - case X86_BREAKPOINT_LEN_8: - len_in_bytes = 8; - break; -#endif - } - return len_in_bytes; -} - /* * Check for virtual address in kernel space. */ @@ -198,7 +177,7 @@ int arch_check_bp_in_kernelspace(struct struct arch_hw_breakpoint *info = counter_arch_bp(bp); va = info->address; - len = get_hbp_len(info->len); + len = bp->attr.bp_len; return (va >= TASK_SIZE) && ((va + len - 1) >= TASK_SIZE); } @@ -278,7 +257,8 @@ static int arch_build_bp_info(struct per return -EINVAL; } - /* Len */ + /* info->len == info->mask == 0 */ + switch (bp->attr.bp_len) { case HW_BREAKPOINT_LEN_1: info->len = X86_BREAKPOINT_LEN_1; @@ -295,7 +275,10 @@ static int arch_build_bp_info(struct per break; #endif default: - return -EINVAL; + if (!is_power_of_2(bp->attr.bp_len)) + return -EINVAL; + info->mask = bp->attr.bp_len - 1; + info->len = X86_BREAKPOINT_LEN_1; } return 0; @@ -314,11 +297,15 @@ int arch_validate_hwbkpt_settings(struct if (ret) return ret; - ret = -EINVAL; - switch (info->len) { case X86_BREAKPOINT_LEN_1: align = 0; + if (info->mask) { + align = info->mask; + ret = arch_validate_hwbkpt_addr_mask(...); + if (ret) + return ret; + } break; case X86_BREAKPOINT_LEN_2: align = 1; @@ -332,7 +319,7 @@ int arch_validate_hwbkpt_settings(struct break; #endif default: - return ret; + BUG(); } /* -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On 04/25, Frederic Weisbecker wrote: > > 2013/4/23 Jacob Shin : > > @@ -286,7 +286,10 @@ struct perf_event_attr { > > __u64 config1; /* extension of config */ > > }; > > union { > > - __u64 bp_len; > > + struct { > > + __u32 bp_len; > > + __u32 bp_addr_mask; > > + }; > > Do we need len and mask to work at the same time? I can't think of a > situation when len and mask mix up together in a useful way to define > a range. And it would be nice (I think) if we could simply turn bp_len into bp_mask. It is already the mask actually, bp_addr should be aligned. But I do not see how we can do this, so I guess we need another field. Well. Another option is to extend bp_len. Fortunately HW_BREAKPOINT_LEN_* match the length, so we can simply allow any 2^n length and amd.c can translate it into the mask. Of course, this doesn't allow to use, say, mask=0xF0. But perhaps this is not really useful? I dunno. I leave this to you and Jacob ;) Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On 04/25, Frederic Weisbecker wrote: 2013/4/23 Jacob Shin jacob.s...@amd.com: @@ -286,7 +286,10 @@ struct perf_event_attr { __u64 config1; /* extension of config */ }; union { - __u64 bp_len; + struct { + __u32 bp_len; + __u32 bp_addr_mask; + }; Do we need len and mask to work at the same time? I can't think of a situation when len and mask mix up together in a useful way to define a range. And it would be nice (I think) if we could simply turn bp_len into bp_mask. It is already the mask actually, bp_addr should be aligned. But I do not see how we can do this, so I guess we need another field. Well. Another option is to extend bp_len. Fortunately HW_BREAKPOINT_LEN_* match the length, so we can simply allow any 2^n length and amd.c can translate it into the mask. Of course, this doesn't allow to use, say, mask=0xF0. But perhaps this is not really useful? I dunno. I leave this to you and Jacob ;) Oleg. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On 04/25, Oleg Nesterov wrote: On 04/25, Frederic Weisbecker wrote: 2013/4/23 Jacob Shin jacob.s...@amd.com: @@ -286,7 +286,10 @@ struct perf_event_attr { __u64 config1; /* extension of config */ }; union { - __u64 bp_len; + struct { + __u32 bp_len; + __u32 bp_addr_mask; + }; Do we need len and mask to work at the same time? I can't think of a situation when len and mask mix up together in a useful way to define a range. And it would be nice (I think) if we could simply turn bp_len into bp_mask. It is already the mask actually, bp_addr should be aligned. But I do not see how we can do this, so I guess we need another field. Well. Another option is to extend bp_len. Fortunately HW_BREAKPOINT_LEN_* match the length, so we can simply allow any 2^n length and amd.c can translate it into the mask. Of course, this doesn't allow to use, say, mask=0xF0. But perhaps this is not really useful? I dunno. I leave this to you and Jacob ;) IOW, I meant something like the incomplete patch below. But let me repeat I am not sure this is the good idea. Hmm. And note the change arch_check_bp_in_kernelspace(). Whatever we do it should be updated if we have a mask... And this reminds me that arch_check_bp_in_kernelspace() is buggy but my trivial fix was ignored ;) see http://marc.info/?t=13524859301 Oleg. --- x/arch/x86/include/asm/hw_breakpoint.h +++ x/arch/x86/include/asm/hw_breakpoint.h @@ -12,6 +12,7 @@ */ struct arch_hw_breakpoint { unsigned long address; + unsigned long mask; u8 len; u8 type; }; --- x/arch/x86/kernel/hw_breakpoint.c +++ x/arch/x86/kernel/hw_breakpoint.c @@ -128,6 +128,8 @@ int arch_install_hw_breakpoint(struct pe *dr7 |= encode_dr7(i, info-len, info-type); set_debugreg(*dr7, 7); + if (info-mask) + set_dr_addr_mask(...); return 0; } @@ -165,29 +167,6 @@ void arch_uninstall_hw_breakpoint(struct set_debugreg(*dr7, 7); } -static int get_hbp_len(u8 hbp_len) -{ - unsigned int len_in_bytes = 0; - - switch (hbp_len) { - case X86_BREAKPOINT_LEN_1: - len_in_bytes = 1; - break; - case X86_BREAKPOINT_LEN_2: - len_in_bytes = 2; - break; - case X86_BREAKPOINT_LEN_4: - len_in_bytes = 4; - break; -#ifdef CONFIG_X86_64 - case X86_BREAKPOINT_LEN_8: - len_in_bytes = 8; - break; -#endif - } - return len_in_bytes; -} - /* * Check for virtual address in kernel space. */ @@ -198,7 +177,7 @@ int arch_check_bp_in_kernelspace(struct struct arch_hw_breakpoint *info = counter_arch_bp(bp); va = info-address; - len = get_hbp_len(info-len); + len = bp-attr.bp_len; return (va = TASK_SIZE) ((va + len - 1) = TASK_SIZE); } @@ -278,7 +257,8 @@ static int arch_build_bp_info(struct per return -EINVAL; } - /* Len */ + /* info-len == info-mask == 0 */ + switch (bp-attr.bp_len) { case HW_BREAKPOINT_LEN_1: info-len = X86_BREAKPOINT_LEN_1; @@ -295,7 +275,10 @@ static int arch_build_bp_info(struct per break; #endif default: - return -EINVAL; + if (!is_power_of_2(bp-attr.bp_len)) + return -EINVAL; + info-mask = bp-attr.bp_len - 1; + info-len = X86_BREAKPOINT_LEN_1; } return 0; @@ -314,11 +297,15 @@ int arch_validate_hwbkpt_settings(struct if (ret) return ret; - ret = -EINVAL; - switch (info-len) { case X86_BREAKPOINT_LEN_1: align = 0; + if (info-mask) { + align = info-mask; + ret = arch_validate_hwbkpt_addr_mask(...); + if (ret) + return ret; + } break; case X86_BREAKPOINT_LEN_2: align = 1; @@ -332,7 +319,7 @@ int arch_validate_hwbkpt_settings(struct break; #endif default: - return ret; + BUG(); } /* -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On Thu, Apr 25, 2013 at 05:10:35PM +0200, Oleg Nesterov wrote: On 04/25, Frederic Weisbecker wrote: 2013/4/23 Jacob Shin jacob.s...@amd.com: @@ -286,7 +286,10 @@ struct perf_event_attr { __u64 config1; /* extension of config */ }; union { - __u64 bp_len; + struct { + __u32 bp_len; + __u32 bp_addr_mask; + }; Do we need len and mask to work at the same time? I can't think of a situation when len and mask mix up together in a useful way to define a range. Okay, we can make it: union { __u64 bp_len; __u64 bp_addr_mask; __config2; }; And in x86, bp_len != HW_BREAKPOINT_LEN_1,2,4,8 will be interpreted as bp_addr_mask. And it would be nice (I think) if we could simply turn bp_len into bp_mask. It is already the mask actually, bp_addr should be aligned. But I do not see how we can do this, so I guess we need another field. Well. Another option is to extend bp_len. Fortunately HW_BREAKPOINT_LEN_* match the length, so we can simply allow any 2^n length and amd.c can translate it into the mask. Okay, this is nice because we can just ride on top of what already exits, but ... Of course, this doesn't allow to use, say, mask=0xF0. But perhaps this is not really useful? Exactly .. Right I think most of the time we are trying to trap on range of contiguous addresses, but .. mask of 0xf0 allows us to trap on 16 byte aligned addresses: addr of 0x1000 and mask of 0xf0 will count accesses to: 0x1000, 0x1010, 0x1020, .. 0x10e0, 0x10f0 Maybe there is some big blob of data and user wants to see how many times 16 byte aligned addresses get hit. This might be not as common, but it is plausible no? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On 04/24, Jacob Shin wrote: And only x86 hw_breakpoint will know what .config = 0xf means for x86 and do the right thing. For ARM, 0xf will mean something different. The downside is that in userland perf tool we need differing documentation on what the mask syntax means for each architecture. Personally I think this is acceptable. But I am new to this code, so... Oleg. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On 04/25/2013 10:06 AM, Oleg Nesterov wrote: The downside is that in userland perf tool we need differing documentation on what the mask syntax means for each architecture. Personally I think this is acceptable. But I am new to this code, so... That would seem really, really awkward. Yes, perf has a bunch of low-level stuff, but it would seem highly undesirable to force the user to deal with something like that. It would be good to have a user-friendly syntax that covers most of what users may want to do and perhaps a longer form that can express everything including ARM's byte selects; if the system can't honor the request it should return an error. -hpa -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On 04/25, Jacob Shin wrote: On Thu, Apr 25, 2013 at 05:10:35PM +0200, Oleg Nesterov wrote: On 04/25, Frederic Weisbecker wrote: Do we need len and mask to work at the same time? I can't think of a situation when len and mask mix up together in a useful way to define a range. Okay, we can make it: union { __u64 bp_len; __u64 bp_addr_mask; __config2; }; And in x86, bp_len != HW_BREAKPOINT_LEN_1,2,4,8 will be interpreted as bp_addr_mask. I think this can work too. And this needs almost the same changes as extending -bp_len. Well. Another option is to extend bp_len. Fortunately HW_BREAKPOINT_LEN_* match the length, so we can simply allow any 2^n length and amd.c can translate it into the mask. Okay, this is nice because we can just ride on top of what already exits, but ... Yes, yes, I agree with your but. As I said from the very beginning I am not sure about this idea. addr of 0x1000 and mask of 0xf0 will count accesses to: 0x1000, 0x1010, 0x1020, .. 0x10e0, 0x10f0 Maybe there is some big blob of data and user wants to see how many times 16 byte aligned addresses get hit. This might be not as common, but it is plausible no? I'd say this is certainly uncommon ;) But in any case we should not limit a user, so I agree. Oleg. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On Thu, Apr 25, 2013 at 10:17:35AM -0700, H. Peter Anvin wrote: On 04/25/2013 10:06 AM, Oleg Nesterov wrote: The downside is that in userland perf tool we need differing documentation on what the mask syntax means for each architecture. Personally I think this is acceptable. But I am new to this code, so... That would seem really, really awkward. Yes, perf has a bunch of low-level stuff, but it would seem highly undesirable to force the user to deal with something like that. It would be good to have a user-friendly syntax that covers most of what users may want to do and perhaps a longer form that can express everything including ARM's byte selects; if the system can't honor the request it should return an error. Okay, If arch specific masks are a no go, then I think I'm convinced that Oleg's idea of using bp_len is the right thing to do. Right now perf userland tool hard codes bp_len to 4, so I need to modify it to allow user to override the length if desired. Oleg, Frederic, et al. Which syntax do you prefer? If we want to set bp_len to 16: $ perf stat -e mem:0x1000:rw:16 Or $ perf stat -e mem:0x1000:16 Or $ perf stat -e mem:0x1000/16 If no bp_len value is specified, it will still default to 4 as it did before. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
2013/4/23 Jacob Shin : > From: Jacob Shin > Date: Mon, 22 Apr 2013 17:02:37 -0500 > Subject: [PATCH 1/4] perf: Add hardware breakpoint address mask > > Some architectures (for us, AMD Family 16h) allow for "don't care" bit > mask to further qualify a hardware breakpoint address, in order to > trap on range of addresses. Update perf uapi to add bp_addr_mask field. It would be nice to describe a bit what this "don't care" bit mask is about. Is it a range of address/bitmask to ignore in the middle of the range we want to breakpoint in? I mean, that's confusing. > > Signed-off-by: Jacob Shin > --- > include/uapi/linux/perf_event.h |5 - > kernel/events/hw_breakpoint.c |9 + > 2 files changed, 13 insertions(+), 1 deletion(-) > > diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h > index fb104e5..e22e1d1 100644 > --- a/include/uapi/linux/perf_event.h > +++ b/include/uapi/linux/perf_event.h > @@ -286,7 +286,10 @@ struct perf_event_attr { > __u64 config1; /* extension of config */ > }; > union { > - __u64 bp_len; > + struct { > + __u32 bp_len; > + __u32 bp_addr_mask; > + }; Do we need len and mask to work at the same time? I can't think of a situation when len and mask mix up together in a useful way to define a range. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On Wed, Apr 24, 2013 at 10:48:53AM +0100, Will Deacon wrote: > On Tue, Apr 23, 2013 at 04:18:46PM +0100, Jacob Shin wrote: > > On Tue, Apr 23, 2013 at 04:02:40PM +0100, Will Deacon wrote: > > > On Tue, Apr 23, 2013 at 03:40:57PM +0100, Jacob Shin wrote: > > > > > perf stat -e mem:0x1000/0xf:w a.out > > > > > > Are you saying that this command would count any write to: > > > > > > 0x1000 > > > 0x1001 > > > ... > > > 0x100e > > > 0x100f > > > > > > ? > > > > > > If so, that differs from the ARM debug architecture in that the mask is > > > called > > > `byte-address-select', so a mask of 0b1001 would count accesses at +0 > > > bytes > > > and +3 bytes from the base address. Is that possible to describe with your > > > masking scheme and a single watchpoint? > > > > > > A mask of 0xf, would count +0, +1, +2 and +3 (essentially bp_len == 4). > > > > > > Unfortunately, that means I can't just invert the mask like I originally > > > thought. > > > > Ah, .. that is different . > > > > Our hardware matches on the breakpoint if: > > > > (physical_address & ~bp_addr_mask) == (bp_addr & ~bp_addr_mask) > > > > In other words, the mask says which of the bp_addr bits hardware should > > ignore when matching. > > > > .. it would be great if we can come up with userland interface that works > > for both archs. I'm coming up empty at the moment .. > > After a bit of thought, I *think* the ARM mechanism is more expressive, and > you could describe your masking in terms of byte-address-select. The > downside is that we'd end up with much larger masks, which could be argued > as counter-intuitive from a user's point of view. Hm .. Oleg, any thoughts on this? bp_addr is encoded into attr.config1 and bp_len in attr.config2. I'm not sure if attr.config is being used or not, but if not, we could say that architecture specific "stuff" goes into attr.config. And in perf tools: perf stat -e mem:0x1000/0xf:w a.out Will just translate into setting attr.config = 0xf. And only x86 hw_breakpoint will know what .config = 0xf means for x86 and do the right thing. For ARM, 0xf will mean something different. The downside is that in userland perf tool we need differing documentation on what the mask syntax means for each architecture. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On Tue, Apr 23, 2013 at 04:18:46PM +0100, Jacob Shin wrote: > On Tue, Apr 23, 2013 at 04:02:40PM +0100, Will Deacon wrote: > > On Tue, Apr 23, 2013 at 03:40:57PM +0100, Jacob Shin wrote: > > > > perf stat -e mem:0x1000/0xf:w a.out > > > > Are you saying that this command would count any write to: > > > > 0x1000 > > 0x1001 > > ... > > 0x100e > > 0x100f > > > > ? > > > > If so, that differs from the ARM debug architecture in that the mask is > > called > > `byte-address-select', so a mask of 0b1001 would count accesses at +0 bytes > > and +3 bytes from the base address. Is that possible to describe with your > > masking scheme and a single watchpoint? > > > > A mask of 0xf, would count +0, +1, +2 and +3 (essentially bp_len == 4). > > > > Unfortunately, that means I can't just invert the mask like I originally > > thought. > > Ah, .. that is different . > > Our hardware matches on the breakpoint if: > > (physical_address & ~bp_addr_mask) == (bp_addr & ~bp_addr_mask) > > In other words, the mask says which of the bp_addr bits hardware should > ignore when matching. > > .. it would be great if we can come up with userland interface that works > for both archs. I'm coming up empty at the moment .. After a bit of thought, I *think* the ARM mechanism is more expressive, and you could describe your masking in terms of byte-address-select. The downside is that we'd end up with much larger masks, which could be argued as counter-intuitive from a user's point of view. Will -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On Tue, Apr 23, 2013 at 04:18:46PM +0100, Jacob Shin wrote: On Tue, Apr 23, 2013 at 04:02:40PM +0100, Will Deacon wrote: On Tue, Apr 23, 2013 at 03:40:57PM +0100, Jacob Shin wrote: perf stat -e mem:0x1000/0xf:w a.out Are you saying that this command would count any write to: 0x1000 0x1001 ... 0x100e 0x100f ? If so, that differs from the ARM debug architecture in that the mask is called `byte-address-select', so a mask of 0b1001 would count accesses at +0 bytes and +3 bytes from the base address. Is that possible to describe with your masking scheme and a single watchpoint? A mask of 0xf, would count +0, +1, +2 and +3 (essentially bp_len == 4). Unfortunately, that means I can't just invert the mask like I originally thought. Ah, .. that is different . Our hardware matches on the breakpoint if: (physical_address ~bp_addr_mask) == (bp_addr ~bp_addr_mask) In other words, the mask says which of the bp_addr bits hardware should ignore when matching. .. it would be great if we can come up with userland interface that works for both archs. I'm coming up empty at the moment .. After a bit of thought, I *think* the ARM mechanism is more expressive, and you could describe your masking in terms of byte-address-select. The downside is that we'd end up with much larger masks, which could be argued as counter-intuitive from a user's point of view. Will -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On Wed, Apr 24, 2013 at 10:48:53AM +0100, Will Deacon wrote: On Tue, Apr 23, 2013 at 04:18:46PM +0100, Jacob Shin wrote: On Tue, Apr 23, 2013 at 04:02:40PM +0100, Will Deacon wrote: On Tue, Apr 23, 2013 at 03:40:57PM +0100, Jacob Shin wrote: perf stat -e mem:0x1000/0xf:w a.out Are you saying that this command would count any write to: 0x1000 0x1001 ... 0x100e 0x100f ? If so, that differs from the ARM debug architecture in that the mask is called `byte-address-select', so a mask of 0b1001 would count accesses at +0 bytes and +3 bytes from the base address. Is that possible to describe with your masking scheme and a single watchpoint? A mask of 0xf, would count +0, +1, +2 and +3 (essentially bp_len == 4). Unfortunately, that means I can't just invert the mask like I originally thought. Ah, .. that is different . Our hardware matches on the breakpoint if: (physical_address ~bp_addr_mask) == (bp_addr ~bp_addr_mask) In other words, the mask says which of the bp_addr bits hardware should ignore when matching. .. it would be great if we can come up with userland interface that works for both archs. I'm coming up empty at the moment .. After a bit of thought, I *think* the ARM mechanism is more expressive, and you could describe your masking in terms of byte-address-select. The downside is that we'd end up with much larger masks, which could be argued as counter-intuitive from a user's point of view. Hm .. Oleg, any thoughts on this? bp_addr is encoded into attr.config1 and bp_len in attr.config2. I'm not sure if attr.config is being used or not, but if not, we could say that architecture specific stuff goes into attr.config. And in perf tools: perf stat -e mem:0x1000/0xf:w a.out Will just translate into setting attr.config = 0xf. And only x86 hw_breakpoint will know what .config = 0xf means for x86 and do the right thing. For ARM, 0xf will mean something different. The downside is that in userland perf tool we need differing documentation on what the mask syntax means for each architecture. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
2013/4/23 Jacob Shin jacob.s...@amd.com: From: Jacob Shin jacob.s...@amd.com Date: Mon, 22 Apr 2013 17:02:37 -0500 Subject: [PATCH 1/4] perf: Add hardware breakpoint address mask Some architectures (for us, AMD Family 16h) allow for don't care bit mask to further qualify a hardware breakpoint address, in order to trap on range of addresses. Update perf uapi to add bp_addr_mask field. It would be nice to describe a bit what this don't care bit mask is about. Is it a range of address/bitmask to ignore in the middle of the range we want to breakpoint in? I mean, that's confusing. Signed-off-by: Jacob Shin jacob.s...@amd.com --- include/uapi/linux/perf_event.h |5 - kernel/events/hw_breakpoint.c |9 + 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index fb104e5..e22e1d1 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -286,7 +286,10 @@ struct perf_event_attr { __u64 config1; /* extension of config */ }; union { - __u64 bp_len; + struct { + __u32 bp_len; + __u32 bp_addr_mask; + }; Do we need len and mask to work at the same time? I can't think of a situation when len and mask mix up together in a useful way to define a range. Thanks. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On Tue, Apr 23, 2013 at 04:02:40PM +0100, Will Deacon wrote: > On Tue, Apr 23, 2013 at 03:40:57PM +0100, Jacob Shin wrote: > > On Tue, Apr 23, 2013 at 09:34:23AM -0500, Jacob Shin wrote: > > > On Tue, Apr 23, 2013 at 10:54:37AM +0100, Will Deacon wrote: > > > > Can you see a problem if I simply invert the mask? > > > > > > That's great! No, I don't see a problem at all. > > Great! The GDB folks have been asking for this, so I can finally make them > go away now :) > > > > I guess now it can be debated if the mask coming in from userland should > > > be include or exclude mask. But I think exclude makes syntax easier: > > > > > > To count writes to [0x1000 ~ 0x1010) > > > > > > Include mask (my current patchset): > > ^^^ > > Exclude (I mean ..) > > > > > > perf stat -e mem:0x1000/0xf:w a.out > > Are you saying that this command would count any write to: > > 0x1000 > 0x1001 > ... > 0x100e > 0x100f > > ? > > If so, that differs from the ARM debug architecture in that the mask is called > `byte-address-select', so a mask of 0b1001 would count accesses at +0 bytes > and +3 bytes from the base address. Is that possible to describe with your > masking scheme and a single watchpoint? > > A mask of 0xf, would count +0, +1, +2 and +3 (essentially bp_len == 4). > > Unfortunately, that means I can't just invert the mask like I originally > thought. Ah, .. that is different . Our hardware matches on the breakpoint if: (physical_address & ~bp_addr_mask) == (bp_addr & ~bp_addr_mask) In other words, the mask says which of the bp_addr bits hardware should ignore when matching. .. it would be great if we can come up with userland interface that works for both archs. I'm coming up empty at the moment .. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On Tue, Apr 23, 2013 at 03:40:57PM +0100, Jacob Shin wrote: > On Tue, Apr 23, 2013 at 09:34:23AM -0500, Jacob Shin wrote: > > On Tue, Apr 23, 2013 at 10:54:37AM +0100, Will Deacon wrote: > > > Can you see a problem if I simply invert the mask? > > > > That's great! No, I don't see a problem at all. Great! The GDB folks have been asking for this, so I can finally make them go away now :) > > I guess now it can be debated if the mask coming in from userland should > > be include or exclude mask. But I think exclude makes syntax easier: > > > > To count writes to [0x1000 ~ 0x1010) > > > > Include mask (my current patchset): > ^^^ > Exclude (I mean ..) > > > > perf stat -e mem:0x1000/0xf:w a.out Are you saying that this command would count any write to: 0x1000 0x1001 ... 0x100e 0x100f ? If so, that differs from the ARM debug architecture in that the mask is called `byte-address-select', so a mask of 0b1001 would count accesses at +0 bytes and +3 bytes from the base address. Is that possible to describe with your masking scheme and a single watchpoint? A mask of 0xf, would count +0, +1, +2 and +3 (essentially bp_len == 4). Unfortunately, that means I can't just invert the mask like I originally thought. Will -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On Tue, Apr 23, 2013 at 09:34:23AM -0500, Jacob Shin wrote: > On Tue, Apr 23, 2013 at 10:54:37AM +0100, Will Deacon wrote: > > Hi Jacob, > > > > On Tue, Apr 23, 2013 at 08:57:02AM +0100, Jacob Shin wrote: > > > Some architectures (for us, AMD Family 16h) allow for "don't care" bit > > > mask to further qualify a hardware breakpoint address, in order to > > > trap on range of addresses. Update perf uapi to add bp_addr_mask field. > > > > arm and arm64 have a similar feature to this, whereby we currently have to > > translate the bp_len field into a mask, which is all the hardware > > understands. Unlike what you describe, our mask indicates the bytes we *are* > > interested in, but I think we could make use of the same functionality that > > you're introducing here. > > > > There are some funky restrictions on the alignment of the base address, but > > we can detect those and tell userspace where to go if it tries any funny > > stuff. > > > > Can you see a problem if I simply invert the mask? > > Hi, > > That's great! No, I don't see a problem at all. > > I guess now it can be debated if the mask coming in from userland should > be include or exclude mask. But I think exclude makes syntax easier: > > To count writes to [0x1000 ~ 0x1010) > > Include mask (my current patchset): ^^^ Exclude (I mean ..) > > perf stat -e mem:0x1000/0xf:w a.out > > Exclude mask: ^^^ Include > > perf stat -e mem:0x1000/0xfff0:w a.out > > Thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On Tue, Apr 23, 2013 at 10:54:37AM +0100, Will Deacon wrote: > Hi Jacob, > > On Tue, Apr 23, 2013 at 08:57:02AM +0100, Jacob Shin wrote: > > Some architectures (for us, AMD Family 16h) allow for "don't care" bit > > mask to further qualify a hardware breakpoint address, in order to > > trap on range of addresses. Update perf uapi to add bp_addr_mask field. > > arm and arm64 have a similar feature to this, whereby we currently have to > translate the bp_len field into a mask, which is all the hardware > understands. Unlike what you describe, our mask indicates the bytes we *are* > interested in, but I think we could make use of the same functionality that > you're introducing here. > > There are some funky restrictions on the alignment of the base address, but > we can detect those and tell userspace where to go if it tries any funny > stuff. > > Can you see a problem if I simply invert the mask? Hi, That's great! No, I don't see a problem at all. I guess now it can be debated if the mask coming in from userland should be include or exclude mask. But I think exclude makes syntax easier: To count writes to [0x1000 ~ 0x1010) Include mask (my current patchset): perf stat -e mem:0x1000/0xf:w a.out Exclude mask: perf stat -e mem:0x1000/0xfff0:w a.out Thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On Tue, Apr 23, 2013 at 03:18:44PM +0200, Oleg Nesterov wrote: > On 04/23, Jacob Shin wrote: > > > > +__weak int arch_validate_hwbkpt_addr_mask(struct perf_event *bp) > > +{ > > + return bp->attr.bp_addr_mask == 0; > > +} > > + > > static int validate_hw_breakpoint(struct perf_event *bp) > > { > > int ret; > > @@ -393,6 +398,10 @@ static int validate_hw_breakpoint(struct perf_event > > *bp) > > if (ret) > > return ret; > > > > + ret = arch_validate_hwbkpt_addr_mask(bp); > > + if (ret) > > + return ret; > > Well, this looks obviously wrong? > > arch_validate_hwbkpt_addr_mask() fails if bp_addr_mask == 0? and returns > "1" as the error code. > > Either it should returns something like "bp_addr_mask ? -ENOTSUPP : 0" > or the caller should do "if (!validate_hw_breakpoint()) return -ERR". You are absolutely right. I have mixed up "does this arch support address masks?" vs "is this a valid address mask?" So sorry about that. Here is the corrected patch: >From f2054e8af979f024aa2da93f80646db18492479e Mon Sep 17 00:00:00 2001 From: Jacob Shin Date: Mon, 22 Apr 2013 17:02:37 -0500 Subject: [PATCH 1/4] perf: Add hardware breakpoint address mask Some architectures (for us, AMD Family 16h) allow for "don't care" bit mask to further qualify a hardware breakpoint address, in order to trap on range of addresses. Update perf uapi to add bp_addr_mask field. Signed-off-by: Jacob Shin --- include/uapi/linux/perf_event.h |5 - kernel/events/hw_breakpoint.c |9 + 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index fb104e5..e22e1d1 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -286,7 +286,10 @@ struct perf_event_attr { __u64 config1; /* extension of config */ }; union { - __u64 bp_len; + struct { + __u32 bp_len; + __u32 bp_addr_mask; + }; __u64 config2; /* extension of config1 */ }; __u64 branch_sample_type; /* enum perf_branch_sample_type */ diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c index a64f8ae..dde8273 100644 --- a/kernel/events/hw_breakpoint.c +++ b/kernel/events/hw_breakpoint.c @@ -385,6 +385,11 @@ int dbg_release_bp_slot(struct perf_event *bp) return 0; } +__weak int arch_validate_hwbkpt_addr_mask(struct perf_event *bp) +{ + return bp->attr.bp_addr_mask ? -EOPNOTSUPP : 0; +} + static int validate_hw_breakpoint(struct perf_event *bp) { int ret; @@ -393,6 +398,10 @@ static int validate_hw_breakpoint(struct perf_event *bp) if (ret) return ret; + ret = arch_validate_hwbkpt_addr_mask(bp); + if (ret) + return ret; + if (arch_check_bp_in_kernelspace(bp)) { if (bp->attr.exclude_kernel) return -EINVAL; -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On 04/23, Jacob Shin wrote: > > +__weak int arch_validate_hwbkpt_addr_mask(struct perf_event *bp) > +{ > + return bp->attr.bp_addr_mask == 0; > +} > + > static int validate_hw_breakpoint(struct perf_event *bp) > { > int ret; > @@ -393,6 +398,10 @@ static int validate_hw_breakpoint(struct perf_event *bp) > if (ret) > return ret; > > + ret = arch_validate_hwbkpt_addr_mask(bp); > + if (ret) > + return ret; Well, this looks obviously wrong? arch_validate_hwbkpt_addr_mask() fails if bp_addr_mask == 0? and returns "1" as the error code. Either it should returns something like "bp_addr_mask ? -ENOTSUPP : 0" or the caller should do "if (!validate_hw_breakpoint()) return -ERR". Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
Hi Jacob, On Tue, Apr 23, 2013 at 08:57:02AM +0100, Jacob Shin wrote: > Some architectures (for us, AMD Family 16h) allow for "don't care" bit > mask to further qualify a hardware breakpoint address, in order to > trap on range of addresses. Update perf uapi to add bp_addr_mask field. arm and arm64 have a similar feature to this, whereby we currently have to translate the bp_len field into a mask, which is all the hardware understands. Unlike what you describe, our mask indicates the bytes we *are* interested in, but I think we could make use of the same functionality that you're introducing here. There are some funky restrictions on the alignment of the base address, but we can detect those and tell userspace where to go if it tries any funny stuff. Can you see a problem if I simply invert the mask? Will -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V2 1/4] perf: Add hardware breakpoint address mask
Some architectures (for us, AMD Family 16h) allow for "don't care" bit mask to further qualify a hardware breakpoint address, in order to trap on range of addresses. Update perf uapi to add bp_addr_mask field. Signed-off-by: Jacob Shin --- include/uapi/linux/perf_event.h |5 - kernel/events/hw_breakpoint.c |9 + 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index fb104e5..e22e1d1 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -286,7 +286,10 @@ struct perf_event_attr { __u64 config1; /* extension of config */ }; union { - __u64 bp_len; + struct { + __u32 bp_len; + __u32 bp_addr_mask; + }; __u64 config2; /* extension of config1 */ }; __u64 branch_sample_type; /* enum perf_branch_sample_type */ diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c index a64f8ae..c454880 100644 --- a/kernel/events/hw_breakpoint.c +++ b/kernel/events/hw_breakpoint.c @@ -385,6 +385,11 @@ int dbg_release_bp_slot(struct perf_event *bp) return 0; } +__weak int arch_validate_hwbkpt_addr_mask(struct perf_event *bp) +{ + return bp->attr.bp_addr_mask == 0; +} + static int validate_hw_breakpoint(struct perf_event *bp) { int ret; @@ -393,6 +398,10 @@ static int validate_hw_breakpoint(struct perf_event *bp) if (ret) return ret; + ret = arch_validate_hwbkpt_addr_mask(bp); + if (ret) + return ret; + if (arch_check_bp_in_kernelspace(bp)) { if (bp->attr.exclude_kernel) return -EINVAL; -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V2 1/4] perf: Add hardware breakpoint address mask
Some architectures (for us, AMD Family 16h) allow for don't care bit mask to further qualify a hardware breakpoint address, in order to trap on range of addresses. Update perf uapi to add bp_addr_mask field. Signed-off-by: Jacob Shin jacob.s...@amd.com --- include/uapi/linux/perf_event.h |5 - kernel/events/hw_breakpoint.c |9 + 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index fb104e5..e22e1d1 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -286,7 +286,10 @@ struct perf_event_attr { __u64 config1; /* extension of config */ }; union { - __u64 bp_len; + struct { + __u32 bp_len; + __u32 bp_addr_mask; + }; __u64 config2; /* extension of config1 */ }; __u64 branch_sample_type; /* enum perf_branch_sample_type */ diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c index a64f8ae..c454880 100644 --- a/kernel/events/hw_breakpoint.c +++ b/kernel/events/hw_breakpoint.c @@ -385,6 +385,11 @@ int dbg_release_bp_slot(struct perf_event *bp) return 0; } +__weak int arch_validate_hwbkpt_addr_mask(struct perf_event *bp) +{ + return bp-attr.bp_addr_mask == 0; +} + static int validate_hw_breakpoint(struct perf_event *bp) { int ret; @@ -393,6 +398,10 @@ static int validate_hw_breakpoint(struct perf_event *bp) if (ret) return ret; + ret = arch_validate_hwbkpt_addr_mask(bp); + if (ret) + return ret; + if (arch_check_bp_in_kernelspace(bp)) { if (bp-attr.exclude_kernel) return -EINVAL; -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
Hi Jacob, On Tue, Apr 23, 2013 at 08:57:02AM +0100, Jacob Shin wrote: Some architectures (for us, AMD Family 16h) allow for don't care bit mask to further qualify a hardware breakpoint address, in order to trap on range of addresses. Update perf uapi to add bp_addr_mask field. arm and arm64 have a similar feature to this, whereby we currently have to translate the bp_len field into a mask, which is all the hardware understands. Unlike what you describe, our mask indicates the bytes we *are* interested in, but I think we could make use of the same functionality that you're introducing here. There are some funky restrictions on the alignment of the base address, but we can detect those and tell userspace where to go if it tries any funny stuff. Can you see a problem if I simply invert the mask? Will -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On 04/23, Jacob Shin wrote: +__weak int arch_validate_hwbkpt_addr_mask(struct perf_event *bp) +{ + return bp-attr.bp_addr_mask == 0; +} + static int validate_hw_breakpoint(struct perf_event *bp) { int ret; @@ -393,6 +398,10 @@ static int validate_hw_breakpoint(struct perf_event *bp) if (ret) return ret; + ret = arch_validate_hwbkpt_addr_mask(bp); + if (ret) + return ret; Well, this looks obviously wrong? arch_validate_hwbkpt_addr_mask() fails if bp_addr_mask == 0? and returns 1 as the error code. Either it should returns something like bp_addr_mask ? -ENOTSUPP : 0 or the caller should do if (!validate_hw_breakpoint()) return -ERR. Oleg. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On Tue, Apr 23, 2013 at 03:18:44PM +0200, Oleg Nesterov wrote: On 04/23, Jacob Shin wrote: +__weak int arch_validate_hwbkpt_addr_mask(struct perf_event *bp) +{ + return bp-attr.bp_addr_mask == 0; +} + static int validate_hw_breakpoint(struct perf_event *bp) { int ret; @@ -393,6 +398,10 @@ static int validate_hw_breakpoint(struct perf_event *bp) if (ret) return ret; + ret = arch_validate_hwbkpt_addr_mask(bp); + if (ret) + return ret; Well, this looks obviously wrong? arch_validate_hwbkpt_addr_mask() fails if bp_addr_mask == 0? and returns 1 as the error code. Either it should returns something like bp_addr_mask ? -ENOTSUPP : 0 or the caller should do if (!validate_hw_breakpoint()) return -ERR. You are absolutely right. I have mixed up does this arch support address masks? vs is this a valid address mask? So sorry about that. Here is the corrected patch: From f2054e8af979f024aa2da93f80646db18492479e Mon Sep 17 00:00:00 2001 From: Jacob Shin jacob.s...@amd.com Date: Mon, 22 Apr 2013 17:02:37 -0500 Subject: [PATCH 1/4] perf: Add hardware breakpoint address mask Some architectures (for us, AMD Family 16h) allow for don't care bit mask to further qualify a hardware breakpoint address, in order to trap on range of addresses. Update perf uapi to add bp_addr_mask field. Signed-off-by: Jacob Shin jacob.s...@amd.com --- include/uapi/linux/perf_event.h |5 - kernel/events/hw_breakpoint.c |9 + 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index fb104e5..e22e1d1 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -286,7 +286,10 @@ struct perf_event_attr { __u64 config1; /* extension of config */ }; union { - __u64 bp_len; + struct { + __u32 bp_len; + __u32 bp_addr_mask; + }; __u64 config2; /* extension of config1 */ }; __u64 branch_sample_type; /* enum perf_branch_sample_type */ diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c index a64f8ae..dde8273 100644 --- a/kernel/events/hw_breakpoint.c +++ b/kernel/events/hw_breakpoint.c @@ -385,6 +385,11 @@ int dbg_release_bp_slot(struct perf_event *bp) return 0; } +__weak int arch_validate_hwbkpt_addr_mask(struct perf_event *bp) +{ + return bp-attr.bp_addr_mask ? -EOPNOTSUPP : 0; +} + static int validate_hw_breakpoint(struct perf_event *bp) { int ret; @@ -393,6 +398,10 @@ static int validate_hw_breakpoint(struct perf_event *bp) if (ret) return ret; + ret = arch_validate_hwbkpt_addr_mask(bp); + if (ret) + return ret; + if (arch_check_bp_in_kernelspace(bp)) { if (bp-attr.exclude_kernel) return -EINVAL; -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On Tue, Apr 23, 2013 at 10:54:37AM +0100, Will Deacon wrote: Hi Jacob, On Tue, Apr 23, 2013 at 08:57:02AM +0100, Jacob Shin wrote: Some architectures (for us, AMD Family 16h) allow for don't care bit mask to further qualify a hardware breakpoint address, in order to trap on range of addresses. Update perf uapi to add bp_addr_mask field. arm and arm64 have a similar feature to this, whereby we currently have to translate the bp_len field into a mask, which is all the hardware understands. Unlike what you describe, our mask indicates the bytes we *are* interested in, but I think we could make use of the same functionality that you're introducing here. There are some funky restrictions on the alignment of the base address, but we can detect those and tell userspace where to go if it tries any funny stuff. Can you see a problem if I simply invert the mask? Hi, That's great! No, I don't see a problem at all. I guess now it can be debated if the mask coming in from userland should be include or exclude mask. But I think exclude makes syntax easier: To count writes to [0x1000 ~ 0x1010) Include mask (my current patchset): perf stat -e mem:0x1000/0xf:w a.out Exclude mask: perf stat -e mem:0x1000/0xfff0:w a.out Thanks! -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On Tue, Apr 23, 2013 at 09:34:23AM -0500, Jacob Shin wrote: On Tue, Apr 23, 2013 at 10:54:37AM +0100, Will Deacon wrote: Hi Jacob, On Tue, Apr 23, 2013 at 08:57:02AM +0100, Jacob Shin wrote: Some architectures (for us, AMD Family 16h) allow for don't care bit mask to further qualify a hardware breakpoint address, in order to trap on range of addresses. Update perf uapi to add bp_addr_mask field. arm and arm64 have a similar feature to this, whereby we currently have to translate the bp_len field into a mask, which is all the hardware understands. Unlike what you describe, our mask indicates the bytes we *are* interested in, but I think we could make use of the same functionality that you're introducing here. There are some funky restrictions on the alignment of the base address, but we can detect those and tell userspace where to go if it tries any funny stuff. Can you see a problem if I simply invert the mask? Hi, That's great! No, I don't see a problem at all. I guess now it can be debated if the mask coming in from userland should be include or exclude mask. But I think exclude makes syntax easier: To count writes to [0x1000 ~ 0x1010) Include mask (my current patchset): ^^^ Exclude (I mean ..) perf stat -e mem:0x1000/0xf:w a.out Exclude mask: ^^^ Include perf stat -e mem:0x1000/0xfff0:w a.out Thanks! -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On Tue, Apr 23, 2013 at 03:40:57PM +0100, Jacob Shin wrote: On Tue, Apr 23, 2013 at 09:34:23AM -0500, Jacob Shin wrote: On Tue, Apr 23, 2013 at 10:54:37AM +0100, Will Deacon wrote: Can you see a problem if I simply invert the mask? That's great! No, I don't see a problem at all. Great! The GDB folks have been asking for this, so I can finally make them go away now :) I guess now it can be debated if the mask coming in from userland should be include or exclude mask. But I think exclude makes syntax easier: To count writes to [0x1000 ~ 0x1010) Include mask (my current patchset): ^^^ Exclude (I mean ..) perf stat -e mem:0x1000/0xf:w a.out Are you saying that this command would count any write to: 0x1000 0x1001 ... 0x100e 0x100f ? If so, that differs from the ARM debug architecture in that the mask is called `byte-address-select', so a mask of 0b1001 would count accesses at +0 bytes and +3 bytes from the base address. Is that possible to describe with your masking scheme and a single watchpoint? A mask of 0xf, would count +0, +1, +2 and +3 (essentially bp_len == 4). Unfortunately, that means I can't just invert the mask like I originally thought. Will -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask
On Tue, Apr 23, 2013 at 04:02:40PM +0100, Will Deacon wrote: On Tue, Apr 23, 2013 at 03:40:57PM +0100, Jacob Shin wrote: On Tue, Apr 23, 2013 at 09:34:23AM -0500, Jacob Shin wrote: On Tue, Apr 23, 2013 at 10:54:37AM +0100, Will Deacon wrote: Can you see a problem if I simply invert the mask? That's great! No, I don't see a problem at all. Great! The GDB folks have been asking for this, so I can finally make them go away now :) I guess now it can be debated if the mask coming in from userland should be include or exclude mask. But I think exclude makes syntax easier: To count writes to [0x1000 ~ 0x1010) Include mask (my current patchset): ^^^ Exclude (I mean ..) perf stat -e mem:0x1000/0xf:w a.out Are you saying that this command would count any write to: 0x1000 0x1001 ... 0x100e 0x100f ? If so, that differs from the ARM debug architecture in that the mask is called `byte-address-select', so a mask of 0b1001 would count accesses at +0 bytes and +3 bytes from the base address. Is that possible to describe with your masking scheme and a single watchpoint? A mask of 0xf, would count +0, +1, +2 and +3 (essentially bp_len == 4). Unfortunately, that means I can't just invert the mask like I originally thought. Ah, .. that is different . Our hardware matches on the breakpoint if: (physical_address ~bp_addr_mask) == (bp_addr ~bp_addr_mask) In other words, the mask says which of the bp_addr bits hardware should ignore when matching. .. it would be great if we can come up with userland interface that works for both archs. I'm coming up empty at the moment .. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/