#2 is what I really don't understand.
I worry something else is going on there
On February 25, 2014 6:07:51 AM PST, Vince Weaver
wrote:
>On Mon, 24 Feb 2014, H. Peter Anvin wrote:
>
>> On 02/24/2014 11:30 AM, Peter Zijlstra wrote:
>> > On Mon, Feb 24, 2014 at 02:13:29PM -0500, Steven Rostedt
On Mon, 24 Feb 2014, H. Peter Anvin wrote:
> On 02/24/2014 11:30 AM, Peter Zijlstra wrote:
> > On Mon, Feb 24, 2014 at 02:13:29PM -0500, Steven Rostedt wrote:
> >> Ah, and x86_64 saves off the cr2 register when entering NMI and restores
> >> it before returning. But it seems to be missing from
On Mon, 24 Feb 2014, H. Peter Anvin wrote:
On 02/24/2014 11:30 AM, Peter Zijlstra wrote:
On Mon, Feb 24, 2014 at 02:13:29PM -0500, Steven Rostedt wrote:
Ah, and x86_64 saves off the cr2 register when entering NMI and restores
it before returning. But it seems to be missing from the i386
#2 is what I really don't understand.
I worry something else is going on there
On February 25, 2014 6:07:51 AM PST, Vince Weaver vincent.wea...@maine.edu
wrote:
On Mon, 24 Feb 2014, H. Peter Anvin wrote:
On 02/24/2014 11:30 AM, Peter Zijlstra wrote:
On Mon, Feb 24, 2014 at 02:13:29PM -0500,
On Tue, 25 Feb 2014 06:34:55 -0800
H. Peter Anvin h...@zytor.com wrote:
#2 is what I really don't understand.
I worry something else is going on there
Yeah, me too.
-- Steve
While the missing cr2 issue made debugging frustrating, I find the
other
aspects of the bug more serious:
On Tue, 25 Feb 2014, Steven Rostedt wrote:
On Tue, 25 Feb 2014 06:34:55 -0800
H. Peter Anvin h...@zytor.com wrote:
#2 is what I really don't understand.
I worry something else is going on there
Yeah, me too.
OK, well I'll work on isolating that next, I was hoping the segfault
On 02/24/2014 11:30 AM, Peter Zijlstra wrote:
> On Mon, Feb 24, 2014 at 02:13:29PM -0500, Steven Rostedt wrote:
>> Ah, and x86_64 saves off the cr2 register when entering NMI and restores
>> it before returning. But it seems to be missing from the i386 code.
>
> arch/x86/kernel/nmi.c:
>
>
On Mon, 24 Feb 2014 20:30:43 +0100
Peter Zijlstra wrote:
> On Mon, Feb 24, 2014 at 02:13:29PM -0500, Steven Rostedt wrote:
> > Ah, and x86_64 saves off the cr2 register when entering NMI and restores
> > it before returning. But it seems to be missing from the i386 code.
>
>
On Mon, Feb 24, 2014 at 02:13:29PM -0500, Steven Rostedt wrote:
> Ah, and x86_64 saves off the cr2 register when entering NMI and restores
> it before returning. But it seems to be missing from the i386 code.
arch/x86/kernel/nmi.c:
#define nmi_nesting_preprocess(regs)
On 02/24/2014 11:13 AM, Steven Rostedt wrote:
>>
>> Either way, it really seems like we have a case of CR2 leakage out of
>> the NMI context.
>
> Ah, and x86_64 saves off the cr2 register when entering NMI and restores
> it before returning. But it seems to be missing from the i386 code.
>
OK,
On Mon, 24 Feb 2014 10:34:13 -0800
"H. Peter Anvin" wrote:
> On 02/24/2014 10:07 AM, Vince Weaver wrote:
> >>
> >> Anyway I've attached the full tail end of the trace if you want to see
> >> everything that happens.
> >
> > and then I note there are *two* kernel page faults.
> >
> >
On 02/24/2014 10:07 AM, Vince Weaver wrote:
>>
>> Anyway I've attached the full tail end of the trace if you want to see
>> everything that happens.
>
> and then I note there are *two* kernel page faults.
>
> perf_fuzzer-2979 [000] 161.475924: page_fault_kernel:
>
On Mon, 24 Feb 2014, Vince Weaver wrote:
> On Mon, 24 Feb 2014, H. Peter Anvin wrote:
>
> > On 02/24/2014 09:32 AM, Vince Weaver wrote:
> > >>
> > >> Peter, does x32 have a slightly different ABI/calling convention that
> > >> would make any of these patches just slightly 'off'?
> > >
> > > I
On 02/24/2014 09:25 AM, Peter Zijlstra wrote:
>>
>> What is likely happening is the user page fault is triggering
>> code to do a "perf_callchain" dump, which is calling copy_from_user_nmi()
>> which calls copy_user_generic_string() which is somehow getting the user
>> RBP in the RDI register
On 02/24/2014 09:41 AM, Vince Weaver wrote:
> On Mon, 24 Feb 2014, Vince Weaver wrote:
>
>> I do note that
>> perf_callchain_user();
>>
>> Does
>> fp = (void __user *)regs->bp;
>>
>> ...
>>
>> bytes = copy_from_user_nmi(, fp, sizeof(frame));
>>
>>
>> And in my
On Mon, Feb 24, 2014 at 12:32:39PM -0500, Vince Weaver wrote:
> I do note that
> perf_callchain_user();
>
> Does
> fp = (void __user *)regs->bp;
>
> ...
>
> bytes = copy_from_user_nmi(, fp, sizeof(frame));
>
>
> And in my particular executable RBP has nothing to
On 02/24/2014 09:32 AM, Vince Weaver wrote:
>>
>> Peter, does x32 have a slightly different ABI/calling convention that
>> would make any of these patches just slightly 'off'?
>
> I do note that
> perf_callchain_user();
>
> Does
> fp = (void __user *)regs->bp;
>
> ...
>
On Mon, 24 Feb 2014, Vince Weaver wrote:
> I do note that
> perf_callchain_user();
>
> Does
> fp = (void __user *)regs->bp;
>
> ...
>
> bytes = copy_from_user_nmi(, fp, sizeof(frame));
>
>
> And in my particular executable RBP has nothing to do with a frame
>
On Mon, 24 Feb 2014, Peter Zijlstra wrote:
> On Mon, Feb 24, 2014 at 12:10:44PM -0500, Vince Weaver wrote:
> > On Mon, 24 Feb 2014, H. Peter Anvin wrote:
> >
> > > On February 24, 2014 8:34:30 AM PST, Vince Weaver
> > > wrote:
> > > >On Mon, 24 Feb 2014, Vince Weaver wrote:
> > > >
> > > >>
On Mon, Feb 24, 2014 at 12:10:44PM -0500, Vince Weaver wrote:
> On Mon, 24 Feb 2014, H. Peter Anvin wrote:
>
> > On February 24, 2014 8:34:30 AM PST, Vince Weaver
> > wrote:
> > >On Mon, 24 Feb 2014, Vince Weaver wrote:
> > >
> > >> Just touching the mmap page with a write of a single byte (it
On Mon, 24 Feb 2014, H. Peter Anvin wrote:
> On February 24, 2014 8:34:30 AM PST, Vince Weaver
> wrote:
> >On Mon, 24 Feb 2014, Vince Weaver wrote:
> >
> >> Just touching the mmap page with a write of a single byte (it doesn't
> >
> >> matter where) is enough to trigger the bug.
> >
> >OK,
Ok, so the obvious question is what is at that kernel address?
On February 24, 2014 8:34:30 AM PST, Vince Weaver
wrote:
>On Mon, 24 Feb 2014, Vince Weaver wrote:
>
>> Just touching the mmap page with a write of a single byte (it doesn't
>
>> matter where) is enough to trigger the bug.
>
>OK,
On Mon, 24 Feb 2014, Vince Weaver wrote:
> Just touching the mmap page with a write of a single byte (it doesn't
> matter where) is enough to trigger the bug.
OK, investigating this more.
perf_fuzzer-2971 [000] 154.944114: page_fault_user:
address=0xf7729000 ip=0x41efab
On Sun, 23 Feb 2014, H. Peter Anvin wrote:
> So we do a write to the buffer rather immediately before this happens,
> and in particular that will update the head:
>
> rb->user_page->data_head = head;
>
> However, that doesn't explain what is going on and in particular the
> write to
On Sun, 23 Feb 2014, H. Peter Anvin wrote:
So we do a write to the buffer rather immediately before this happens,
and in particular that will update the head:
rb-user_page-data_head = head;
However, that doesn't explain what is going on and in particular the
write to whatever
On Mon, 24 Feb 2014, Vince Weaver wrote:
Just touching the mmap page with a write of a single byte (it doesn't
matter where) is enough to trigger the bug.
OK, investigating this more.
perf_fuzzer-2971 [000] 154.944114: page_fault_user:
address=0xf7729000 ip=0x41efab
Ok, so the obvious question is what is at that kernel address?
On February 24, 2014 8:34:30 AM PST, Vince Weaver vincent.wea...@maine.edu
wrote:
On Mon, 24 Feb 2014, Vince Weaver wrote:
Just touching the mmap page with a write of a single byte (it doesn't
matter where) is enough to trigger
On Mon, 24 Feb 2014, H. Peter Anvin wrote:
On February 24, 2014 8:34:30 AM PST, Vince Weaver vincent.wea...@maine.edu
wrote:
On Mon, 24 Feb 2014, Vince Weaver wrote:
Just touching the mmap page with a write of a single byte (it doesn't
matter where) is enough to trigger the bug.
On Mon, Feb 24, 2014 at 12:10:44PM -0500, Vince Weaver wrote:
On Mon, 24 Feb 2014, H. Peter Anvin wrote:
On February 24, 2014 8:34:30 AM PST, Vince Weaver
vincent.wea...@maine.edu wrote:
On Mon, 24 Feb 2014, Vince Weaver wrote:
Just touching the mmap page with a write of a single
On Mon, 24 Feb 2014, Peter Zijlstra wrote:
On Mon, Feb 24, 2014 at 12:10:44PM -0500, Vince Weaver wrote:
On Mon, 24 Feb 2014, H. Peter Anvin wrote:
On February 24, 2014 8:34:30 AM PST, Vince Weaver
vincent.wea...@maine.edu wrote:
On Mon, 24 Feb 2014, Vince Weaver wrote:
On Mon, 24 Feb 2014, Vince Weaver wrote:
I do note that
perf_callchain_user();
Does
fp = (void __user *)regs-bp;
...
bytes = copy_from_user_nmi(frame, fp, sizeof(frame));
And in my particular executable RBP has nothing to do with a frame
pointer,
On 02/24/2014 09:32 AM, Vince Weaver wrote:
Peter, does x32 have a slightly different ABI/calling convention that
would make any of these patches just slightly 'off'?
I do note that
perf_callchain_user();
Does
fp = (void __user *)regs-bp;
...
bytes =
On Mon, Feb 24, 2014 at 12:32:39PM -0500, Vince Weaver wrote:
I do note that
perf_callchain_user();
Does
fp = (void __user *)regs-bp;
...
bytes = copy_from_user_nmi(frame, fp, sizeof(frame));
And in my particular executable RBP has nothing to do with
On 02/24/2014 09:41 AM, Vince Weaver wrote:
On Mon, 24 Feb 2014, Vince Weaver wrote:
I do note that
perf_callchain_user();
Does
fp = (void __user *)regs-bp;
...
bytes = copy_from_user_nmi(frame, fp, sizeof(frame));
And in my particular executable RBP has
On 02/24/2014 09:25 AM, Peter Zijlstra wrote:
What is likely happening is the user page fault is triggering
code to do a perf_callchain dump, which is calling copy_from_user_nmi()
which calls copy_user_generic_string() which is somehow getting the user
RBP in the RDI register somehow?
So
On Mon, 24 Feb 2014, Vince Weaver wrote:
On Mon, 24 Feb 2014, H. Peter Anvin wrote:
On 02/24/2014 09:32 AM, Vince Weaver wrote:
Peter, does x32 have a slightly different ABI/calling convention that
would make any of these patches just slightly 'off'?
I do note that
On 02/24/2014 10:07 AM, Vince Weaver wrote:
Anyway I've attached the full tail end of the trace if you want to see
everything that happens.
and then I note there are *two* kernel page faults.
perf_fuzzer-2979 [000] 161.475924: page_fault_kernel:
address=irq_stack_union
On Mon, 24 Feb 2014 10:34:13 -0800
H. Peter Anvin h...@zytor.com wrote:
On 02/24/2014 10:07 AM, Vince Weaver wrote:
Anyway I've attached the full tail end of the trace if you want to see
everything that happens.
and then I note there are *two* kernel page faults.
On 02/24/2014 11:13 AM, Steven Rostedt wrote:
Either way, it really seems like we have a case of CR2 leakage out of
the NMI context.
Ah, and x86_64 saves off the cr2 register when entering NMI and restores
it before returning. But it seems to be missing from the i386 code.
OK, that might
On Mon, Feb 24, 2014 at 02:13:29PM -0500, Steven Rostedt wrote:
Ah, and x86_64 saves off the cr2 register when entering NMI and restores
it before returning. But it seems to be missing from the i386 code.
arch/x86/kernel/nmi.c:
#define nmi_nesting_preprocess(regs)
On Mon, 24 Feb 2014 20:30:43 +0100
Peter Zijlstra pet...@infradead.org wrote:
On Mon, Feb 24, 2014 at 02:13:29PM -0500, Steven Rostedt wrote:
Ah, and x86_64 saves off the cr2 register when entering NMI and restores
it before returning. But it seems to be missing from the i386 code.
On 02/24/2014 11:30 AM, Peter Zijlstra wrote:
On Mon, Feb 24, 2014 at 02:13:29PM -0500, Steven Rostedt wrote:
Ah, and x86_64 saves off the cr2 register when entering NMI and restores
it before returning. But it seems to be missing from the i386 code.
arch/x86/kernel/nmi.c:
#define
On 02/23/2014 07:02 PM, Vince Weaver wrote:
> On Sun, 23 Feb 2014, Vince Weaver wrote:
>>
>> and as far as I can tell nothing touches rbp again until the segfault.
>> Nothing in _memset_sse2 does as far as I can tell.
>
> I only know enough about ftrace to be dangerous, but here is what I think
On Sun, 23 Feb 2014, Vince Weaver wrote:
>
> and as far as I can tell nothing touches rbp again until the segfault.
> Nothing in _memset_sse2 does as far as I can tell.
I only know enough about ftrace to be dangerous, but here is what I think
is the trace of the problem:
perf_fuzzer-11492
On Sat, 22 Feb 2014, H. Peter Anvin wrote:
> I'd be interested in how rbp gets set, too. It might just be a
> coincidence and the value in rbp has some other meaning here.
The code in question does this:
i=find_random_active_event();
if (i<0) return;
if
On Sat, 22 Feb 2014, H. Peter Anvin wrote:
I'd be interested in how rbp gets set, too. It might just be a
coincidence and the value in rbp has some other meaning here.
The code in question does this:
i=find_random_active_event();
if (i0) return;
if
On Sun, 23 Feb 2014, Vince Weaver wrote:
and as far as I can tell nothing touches rbp again until the segfault.
Nothing in _memset_sse2 does as far as I can tell.
I only know enough about ftrace to be dangerous, but here is what I think
is the trace of the problem:
perf_fuzzer-11492
On 02/23/2014 07:02 PM, Vince Weaver wrote:
On Sun, 23 Feb 2014, Vince Weaver wrote:
and as far as I can tell nothing touches rbp again until the segfault.
Nothing in _memset_sse2 does as far as I can tell.
I only know enough about ftrace to be dangerous, but here is what I think
is the
I'd be interested in how rbp gets set, too. It might just be a coincidence and
the value in rbp has some other meaning here.
On February 22, 2014 9:18:17 PM PST, Vince Weaver
wrote:
>On Fri, 21 Feb 2014, H. Peter Anvin wrote:
>
>> Error 6 reflects a write in userspace to a not-present page.
What is the instructions around it, by any chance?
On February 22, 2014 9:18:17 PM PST, Vince Weaver
wrote:
>On Fri, 21 Feb 2014, H. Peter Anvin wrote:
>
>> Error 6 reflects a write in userspace to a not-present page.
>>
>> Since your previous trace indicates that the value of the register in
On Fri, 21 Feb 2014, H. Peter Anvin wrote:
> Error 6 reflects a write in userspace to a not-present page.
>
> Since your previous trace indicates that the value of the register in question
> is a different one, I'm guessing that what we have here is PEBS getting
> activated. 0x120 is 2*0x90,
On Fri, 21 Feb 2014, H. Peter Anvin wrote:
Error 6 reflects a write in userspace to a not-present page.
Since your previous trace indicates that the value of the register in question
is a different one, I'm guessing that what we have here is PEBS getting
activated. 0x120 is 2*0x90, and
What is the instructions around it, by any chance?
On February 22, 2014 9:18:17 PM PST, Vince Weaver vincent.wea...@maine.edu
wrote:
On Fri, 21 Feb 2014, H. Peter Anvin wrote:
Error 6 reflects a write in userspace to a not-present page.
Since your previous trace indicates that the value of
I'd be interested in how rbp gets set, too. It might just be a coincidence and
the value in rbp has some other meaning here.
On February 22, 2014 9:18:17 PM PST, Vince Weaver vincent.wea...@maine.edu
wrote:
On Fri, 21 Feb 2014, H. Peter Anvin wrote:
Error 6 reflects a write in userspace to a
On 02/21/2014 08:50 PM, Vince Weaver wrote:
So I changed the perf_fuzzer so when it randomly stomps all over the
perf_event_mmap_page, it uses a constant value of 0xdeadbeef rather
than a random value.
The result is below. The segfaults make a bit more sense now, it
almost looks like what is
Those are segfaults in user space, though?
On February 21, 2014 8:50:38 PM PST, Vince Weaver
wrote:
>
>So I changed the perf_fuzzer so when it randomly stomps all over the
>perf_event_mmap_page, it uses a constant value of 0xdeadbeef rather
>than a random value.
>
>The result is below. The
So I changed the perf_fuzzer so when it randomly stomps all over the
perf_event_mmap_page, it uses a constant value of 0xdeadbeef rather
than a random value.
The result is below. The segfaults make a bit more sense now, it
almost looks like what is happening is we are corrupting an address
On Fri, 21 Feb 2014, Vince Weaver wrote:
> On Fri, 21 Feb 2014, Vince Weaver wrote:
>
> > So I'm not sure who exactly to report this to. Some perf people CC'd as
> > I trigger it while using the perf_fuzzer.
> >
> > This is with 3.14-rc3 on a core2 machine, although I've had the reboots
> >
cc'ing x32 people
On Fri, 21 Feb 2014, Vince Weaver wrote:
> So I'm not sure who exactly to report this to. Some perf people CC'd as
> I trigger it while using the perf_fuzzer.
>
> This is with 3.14-rc3 on a core2 machine, although I've had the reboots
> happen throughout at least 3.14-rc*
>
cc'ing x32 people
On Fri, 21 Feb 2014, Vince Weaver wrote:
So I'm not sure who exactly to report this to. Some perf people CC'd as
I trigger it while using the perf_fuzzer.
This is with 3.14-rc3 on a core2 machine, although I've had the reboots
happen throughout at least 3.14-rc*
I'm
On Fri, 21 Feb 2014, Vince Weaver wrote:
On Fri, 21 Feb 2014, Vince Weaver wrote:
So I'm not sure who exactly to report this to. Some perf people CC'd as
I trigger it while using the perf_fuzzer.
This is with 3.14-rc3 on a core2 machine, although I've had the reboots
happen
So I changed the perf_fuzzer so when it randomly stomps all over the
perf_event_mmap_page, it uses a constant value of 0xdeadbeef rather
than a random value.
The result is below. The segfaults make a bit more sense now, it
almost looks like what is happening is we are corrupting an address
Those are segfaults in user space, though?
On February 21, 2014 8:50:38 PM PST, Vince Weaver vincent.wea...@maine.edu
wrote:
So I changed the perf_fuzzer so when it randomly stomps all over the
perf_event_mmap_page, it uses a constant value of 0xdeadbeef rather
than a random value.
The result
On 02/21/2014 08:50 PM, Vince Weaver wrote:
So I changed the perf_fuzzer so when it randomly stomps all over the
perf_event_mmap_page, it uses a constant value of 0xdeadbeef rather
than a random value.
The result is below. The segfaults make a bit more sense now, it
almost looks like what is
101 - 164 of 164 matches
Mail list logo