Re: __schedule #DF splat

2014-06-29 Thread Jan Kiszka
On 2014-06-29 16:32, Jan Kiszka wrote: > On 2014-06-29 16:27, Gleb Natapov wrote: >> On Sun, Jun 29, 2014 at 04:01:04PM +0200, Borislav Petkov wrote: >>> On Sun, Jun 29, 2014 at 04:42:47PM +0300, Gleb Natapov wrote: Please do so and let us know. >>> >>> Yep, just did. Reverting ae9fedc793

Re: __schedule #DF splat

2014-06-29 Thread Jan Kiszka
On 2014-06-29 16:27, Gleb Natapov wrote: > On Sun, Jun 29, 2014 at 04:01:04PM +0200, Borislav Petkov wrote: >> On Sun, Jun 29, 2014 at 04:42:47PM +0300, Gleb Natapov wrote: >>> Please do so and let us know. >> >> Yep, just did. Reverting ae9fedc793 fixes the issue. >> >>> reinj:1 means that

Re: __schedule #DF splat

2014-06-29 Thread Gleb Natapov
On Sun, Jun 29, 2014 at 04:01:04PM +0200, Borislav Petkov wrote: > On Sun, Jun 29, 2014 at 04:42:47PM +0300, Gleb Natapov wrote: > > Please do so and let us know. > > Yep, just did. Reverting ae9fedc793 fixes the issue. > > > reinj:1 means that previous injection failed due to another #PF that >

Re: __schedule #DF splat

2014-06-29 Thread Borislav Petkov
On Sun, Jun 29, 2014 at 04:42:47PM +0300, Gleb Natapov wrote: > Please do so and let us know. Yep, just did. Reverting ae9fedc793 fixes the issue. > reinj:1 means that previous injection failed due to another #PF that > happened during the event injection itself This may happen if GDT or fist >

Re: __schedule #DF splat

2014-06-29 Thread Borislav Petkov
On Sun, Jun 29, 2014 at 03:14:43PM +0200, Borislav Petkov wrote: > I better go and revert that one and check whether it fixes things. Yahaaa, that was some good bisection work Jan! :-) > 20 guest restart cycles and all is fine - it used to trigger after 5 max. Phew, we have it right in time

Re: __schedule #DF splat

2014-06-29 Thread Gleb Natapov
On Sun, Jun 29, 2014 at 03:14:43PM +0200, Borislav Petkov wrote: > On Sun, Jun 29, 2014 at 02:22:35PM +0200, Jan Kiszka wrote: > > OK, looks like I won ;): > > I gladly let you win. :-P > > > The issue was apparently introduced with "KVM: x86: get CPL from > > SS.DPL" (ae9fedc793). Maybe we are

Re: __schedule #DF splat

2014-06-29 Thread Borislav Petkov
On Sun, Jun 29, 2014 at 02:22:35PM +0200, Jan Kiszka wrote: > OK, looks like I won ;): I gladly let you win. :-P > The issue was apparently introduced with "KVM: x86: get CPL from > SS.DPL" (ae9fedc793). Maybe we are not properly saving or restoring > this state on SVM since then. I wonder if

Re: __schedule #DF splat

2014-06-29 Thread Jan Kiszka
On 2014-06-29 13:51, Borislav Petkov wrote: > On Sun, Jun 29, 2014 at 12:59:30PM +0200, Jan Kiszka wrote: >> Will see what I can do regarding bisecting. That host is a bit slow >> (netbook), so it may take a while. Boris will probably beat me in >> this. > > Nah, I was about to instrument

Re: __schedule #DF splat

2014-06-29 Thread Borislav Petkov
On Sun, Jun 29, 2014 at 12:59:30PM +0200, Jan Kiszka wrote: > Will see what I can do regarding bisecting. That host is a bit slow > (netbook), so it may take a while. Boris will probably beat me in > this. Nah, I was about to instrument kvm_multiple_exception() first and am slow anyway so... :-)

Re: __schedule #DF splat

2014-06-29 Thread Jan Kiszka
On 2014-06-29 12:53, Gleb Natapov wrote: > On Sun, Jun 29, 2014 at 12:31:50PM +0200, Jan Kiszka wrote: >> On 2014-06-29 12:24, Gleb Natapov wrote: >>> On Sun, Jun 29, 2014 at 11:56:03AM +0200, Jan Kiszka wrote: On 2014-06-29 08:46, Gleb Natapov wrote: > On Sat, Jun 28, 2014 at 01:44:31PM

Re: __schedule #DF splat

2014-06-29 Thread Gleb Natapov
On Sun, Jun 29, 2014 at 12:31:50PM +0200, Jan Kiszka wrote: > On 2014-06-29 12:24, Gleb Natapov wrote: > > On Sun, Jun 29, 2014 at 11:56:03AM +0200, Jan Kiszka wrote: > >> On 2014-06-29 08:46, Gleb Natapov wrote: > >>> On Sat, Jun 28, 2014 at 01:44:31PM +0200, Borislav Petkov wrote: >

Re: __schedule #DF splat

2014-06-29 Thread Jan Kiszka
On 2014-06-29 12:24, Gleb Natapov wrote: > On Sun, Jun 29, 2014 at 11:56:03AM +0200, Jan Kiszka wrote: >> On 2014-06-29 08:46, Gleb Natapov wrote: >>> On Sat, Jun 28, 2014 at 01:44:31PM +0200, Borislav Petkov wrote: qemu-system-x86-20240 [006] ...1 9406.484134: kvm_page_fault: address

Re: __schedule #DF splat

2014-06-29 Thread Gleb Natapov
On Sun, Jun 29, 2014 at 11:56:03AM +0200, Jan Kiszka wrote: > On 2014-06-29 08:46, Gleb Natapov wrote: > > On Sat, Jun 28, 2014 at 01:44:31PM +0200, Borislav Petkov wrote: > >> qemu-system-x86-20240 [006] ...1 9406.484134: kvm_page_fault: address > >> 7fffb62ba318 error_code 2 > >>

Re: __schedule #DF splat

2014-06-29 Thread Jan Kiszka
On 2014-06-29 08:46, Gleb Natapov wrote: > On Sat, Jun 28, 2014 at 01:44:31PM +0200, Borislav Petkov wrote: >> qemu-system-x86-20240 [006] ...1 9406.484134: kvm_page_fault: address >> 7fffb62ba318 error_code 2 >> qemu-system-x86-20240 [006] ...1 9406.484136: kvm_inj_exception: #PF (0x2)a >>

Re: __schedule #DF splat

2014-06-29 Thread Gleb Natapov
On Sat, Jun 28, 2014 at 01:44:31PM +0200, Borislav Petkov wrote: > qemu-system-x86-20240 [006] ...1 9406.484134: kvm_page_fault: address > 7fffb62ba318 error_code 2 > qemu-system-x86-20240 [006] ...1 9406.484136: kvm_inj_exception: #PF (0x2)a > > kvm injects the #PF into the guest. > >

Re: __schedule #DF splat

2014-06-29 Thread Gleb Natapov
On Sat, Jun 28, 2014 at 01:44:31PM +0200, Borislav Petkov wrote: qemu-system-x86-20240 [006] ...1 9406.484134: kvm_page_fault: address 7fffb62ba318 error_code 2 qemu-system-x86-20240 [006] ...1 9406.484136: kvm_inj_exception: #PF (0x2)a kvm injects the #PF into the guest.

Re: __schedule #DF splat

2014-06-29 Thread Jan Kiszka
On 2014-06-29 08:46, Gleb Natapov wrote: On Sat, Jun 28, 2014 at 01:44:31PM +0200, Borislav Petkov wrote: qemu-system-x86-20240 [006] ...1 9406.484134: kvm_page_fault: address 7fffb62ba318 error_code 2 qemu-system-x86-20240 [006] ...1 9406.484136: kvm_inj_exception: #PF (0x2)a kvm

Re: __schedule #DF splat

2014-06-29 Thread Gleb Natapov
On Sun, Jun 29, 2014 at 11:56:03AM +0200, Jan Kiszka wrote: On 2014-06-29 08:46, Gleb Natapov wrote: On Sat, Jun 28, 2014 at 01:44:31PM +0200, Borislav Petkov wrote: qemu-system-x86-20240 [006] ...1 9406.484134: kvm_page_fault: address 7fffb62ba318 error_code 2 qemu-system-x86-20240

Re: __schedule #DF splat

2014-06-29 Thread Jan Kiszka
On 2014-06-29 12:24, Gleb Natapov wrote: On Sun, Jun 29, 2014 at 11:56:03AM +0200, Jan Kiszka wrote: On 2014-06-29 08:46, Gleb Natapov wrote: On Sat, Jun 28, 2014 at 01:44:31PM +0200, Borislav Petkov wrote: qemu-system-x86-20240 [006] ...1 9406.484134: kvm_page_fault: address 7fffb62ba318

Re: __schedule #DF splat

2014-06-29 Thread Gleb Natapov
On Sun, Jun 29, 2014 at 12:31:50PM +0200, Jan Kiszka wrote: On 2014-06-29 12:24, Gleb Natapov wrote: On Sun, Jun 29, 2014 at 11:56:03AM +0200, Jan Kiszka wrote: On 2014-06-29 08:46, Gleb Natapov wrote: On Sat, Jun 28, 2014 at 01:44:31PM +0200, Borislav Petkov wrote: qemu-system-x86-20240

Re: __schedule #DF splat

2014-06-29 Thread Jan Kiszka
On 2014-06-29 12:53, Gleb Natapov wrote: On Sun, Jun 29, 2014 at 12:31:50PM +0200, Jan Kiszka wrote: On 2014-06-29 12:24, Gleb Natapov wrote: On Sun, Jun 29, 2014 at 11:56:03AM +0200, Jan Kiszka wrote: On 2014-06-29 08:46, Gleb Natapov wrote: On Sat, Jun 28, 2014 at 01:44:31PM +0200, Borislav

Re: __schedule #DF splat

2014-06-29 Thread Borislav Petkov
On Sun, Jun 29, 2014 at 12:59:30PM +0200, Jan Kiszka wrote: Will see what I can do regarding bisecting. That host is a bit slow (netbook), so it may take a while. Boris will probably beat me in this. Nah, I was about to instrument kvm_multiple_exception() first and am slow anyway so... :-)

Re: __schedule #DF splat

2014-06-29 Thread Jan Kiszka
On 2014-06-29 13:51, Borislav Petkov wrote: On Sun, Jun 29, 2014 at 12:59:30PM +0200, Jan Kiszka wrote: Will see what I can do regarding bisecting. That host is a bit slow (netbook), so it may take a while. Boris will probably beat me in this. Nah, I was about to instrument

Re: __schedule #DF splat

2014-06-29 Thread Borislav Petkov
On Sun, Jun 29, 2014 at 02:22:35PM +0200, Jan Kiszka wrote: OK, looks like I won ;): I gladly let you win. :-P The issue was apparently introduced with KVM: x86: get CPL from SS.DPL (ae9fedc793). Maybe we are not properly saving or restoring this state on SVM since then. I wonder if this

Re: __schedule #DF splat

2014-06-29 Thread Gleb Natapov
On Sun, Jun 29, 2014 at 03:14:43PM +0200, Borislav Petkov wrote: On Sun, Jun 29, 2014 at 02:22:35PM +0200, Jan Kiszka wrote: OK, looks like I won ;): I gladly let you win. :-P The issue was apparently introduced with KVM: x86: get CPL from SS.DPL (ae9fedc793). Maybe we are not properly

Re: __schedule #DF splat

2014-06-29 Thread Borislav Petkov
On Sun, Jun 29, 2014 at 03:14:43PM +0200, Borislav Petkov wrote: I better go and revert that one and check whether it fixes things. Yahaaa, that was some good bisection work Jan! :-) 20 guest restart cycles and all is fine - it used to trigger after 5 max. Phew, we have it right in time

Re: __schedule #DF splat

2014-06-29 Thread Borislav Petkov
On Sun, Jun 29, 2014 at 04:42:47PM +0300, Gleb Natapov wrote: Please do so and let us know. Yep, just did. Reverting ae9fedc793 fixes the issue. reinj:1 means that previous injection failed due to another #PF that happened during the event injection itself This may happen if GDT or fist

Re: __schedule #DF splat

2014-06-29 Thread Gleb Natapov
On Sun, Jun 29, 2014 at 04:01:04PM +0200, Borislav Petkov wrote: On Sun, Jun 29, 2014 at 04:42:47PM +0300, Gleb Natapov wrote: Please do so and let us know. Yep, just did. Reverting ae9fedc793 fixes the issue. reinj:1 means that previous injection failed due to another #PF that

Re: __schedule #DF splat

2014-06-29 Thread Jan Kiszka
On 2014-06-29 16:27, Gleb Natapov wrote: On Sun, Jun 29, 2014 at 04:01:04PM +0200, Borislav Petkov wrote: On Sun, Jun 29, 2014 at 04:42:47PM +0300, Gleb Natapov wrote: Please do so and let us know. Yep, just did. Reverting ae9fedc793 fixes the issue. reinj:1 means that previous injection

Re: __schedule #DF splat

2014-06-29 Thread Jan Kiszka
On 2014-06-29 16:32, Jan Kiszka wrote: On 2014-06-29 16:27, Gleb Natapov wrote: On Sun, Jun 29, 2014 at 04:01:04PM +0200, Borislav Petkov wrote: On Sun, Jun 29, 2014 at 04:42:47PM +0300, Gleb Natapov wrote: Please do so and let us know. Yep, just did. Reverting ae9fedc793 fixes the issue.

Re: __schedule #DF splat

2014-06-28 Thread Borislav Petkov
Ok, I rebuilt the host kernel with latest linus+tip/master and my queue. The guest kernel is v3.15-8992-g08f7cc749389 with a is a bunch of RAS patches. Before I start doing the coarse-grained bisection by testing -rcs and major numbers, I wanted to catch a #DF and try to analyze at least why it

Re: __schedule #DF splat

2014-06-28 Thread Borislav Petkov
Ok, I rebuilt the host kernel with latest linus+tip/master and my queue. The guest kernel is v3.15-8992-g08f7cc749389 with a is a bunch of RAS patches. Before I start doing the coarse-grained bisection by testing -rcs and major numbers, I wanted to catch a #DF and try to analyze at least why it

Re: __schedule #DF splat

2014-06-27 Thread Borislav Petkov
On Fri, Jun 27, 2014 at 02:01:43PM +0200, Paolo Bonzini wrote: > Can you try gathering a trace? (and since those things get huge, you > can send it to me offlist) Also try without ept and see what happens. Yeah, Joerg just sent me a diff on how to intercept #DF. I'll add a tracepoint so that it

Re: __schedule #DF splat

2014-06-27 Thread Paolo Bonzini
Il 27/06/2014 13:55, Borislav Petkov ha scritto: On Fri, Jun 27, 2014 at 01:41:30PM +0200, Paolo Bonzini wrote: Il 27/06/2014 12:18, Borislav Petkov ha scritto: Joerg says I should bisect but I'm busy with other stuff. If people are interested in chasing this further, I could free up some time

Re: __schedule #DF splat

2014-06-27 Thread Borislav Petkov
On Fri, Jun 27, 2014 at 01:41:30PM +0200, Paolo Bonzini wrote: > Il 27/06/2014 12:18, Borislav Petkov ha scritto: > >Joerg says I should bisect but I'm busy with other stuff. If people are > >interested in chasing this further, I could free up some time to do > >so... > > Please first try "-M

Re: __schedule #DF splat

2014-06-27 Thread Paolo Bonzini
Il 27/06/2014 12:18, Borislav Petkov ha scritto: Joerg says I should bisect but I'm busy with other stuff. If people are interested in chasing this further, I could free up some time to do so... Please first try "-M pc-1.7" on the 2.0 QEMU. If it fails, please do bisect it. A QEMU bisection

Re: __schedule #DF splat

2014-06-27 Thread Borislav Petkov
On Wed, Jun 25, 2014 at 10:26:50PM +0200, Borislav Petkov wrote: > On Wed, Jun 25, 2014 at 05:32:28PM +0200, Borislav Petkov wrote: > > Hi guys, > > > > so I'm looking at this splat below when booting current linus+tip/master > > in a kvm guest. Initially I thought this is something related to

Re: __schedule #DF splat

2014-06-27 Thread Borislav Petkov
On Wed, Jun 25, 2014 at 10:26:50PM +0200, Borislav Petkov wrote: On Wed, Jun 25, 2014 at 05:32:28PM +0200, Borislav Petkov wrote: Hi guys, so I'm looking at this splat below when booting current linus+tip/master in a kvm guest. Initially I thought this is something related to the

Re: __schedule #DF splat

2014-06-27 Thread Paolo Bonzini
Il 27/06/2014 12:18, Borislav Petkov ha scritto: Joerg says I should bisect but I'm busy with other stuff. If people are interested in chasing this further, I could free up some time to do so... Please first try -M pc-1.7 on the 2.0 QEMU. If it fails, please do bisect it. A QEMU bisection

Re: __schedule #DF splat

2014-06-27 Thread Borislav Petkov
On Fri, Jun 27, 2014 at 01:41:30PM +0200, Paolo Bonzini wrote: Il 27/06/2014 12:18, Borislav Petkov ha scritto: Joerg says I should bisect but I'm busy with other stuff. If people are interested in chasing this further, I could free up some time to do so... Please first try -M pc-1.7 on the

Re: __schedule #DF splat

2014-06-27 Thread Paolo Bonzini
Il 27/06/2014 13:55, Borislav Petkov ha scritto: On Fri, Jun 27, 2014 at 01:41:30PM +0200, Paolo Bonzini wrote: Il 27/06/2014 12:18, Borislav Petkov ha scritto: Joerg says I should bisect but I'm busy with other stuff. If people are interested in chasing this further, I could free up some time

Re: __schedule #DF splat

2014-06-27 Thread Borislav Petkov
On Fri, Jun 27, 2014 at 02:01:43PM +0200, Paolo Bonzini wrote: Can you try gathering a trace? (and since those things get huge, you can send it to me offlist) Also try without ept and see what happens. Yeah, Joerg just sent me a diff on how to intercept #DF. I'll add a tracepoint so that it all

Re: __schedule #DF splat

2014-06-25 Thread Borislav Petkov
On Wed, Jun 25, 2014 at 05:32:28PM +0200, Borislav Petkov wrote: > Hi guys, > > so I'm looking at this splat below when booting current linus+tip/master > in a kvm guest. Initially I thought this is something related to the > PARAVIRT gunk but it happens with and without it. Ok, here's a cleaner

__schedule #DF splat

2014-06-25 Thread Borislav Petkov
Hi guys, so I'm looking at this splat below when booting current linus+tip/master in a kvm guest. Initially I thought this is something related to the PARAVIRT gunk but it happens with and without it. So, from what I can see, we first #DF and then lockdep fires a deadlock warning. That I can

__schedule #DF splat

2014-06-25 Thread Borislav Petkov
Hi guys, so I'm looking at this splat below when booting current linus+tip/master in a kvm guest. Initially I thought this is something related to the PARAVIRT gunk but it happens with and without it. So, from what I can see, we first #DF and then lockdep fires a deadlock warning. That I can

Re: __schedule #DF splat

2014-06-25 Thread Borislav Petkov
On Wed, Jun 25, 2014 at 05:32:28PM +0200, Borislav Petkov wrote: Hi guys, so I'm looking at this splat below when booting current linus+tip/master in a kvm guest. Initially I thought this is something related to the PARAVIRT gunk but it happens with and without it. Ok, here's a cleaner