Re: Lag after resume culprit found

2018-05-17 Thread Konstantin Belousov
On Thu, May 17, 2018 at 11:06:42AM +0300, Andriy Gapon wrote:
> On 17/05/2018 10:56, Johannes Lundberg wrote:
> > 
> > 
> > On Thu, May 17, 2018 at 8:46 AM, Johannes Lundberg  > > wrote:
> > 
> > 
> > 
> > On Thu, May 17, 2018 at 7:43 AM, Andriy Gapon  > > wrote:
> > 
> > On 17/05/2018 02:07, Johannes Lundberg wrote:
> > > 
> > https://github.com/freebsd/freebsd/commit/66f063557f257baa9c8aeab9f933171eaa6e1cfa
> > 
> > 
> > > x86 cpususpend_handler: call wbinvd after setting suspend state 
> > bits
> > 
> > That's very interesting and surprising.
> > That commit changes something that happens before suspend, it 
> > should not
> > have
> > any effect on the system state after resume.
> > 
> > Does anyone have a theory of what could be wrong?
> > 
> > 
> > Nope but moving
> >         CPU_CLR_ATOMIC(cpu, _cpus);
> > back to the end of that scope fixes it.
> >  
> > 
> > 
> > I did some further testing.
> > Calling
> > CPU_CLR_ATOMIC(cpu, _cpus);
> > before
> > pmap_init_pat();
> >  is what "breaks" resume.
> > 
> > Is this Intel only or this it happen on AMD as well (which this patch was
> > intended for)?
> 
> Not sure about the PAT part, but fpuresume/npxresume would affect all 
> platforms.
> It's a bit puzzling that doing PAT manipulations on one AP while another AP is
> being brought up is problematic.  Probably there is something that I am 
> missing.

Manipulating PAT might affect the cache consistency, since contradicting
caching attributes are applied to the line of the suspended_cpus variable
which is already cached.  It might be not the variable itself that causes
the final mis-operation, but some other data sharing the line.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Lag after resume culprit found

2018-05-17 Thread Andriy Gapon
On 17/05/2018 10:56, Johannes Lundberg wrote:
> 
> 
> On Thu, May 17, 2018 at 8:46 AM, Johannes Lundberg  > wrote:
> 
> 
> 
> On Thu, May 17, 2018 at 7:43 AM, Andriy Gapon  > wrote:
> 
> On 17/05/2018 02:07, Johannes Lundberg wrote:
> > 
> https://github.com/freebsd/freebsd/commit/66f063557f257baa9c8aeab9f933171eaa6e1cfa
> 
> 
> > x86 cpususpend_handler: call wbinvd after setting suspend state bits
> 
> That's very interesting and surprising.
> That commit changes something that happens before suspend, it should 
> not
> have
> any effect on the system state after resume.
> 
> Does anyone have a theory of what could be wrong?
> 
> 
> Nope but moving
>         CPU_CLR_ATOMIC(cpu, _cpus);
> back to the end of that scope fixes it.
>  
> 
> 
> I did some further testing.
> Calling
> CPU_CLR_ATOMIC(cpu, _cpus);
> before
> pmap_init_pat();
>  is what "breaks" resume.
> 
> Is this Intel only or this it happen on AMD as well (which this patch was
> intended for)?

Not sure about the PAT part, but fpuresume/npxresume would affect all platforms.
It's a bit puzzling that doing PAT manipulations on one AP while another AP is
being brought up is problematic.  Probably there is something that I am missing.

Thank you very much again for zeroing in on it.

> > How to test (i915kms)
> > 
> > Start X with glxgears
> > Confirm running stable at 60 fps
> > suspend/resume (S3)
> > glxgears is now fluctuating between 10-40 fps.


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Lag after resume culprit found

2018-05-17 Thread Andriy Gapon
On 17/05/2018 10:56, Andriy Gapon wrote:
> On 17/05/2018 10:46, Johannes Lundberg wrote:
>> On Thu, May 17, 2018 at 7:43 AM, Andriy Gapon > > wrote:
>>
>> On 17/05/2018 02:07, Johannes Lundberg wrote:
>> > 
>> https://github.com/freebsd/freebsd/commit/66f063557f257baa9c8aeab9f933171eaa6e1cfa
>> 
>> 
>> > x86 cpususpend_handler: call wbinvd after setting suspend state bits
>>
>> That's very interesting and surprising.
>> That commit changes something that happens before suspend, it should not 
>> have
>> any effect on the system state after resume.
>>
>> Does anyone have a theory of what could be wrong?
>>
>> Nope but moving
>>         CPU_CLR_ATOMIC(cpu, _cpus);
>> back to the end of that scope fixes it.
> 
> That's interesting.
> Thank you for testing it!
> And let me think about it.

Oh, I am stupid.
I intended that operation to be right after the CPU is done with restoring its
saved context.  Which means that it has to be after fpuresume/npxresume block.
Could you please re-test with CPU_CLR_ATOMIC(cpu, _cpus) at that 
position?
And my apologies.

>> > How to test (i915kms)
>> >
>> > Start X with glxgears
>> > Confirm running stable at 60 fps
>> > suspend/resume (S3)
>> > glxgears is now fluctuating between 10-40 fps.
> 


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Lag after resume culprit found

2018-05-17 Thread Andriy Gapon
On 17/05/2018 10:46, Johannes Lundberg wrote:
> On Thu, May 17, 2018 at 7:43 AM, Andriy Gapon  > wrote:
> 
> On 17/05/2018 02:07, Johannes Lundberg wrote:
> > 
> https://github.com/freebsd/freebsd/commit/66f063557f257baa9c8aeab9f933171eaa6e1cfa
> 
> 
> > x86 cpususpend_handler: call wbinvd after setting suspend state bits
> 
> That's very interesting and surprising.
> That commit changes something that happens before suspend, it should not 
> have
> any effect on the system state after resume.
> 
> Does anyone have a theory of what could be wrong?
> 
> Nope but moving
>         CPU_CLR_ATOMIC(cpu, _cpus);
> back to the end of that scope fixes it.

That's interesting.
Thank you for testing it!
And let me think about it.

> > How to test (i915kms)
> >
> > Start X with glxgears
> > Confirm running stable at 60 fps
> > suspend/resume (S3)
> > glxgears is now fluctuating between 10-40 fps.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Lag after resume culprit found

2018-05-17 Thread Johannes Lundberg
On Thu, May 17, 2018 at 8:46 AM, Johannes Lundberg 
wrote:

>
>
> On Thu, May 17, 2018 at 7:43 AM, Andriy Gapon  wrote:
>
>> On 17/05/2018 02:07, Johannes Lundberg wrote:
>> > https://github.com/freebsd/freebsd/commit/66f063557f257baa9c
>> 8aeab9f933171eaa6e1cfa
>> > x86 cpususpend_handler: call wbinvd after setting suspend state bits
>>
>> That's very interesting and surprising.
>> That commit changes something that happens before suspend, it should not
>> have
>> any effect on the system state after resume.
>>
>> Does anyone have a theory of what could be wrong?
>>
>
> Nope but moving
> CPU_CLR_ATOMIC(cpu, _cpus);
> back to the end of that scope fixes it.
>
>

I did some further testing.
Calling
CPU_CLR_ATOMIC(cpu, _cpus);
before
pmap_init_pat();
 is what "breaks" resume.

Is this Intel only or this it happen on AMD as well (which this patch was
intended for)?



>> > How to test (i915kms)
>> >
>> > Start X with glxgears
>> > Confirm running stable at 60 fps
>> > suspend/resume (S3)
>> > glxgears is now fluctuating between 10-40 fps.
>>
>>
>>
>> --
>> Andriy Gapon
>>
>
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Lag after resume culprit found

2018-05-17 Thread Johannes Lundberg
On Thu, May 17, 2018 at 7:43 AM, Andriy Gapon  wrote:

> On 17/05/2018 02:07, Johannes Lundberg wrote:
> > https://github.com/freebsd/freebsd/commit/66f063557f257baa9c8aeab9f93317
> 1eaa6e1cfa
> > x86 cpususpend_handler: call wbinvd after setting suspend state bits
>
> That's very interesting and surprising.
> That commit changes something that happens before suspend, it should not
> have
> any effect on the system state after resume.
>
> Does anyone have a theory of what could be wrong?
>

Nope but moving
CPU_CLR_ATOMIC(cpu, _cpus);
back to the end of that scope fixes it.


>
> > How to test (i915kms)
> >
> > Start X with glxgears
> > Confirm running stable at 60 fps
> > suspend/resume (S3)
> > glxgears is now fluctuating between 10-40 fps.
>
>
>
> --
> Andriy Gapon
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Lag after resume culprit found

2018-05-17 Thread Andriy Gapon
On 17/05/2018 02:07, Johannes Lundberg wrote:
> https://github.com/freebsd/freebsd/commit/66f063557f257baa9c8aeab9f933171eaa6e1cfa
> x86 cpususpend_handler: call wbinvd after setting suspend state bits

That's very interesting and surprising.
That commit changes something that happens before suspend, it should not have
any effect on the system state after resume.

Does anyone have a theory of what could be wrong?

> How to test (i915kms)
> 
> Start X with glxgears
> Confirm running stable at 60 fps
> suspend/resume (S3)
> glxgears is now fluctuating between 10-40 fps.



-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"