Re: [Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-09-09 Thread Pavel Machek
On Tue 2020-09-01 13:57:55, Harald Arnesen wrote:
> Still (rc3) doesn't work without the three reverts.
> 
> I'm not sure how to proceed, I cannot capture any oops, and see nothing
> obvious in any logs.

I believe this is the place when you ask Linus for reverts...

Best regards,

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-09-01 Thread Harald Arnesen
Still (rc3) doesn't work without the three reverts.

I'm not sure how to proceed, I cannot capture any oops, and see nothing
obvious in any logs.
-- 
Hilsen Harald
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-08-27 Thread Pavel Machek
Hi!

> >> It's a Thinkpad T520.
> > 
> > Oh, so this is a 64-bit machine? Yeah, that patch to flush vmalloc
> > ranges won't make any difference on x86-64.
> > 
> > Or are you for some reason running a 32-bit kernel on that thing? Have
> > you tried building a 64-bit one (user-space can be 32-bit, it should
> > all just work. Knock wood).
> 
> No, I run a 64-bit kernel with 64-bit userspace (Void Linux).
> Config is attached, in case anything is obvious from that.

For the record, I'm running 5.9.0-rc2-next-20200825 w/o further
patches, and it behaves okay on that 32-bit thinkpad x60.

BTW... could we get the test farms to occassionaly boot in 32-bit
mode? Those modern CPUs can still do that :-).

Best regards,
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-08-26 Thread Linus Torvalds
On Wed, Aug 26, 2020 at 1:53 PM Harald Arnesen  wrote:
>
> It's a Thinkpad T520.

Oh, so this is a 64-bit machine? Yeah, that patch to flush vmalloc
ranges won't make any difference on x86-64.

Or are you for some reason running a 32-bit kernel on that thing? Have
you tried building a 64-bit one (user-space can be 32-bit, it should
all just work. Knock wood).

   Linus
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-08-26 Thread Harald Arnesen
Dave Airlie [26.08.2020 22:47]:

> On Thu, 27 Aug 2020 at 06:44, Harald Arnesen  wrote:
>>
>> Linus Torvalds [26.08.2020 20:04]:
>>
>> > On Wed, Aug 26, 2020 at 2:30 AM Harald Arnesen  wrote:
>> >> Somehow related to lightdm or xfce4? However, it is a regression, since
>> >> kernel 5.8 works.
>> > Yeah, apparently there's something else wrong with the relocation changes 
>> > too.
>> >
>> > That said, does that patch at
>> >
>> >   https://lore.kernel.org/intel-gfx/20200821123746.16904-1-j...@8bytes.org/
>> >
>> > change things at all? If there are two independent bugs, maybe
>> > applying that patch might at least give you an oops that gets saved in
>> > the logs?
>> >
>> > (it might be worth waiting a bit after the machine locks up in case
>> > the machine is alive enough so sync logs after a bit.. If ssh works,
>> > that's obviously better yet)
>>
>> No, doesn't help. And I was wrong, ssh does not work at all when the
>> display locks up.
> 
> Did you say what hw you had? is it the same hw as Pavel or different?
> 
> Dave.
> 

It's a Thinkpad T520.
Output from 'lspci' attached.

-- 
Hilsen Harald
00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family 
DRAM Controller (rev 09)
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core 
Processor Family Integrated Graphics Controller (rev 09)
00:16.0 Communication controller: Intel Corporation 6 Series/C200 Series 
Chipset Family MEI Controller #1 (rev 04)
00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network 
Connection (Lewisville) (rev 04)
00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family 
USB Enhanced Host Controller #2 (rev 04)
00:1b.0 Audio device: Intel Corporation 6 Series/C200 Series Chipset Family 
High Definition Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI 
Express Root Port 1 (rev b4)
00:1c.1 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI 
Express Root Port 2 (rev b4)
00:1c.3 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI 
Express Root Port 4 (rev b4)
00:1c.4 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI 
Express Root Port 5 (rev b4)
00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family 
USB Enhanced Host Controller #1 (rev 04)
00:1f.0 ISA bridge: Intel Corporation QM67 Express Chipset LPC Controller (rev 
04)
00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series Chipset Family 
6 port Mobile SATA AHCI Controller (rev 04)
00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family SMBus 
Controller (rev 04)
03:00.0 Network controller: Intel Corporation Centrino Ultimate-N 6300 (rev 35)
0d:00.0 System peripheral: Ricoh Co Ltd PCIe SDXC/MMC Host Controller (rev 08)
0d:00.3 FireWire (IEEE 1394): Ricoh Co Ltd R5C832 PCIe IEEE 1394 Controller 
(rev 04)
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-08-26 Thread Dave Airlie
On Thu, 27 Aug 2020 at 06:44, Harald Arnesen  wrote:
>
> Linus Torvalds [26.08.2020 20:04]:
>
> > On Wed, Aug 26, 2020 at 2:30 AM Harald Arnesen  wrote:
> >> Somehow related to lightdm or xfce4? However, it is a regression, since
> >> kernel 5.8 works.
> > Yeah, apparently there's something else wrong with the relocation changes 
> > too.
> >
> > That said, does that patch at
> >
> >   https://lore.kernel.org/intel-gfx/20200821123746.16904-1-j...@8bytes.org/
> >
> > change things at all? If there are two independent bugs, maybe
> > applying that patch might at least give you an oops that gets saved in
> > the logs?
> >
> > (it might be worth waiting a bit after the machine locks up in case
> > the machine is alive enough so sync logs after a bit.. If ssh works,
> > that's obviously better yet)
>
> No, doesn't help. And I was wrong, ssh does not work at all when the
> display locks up.

Did you say what hw you had? is it the same hw as Pavel or different?

Dave.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-08-26 Thread Harald Arnesen
Linus Torvalds [26.08.2020 20:04]:

> On Wed, Aug 26, 2020 at 2:30 AM Harald Arnesen  wrote:
>> Somehow related to lightdm or xfce4? However, it is a regression, since
>> kernel 5.8 works.
> Yeah, apparently there's something else wrong with the relocation changes too.
> 
> That said, does that patch at
> 
>   https://lore.kernel.org/intel-gfx/20200821123746.16904-1-j...@8bytes.org/
> 
> change things at all? If there are two independent bugs, maybe
> applying that patch might at least give you an oops that gets saved in
> the logs?
> 
> (it might be worth waiting a bit after the machine locks up in case
> the machine is alive enough so sync logs after a bit.. If ssh works,
> that's obviously better yet)

No, doesn't help. And I was wrong, ssh does not work at all when the
display locks up.
-- 
Hilsen Harald
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-08-26 Thread Linus Torvalds
On Wed, Aug 26, 2020 at 2:30 AM Harald Arnesen  wrote:
>
> Somehow related to lightdm or xfce4? However, it is a regression, since
> kernel 5.8 works.

Yeah, apparently there's something else wrong with the relocation changes too.

That said, does that patch at

  https://lore.kernel.org/intel-gfx/20200821123746.16904-1-j...@8bytes.org/

change things at all? If there are two independent bugs, maybe
applying that patch might at least give you an oops that gets saved in
the logs?

(it might be worth waiting a bit after the machine locks up in case
the machine is alive enough so sync logs after a bit.. If ssh works,
that's obviously better yet)

  Linus
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-08-26 Thread Harald Arnesen
Harald Arnesen [26.08.2020 10:36]:

> I was wrong about ssh working. The whole machine locks up when X starts.
> 
> A strange thing, sometimes I can log in from lightdm before it locks up,
> sometimes I cannot even use the login screen. Timing related?
> 
> If I don't start X, console login seems to work fine, and I see nothing
> obvious in the logs or kernel messages.
> 
> I will try to start just a window manager with startx instead of going
> through lightdm.

Disabled lightdm, started DE or WM from .xinitrc:

xfce4-session: Machine locks up
enlightenment: Machine works

Somehow related to lightdm or xfce4? However, it is a regression, since
kernel 5.8 works.
-- 
Hilsen Harald
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-08-26 Thread Harald Arnesen
Linus Torvalds [25.08.2020 20:19]:

> On Tue, Aug 25, 2020 at 9:32 AM Harald Arnesen  wrote:
>>
>> > For posterity, I'm told the fix is [1].
>> >
>> > [1] 
>> > https://lore.kernel.org/intel-gfx/20200821123746.16904-1-j...@8bytes.org/
>>
>> Doesn't fix it for me. As soon as I start XFCE, the mouse and keyboard
>> freeezes. I can still ssh into the machine
>>
>> The three reverts (763fedd6a216, 7ac2d2536dfa and 9e0f9464e2ab) fixes
>> the bug for me.
> 
> Do you get any oops or other indication of what ends up going wrong?
> Since ssh works that should be fairly easy to see.
I was wrong about ssh working. The whole machine locks up when X starts.

A strange thing, sometimes I can log in from lightdm before it locks up,
sometimes I cannot even use the login screen. Timing related?

If I don't start X, console login seems to work fine, and I see nothing
obvious in the logs or kernel messages.

I will try to start just a window manager with startx instead of going
through lightdm.
-- 
Hilsen Harald
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-08-25 Thread Harald Arnesen
Linus Torvalds [25.08.2020 20:19]:

>> Doesn't fix it for me. As soon as I start XFCE, the mouse and keyboard
>> freeezes. I can still ssh into the machine
>>
>> The three reverts (763fedd6a216, 7ac2d2536dfa and 9e0f9464e2ab) fixes
>> the bug for me.
> Do you get any oops or other indication of what ends up going wrong?
> Since ssh works that should be fairly easy to see.

Away from the machine now, will check tomorrow morning (CET).
-- 
Hilsen Harald
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-08-25 Thread Linus Torvalds
On Tue, Aug 25, 2020 at 9:32 AM Harald Arnesen  wrote:
>
> > For posterity, I'm told the fix is [1].
> >
> > [1] 
> > https://lore.kernel.org/intel-gfx/20200821123746.16904-1-j...@8bytes.org/
>
> Doesn't fix it for me. As soon as I start XFCE, the mouse and keyboard
> freeezes. I can still ssh into the machine
>
> The three reverts (763fedd6a216, 7ac2d2536dfa and 9e0f9464e2ab) fixes
> the bug for me.

Do you get any oops or other indication of what ends up going wrong?
Since ssh works that should be fairly easy to see.

Linus
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-08-25 Thread Harald Arnesen
Jani Nikula [25.08.2020 11:55]:

> On Fri, 21 Aug 2020, Pavel Machek  wrote:
>> On Thu 2020-08-20 09:16:18, Linus Torvalds wrote:
>>> On Thu, Aug 20, 2020 at 2:23 AM Pavel Machek  wrote:
>>> >
>>> > Yes, it seems they make things work. (Chris asked for new patch to be
>>> > tested, so I am switching to his kernel, but it survived longer than
>>> > it usually does.)
>>> 
>>> Ok, so at worst we know how to solve it, at best the reverts won't be
>>> needed because Chris' patch will fix the issue properly.
>>> 
>>> So I'll archive this thread, but remind me if this hasn't gotten
>>> sorted out in the later rc's.
>>
>> Yes, thank you, it seems we have a solution w/o the revert.
> 
> For posterity, I'm told the fix is [1].
> 
> BR,
> Jani.
> 
> 
> [1] https://lore.kernel.org/intel-gfx/20200821123746.16904-1-j...@8bytes.org/

Doesn't fix it for me. As soon as I start XFCE, the mouse and keyboard
freeezes. I can still ssh into the machine

The three reverts (763fedd6a216, 7ac2d2536dfa and 9e0f9464e2ab) fixes
the bug for me.
-- 
Hilsen Harald
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-08-25 Thread Jani Nikula
On Fri, 21 Aug 2020, Pavel Machek  wrote:
> On Thu 2020-08-20 09:16:18, Linus Torvalds wrote:
>> On Thu, Aug 20, 2020 at 2:23 AM Pavel Machek  wrote:
>> >
>> > Yes, it seems they make things work. (Chris asked for new patch to be
>> > tested, so I am switching to his kernel, but it survived longer than
>> > it usually does.)
>> 
>> Ok, so at worst we know how to solve it, at best the reverts won't be
>> needed because Chris' patch will fix the issue properly.
>> 
>> So I'll archive this thread, but remind me if this hasn't gotten
>> sorted out in the later rc's.
>
> Yes, thank you, it seems we have a solution w/o the revert.

For posterity, I'm told the fix is [1].

BR,
Jani.


[1] https://lore.kernel.org/intel-gfx/20200821123746.16904-1-j...@8bytes.org/


-- 
Jani Nikula, Intel Open Source Graphics Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-08-21 Thread Pavel Machek
On Thu 2020-08-20 09:16:18, Linus Torvalds wrote:
> On Thu, Aug 20, 2020 at 2:23 AM Pavel Machek  wrote:
> >
> > Yes, it seems they make things work. (Chris asked for new patch to be
> > tested, so I am switching to his kernel, but it survived longer than
> > it usually does.)
> 
> Ok, so at worst we know how to solve it, at best the reverts won't be
> needed because Chris' patch will fix the issue properly.
> 
> So I'll archive this thread, but remind me if this hasn't gotten
> sorted out in the later rc's.

Yes, thank you, it seems we have a solution w/o the revert.

Best regards,
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-08-20 Thread Linus Torvalds
On Thu, Aug 20, 2020 at 2:23 AM Pavel Machek  wrote:
>
> Yes, it seems they make things work. (Chris asked for new patch to be
> tested, so I am switching to his kernel, but it survived longer than
> it usually does.)

Ok, so at worst we know how to solve it, at best the reverts won't be
needed because Chris' patch will fix the issue properly.

So I'll archive this thread, but remind me if this hasn't gotten
sorted out in the later rc's.

 Linus
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-08-20 Thread Pavel Machek
Hi!

> > I think there's been some discussion about reverting that change for
> > other reasons, but it's quite likely the culprit.
> 
> Hmm. It reverts cleanly, but the end result doesn't work, because of
> other changes.
> 
> Reverting all of
> 
>763fedd6a216 ("drm/i915: Remove i915_gem_object_get_dirty_page()")
>7ac2d2536dfa ("drm/i915/gem: Delete unused code")
>9e0f9464e2ab ("drm/i915/gem: Async GPU relocations only")
> 
> seems to at least build.
> 
> Pavel, does doing those three reverts make things work for you?

Yes, it seems they make things work. (Chris asked for new patch to be
tested, so I am switching to his kernel, but it survived longer than
it usually does.)

Thanks and best regards,
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-08-19 Thread Pavel Machek
On Tue 2020-08-18 18:59:27, Linus Torvalds wrote:
> On Tue, Aug 18, 2020 at 6:13 PM Dave Airlie  wrote:
> >
> > I think there's been some discussion about reverting that change for
> > other reasons, but it's quite likely the culprit.
> 
> Hmm. It reverts cleanly, but the end result doesn't work, because of
> other changes.
> 
> Reverting all of
> 
>763fedd6a216 ("drm/i915: Remove i915_gem_object_get_dirty_page()")
>7ac2d2536dfa ("drm/i915/gem: Delete unused code")
>9e0f9464e2ab ("drm/i915/gem: Async GPU relocations only")
> 
> seems to at least build.
> 
> Pavel, does doing those three reverts make things work for you?

Ok, so Chris' patches resulted in (less severe?) crash, let me try this.

pavel@amd:/data/l/linux-next-32$ git reset --hard 
8eb858df0a5f6bcd371b5d5637255c987278b8c9
HEAD is now at 8eb858df0a5f Add linux-next specific files for 20200819
pavel@amd:/data/l/linux-next-32$ git revert 763fedd6a216
Performing inexact rename detection: 100% (1212316/1212316), done.
hint: Waiting for your editor to close the file... Editing file: 
/data/fast/l/linux-next-32/.git/COMMIT_EDITMSG
/home/pavel/bin/emacsf: line 3: ed: command not found
[detached HEAD 261cbba627b7] Revert "drm/i915: Remove 
i915_gem_object_get_dirty_page()"
 2 files changed, 18 insertions(+)
pavel@amd:/data/l/linux-next-32$ git revert 7ac2d2536dfa
warning: inexact rename detection was skipped due to too many files.
warning: you may want to set your merge.renamelimit variable to at least 3877 
and retry the command.
hint: Waiting for your editor to close the file... Editing file: 
/data/fast/l/linux-next-32/.git/COMMIT_EDITMSG
/home/pavel/bin/emacsf: line 3: ed: command not found
[detached HEAD 526af90ea811] Revert "drm/i915/gem: Delete unused code"
 1 file changed, 19 insertions(+)
pavel@amd:/data/l/linux-next-32$ git revert 9e0f9464e2ab
warning: inexact rename detection was skipped due to too many files.
warning: you may want to set your merge.renamelimit variable to at least 3877 
and retry the command.
hint: Waiting for your editor to close the file... Editing file: 
/data/fast/l/linux-next-32/.git/COMMIT_EDITMSG
/home/pavel/bin/emacsf: line 3: ed: command not found
[detached HEAD 173e46213949] Revert "drm/i915/gem: Async GPU relocations only"
 2 files changed, 289 insertions(+), 27 deletions(-)
pavel@amd:/data/l/linux-next-32$ 

It is now running, it seems unison is the thing that usually triggers
this (due to memory pressure?). This time it survived unison (but
without chromium). I'll really know if it works in day or two.

Best regards,
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-08-19 Thread Pavel Machek
Hi!

> > I think there's been some discussion about reverting that change for
> > other reasons, but it's quite likely the culprit.
> 
> Hmm. It reverts cleanly, but the end result doesn't work, because of
> other changes.
> 
> Reverting all of
> 
>763fedd6a216 ("drm/i915: Remove i915_gem_object_get_dirty_page()")
>7ac2d2536dfa ("drm/i915/gem: Delete unused code")
>9e0f9464e2ab ("drm/i915/gem: Async GPU relocations only")
> 
> seems to at least build.
> 
> Pavel, does doing those three reverts make things work for you?

Thanks.

I got "[PATCH 1/2] drm/i915/gem: Replace reloc chain with terminator
on..." in my inbox; I believe that's related. Let me try those, first.

Best regards,
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-08-18 Thread Linus Torvalds
On Tue, Aug 18, 2020 at 6:13 PM Dave Airlie  wrote:
>
> I think there's been some discussion about reverting that change for
> other reasons, but it's quite likely the culprit.

Hmm. It reverts cleanly, but the end result doesn't work, because of
other changes.

Reverting all of

   763fedd6a216 ("drm/i915: Remove i915_gem_object_get_dirty_page()")
   7ac2d2536dfa ("drm/i915/gem: Delete unused code")
   9e0f9464e2ab ("drm/i915/gem: Async GPU relocations only")

seems to at least build.

Pavel, does doing those three reverts make things work for you?

   Linus
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-08-18 Thread Dave Airlie
On Wed, 19 Aug 2020 at 10:38, Linus Torvalds
 wrote:
>
> Ping on this?
>
> The code disassembles to
>
>   24: 8b 85 d0 fd ff ffmov-0x230(%ebp),%eax
>   2a:* c7 03 01 00 40 10movl   $0x1041,(%ebx) <-- trapping instruction
>   30: 89 43 04  mov%eax,0x4(%ebx)
>   33: 8b 85 b4 fd ff ffmov-0x24c(%ebp),%eax
>   39: 89 43 08  mov%eax,0x8(%ebx)
>   3c: e9jmp ...
>
> which looks like is one of the cases in __reloc_entry_gpu(). I *think*
> it's this one:
>
> } else if (gen >= 3 &&
>!(IS_I915G(eb->i915) || IS_I915GM(eb->i915))) {
> *batch++ = MI_STORE_DWORD_IMM | MI_MEM_VIRTUAL;
> *batch++ = addr;
> *batch++ = target_addr;
>
> where that "batch" pointer is 0xf8601000, so it looks like it just
> overflowed into the next page that isn't there.
>
> The cleaned-up call trace is
>
>   drm_ioctl+0x1f4/0x38b ->
> drm_ioctl_kernel+0x87/0xd0 ->
>   i915_gem_execbuffer2_ioctl+0xdd/0x360 ->
> i915_gem_do_execbuffer+0xaab/0x2780 ->
>   eb_relocate_vma
>
> but there's a lot of inling going on, so..
>
> The obvious suspect is commit 9e0f9464e2ab ("drm/i915/gem: Async GPU
> relocations only") but that's going purely by "that seems to be the
> main relocation change this mmrge window".

I think there's been some discussion about reverting that change for
other reasons, but it's quite likely the culprit.

Maybe we can push for a revert sooner, (cc'ing more of i915 team).

Dave.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-08-18 Thread Linus Torvalds
Ping on this?

The code disassembles to

  24: 8b 85 d0 fd ff ffmov-0x230(%ebp),%eax
  2a:* c7 03 01 00 40 10movl   $0x1041,(%ebx) <-- trapping instruction
  30: 89 43 04  mov%eax,0x4(%ebx)
  33: 8b 85 b4 fd ff ffmov-0x24c(%ebp),%eax
  39: 89 43 08  mov%eax,0x8(%ebx)
  3c: e9jmp ...

which looks like is one of the cases in __reloc_entry_gpu(). I *think*
it's this one:

} else if (gen >= 3 &&
   !(IS_I915G(eb->i915) || IS_I915GM(eb->i915))) {
*batch++ = MI_STORE_DWORD_IMM | MI_MEM_VIRTUAL;
*batch++ = addr;
*batch++ = target_addr;

where that "batch" pointer is 0xf8601000, so it looks like it just
overflowed into the next page that isn't there.

The cleaned-up call trace is

  drm_ioctl+0x1f4/0x38b ->
drm_ioctl_kernel+0x87/0xd0 ->
  i915_gem_execbuffer2_ioctl+0xdd/0x360 ->
i915_gem_do_execbuffer+0xaab/0x2780 ->
  eb_relocate_vma

but there's a lot of inling going on, so..

The obvious suspect is commit 9e0f9464e2ab ("drm/i915/gem: Async GPU
relocations only") but that's going purely by "that seems to be the
main relocation change this mmrge window".

 Linus

On Mon, Aug 17, 2020 at 9:11 AM Pavel Machek  wrote:
>
> Hi!
>
> After about half an hour of uptime, screen starts blinking on thinkpad
> x60 and machine becomes unusable.
>
> I already reported this in -next, and now it is in mainline. It is
> 32-bit x86 system.
>
>
> Pavel
>
>
> Aug 17 17:36:04 amd ovpn-castor[2828]: UDPv4 link local (bound):
> [undef]
> Aug 17 17:36:04 amd ovpn-castor[2828]: UDPv4 link remote:
> [AF_INET]87.138.219.28:1194
> Aug 17 17:36:23 amd kernel: BUG: unable to handle page fault for
> address: f8601000
> Aug 17 17:36:23 amd kernel: #PF: supervisor write access in kernel
> mode
> Aug 17 17:36:23 amd kernel: #PF: error_code(0x0002) - not-present page
> Aug 17 17:36:23 amd kernel: *pdpt = 318f2001 *pde =
> 
> Aug 17 17:36:23 amd kernel: Oops: 0002 [#1] PREEMPT SMP PTI
> Aug 17 17:36:23 amd kernel: CPU: 1 PID: 3004 Comm: Xorg Not tainted
> 5.9.0-rc1+ #86
> Aug 17 17:36:23 amd kernel: Hardware name: LENOVO 17097HU/17097HU,
> BIOS 7BETD8WW (2.19 ) 03/31
> /2011
> Aug 17 17:36:23 amd kernel: EIP: eb_relocate_vma+0xcf6/0xf20
> Aug 17 17:36:23 amd kernel: Code: e9 ff f7 ff ff c7 85 c0 fd ff ff ed
> ff ff ff c7 85 c4 fd ff
> ff ff ff ff ff 8b 85 c0 fd ff ff e9 a5 f8 ff ff 8b 85 d0 fd ff ff 
> 03 01 00 40 10 89 43 04
>  8b 85 b4 fd ff ff 89 43 08 e9 9f f7 ff
>  Aug 17 17:36:23 amd kernel: EAX: 003c306c EBX: f8601000 ECX: 00847000
>  EDX: 
>  Aug 17 17:36:23 amd kernel: ESI: 00847000 EDI:  EBP: f1947c68
>  ESP: f19479fc
>  Aug 17 17:36:23 amd kernel: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS:
>  0068 EFLAGS: 00210246
>  Aug 17 17:36:23 amd kernel: CR0: 80050033 CR2: f8601000 CR3: 31a1e000
>  CR4: 06b0
>  Aug 17 17:36:23 amd kernel: Call Trace:
>  Aug 17 17:36:23 amd kernel: ? i915_vma_pin+0xc5/0x8c0
>  Aug 17 17:36:23 amd kernel: ? __mutex_unlock_slowpath+0x2b/0x280
>  Aug 17 17:36:23 amd kernel: ? __active_retire+0x7e/0xd0
>  Aug 17 17:36:23 amd kernel: ? mutex_unlock+0xb/0x10
>  Aug 17 17:36:23 amd kernel: ? i915_vma_pin+0xc5/0x8c0
>  Aug 17 17:36:23 amd kernel: ? __lock_acquire.isra.31+0x261/0x530
>  Aug 17 17:36:23 amd kernel: ? eb_lookup_vmas+0x1f5/0x9e0
>  Aug 17 17:36:23 amd kernel: i915_gem_do_execbuffer+0xaab/0x2780
>  Aug 17 17:36:23 amd kernel: ? _raw_spin_unlock_irqrestore+0x27/0x40
>  Aug 17 17:36:23 amd kernel: ? __lock_acquire.isra.31+0x261/0x530
>  Aug 17 17:36:23 amd kernel: ? __lock_acquire.isra.31+0x261/0x530
>  Aug 17 17:36:23 amd kernel: ? kvmalloc_node+0x69/0x70
>  Aug 17 17:36:23 amd kernel: i915_gem_execbuffer2_ioctl+0xdd/0x360
>  Aug 17 17:36:23 amd kernel: ? i915_gem_execbuffer_ioctl+0x2b0/0x2b0
>  Aug 17 17:36:23 amd kernel: drm_ioctl_kernel+0x87/0xd0
>  Aug 17 17:36:23 amd kernel: drm_ioctl+0x1f4/0x38b
>  Aug 17 17:36:23 amd kernel: ? i915_gem_execbuffer_ioctl+0x2b0/0x2b0
>  Aug 17 17:36:23 amd kernel: ? posix_get_monotonic_timespec+0x1c/0x90
>  Aug 17 17:36:23 amd kernel: ? ktime_get_ts64+0x7a/0x1e0
>  Aug 17 17:36:23 amd kernel: ? drm_ioctl_kernel+0xd0/0xd0
>  Aug 17 17:36:23 amd kernel: __ia32_sys_ioctl+0x1ad/0x799
>  Aug 17 17:36:23 amd kernel: ? debug_smp_processor_id+0x12/0x20
>  Aug 17 17:36:23 amd kernel: ? exit_to_user_mode_prepare+0x4f/0x100
>  Aug 17 17:36:23 amd kernel: do_int80_syscall_32+0x2c/0x40
>  Aug 17 17:36:23 amd kernel: entry_INT80_32+0x111/0x111
>  Aug 17 17:36:23 amd kernel: EIP: 0xb7fbc092
>  Aug 17 17:36:23 amd kernel: Code: 00 00 00 e9 90 ff ff ff ff a3 24 00
>  00 00 68 30 00 00 00 e9 80 ff ff ff ff a3 e8 ff ff ff 66 90 00 00 00
>  00 00 00 00 00 cd 80  8d b4 26 00 00 00 00 8d b6 00 00 00 00 8b
>  1c 24 c3 8d b4 26 00
>  Aug 17 17:36:23 amd kernel: EAX: ffda EBX: 000a ECX: c0406469
>  

[Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

2020-08-17 Thread Pavel Machek
Hi!

After about half an hour of uptime, screen starts blinking on thinkpad
x60 and machine becomes unusable.

I already reported this in -next, and now it is in mainline. It is
32-bit x86 system.


Pavel


Aug 17 17:36:04 amd ovpn-castor[2828]: UDPv4 link local (bound):
[undef]
Aug 17 17:36:04 amd ovpn-castor[2828]: UDPv4 link remote:
[AF_INET]87.138.219.28:1194
Aug 17 17:36:23 amd kernel: BUG: unable to handle page fault for
address: f8601000
Aug 17 17:36:23 amd kernel: #PF: supervisor write access in kernel
mode
Aug 17 17:36:23 amd kernel: #PF: error_code(0x0002) - not-present page
Aug 17 17:36:23 amd kernel: *pdpt = 318f2001 *pde =

Aug 17 17:36:23 amd kernel: Oops: 0002 [#1] PREEMPT SMP PTI
Aug 17 17:36:23 amd kernel: CPU: 1 PID: 3004 Comm: Xorg Not tainted
5.9.0-rc1+ #86
Aug 17 17:36:23 amd kernel: Hardware name: LENOVO 17097HU/17097HU,
BIOS 7BETD8WW (2.19 ) 03/31
/2011
Aug 17 17:36:23 amd kernel: EIP: eb_relocate_vma+0xcf6/0xf20
Aug 17 17:36:23 amd kernel: Code: e9 ff f7 ff ff c7 85 c0 fd ff ff ed
ff ff ff c7 85 c4 fd ff
ff ff ff ff ff 8b 85 c0 fd ff ff e9 a5 f8 ff ff 8b 85 d0 fd ff ff 
03 01 00 40 10 89 43 04
 8b 85 b4 fd ff ff 89 43 08 e9 9f f7 ff
 Aug 17 17:36:23 amd kernel: EAX: 003c306c EBX: f8601000 ECX: 00847000
 EDX: 
 Aug 17 17:36:23 amd kernel: ESI: 00847000 EDI:  EBP: f1947c68
 ESP: f19479fc
 Aug 17 17:36:23 amd kernel: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS:
 0068 EFLAGS: 00210246
 Aug 17 17:36:23 amd kernel: CR0: 80050033 CR2: f8601000 CR3: 31a1e000
 CR4: 06b0
 Aug 17 17:36:23 amd kernel: Call Trace:
 Aug 17 17:36:23 amd kernel: ? i915_vma_pin+0xc5/0x8c0
 Aug 17 17:36:23 amd kernel: ? __mutex_unlock_slowpath+0x2b/0x280
 Aug 17 17:36:23 amd kernel: ? __active_retire+0x7e/0xd0
 Aug 17 17:36:23 amd kernel: ? mutex_unlock+0xb/0x10
 Aug 17 17:36:23 amd kernel: ? i915_vma_pin+0xc5/0x8c0
 Aug 17 17:36:23 amd kernel: ? __lock_acquire.isra.31+0x261/0x530
 Aug 17 17:36:23 amd kernel: ? eb_lookup_vmas+0x1f5/0x9e0
 Aug 17 17:36:23 amd kernel: i915_gem_do_execbuffer+0xaab/0x2780
 Aug 17 17:36:23 amd kernel: ? _raw_spin_unlock_irqrestore+0x27/0x40
 Aug 17 17:36:23 amd kernel: ? __lock_acquire.isra.31+0x261/0x530
 Aug 17 17:36:23 amd kernel: ? __lock_acquire.isra.31+0x261/0x530
 Aug 17 17:36:23 amd kernel: ? kvmalloc_node+0x69/0x70
 Aug 17 17:36:23 amd kernel: i915_gem_execbuffer2_ioctl+0xdd/0x360
 Aug 17 17:36:23 amd kernel: ? i915_gem_execbuffer_ioctl+0x2b0/0x2b0
 Aug 17 17:36:23 amd kernel: drm_ioctl_kernel+0x87/0xd0
 Aug 17 17:36:23 amd kernel: drm_ioctl+0x1f4/0x38b
 Aug 17 17:36:23 amd kernel: ? i915_gem_execbuffer_ioctl+0x2b0/0x2b0
 Aug 17 17:36:23 amd kernel: ? posix_get_monotonic_timespec+0x1c/0x90
 Aug 17 17:36:23 amd kernel: ? ktime_get_ts64+0x7a/0x1e0
 Aug 17 17:36:23 amd kernel: ? drm_ioctl_kernel+0xd0/0xd0
 Aug 17 17:36:23 amd kernel: __ia32_sys_ioctl+0x1ad/0x799
 Aug 17 17:36:23 amd kernel: ? debug_smp_processor_id+0x12/0x20
 Aug 17 17:36:23 amd kernel: ? exit_to_user_mode_prepare+0x4f/0x100
 Aug 17 17:36:23 amd kernel: do_int80_syscall_32+0x2c/0x40
 Aug 17 17:36:23 amd kernel: entry_INT80_32+0x111/0x111
 Aug 17 17:36:23 amd kernel: EIP: 0xb7fbc092
 Aug 17 17:36:23 amd kernel: Code: 00 00 00 e9 90 ff ff ff ff a3 24 00
 00 00 68 30 00 00 00 e9 80 ff ff ff ff a3 e8 ff ff ff 66 90 00 00 00
 00 00 00 00 00 cd 80  8d b4 26 00 00 00 00 8d b6 00 00 00 00 8b
 1c 24 c3 8d b4 26 00
 Aug 17 17:36:23 amd kernel: EAX: ffda EBX: 000a ECX: c0406469
 EDX: bff0ae3c
 Aug 17 17:36:23 amd kernel: ESI: b73aa000 EDI: c0406469 EBP: 000a
 ESP: bff0adb4
 Aug 17 17:36:23 amd kernel: DS: 007b ES: 007b FS:  GS: 0033 SS:
 007b EFLAGS: 00200296
 Aug 17 17:36:23 amd kernel: ? asm_exc_nmi+0xcc/0x2bc
 Aug 17 17:36:23 amd kernel: Modules linked in:
 Aug 17 17:36:23 amd kernel: CR2: f8601000
 Aug 17 17:36:23 amd kernel: ---[ end trace 2ca9775068bbac06 ]---
 
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx