Re: [regression from v4.19] Re: 4.20.0-rc6-next-20181210, v4.20-rc1: list_del corruption on thinkpad x220, graphics related?
Hi! > > > So if you could please try drm-tip reproducing AND open a bug in Bugzilla. > > > If you are unwilling to do that, it is very difficult to help you > > > more. > > > > Website says I have to read and agree to two different pieces of > > legalesee, and I'd need to keep track of yet another password... so > > you can "communicate" with me. > > > > But you can already communicate with me, over email. > > I've listed all the reasons why our bug handling process is what it is. > > If registering to the Bugzilla is too much of an effort for you, then I > won't be able to help you further on this. Actually I did register at the bugzilla. Only useful help there was that CONFIG_DRM_I915_DEBUG_GEM might be useful. Unfortunately that one seems to make it panic() and impossible to get anything useful. https://bugs.freedesktop.org/show_bug.cgi?id=109175 Best regards, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: [regression from v4.19] Re: 4.20.0-rc6-next-20181210, v4.20-rc1: list_del corruption on thinkpad x220, graphics related?
Quoting Pavel Machek (2018-12-27 10:34:39) > Hi! > > > > > > If you think it is useful, I can try to update my machine to > > > > > linux-next. > > > > > > > > linux-next is closer to drm-tip, so it's better. Do you have some > > > > specific reason for not wanting to run drm-tip (but linux-next is still > > > > ok)? > > > > > > I already have build/update scripts for -next, and I trust -next not > > > to store screenshots of my desktop in my master boot record :-). > > > > > > Anyway, it does happen with -next. This time, chromiums were running, > > > and crash happened minute? after I exited flightgear. It can be seen > > > in the logs. > > > > > > Oh and I might want to mention -- machine was rather deep in swap this > > > time, as in "mouse jumping when starting fgfs" and "could feel the > > > chromium being swapped back in". I might have had this situation > > > before, and just powercycled the machine "because it is so deep in > > > swap that it will not recover". > > > > > > top says: > > > > > > top - 19:18:24 up 2 days, 8:03, 2 users, load average: 3.02, 3.45, > > > 3.21 > > > Tasks: 141 total, 1 running, 86 sleeping, 0 stopped, 2 zombie > > > %Cpu(s): 18.8 us, 7.6 sy, 3.0 ni, 68.4 id, 1.3 wa, 0.0 hi, 0.9 > > > si, 0.0 st > > > KiB Mem: 5967968 total, 663244 used, 5304724 free,48876 > > > buffers > > > KiB Swap: 1681428 total, 170904 used, 1510524 free. 446280 > > > cached Mem > > > > > > but of course that memory is free once everything died. > > > > > > Any ideas? Should I go back to v4.19 to see if it happens there, too? > > > > linux-next includes very much the same code as drm-tip. There's nobody > > magically reviewing the code more than it is reviewed for inclusion into > > drm-tip, when it is fed into linux-next. So thinking linux-next would be > > some way safer is an illusion. > > > > It sounds like having memory pressure expedites the corruption, which > > should make it easier to reproduce and thus fix. > > > > So if you could please try drm-tip reproducing AND open a bug in Bugzilla. > > If you are unwilling to do that, it is very difficult to help you > > more. > > Website says I have to read and agree to two different pieces of > legalesee, and I'd need to keep track of yet another password... so > you can "communicate" with me. > > But you can already communicate with me, over email. I've listed all the reasons why our bug handling process is what it is. If registering to the Bugzilla is too much of an effort for you, then I won't be able to help you further on this. Regards, Joonas > I verified v4.19 is stable -- it worked ok for way more than two days > it usually takes to crash. > > Pavel > -- > (english) http://www.livejournal.com/~pavelmachek > (cesky, pictures) > http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
[regression from v4.19] Re: 4.20.0-rc6-next-20181210, v4.20-rc1: list_del corruption on thinkpad x220, graphics related?
Hi! > > > > If you think it is useful, I can try to update my machine to > > > > linux-next. > > > > > > linux-next is closer to drm-tip, so it's better. Do you have some > > > specific reason for not wanting to run drm-tip (but linux-next is still > > > ok)? > > > > I already have build/update scripts for -next, and I trust -next not > > to store screenshots of my desktop in my master boot record :-). > > > > Anyway, it does happen with -next. This time, chromiums were running, > > and crash happened minute? after I exited flightgear. It can be seen > > in the logs. > > > > Oh and I might want to mention -- machine was rather deep in swap this > > time, as in "mouse jumping when starting fgfs" and "could feel the > > chromium being swapped back in". I might have had this situation > > before, and just powercycled the machine "because it is so deep in > > swap that it will not recover". > > > > top says: > > > > top - 19:18:24 up 2 days, 8:03, 2 users, load average: 3.02, 3.45, > > 3.21 > > Tasks: 141 total, 1 running, 86 sleeping, 0 stopped, 2 zombie > > %Cpu(s): 18.8 us, 7.6 sy, 3.0 ni, 68.4 id, 1.3 wa, 0.0 hi, 0.9 > > si, 0.0 st > > KiB Mem: 5967968 total, 663244 used, 5304724 free,48876 > > buffers > > KiB Swap: 1681428 total, 170904 used, 1510524 free. 446280 > > cached Mem > > > > but of course that memory is free once everything died. > > > > Any ideas? Should I go back to v4.19 to see if it happens there, too? > > linux-next includes very much the same code as drm-tip. There's nobody > magically reviewing the code more than it is reviewed for inclusion into > drm-tip, when it is fed into linux-next. So thinking linux-next would be > some way safer is an illusion. > > It sounds like having memory pressure expedites the corruption, which > should make it easier to reproduce and thus fix. > > So if you could please try drm-tip reproducing AND open a bug in Bugzilla. > If you are unwilling to do that, it is very difficult to help you > more. Website says I have to read and agree to two different pieces of legalesee, and I'd need to keep track of yet another password... so you can "communicate" with me. But you can already communicate with me, over email. I verified v4.19 is stable -- it worked ok for way more than two days it usually takes to crash. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: 4.20.0-rc6-next-20181210, v4.20-rc1: list_del corruption on thinkpad x220, graphics related?
Quoting Pavel Machek (2018-12-12 20:29:02) > Hi! > > > > > > > > > There's one similar for nouveau in Bugzilla, but it seems like > > > > > > > > a genuine > > > > > > > > memory corruption (1 bit flipped): > > > > > > > > > > > > > > > > https://bugs.freedesktop.org/show_bug.cgi?id=84880 > > > > > > > > > > > > > > > > Any extra information would be of use :) > > > > > > > > > > > > > > > > Regards, Joonas > > > > > > > > > > > > > > > > PS. Could you open a bug to Bugzilla, it'll help to collect the > > > > > > > > information in one consolidated place: > > > > > > > > > > > > > > > > https://01.org/linuxgraphics/documentation/how-report-bugs > > > > > > > > > > > > > > I prefer email... certainly for bugs that can't be reproduced. > > > > > > > > > > > > By adding it to the Bugzilla it may be recognized by somebody else > > > > > > who is experiencing a similar issue. Internet points are not > > > > > > deducted > > > > > > for submitting bugs in good faith, even if they get closed as > > > > > > NOTABUG. > > > > > > > > Well, your documentation suggests you'll deduce my internet points: > > > > > > > > Before filing the bug, please try to reproduce your issue with the > > > > latest kernel. Use the latest drm-tip branch from > > > > http://cgit.freedesktop.org/drm-tip and build as instructed on our > > > > Build Guide. > > > > > > > > :-) > > > > > > I'd prefer not to run drm-tip. I'll update to 2.6.20-rc5+ and see if > > > it re-appears (but it takes long time to reproduce :-(). > > > > If we can or can not reproduce the issue with drm-tip, is a very useful > > datapoint for us. If we can not reproduce, it'll be possible to bisect > > which commit fixed it, and backport that. On the other hand, if it's > > still reproducible, we know we're not spending time on something we > > already fixed, and the priority gets a bump. > > bisect ... is not practical on something that takes 2 days to reproduce. > > > > If you think it is useful, I can try to update my machine to > > > linux-next. > > > > linux-next is closer to drm-tip, so it's better. Do you have some > > specific reason for not wanting to run drm-tip (but linux-next is still > > ok)? > > I already have build/update scripts for -next, and I trust -next not > to store screenshots of my desktop in my master boot record :-). > > Anyway, it does happen with -next. This time, chromiums were running, > and crash happened minute? after I exited flightgear. It can be seen > in the logs. > > Oh and I might want to mention -- machine was rather deep in swap this > time, as in "mouse jumping when starting fgfs" and "could feel the > chromium being swapped back in". I might have had this situation > before, and just powercycled the machine "because it is so deep in > swap that it will not recover". > > top says: > > top - 19:18:24 up 2 days, 8:03, 2 users, load average: 3.02, 3.45, > 3.21 > Tasks: 141 total, 1 running, 86 sleeping, 0 stopped, 2 zombie > %Cpu(s): 18.8 us, 7.6 sy, 3.0 ni, 68.4 id, 1.3 wa, 0.0 hi, 0.9 > si, 0.0 st > KiB Mem: 5967968 total, 663244 used, 5304724 free,48876 > buffers > KiB Swap: 1681428 total, 170904 used, 1510524 free. 446280 > cached Mem > > but of course that memory is free once everything died. > > Any ideas? Should I go back to v4.19 to see if it happens there, too? linux-next includes very much the same code as drm-tip. There's nobody magically reviewing the code more than it is reviewed for inclusion into drm-tip, when it is fed into linux-next. So thinking linux-next would be some way safer is an illusion. It sounds like having memory pressure expedites the corruption, which should make it easier to reproduce and thus fix. So if you could please try drm-tip reproducing AND open a bug in Bugzilla. If you are unwilling to do that, it is very difficult to help you more. Regards, Joonas > > > Pavel > -- > (english) http://www.livejournal.com/~pavelmachek > (cesky, pictures) > http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
4.20.0-rc6-next-20181210, v4.20-rc1: list_del corruption on thinkpad x220, graphics related?
Hi! > > > > > > > There's one similar for nouveau in Bugzilla, but it seems like a > > > > > > > genuine > > > > > > > memory corruption (1 bit flipped): > > > > > > > > > > > > > > https://bugs.freedesktop.org/show_bug.cgi?id=84880 > > > > > > > > > > > > > > Any extra information would be of use :) > > > > > > > > > > > > > > Regards, Joonas > > > > > > > > > > > > > > PS. Could you open a bug to Bugzilla, it'll help to collect the > > > > > > > information in one consolidated place: > > > > > > > > > > > > > > https://01.org/linuxgraphics/documentation/how-report-bugs > > > > > > > > > > > > I prefer email... certainly for bugs that can't be reproduced. > > > > > > > > > > By adding it to the Bugzilla it may be recognized by somebody else > > > > > who is experiencing a similar issue. Internet points are not deducted > > > > > for submitting bugs in good faith, even if they get closed as > > > > > NOTABUG. > > > > > > Well, your documentation suggests you'll deduce my internet points: > > > > > > Before filing the bug, please try to reproduce your issue with the > > > latest kernel. Use the latest drm-tip branch from > > > http://cgit.freedesktop.org/drm-tip and build as instructed on our > > > Build Guide. > > > > > > :-) > > > > I'd prefer not to run drm-tip. I'll update to 2.6.20-rc5+ and see if > > it re-appears (but it takes long time to reproduce :-(). > > If we can or can not reproduce the issue with drm-tip, is a very useful > datapoint for us. If we can not reproduce, it'll be possible to bisect > which commit fixed it, and backport that. On the other hand, if it's > still reproducible, we know we're not spending time on something we > already fixed, and the priority gets a bump. bisect ... is not practical on something that takes 2 days to reproduce. > > If you think it is useful, I can try to update my machine to > > linux-next. > > linux-next is closer to drm-tip, so it's better. Do you have some > specific reason for not wanting to run drm-tip (but linux-next is still > ok)? I already have build/update scripts for -next, and I trust -next not to store screenshots of my desktop in my master boot record :-). Anyway, it does happen with -next. This time, chromiums were running, and crash happened minute? after I exited flightgear. It can be seen in the logs. Oh and I might want to mention -- machine was rather deep in swap this time, as in "mouse jumping when starting fgfs" and "could feel the chromium being swapped back in". I might have had this situation before, and just powercycled the machine "because it is so deep in swap that it will not recover". top says: top - 19:18:24 up 2 days, 8:03, 2 users, load average: 3.02, 3.45, 3.21 Tasks: 141 total, 1 running, 86 sleeping, 0 stopped, 2 zombie %Cpu(s): 18.8 us, 7.6 sy, 3.0 ni, 68.4 id, 1.3 wa, 0.0 hi, 0.9 si, 0.0 st KiB Mem: 5967968 total, 663244 used, 5304724 free,48876 buffers KiB Swap: 1681428 total, 170904 used, 1510524 free. 446280 cached Mem but of course that memory is free once everything died. Any ideas? Should I go back to v4.19 to see if it happens there, too? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html delme.gz Description: application/gzip signature.asc Description: Digital signature