Re: [Resend] 3.2-rc6+: Reported regressions from 3.0 and 3.1
On 12/21/2011 05:24 AM, Rafael J. Wysocki wrote: [Evidently, vger has blocked the previous attempts to post this message for some reason. I wonder why?] This message contains a list of some regressions from 3.0 and 3.1 for which there are no fixes in the mainline known to the tracking team. If any of them have been fixed already, please let us know. If you know of any other unresolved regressions from 3.0 and 3.1, please let us know either and we'll add them to the list. Also, please let us know if any of the entries below are invalid. The entries below are simplified and the statistics are not present due to the continuing Bugzilla outage. [...] Subject: Freezing of tasks failed after 20.01 seconds in kernel 3.2.0-rc2 Submitter : Belisko Marek marek.beli...@gmail.com Date : 2011-11-22 21:20 Message-ID : caafyv36nxyq1ea2pchi5wr1bdwyhp2sqty1mma4q3jjcog5...@mail.gmail.com References : http://marc.info/?l=linux-kernelm=132199691706100w=2 I believe we already have a fix for this, but not yet in mainline: http://thread.gmane.org/gmane.linux.nfs/45336 Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line unsubscribe kernel-testers in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Resend] 3.2-rc6+: Reported regressions from 3.0 and 3.1
On 12/21/2011 05:24 AM, Rafael J. Wysocki wrote: [Evidently, vger has blocked the previous attempts to post this message for some reason. I wonder why?] This message contains a list of some regressions from 3.0 and 3.1 for which there are no fixes in the mainline known to the tracking team. If any of them have been fixed already, please let us know. If you know of any other unresolved regressions from 3.0 and 3.1, please let us know either and we'll add them to the list. Also, please let us know if any of the entries below are invalid. The entries below are simplified and the statistics are not present due to the continuing Bugzilla outage. Subject: 3.2-rc2: regression after hibernate: lots of warnings, broken system Submitter : Pavel Machek pa...@ucw.cz Date : 2011-11-24 15:40 Message-ID : 2024154014.ga2...@elf.ucw.cz References : http://marc.info/?l=linux-kernelm=132214929818015w=2 This one looks a bit similar to the one reported at http://thread.gmane.org/gmane.linux.kernel/1230509/ which is fixed by commit 29495aa04a30c21565243c5b9c028510446d242c. Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line unsubscribe kernel-testers in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Resend] 3.2-rc6+: Reported regressions from 3.0 and 3.1
On Thu, 22 Dec 2011 16:28:19 +0530 Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com wrote: On 12/21/2011 05:24 AM, Rafael J. Wysocki wrote: [Evidently, vger has blocked the previous attempts to post this message for some reason. I wonder why?] This message contains a list of some regressions from 3.0 and 3.1 for which there are no fixes in the mainline known to the tracking team. If any of them have been fixed already, please let us know. If you know of any other unresolved regressions from 3.0 and 3.1, please let us know either and we'll add them to the list. Also, please let us know if any of the entries below are invalid. The entries below are simplified and the statistics are not present due to the continuing Bugzilla outage. [...] Subject: Freezing of tasks failed after 20.01 seconds in kernel 3.2.0-rc2 Submitter : Belisko Marek marek.beli...@gmail.com Date : 2011-11-22 21:20 Message-ID : caafyv36nxyq1ea2pchi5wr1bdwyhp2sqty1mma4q3jjcog5...@mail.gmail.com References : http://marc.info/?l=linux-kernelm=132199691706100w=2 I believe we already have a fix for this, but not yet in mainline: http://thread.gmane.org/gmane.linux.nfs/45336 I don't believe this is a regression at all. As far as I know, It's always been the case that these sleeps weren't freezable. I even have a similar bug open for RHEL6 which has a 2.6.32-based kernel... If people are seeing this more now, then it might be some subtle difference in timing or differences in what userspace is doing at the time the freezer runs. In any case, the final fix didn't make 3.2, but should go in early for 3.3. -- Jeff Layton jlay...@redhat.com -- To unsubscribe from this list: send the line unsubscribe kernel-testers in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Resend] 3.2-rc6+: Reported regressions from 3.0 and 3.1
On 2011-12-21 00:54 +0100, Rafael J. Wysocki wrote: This message contains a list of some regressions from 3.0 and 3.1 for which there are no fixes in the mainline known to the tracking team. If any of them have been fixed already, please let us know. If you know of any other unresolved regressions from 3.0 and 3.1, please let us know either and we'll add them to the list. Also, please let us know if any of the entries below are invalid. i915 HDMI log spam, reported against -rc1: http://permalink.gmane.org/gmane.linux.kernel/1212638 remains present in mainline. The latest patches (afaik) were posted earlier this month: Subject: Intel HDMI ELD fixes v2 http://permalink.gmane.org/gmane.linux.kernel/1226920 Cheers, -- Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/) -- To unsubscribe from this list: send the line unsubscribe kernel-testers in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Resend] 3.2-rc6+: Reported regressions from 3.0 and 3.1
On Wed 21-12-11 00:54:46, Rafael J. Wysocki wrote: Subject: Reiserfs.c bug in 3.2-rc5 Submitter : Jorge Bastos mysql.jo...@decimal.pt Date : 2011-12-10 23:48 Message-ID : 43556.213.228.140.150.1323560920.squir...@webmail.decimal.pt References : http://marc.info/?l=linux-kernelm=132356156914296w=2 Well, it's not clear this is a regression. Also I didn't get any reply from the reporter for a week so I'm inclined to declare it ENORESPONSE... Honza -- Jan Kara j...@suse.cz SUSE Labs, CR -- To unsubscribe from this list: send the line unsubscribe kernel-testers in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Resend] 3.2-rc6+: Reported regressions from 3.0 and 3.1
On Tue, Dec 20, 2011 at 3:54 PM, Rafael J. Wysocki r...@sisk.pl wrote: Subject : Regression: irqpoll hasn't been working for me since March 16 IRQ Submitter : Edward Donovan edward.dono...@numble.net Date : 2011-10-19 22:09 Message-ID : CADdbW+HXdCPfJu2RTF6zz+ujCmiu_dmZwL2iScuF53p=aaz...@mail.gmail.com References : http://marc.info/?l=linux-kernelm=131914554220679w=2 Edward fixed this in commit 52553ddffad76ccf192d4dd9ce88d5818f57f62a. Subject : Linus GIT - INFO: possible circular locking dependency detected Submitter : Miles Lane miles.l...@gmail.com Date : 2011-11-03 15:57 Message-ID : CAHFgRy8S0xLfhZxTUOEH5A0PL_Fb79-0-gmbQ=9h2d-xmqt...@mail.gmail.com References : http://marc.info/?l=linux-kernelm=132033587908426w=2 I *think* this is fixed by the revert in commit 5e442a493fc5. Subject : Sparc-32 doesn't work in 3.1. Submitter : Rob Landley r...@landley.net Date : 2011-11-12 11:22 Message-ID : 4ebeab5a.5020...@landley.net References : http://www.spinics.net/lists/kernel/msg1260383.html I'm pretty sure this is fixed by commit b1f44e13a525. Subject : WARNING: at mm/slub.c:3357, kernel BUG at mm/slub.c:3413 Submitter : Markus Trippelsdorf mar...@trippelsdorf.de Date : 2011-11-18 7:25 Message-ID : 2018072519.ga1...@x4.trippels.de References : http://marc.info/?l=linux-kernelm=132160119031794w=2 This is a combination one, I think. There's some kexec trouble with DRI and Radeons, and there on PPC the SLUB case was done incorrectly. The PPC case was fixed, the DRI/Radeon/kexec thing is pending for next release, I'm afraid. Subject : Bisected regression: hang on i915 between 3.1.0-rc9 and 3.1.0 Submitter : Meelis Roos mr...@linux.ee Date : 2011-11-22 10:15 Message-ID : alpine.soc.1.00.221207090.6...@math.ut.ee References : http://marc.info/?l=linux-kernelm=132195700023709w=2 This is i915 vs VT-d. It may be fixed in current -git, but basically people should try to avoid using VT-d with i915, there seem to be hardware bugs wrt the graphics semaphores and power management code. Subject : 3.2-rc2 regression: floppy driver breaks boot Submitter : Pavel Machek pa...@ucw.cz Date : 2011-11-22 11:14 Message-ID : 2022111405.ga28...@elf.ucw.cz References : http://marc.info/?l=linux-kernelm=132196052124801w=2 Hmm. I'd love to get more info. Turning off floppy support may work around it, but I don't think we've actually seen the oops. Subject : i915: 3.2 rc1/2 KMS regression Submitter : Patrik Kullman patrik.kull...@gmail.com Date : 2011-11-23 23:43 Message-ID : CAGPN=9THOv- m4td4hae94udcsajw3egmg7ioleu+p8xuobp...@mail.gmail.com References : http://marc.info/?l=linux-kernelm=132209186403288w=2 Fixed by commit ed4a51842a9d which reverted the problematic commit. Subject : [regression] WARNING: at drivers/block/floppy.c:2929 do_fd_request+0xb7/0xb9() in 3.2.0-rc2 and 3 Submitter : Ralf Hildebrandt ralf.hildebra...@charite.de Date : 2011-11-25 10:34 Message-ID : 2025103420.go4...@charite.de References : http://marc.info/?l=linux-kernelm=132221799501685w=2 This should be fixed by commit 4eabc941259f. I wonder if that's related to the floppy issue above? Nothing really changed in the floppy driver itself, so it should be something about the block layer.. Subject : 3.2-rc2 regression due to commit USB: EHCI: fix HUB TT scheduling issue with iso transfer 811c926c538f7e8d3c08b630dd5844efd7e000f6 Submitter : Sander Eikelenboom li...@eikelenboom.it Date : 2011-11-26 15:47 Message-ID : 1001209018.2026164...@eikelenboom.it References : http://marc.info/?l=linux-kernelm=132232295425393w=2 Fixed by commit e3420901eba6. Subject : 3.2-rc3+: [drm:i915_hangcheck_elapsed] ERROR Hangcheck timer elapsed... GPU hung Submitter : Sergei Trofimovich sly...@gmail.com Date : 2011-12-02 17:56 Message-ID : 20111202205601.11552...@sf.home References : http://marc.info/?l=linux-kernelm=132284845705156w=2 I think this is the same i915 issue above, fixed by the same commit ed4a51842a9d. Subject : [BUG] deadlock: jfs (3.2.0-rc4-00154-g8e8da02) Submitter : Nico Schottelius nico-linux-20111...@schottelius.org Date : 2011-12-06 10:05 Message-ID : 20111206100533.gb6...@schottelius.org References : http://marc.info/?l=linux-kernelm=132317917827825w=2 That's an odd bug-report. I think Nico should try to cut-and-paste more of the relevant problem.. It's all there in the attached xz-file, but I doubt anybody followed up on it because it's so hidden.. Unpacked, and added Dave and jfs-discussion to the cc: [ 6281.127353] = [ 6281.127355] [ INFO: inconsistent lock state ] [ 6281.127358] 3.2.0-rc4-00154-g8e8da02 #91 [ 6281.127360] - [ 6281.127363] inconsistent {RECLAIM_FS-ON-W} - {IN-RECLAIM_FS-W} usage. [ 6281.127366] kswapd0/30 [HC0[0]:SC0[0]:HE1:SE1] takes:
Re: [Resend] 3.2-rc6+: Reported regressions from 3.0 and 3.1
On 12/20/2011 08:31 PM, Linus Torvalds wrote: On Tue, Dec 20, 2011 at 3:54 PM, Rafael J. Wysocki r...@sisk.pl wrote: Subject: [BUG] deadlock: jfs (3.2.0-rc4-00154-g8e8da02) Submitter : Nico Schottelius nico-linux-20111...@schottelius.org Date : 2011-12-06 10:05 Message-ID : 20111206100533.gb6...@schottelius.org References : http://marc.info/?l=linux-kernelm=132317917827825w=2 That's an odd bug-report. I think Nico should try to cut-and-paste more of the relevant problem.. It's all there in the attached xz-file, but I doubt anybody followed up on it because it's so hidden.. Unpacked, and added Dave and jfs-discussion to the cc: [ 6281.127353] = [ 6281.127355] [ INFO: inconsistent lock state ] [ 6281.127358] 3.2.0-rc4-00154-g8e8da02 #91 [ 6281.127360] - [ 6281.127363] inconsistent {RECLAIM_FS-ON-W} - {IN-RECLAIM_FS-W} usage. [ 6281.127366] kswapd0/30 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 6281.127368] (jfs_ip-rdwrlock#2){?+}, at: [a01958d7] jfs_get_block+0x57/0x220 [jfs] [ 6281.127381] {RECLAIM_FS-ON-W} state was registered at: [ 6281.127383] [810a2c71] mark_held_locks+0x61/0x140 [ 6281.127392] [810a3401] lockdep_trace_alloc+0x71/0xd0 [ 6281.127399] [8115daed] kmem_cache_alloc+0x2d/0x170 [ 6281.127406] [8124d7d6] radix_tree_preload+0x66/0xf0 [ 6281.127414] [81110e93] add_to_page_cache_locked+0x73/0x170 [ 6281.127422] [81110fb1] add_to_page_cache_lru+0x21/0x50 [ 6281.127428] [812a] do_read_cache_page+0x6a/0x170 [ 6281.127434] [827c] read_cache_page_async+0x1c/0x20 [ 6281.127441] [828e] read_cache_page+0xe/0x20 [ 6281.127446] [a01ae406] __get_metapage+0x1c6/0x5c0 [jfs] [ 6281.127455] [a01a018a] diWrite+0xea/0x7f0 [jfs] [ 6281.127461] [a01b3b04] txCommit+0x1d4/0xe40 [jfs] [ 6281.127468] [a01982e3] jfs_unlink+0x2a3/0x390 [jfs] [ 6281.127474] [8118255f] vfs_unlink+0x9f/0x110 [ 6281.127479] [8118277a] do_unlinkat+0x1aa/0x1d0 [ 6281.127482] [81184236] sys_unlink+0x16/0x20 [ 6281.127486] [8143e202] system_call_fastpath+0x16/0x1b [ 6281.127491] irq event stamp: 26965295 [ 6281.127493] hardirqs last enabled at (26965295): [8111a3d5] clear_page_dirty_for_io+0x105/0x130 [ 6281.127498] hardirqs last disabled at (26965294): [8111a378] clear_page_dirty_for_io+0xa8/0x130 [ 6281.127503] softirqs last enabled at (26964300): [8106cda7] __do_softirq+0x137/0x2a0 [ 6281.127508] softirqs last disabled at (26964283): [814404fc] call_softirq+0x1c/0x30 [ 6281.127513] [ 6281.127514] other info that might help us debug this: [ 6281.127516] Possible unsafe locking scenario: [ 6281.127517] [ 6281.127518]CPU0 [ 6281.127519] [ 6281.127521] lock(jfs_ip-rdwrlock); [ 6281.127524] Interrupt [ 6281.127525] lock(jfs_ip-rdwrlock); [ 6281.127528] [ 6281.127529] *** DEADLOCK *** [ 6281.127529] [ 6281.127531] no locks held by kswapd0/30. [ 6281.127533] [ 6281.127533] stack backtrace: [ 6281.127536] Pid: 30, comm: kswapd0 Tainted: G C 3.2.0-rc4-00154-g8e8da02 #91 [ 6281.127539] Call Trace: [ 6281.127545] [8143374c] print_usage_bug.part.34+0x285/0x294 [ 6281.127552] [8102494f] ? save_stack_trace+0x2f/0x50 [ 6281.127559] [8109ffe0] mark_lock+0x540/0x600 [ 6281.127564] [8109ef60] ? print_irq_inversion_bug.part.31+0x1f0/0x1f0 [ 6281.127568] [810a0677] __lock_acquire+0x5d7/0x1d10 [ 6281.127573] [81118394] ? free_pcppages_bulk+0x34/0x430 [ 6281.127580] [a01958d7] ? jfs_get_block+0x57/0x220 [jfs] [ 6281.127584] [810a23a2] lock_acquire+0x92/0x160 [ 6281.127590] [a01958d7] ? jfs_get_block+0x57/0x220 [jfs] [ 6281.127595] [811a5253] ? create_empty_buffers+0x53/0xe0 [ 6281.127600] [8108e77f] down_write_nested+0x2f/0x60 [ 6281.127606] [a01958d7] ? jfs_get_block+0x57/0x220 [jfs] [ 6281.127612] [a01958d7] jfs_get_block+0x57/0x220 [jfs] [ 6281.127616] [8143d24b] ? _raw_spin_unlock+0x2b/0x60 [ 6281.127620] [811a65d1] __block_write_full_page+0x101/0x3a0 [ 6281.127625] [811a5fe0] ? block_read_full_page+0x3d0/0x3d0 [ 6281.127631] [a0195880] ? jfs_writepage+0x20/0x20 [jfs] [ 6281.127637] [811a6954] block_write_full_page_endio+0xe4/0x130 [ 6281.127642] [811a69b5] block_write_full_page+0x15/0x20 [ 6281.127651] [a0195878] jfs_writepage+0x18/0x20 [jfs] [ 6281.127657] [8112427c]
Re: [Resend] 3.2-rc6+: Reported regressions from 3.0 and 3.1
On Tue, Dec 20, 2011 at 8:23 PM, Dave Kleikamp dave.kleik...@oracle.com wrote: I don't think this is a regression. It's been seen before, but the patch never got submitted, or was lost somewhere. I believe this will fix it. Hmm. This patch looks obviously correct. But it looks *so* obviously correct that it just makes me suspicious - this is not new or seldom used code, it's been this way for ages and used all the time. That line literally goes back to 2007, commit eb2be189317d0. And it looks like even before that we had a GFP_KERNEL for the add_to_page_cache() case and that goes back to before the git history. So this is *ancient*. Maybe almost nobody uses __read_cache_page() with a non-GFP_KERNEL gfp and as a result we've not noticed. Or maybe there is some crazy reason why it calls add_to_page_cache() with GFP_KERNEL. Adding the usual suspects for mm/filemap.c to the cc line (Andrew is already cc'd, but Al and Hugh should comment). Ack's, people? Is it really as obvious as it looks, and we've just had this bug forever? Linus --- snip snip --- vfs: __read_cache_page should use gfp argument rather than GFP_KERNEL lockdep reports a deadlock in jfs because a special inode's rw semaphore is taken recursively. The mapping's gfp mask is GFP_NOFS, but is not used when __read_cache_page() calls add_to_page_cache_lru(). Signed-off-by: Dave Kleikamp dave.kleik...@oracle.com diff --git a/mm/filemap.c b/mm/filemap.c index c106d3b..c9ea3df 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1828,7 +1828,7 @@ repeat: page = __page_cache_alloc(gfp | __GFP_COLD); if (!page) return ERR_PTR(-ENOMEM); - err = add_to_page_cache_lru(page, mapping, index, GFP_KERNEL); + err = add_to_page_cache_lru(page, mapping, index, gfp); if (unlikely(err)) { page_cache_release(page); if (err == -EEXIST) -- To unsubscribe from this list: send the line unsubscribe kernel-testers in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Resend] 3.2-rc6+: Reported regressions from 3.0 and 3.1
From: Linus Torvalds torva...@linux-foundation.org Date: Tue, 20 Dec 2011 18:31:15 -0800 On Tue, Dec 20, 2011 at 3:54 PM, Rafael J. Wysocki r...@sisk.pl wrote: Subject : Sparc-32 doesn't work in 3.1. Submitter : Rob Landley r...@landley.net Date : 2011-11-12 11:22 Message-ID : 4ebeab5a.5020...@landley.net References : http://www.spinics.net/lists/kernel/msg1260383.html I'm pretty sure this is fixed by commit b1f44e13a525. It is. -- To unsubscribe from this list: send the line unsubscribe kernel-testers in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Resend] 3.2-rc6+: Reported regressions from 3.0 and 3.1
On Tue, 20 Dec 2011, Linus Torvalds wrote: On Tue, Dec 20, 2011 at 8:23 PM, Dave Kleikamp dave.kleik...@oracle.com wrote: I don't think this is a regression. It's been seen before, but the patch never got submitted, or was lost somewhere. I believe this will fix it. Hmm. This patch looks obviously correct. But it looks *so* obviously correct that it just makes me suspicious - this is not new or seldom used code, it's been this way for ages and used all the time. That line literally goes back to 2007, commit eb2be189317d0. And it looks like even before that we had a GFP_KERNEL for the add_to_page_cache() case and that goes back to before the git history. So this is *ancient*. Maybe almost nobody uses __read_cache_page() with a non-GFP_KERNEL gfp and as a result we've not noticed. Or maybe there is some crazy reason why it calls add_to_page_cache() with GFP_KERNEL. Adding the usual suspects for mm/filemap.c to the cc line (Andrew is already cc'd, but Al and Hugh should comment). Ack's, people? Is it really as obvious as it looks, and we've just had this bug forever? Certainly Acked-by: Hugh Dickins hu...@google.com from me (and add_to_page_cache_locked does the masking of inappropriate bits when passing on down, so no need to worry about that aspect). I agree that it's odd that we've never noticed it before, but I don't think the GFP_KERNEL there has any more significance than oversight. Nick cleaned up some similar instances in filemap.c a few years back, I guess ones he hit in testing, but this just got left over. page_cache_read()'s GFP_KERNEL looks similarly worrying, but as it's only called by filemap_fault(), I suppose it's actually okay. Ooh, maybe you should also update that comment on GFP_KERNEL above read_cache_page_gfp()... Hugh Linus --- snip snip --- vfs: __read_cache_page should use gfp argument rather than GFP_KERNEL lockdep reports a deadlock in jfs because a special inode's rw semaphore is taken recursively. The mapping's gfp mask is GFP_NOFS, but is not used when __read_cache_page() calls add_to_page_cache_lru(). Signed-off-by: Dave Kleikamp dave.kleik...@oracle.com diff --git a/mm/filemap.c b/mm/filemap.c index c106d3b..c9ea3df 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1828,7 +1828,7 @@ repeat: page = __page_cache_alloc(gfp | __GFP_COLD); if (!page) return ERR_PTR(-ENOMEM); - err = add_to_page_cache_lru(page, mapping, index, GFP_KERNEL); + err = add_to_page_cache_lru(page, mapping, index, gfp); if (unlikely(err)) { page_cache_release(page); if (err == -EEXIST)
Re: [Resend] 3.2-rc6+: Reported regressions from 3.0 and 3.1
On Tue, Dec 20, 2011 at 10:15:00PM -0800, Hugh Dickins wrote: Acked-by: Hugh Dickins hu...@google.com from me (and add_to_page_cache_locked does the masking of inappropriate bits when passing on down, so no need to worry about that aspect). I was grepping for possibilities of that hitting us right now... OK, rigth you are. Acked-by: Al Viro v...@zeniv.linux.org.uk -- To unsubscribe from this list: send the line unsubscribe kernel-testers in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html