Re: delayed allocatiou result in Oops
On 16:46 Срд 20 Июн , Mingming Cao wrote: > On Wed, 2007-06-20 at 12:15 +0400, Alex Tomas wrote: > > Mingming Cao wrote: > > > Hmm, PageMappedToDisk is probably not sufficient enough for pagesize! > > > =blocksize. Is that the reason we need page->private to pass the > > > request? > > > > PageMappedToDisk isn't enough in that case, definitely. bh is the way > > to track state of each block (this is what i'm implementing now), but > > I think current nobh version (per-page flags are used) is valuable. > > > > > That's good to know, thanks for the update. So probably above error case > > > handling will be addressed in the new version? > > > > well, you actually can move that SetPagePrivate() few lines above > > (Dmitriy already tested this). > > Like this? Of course not, we may set page private bit only if space was successfully reserved by ext4_wb_reserve_space_page(). So patch looks like this: diff --git a/fs/ext4/writeback.c b/fs/ext4/writeback.c index 3e669d6..84e5029 100644 --- a/fs/ext4/writeback.c +++ b/fs/ext4/writeback.c @@ -904,9 +904,6 @@ int ext4_wb_commit_write(struct file *file, struct page *page, wb_debug("commit page %lu (%u-%u) for inode %lu\n", page->index, from, to, inode->i_ino); - /* mark page private so that we get -* called to invalidate/release page */ - SetPagePrivate(page); if (!PageBooked(page) && !PageMappedToDisk(page)) { /* ->prepare_write() observed that block for this @@ -918,6 +915,9 @@ int ext4_wb_commit_write(struct file *file, struct page *page, if (err) return err; } + /* mark page private so that we get +* called to invalidate/release page */ + SetPagePrivate(page); /* ok. block for this page is allocated already or it has * been reserved succesfully. so, user may use it */ -- 1.5.2 > > Index: linux-2.6.22-rc5/fs/ext4/writeback.c > === > --- linux-2.6.22-rc5.orig/fs/ext4/writeback.c 2007-06-20 16:41:26.0 > -0700 > +++ linux-2.6.22-rc5/fs/ext4/writeback.c 2007-06-20 16:44:10.0 > -0700 > @@ -918,16 +918,16 @@ int ext4_wb_commit_write(struct file *fi > wb_debug("commit page %lu (%u-%u) for inode %lu\n", > page->index, from, to, inode->i_ino); > > - /* mark page private so that we get > - * called to invalidate/release page */ > - SetPagePrivate(page); > - > if (!PageBooked(page) && !PageMappedToDisk(page)) { > /* ->prepare_write() observed that block for this >* page hasn't been allocated yet. there fore it >* asked to reserve block for later allocation */ > BUG_ON(page->private == 0); > page->private = 0; > + /* mark page private so that we get > + * called to invalidate/release page */ > + SetPagePrivate(page); > + > err = ext4_wb_reserve_space_page(page, 1); > if (err) > return err; > > > - > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: delayed allocatiou result in Oops
On Wed, 2007-06-20 at 12:15 +0400, Alex Tomas wrote: > Mingming Cao wrote: > > Hmm, PageMappedToDisk is probably not sufficient enough for pagesize! > > =blocksize. Is that the reason we need page->private to pass the > > request? > > PageMappedToDisk isn't enough in that case, definitely. bh is the way > to track state of each block (this is what i'm implementing now), but > I think current nobh version (per-page flags are used) is valuable. > > > That's good to know, thanks for the update. So probably above error case > > handling will be addressed in the new version? > > well, you actually can move that SetPagePrivate() few lines above > (Dmitriy already tested this). Like this? Index: linux-2.6.22-rc5/fs/ext4/writeback.c === --- linux-2.6.22-rc5.orig/fs/ext4/writeback.c 2007-06-20 16:41:26.0 -0700 +++ linux-2.6.22-rc5/fs/ext4/writeback.c2007-06-20 16:44:10.0 -0700 @@ -918,16 +918,16 @@ int ext4_wb_commit_write(struct file *fi wb_debug("commit page %lu (%u-%u) for inode %lu\n", page->index, from, to, inode->i_ino); - /* mark page private so that we get -* called to invalidate/release page */ - SetPagePrivate(page); - if (!PageBooked(page) && !PageMappedToDisk(page)) { /* ->prepare_write() observed that block for this * page hasn't been allocated yet. there fore it * asked to reserve block for later allocation */ BUG_ON(page->private == 0); page->private = 0; + /* mark page private so that we get +* called to invalidate/release page */ + SetPagePrivate(page); + err = ext4_wb_reserve_space_page(page, 1); if (err) return err; - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: delayed allocatiou result in Oops
Mingming Cao wrote: Hmm, PageMappedToDisk is probably not sufficient enough for pagesize! =blocksize. Is that the reason we need page->private to pass the request? PageMappedToDisk isn't enough in that case, definitely. bh is the way to track state of each block (this is what i'm implementing now), but I think current nobh version (per-page flags are used) is valuable. That's good to know, thanks for the update. So probably above error case handling will be addressed in the new version? well, you actually can move that SetPagePrivate() few lines above (Dmitriy already tested this). BTW, can you point me your latest and greatest mballoc patch? I am trying to forward port and merge that patch to ext4 patch queue I don't have version for mainline yet. will prepare soon. thanks for your interest. thanks, Alex - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: delayed allocatiou result in Oops
On Jun 19, 2007 13:16 +0400, Dmitry Monakhov wrote: > But whole approach based on using PagePrivate bit and page->private > ptr for special purposes is realy dengerous, and imho wrong, > because avery fs-related code assume that page->private points to > page_buffers. This is actually a misconception. page->private can be anything. While many fs helper routines use page->private as buffers, that only happens when the fs itself causes those helper routines to be called. Lustre, for example, has no buffer heads on the clients, and uses page->private to track RPC batching information and data checksums. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: delayed allocatiou result in Oops
Mingming Cao wrote: O BTW, can you point me your latest and greatest mballoc patch? I am trying to forward port and merge that patch to ext4 patch queue I am looking at the one found at. ftp://ftp.clusterfs.com/pub/people/alex/2.6.19-rc6 -aneesh - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: delayed allocatiou result in Oops
On Tue, 2007-06-19 at 11:11 +0400, Alex Tomas wrote: > Mingming Cao wrote: > >>From the comments it says the page->private is set to 1 to letting > > commit_write know that it needs block reservation, but I don't see the > > page->private value being checked in ext4_wb_commit_write(). Instead, > > the PageMappedToDisk(page) flag is being checked. > > > > Alex, can you clarify the use of page->private and PagePrivate flag > > here? Do we still need the page->private for delayed allocation, with > > PageBooked flag and PageMappedToDisk page flag? > > sorry for confusion, we need PagePrivate so that truncate calls our > ->invalidatepage(). in turn this call is needed to drop unused reservation. One more question, do we still need page->private? I don't see page- >private value being checked anywhere. I see you cleared PageMappedToDisk page flag at prepare_write, and checked PageMappedToDisk in commit_write()... Hmm, PageMappedToDisk is probably not sufficient enough for pagesize! =blocksize. Is that the reason we need page->private to pass the request? > block_read_full_page() must not happen to pages being delayed allocated > in the first place - it's uptodate by definition. I think the problem > began when ext4_wb_commit_write() exited due to -ENOSPC but left not > uptodate *and* with PG_private. then subsequent access to page turned to > block_read_full_page() which relies on PG_private and meaningful private > field. Dmitry, could you repeat the test with SetPagePrivate(page) moved > to after that if() with ext4_wb_reserve_space_page(), please? > > > as i'm here ... status update: I've been reworking delayed allocation > patches to support blocksize < PAGE_CACHE_SIZE (and address akpm's request > for more generic implementation). the patch isn't ready for review, but > hopefully will be in few days. > That's good to know, thanks for the update. So probably above error case handling will be addressed in the new version? BTW, can you point me your latest and greatest mballoc patch? I am trying to forward port and merge that patch to ext4 patch queue Thanks, Mingming > > thanks, Alex > > - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: delayed allocatiou result in Oops
Dmitry Monakhov wrote: But whole approach based on using PagePrivate bit and page->private ptr for special purposes is realy dengerous, and imho wrong, because avery fs-related code assume that page->private points to page_buffers. Especially this approach not work for blksize < pgsize. The best way to store/pass any block related info is buffer_head flags. This approach works for blksize < pgsize case too. it's not dangerous as buffer_heads is optional facility and any filesystem is free to use them or not. or even use own mechanism like reiser4 does. this is why struct page got private field instead of buffers (in 2.4). buffer_heads have cost - 56 bytes/page (on 32bit arch) + cpu cycles to maintain them. current delayed allocation isn't supposed to work with bsize != PAGE_CACHE_SIZE. this will be resolved by the upcoming patch, but bsize == PAGE_CACHE_SIZE is worthwhile to optimize, I believe. thanks, Alex - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: delayed allocatiou result in Oops
On 11:11 Втр 19 Июн , Alex Tomas wrote: > Mingming Cao wrote: > >>From the comments it says the page->private is set to 1 to letting > >commit_write know that it needs block reservation, but I don't see the > >page->private value being checked in ext4_wb_commit_write(). Instead, > >the PageMappedToDisk(page) flag is being checked. > > > >Alex, can you clarify the use of page->private and PagePrivate flag > >here? Do we still need the page->private for delayed allocation, with > >PageBooked flag and PageMappedToDisk page flag? > > sorry for confusion, we need PagePrivate so that truncate calls our > ->invalidatepage(). in turn this call is needed to drop unused reservation. > > block_read_full_page() must not happen to pages being delayed allocated > in the first place - it's uptodate by definition. I think the problem > began when ext4_wb_commit_write() exited due to -ENOSPC but left not > uptodate *and* with PG_private. then subsequent access to page turned to > block_read_full_page() which relies on PG_private and meaningful private > field. Dmitry, could you repeat the test with SetPagePrivate(page) moved > to after that if() with ext4_wb_reserve_space_page(), please? Yes it works. Bug not triggered now. But whole approach based on using PagePrivate bit and page->private ptr for special purposes is realy dengerous, and imho wrong, because avery fs-related code assume that page->private points to page_buffers. Especially this approach not work for blksize < pgsize. The best way to store/pass any block related info is buffer_head flags. This approach works for blksize < pgsize case too. > > > as i'm here ... status update: I've been reworking delayed allocation > patches to support blocksize < PAGE_CACHE_SIZE (and address akpm's request > for more generic implementation). the patch isn't ready for review, but > hopefully will be in few days. > > > thanks, Alex > > > - > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: delayed allocatiou result in Oops
Mingming Cao wrote: From the comments it says the page->private is set to 1 to letting commit_write know that it needs block reservation, but I don't see the page->private value being checked in ext4_wb_commit_write(). Instead, the PageMappedToDisk(page) flag is being checked. Alex, can you clarify the use of page->private and PagePrivate flag here? Do we still need the page->private for delayed allocation, with PageBooked flag and PageMappedToDisk page flag? sorry for confusion, we need PagePrivate so that truncate calls our ->invalidatepage(). in turn this call is needed to drop unused reservation. block_read_full_page() must not happen to pages being delayed allocated in the first place - it's uptodate by definition. I think the problem began when ext4_wb_commit_write() exited due to -ENOSPC but left not uptodate *and* with PG_private. then subsequent access to page turned to block_read_full_page() which relies on PG_private and meaningful private field. Dmitry, could you repeat the test with SetPagePrivate(page) moved to after that if() with ext4_wb_reserve_space_page(), please? as i'm here ... status update: I've been reworking delayed allocation patches to support blocksize < PAGE_CACHE_SIZE (and address akpm's request for more generic implementation). the patch isn't ready for review, but hopefully will be in few days. thanks, Alex - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: delayed allocatiou result in Oops
On Sat, 2007-06-16 at 12:14 +0400, Dmitriy Monakhov wrote: > On 16:16 Птн 15 Июн , Mingming Cao wrote: > > I hit almost the same issue today also, but with different error #, and > > one more kernel oops, when run fsstress on x86_64. > > > > EXT4-fs: writeback error = -2 > > EXT4-fs: writeback error = -2 > This error never happens in writeback in my case, only ENOSPC. I see why, error -2 is ENOENT, your kernel already patched with your previus patch to fix the return error from ext4_reserve_global() from ENOENT to ENOSPC. > > > > I've digged this a litle bit with folowig results: > > > > > > > > int block_read_full_page(struct page *page, get_block_t *get_block) > > > > { > > > > ... > > > > 1914: if (!page_has_buffers(page)) <<< page_has_buffers(page) == true > > > > create_empty_buffers(page, blocksize, 0); > > > > head = page_buffers(page); page_buffers(page) == NULL > > > > << > > > <<< page->flags == 1000821 > > > > <<< PagePrivate(page) == 1, (page)->private == NULL > > > > <<< So we have private page without buffers, it is WRONG. > > > > Right, I think the problem is the PagePrivate() is being set without a valid page->private pointer. The PagePrivate flag is being set by delayed allocation in the ext4_wb_commit_write(). Looking at ext4-delayed-allocation.patch +int ext4_wb_commit_write(struct file *file, struct page *page, +unsigned from, unsigned to) +{ + loff_t pos = ((loff_t)page->index << PAGE_CACHE_SHIFT) + to; + struct inode *inode = page->mapping->host; + int err = 0; + + wb_debug("commit page %lu (%u-%u) for inode %lu\n", + page->index, from, to, inode->i_ino); + + /* mark page private so that we get +* called to invalidate/release page */ + SetPagePrivate(page); + + if (!PageBooked(page) && !PageMappedToDisk(page)) { + /* ->prepare_write() observed that block for this +* page hasn't been allocated yet. there fore it +* asked to reserve block for later allocation */ + BUG_ON(page->private == 0); + page->private = 0; + err = ext4_wb_reserve_space_page(page, 1); + if (err) + return err; + } It sets the PagePrivate flag at the commit_write() time. And the page- >private is cleared to 0. page->private is being set to 1 at ext4_wb_prepare_write(): +int ext4_wb_prepare_write(struct file *file, struct page *page, + unsigned from, unsigned to) +{ + struct inode *inode = page->mapping->host; + struct buffer_head bh, *bhw = &bh; + int err = 0; + + wb_debug("prepare page %lu (%u-%u) for inode %lu\n", + page->index, from, to, page->mapping->host->i_ino); + + /* if page is uptodate this means that ->prepare_write() has +* been called on page before and page is mapped to disk or +* we did reservation. page is protected and nobody can +* access it. hence, it safe to use page->private to pass +* flag that ->commit_write() has to reserve blocks. because +* an error may occur after ->prepare_write() we should not +* reserve block here. it's better to do in ->commit_write() +* when we're sure page is to be written */ + page->private = 0; + if (!PageUptodate(page)) { + /* first write to this page */ + bh.b_state = 0; + err = ext4_get_block(inode, page->index, bhw, 0); + if (err) + return err; + if (!buffer_mapped(bhw)) { + /* this block isn't allocated yet, reserve space */ + wb_debug("reserve space for new block\n"); + page->private = 1; + ext4_wb_clear_page(page, from, to); + ClearPageMappedToDisk(page); >From the comments it says the page->private is set to 1 to letting commit_write know that it needs block reservation, but I don't see the page->private value being checked in ext4_wb_commit_write(). Instead, the PageMappedToDisk(page) flag is being checked. Alex, can you clarify the use of page->private and PagePrivate flag here? Do we still need the page->private for delayed allocation, with PageBooked flag and PageMappedToDisk page flag? Thanks, Mingming - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: delayed allocatiou result in Oops
On 16:16 Птн 15 Июн , Mingming Cao wrote: > I hit almost the same issue today also, but with different error #, and > one more kernel oops, when run fsstress on x86_64. > > EXT4-fs: writeback error = -2 > EXT4-fs: writeback error = -2 This error never happens in writeback in my case, only ENOSPC. Btw: i've send one more micro fix (see in this thread " PATCH] ext4:fix invariant checking in ext4_rebalance_reservation") > > Unable to handle kernel NULL pointer dereference at RIP: > [] block_read_full_page+0xb5/0x267 > PGD 1f9842067 PUD 1f9843067 PMD 0 > Oops: [5] SMP > CPU 3 > Modules linked in: > Pid: 10900, comm: fsstress Not tainted 2.6.22-rc4-autokern1 #1 > RIP: 0010:[] [] > block_read_full_page+0xb5/0x267 > RSP: :8101f984fa48 EFLAGS: 00010213 > RAX: 0179 RBX: RCX: 000c > RDX: 000c RSI: 802e0f7b RDI: 81017ff578c8 > RBP: 81017ff578c8 R08: 8101f984fbe8 R09: 8101f984fbe0 > R10: R11: R12: 00e5 > R13: R14: 1000 R15: > FS: () GS:8101803ec5c0(0063) knlGS:f7dec460 > CS: 0010 DS: 002b ES: 002b CR0: 8005003b > CR2: CR3: 0001f9841000 CR4: 06e0 > Process fsstress (pid: 10900, threadinfo 8101f984e000, task > 8101f9824280) > Stack: 1000f9ad4080 0001 0179 > 8100de7c4100 802e0f7b 03363e3761bf 804eac2d > 8101f984fb48 0082 81017e9bc550 81017e9bc588 > Call Trace: > [] ext4_get_block+0x0/0x104 > [] thread_return+0x0/0xd5 > [] do_mpage_readpage+0x411/0x430 > [] io_schedule+0x26/0x32 > [] __wait_on_bit_lock+0x5f/0x6d > [] mpage_readpage+0x42/0x5b > [] ext4_get_block+0x0/0x104 > [] wake_bit_function+0x0/0x23 > [] file_read_actor+0x89/0xf4 > [] find_get_page+0x1e/0x4d > [] do_generic_mapping_read+0x20e/0x3df > [] file_read_actor+0x0/0xf4 > [] generic_file_aio_read+0x11d/0x154 > [] do_sync_read+0xc8/0x10b > [] permission+0xbb/0xbd > [] autoremove_wake_function+0x0/0x2e > [] nameidata_to_filp+0x25/0x34 > [] do_filp_open+0x2d/0x3d > [] vfs_getattr+0x2b/0x2f > [] vfs_fstat+0x33/0x3a > [] vfs_read+0xab/0x12e > [] sys_read+0x45/0x6e > [] ia32_sysret+0x0/0xa > [skip] > [ cut here ] > kernel BUG at fs/ext4/writeback.c:266! Yepp. I've saw this oops too, But IMHO it may be artefact caused by previous one. > invalid opcode: [6] SMP > CPU 3 > Modules linked in: > Pid: 10851, comm: fsstress Not tainted 2.6.22-rc4-autokern1 #1 > RIP: 0010:[] [] > ext4_wb_submit_extent+0x1ef/0x3d9 > RSP: :8101e47cfab8 EFLAGS: 00010246 > RAX: 0001182c RBX: 8100c6709ca0 RCX: 000c > RDX: 0001 RSI: RDI: 8101e8de5000 > RBP: 8100c6709a48 R08: 8101b1056338 R09: > R10: 8101b1056338 R11: 8100c6709a48 R12: 0040 > R13: 81017eaa5b98 R14: 0040 R15: 0001 > FS: () GS:8101803ec5c0(0063) knlGS:f7dec460 > CS: 0010 DS: 002b ES: 002b CR0: 8005003b > CR2: f7dcb004 CR3: 0001e47c1000 CR4: 06e0 > Process fsstress (pid: 10851, threadinfo 8101e47ce000, task > 8101e47a6b30) > Stack: 8101cf22c9b8 0001 000c0001 > 8100c6709a48 00018028938e 8101e47cfb68 > 8101e47cfd28 8100c6709ca0 8100c6709a48 8100c6709990 > Call Trace: > [] ext4_wb_handle_extent+0x3b5/0x48c > [] ext4_ext_walk_space+0x18a/0x20c > [] ext4_wb_handle_extent+0x0/0x48c > [] ext4_wb_flush+0x5b/0x153 > [] ext4_wb_writepages+0x34b/0x398 > [] do_writepages+0x20/0x2d > [] __writeback_single_inode+0x1df/0x3a7 > [] find_get_pages_tag+0x34/0x89 > [] pagevec_lookup_tag+0x1a/0x24 > [] wait_on_page_writeback_range+0xc7/0x10d > [] sync_sb_inodes+0x1cb/0x2a0 > [] sync_inodes_sb+0xa5/0xb9 > [] __up_read+0x10/0x8a > [] __sync_inodes+0x6a/0xb1 > [] sync_inodes+0x11/0x29 > [] do_sync+0x2c/0x50 > [] sys_sync+0xb/0xf > [] ia32_sysret+0x0/0xa > > > Code: 0f 0b eb fe f0 41 0f ba 75 00 14 48 8b 4c 24 40 01 51 10 48 > RIP [] ext4_wb_submit_extent+0x1ef/0x3d9 > RSP > > I will try the patch below...Alex, any hint about the second oops? > > Mingming > Alex please > On Fri, 2007-06-15 at 09:14 +0400, Alex Tomas wrote: > > looks like an error in error handling path (notice -28 (ENOSPC) before) > > > > thanks for the report, Alex > > > > Dmitriy Monakhov wrote: > > > ) > > > > > > Simple test failed on ext4 when delayed allocation was used. > > > #mkfs.ext3 -b4096 /dev/vzvg/test2 > > > #mount -text4dev /dev/vzvg/test2 /mnt/test -odelalloc > > > #fsstress -d /mnt/test/ -l100 -n10 -p20 -f dwrite=0 > > > > > > > > > EXT4-fs: writeback error = -28 > > > .. > > > EXT4-fs: writeba
Re: delayed allocatiou result in Oops
I hit almost the same issue today also, but with different error #, and one more kernel oops, when run fsstress on x86_64. EXT4-fs: writeback error = -2 EXT4-fs: writeback error = -2 Unable to handle kernel NULL pointer dereference at RIP: [] block_read_full_page+0xb5/0x267 PGD 1f9842067 PUD 1f9843067 PMD 0 Oops: [5] SMP CPU 3 Modules linked in: Pid: 10900, comm: fsstress Not tainted 2.6.22-rc4-autokern1 #1 RIP: 0010:[] [] block_read_full_page+0xb5/0x267 RSP: :8101f984fa48 EFLAGS: 00010213 RAX: 0179 RBX: RCX: 000c RDX: 000c RSI: 802e0f7b RDI: 81017ff578c8 RBP: 81017ff578c8 R08: 8101f984fbe8 R09: 8101f984fbe0 R10: R11: R12: 00e5 R13: R14: 1000 R15: FS: () GS:8101803ec5c0(0063) knlGS:f7dec460 CS: 0010 DS: 002b ES: 002b CR0: 8005003b CR2: CR3: 0001f9841000 CR4: 06e0 Process fsstress (pid: 10900, threadinfo 8101f984e000, task 8101f9824280) Stack: 1000f9ad4080 0001 0179 8100de7c4100 802e0f7b 03363e3761bf 804eac2d 8101f984fb48 0082 81017e9bc550 81017e9bc588 Call Trace: [] ext4_get_block+0x0/0x104 [] thread_return+0x0/0xd5 [] do_mpage_readpage+0x411/0x430 [] io_schedule+0x26/0x32 [] __wait_on_bit_lock+0x5f/0x6d [] mpage_readpage+0x42/0x5b [] ext4_get_block+0x0/0x104 [] wake_bit_function+0x0/0x23 [] file_read_actor+0x89/0xf4 [] find_get_page+0x1e/0x4d [] do_generic_mapping_read+0x20e/0x3df [] file_read_actor+0x0/0xf4 [] generic_file_aio_read+0x11d/0x154 [] do_sync_read+0xc8/0x10b [] permission+0xbb/0xbd [] autoremove_wake_function+0x0/0x2e [] nameidata_to_filp+0x25/0x34 [] do_filp_open+0x2d/0x3d [] vfs_getattr+0x2b/0x2f [] vfs_fstat+0x33/0x3a [] vfs_read+0xab/0x12e [] sys_read+0x45/0x6e [] ia32_sysret+0x0/0xa Code: 8b 03 a8 01 0f 85 e1 00 00 00 8b 03 a8 20 0f 85 cc 00 00 00 RIP [] block_read_full_page+0xb5/0x267 RSP CR2: EXT4-fs: writeback error = -2 EXT4-fs: writeback error = -2 EXT4-fs: writeback error = -2 EXT4-fs: writeback error = -2 EXT4-fs: writeback error = -2 EXT4-fs: writeback error = -2 EXT4-fs: writeback error = -2 EXT4-fs: writeback error = -2 EXT4-fs: writeback error = -2 EXT4-fs: writeback error = -2 EXT4-fs: writeback error = -2 EXT4-fs: writeback error = -2 EXT4-fs: writeback error = -2 EXT4-fs: writeback error = -2 EXT4-fs: writeback error = -2 EXT4-fs: writeback error = -2 EXT4-fs: writeback error = -2 EXT4-fs: writeback error = -2 EXT4-fs: writeback error = -2 EXT4-fs: writeback error = -2 EXT4-fs: writeback error = -2 [ cut here ] kernel BUG at fs/ext4/writeback.c:266! invalid opcode: [6] SMP CPU 3 Modules linked in: Pid: 10851, comm: fsstress Not tainted 2.6.22-rc4-autokern1 #1 RIP: 0010:[] [] ext4_wb_submit_extent+0x1ef/0x3d9 RSP: :8101e47cfab8 EFLAGS: 00010246 RAX: 0001182c RBX: 8100c6709ca0 RCX: 000c RDX: 0001 RSI: RDI: 8101e8de5000 RBP: 8100c6709a48 R08: 8101b1056338 R09: R10: 8101b1056338 R11: 8100c6709a48 R12: 0040 R13: 81017eaa5b98 R14: 0040 R15: 0001 FS: () GS:8101803ec5c0(0063) knlGS:f7dec460 CS: 0010 DS: 002b ES: 002b CR0: 8005003b CR2: f7dcb004 CR3: 0001e47c1000 CR4: 06e0 Process fsstress (pid: 10851, threadinfo 8101e47ce000, task 8101e47a6b30) Stack: 8101cf22c9b8 0001 000c0001 8100c6709a48 00018028938e 8101e47cfb68 8101e47cfd28 8100c6709ca0 8100c6709a48 8100c6709990 Call Trace: [] ext4_wb_handle_extent+0x3b5/0x48c [] ext4_ext_walk_space+0x18a/0x20c [] ext4_wb_handle_extent+0x0/0x48c [] ext4_wb_flush+0x5b/0x153 [] ext4_wb_writepages+0x34b/0x398 [] do_writepages+0x20/0x2d [] __writeback_single_inode+0x1df/0x3a7 [] find_get_pages_tag+0x34/0x89 [] pagevec_lookup_tag+0x1a/0x24 [] wait_on_page_writeback_range+0xc7/0x10d [] sync_sb_inodes+0x1cb/0x2a0 [] sync_inodes_sb+0xa5/0xb9 [] __up_read+0x10/0x8a [] __sync_inodes+0x6a/0xb1 [] sync_inodes+0x11/0x29 [] do_sync+0x2c/0x50 [] sys_sync+0xb/0xf [] ia32_sysret+0x0/0xa Code: 0f 0b eb fe f0 41 0f ba 75 00 14 48 8b 4c 24 40 01 51 10 48 RIP [] ext4_wb_submit_extent+0x1ef/0x3d9 RSP I will try the patch below...Alex, any hint about the second oops? Mingming Alex please On Fri, 2007-06-15 at 09:14 +0400, Alex Tomas wrote: > looks like an error in error handling path (notice -28 (ENOSPC) before) > > thanks for the report, Alex > > Dmitriy Monakhov wrote: > > ) > > > > Simple test failed on ext4 when delayed allocation was used. > > #mkfs.ext3 -b4096 /dev/vzvg/tes
Re: delayed allocatiou result in Oops
looks like an error in error handling path (notice -28 (ENOSPC) before) thanks for the report, Alex Dmitriy Monakhov wrote: ) Simple test failed on ext4 when delayed allocation was used. #mkfs.ext3 -b4096 /dev/vzvg/test2 #mount -text4dev /dev/vzvg/test2 /mnt/test -odelalloc #fsstress -d /mnt/test/ -l100 -n10 -p20 -f dwrite=0 EXT4-fs: writeback error = -28 .. EXT4-fs: writeback error = -28 Unable to handle kernel NULL pointer dereference at RIP: [] block_read_full_page+0xab/0x25f PGD 44c1067 PUD 44fd067 PMD 0 Oops: [2] SMP CPU 0 Modules linked in: ext4dev jbd2 Pid: 4833, comm: fsstress Not tainted 2.6.22-rc4-mm2 #9 RIP: 0010:[] [] block_read_full_page+0xab/0x25f RSP: 0018:810004df9a58 EFLAGS: 00010203 RAX: 1000 RBX: 8100cf4256f8 RCX: 000c RDX: 0001 RSI: 000c RDI: 8100cf4256f8 RBP: R08: 810004df9be8 R09: 810004df9c58 R10: R11: R12: 0052 R13: 1000 R14: R15: 00d3 FS: 2adfe3f7d6f0() GS:8073() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: CR3: 04362000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process fsstress (pid: 4833, threadinfo 810004df8000, task 810004c867a0) Stack: 880180f2 8100054a23f0 810005dcbb80 81000549bf00 810005def8b0 88029e60 8025b2be 8100054da540 8100cf496fb0 Call Trace: [] :ext4dev:ext4_get_block+0x0/0x109 [] find_get_page+0x21/0x51 [] do_mpage_readpage+0x45f/0x480 [] :ext4dev:ext4_get_block+0x0/0x109 [] :jbd2:jbd2_journal_dirty_metadata+0x197/0x1be [] bit_waitqueue+0x1c/0x99 [] mpage_readpage+0x4e/0x67 [] :ext4dev:ext4_get_block+0x0/0x109 [] do_lookup+0x63/0x1ae [] file_read_actor+0x8d/0xf6 [] find_get_page+0x21/0x51 [] do_generic_mapping_read+0x23c/0x3da [] file_read_actor+0x0/0xf6 [] generic_file_aio_read+0x119/0x156 [] do_sync_read+0xc9/0x10c [] cp_new_stat+0xe5/0xfd [] autoremove_wake_function+0x0/0x2e [] vfs_read+0xaa/0x132 [] sys_read+0x45/0x6e [] system_call+0x7e/0x83 Code: 8b 45 00 a8 01 0f 85 e6 00 00 00 8b 45 00 a8 20 0f 85 c9 00 I've digged this a litle bit with folowig results: int block_read_full_page(struct page *page, get_block_t *get_block) { ... 1914: if (!page_has_buffers(page)) <<< page_has_buffers(page) == true create_empty_buffers(page, blocksize, 0); head = page_buffers(page); page_buffers(page) == NULL << <<< page->flags == 1000821 <<< PagePrivate(page) == 1, (page)->private == NULL <<< So we have private page without buffers, it is WRONG. iblock = (sector_t)page->index << (PAGE_CACHE_SHIFT - inode->i_blkbits); lblock = (i_size_read(inode)+blocksize-1) >> inode->i_blkbits; bh = head; nr = 0; i = 0; do { if (buffer_uptodate(bh)) << Null pointer deref here result in oops ... } - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
delayed allocatiou result in Oops
Simple test failed on ext4 when delayed allocation was used. #mkfs.ext3 -b4096 /dev/vzvg/test2 #mount -text4dev /dev/vzvg/test2 /mnt/test -odelalloc #fsstress -d /mnt/test/ -l100 -n10 -p20 -f dwrite=0 EXT4-fs: writeback error = -28 .. EXT4-fs: writeback error = -28 Unable to handle kernel NULL pointer dereference at RIP: [] block_read_full_page+0xab/0x25f PGD 44c1067 PUD 44fd067 PMD 0 Oops: [2] SMP CPU 0 Modules linked in: ext4dev jbd2 Pid: 4833, comm: fsstress Not tainted 2.6.22-rc4-mm2 #9 RIP: 0010:[] [] block_read_full_page+0xab/0x25f RSP: 0018:810004df9a58 EFLAGS: 00010203 RAX: 1000 RBX: 8100cf4256f8 RCX: 000c RDX: 0001 RSI: 000c RDI: 8100cf4256f8 RBP: R08: 810004df9be8 R09: 810004df9c58 R10: R11: R12: 0052 R13: 1000 R14: R15: 00d3 FS: 2adfe3f7d6f0() GS:8073() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: CR3: 04362000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process fsstress (pid: 4833, threadinfo 810004df8000, task 810004c867a0) Stack: 880180f2 8100054a23f0 810005dcbb80 81000549bf00 810005def8b0 88029e60 8025b2be 8100054da540 8100cf496fb0 Call Trace: [] :ext4dev:ext4_get_block+0x0/0x109 [] find_get_page+0x21/0x51 [] do_mpage_readpage+0x45f/0x480 [] :ext4dev:ext4_get_block+0x0/0x109 [] :jbd2:jbd2_journal_dirty_metadata+0x197/0x1be [] bit_waitqueue+0x1c/0x99 [] mpage_readpage+0x4e/0x67 [] :ext4dev:ext4_get_block+0x0/0x109 [] do_lookup+0x63/0x1ae [] file_read_actor+0x8d/0xf6 [] find_get_page+0x21/0x51 [] do_generic_mapping_read+0x23c/0x3da [] file_read_actor+0x0/0xf6 [] generic_file_aio_read+0x119/0x156 [] do_sync_read+0xc9/0x10c [] cp_new_stat+0xe5/0xfd [] autoremove_wake_function+0x0/0x2e [] vfs_read+0xaa/0x132 [] sys_read+0x45/0x6e [] system_call+0x7e/0x83 Code: 8b 45 00 a8 01 0f 85 e6 00 00 00 8b 45 00 a8 20 0f 85 c9 00 I've digged this a litle bit with folowig results: int block_read_full_page(struct page *page, get_block_t *get_block) { ... 1914: if (!page_has_buffers(page)) <<< page_has_buffers(page) == true create_empty_buffers(page, blocksize, 0); head = page_buffers(page); page_buffers(page) == NULL <private == NULL <<< So we have private page without buffers, it is WRONG. iblock = (sector_t)page->index << (PAGE_CACHE_SHIFT - inode->i_blkbits); lblock = (i_size_read(inode)+blocksize-1) >> inode->i_blkbits; bh = head; nr = 0; i = 0; do { if (buffer_uptodate(bh)) << Null pointer deref here result in oops ... } - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html