On Thu, May 03, 2012 at 09:46:15AM -0600, Jim Schutt wrote: > On 05/03/2012 08:53 AM, Josef Bacik wrote: > >On Thu, May 03, 2012 at 08:43:32AM -0600, Jim Schutt wrote: > >>On 05/01/2012 10:41 AM, Jim Schutt wrote: > >>>On 05/01/2012 10:00 AM, Josef Bacik wrote: > >>>>On Wed, Apr 11, 2012 at 02:24:30PM -0600, Jim Schutt wrote: > >>>>>On 04/11/2012 01:09 PM, Josef Bacik wrote: > >>>>>>On Tue, Apr 10, 2012 at 01:39:14PM -0600, Jim Schutt wrote: > >>>>>>>Hi, > >>>>>>> > >>>>>>>I hit this BUG today. > >>>>>>> > >>>>>>>I'm running 3.3.1 merged with the ceph and btrfs bits for 3.4, > >>>>>>>i.e. 3.3.1 + > >>>>>>>commit bc3f116fec194 "Btrfs: update the checks for mixed block groups > >>>>>>>with big metadata blocks" > >>>>>>>commit c666601a935b9 "rbd: move snap_rwsem to the device, rename to > >>>>>>>header_rwsem" > >>>>>>> > >>>>>>>The btrfs filesystem in question is backing a Ceph OSD under > >>>>>>>a heavy write load. > >>>>>>> > >>>>>>>Here's the bug: > >>>>>>> > >>>>>> > >>>>>>Can you give this a whirl and let me know how it goes? If I'm right you > >>>>>>should > >>>>>>see a warning pop up in your messages. Thanks, > >>>>> > >>>>>OK, I've got my test running with your patch applied > >>>>>to my previous kernel. > >>>>> > >>>>>Do you expect your warning to only fire when my > >>>>>previous kernel would have BUGged? I ask because I've > >>>>>only seen the BUG once, so it may be a low-probability > >>>>>occurrence. > >>>>> > >>>>>It seems like I should keep testing until I see either > >>>>>your new warning or the BUG, right? > >>>> > >>>>Hey Jim, > >>>> > >>>>I just sent a patch to the list > >>>> > >>>>[PATCH] Btrfs: fix page leak when allocing extent buffers > >>>> > >>>>Could you try that and see if you can reproduce your problem? > >>> > >>>Taking it for a spin now... > >>> > >> > >>Hit it again: > >> > > > >Argh ok it's time to stop hopping around the problem and see what exactly the > >state is when this happens so I know where to look. Can you run with this > >patch > >and give me the dmesg? The important information will be above the --- cut > >here > > --- line so make sure to grab that part. Thanks, > > Working on it... > > BTW, when I recompiled, I noticed this warning: > > CC [M] fs/btrfs/extent_io.o > fs/btrfs/extent_io.c: In function ‘write_one_eb’: > fs/btrfs/extent_io.c:3195: warning: ‘ret’ may be used uninitialized in this > function > > Is there ever any chance at all that write_one_eb() can be > called by mistake for an eb with zero pages? If so, could > that be part of the problem? >
It shouldn't happen but really neither should this bug sooooo go ahead and set ret = 0 and put a BUG_ON(!num_pages); in write_one_eb after the num_pages = num_extent_pages(eb->start, eb->len); and let it ride. Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html