On Thu, May 03, 2012 at 09:46:15AM -0600, Jim Schutt wrote:
> On 05/03/2012 08:53 AM, Josef Bacik wrote:
> >On Thu, May 03, 2012 at 08:43:32AM -0600, Jim Schutt wrote:
> >>On 05/01/2012 10:41 AM, Jim Schutt wrote:
> >>>On 05/01/2012 10:00 AM, Josef Bacik wrote:
> >>>>On Wed, Apr 11, 2012 at 02:24:30PM -0600, Jim Schutt wrote:
> >>>>>On 04/11/2012 01:09 PM, Josef Bacik wrote:
> >>>>>>On Tue, Apr 10, 2012 at 01:39:14PM -0600, Jim Schutt wrote:
> >>>>>>>Hi,
> >>>>>>>
> >>>>>>>I hit this BUG today.
> >>>>>>>
> >>>>>>>I'm running 3.3.1 merged with the ceph and btrfs bits for 3.4,
> >>>>>>>i.e. 3.3.1 +
> >>>>>>>commit bc3f116fec194 "Btrfs: update the checks for mixed block groups 
> >>>>>>>with big metadata blocks"
> >>>>>>>commit c666601a935b9 "rbd: move snap_rwsem to the device, rename to 
> >>>>>>>header_rwsem"
> >>>>>>>
> >>>>>>>The btrfs filesystem in question is backing a Ceph OSD under
> >>>>>>>a heavy write load.
> >>>>>>>
> >>>>>>>Here's the bug:
> >>>>>>>
> >>>>>>
> >>>>>>Can you give this a whirl and let me know how it goes? If I'm right you 
> >>>>>>should
> >>>>>>see a warning pop up in your messages. Thanks,
> >>>>>
> >>>>>OK, I've got my test running with your patch applied
> >>>>>to my previous kernel.
> >>>>>
> >>>>>Do you expect your warning to only fire when my
> >>>>>previous kernel would have BUGged? I ask because I've
> >>>>>only seen the BUG once, so it may be a low-probability
> >>>>>occurrence.
> >>>>>
> >>>>>It seems like I should keep testing until I see either
> >>>>>your new warning or the BUG, right?
> >>>>
> >>>>Hey Jim,
> >>>>
> >>>>I just sent a patch to the list
> >>>>
> >>>>[PATCH] Btrfs: fix page leak when allocing extent buffers
> >>>>
> >>>>Could you try that and see if you can reproduce your problem?
> >>>
> >>>Taking it for a spin now...
> >>>
> >>
> >>Hit it again:
> >>
> >
> >Argh ok it's time to stop hopping around the problem and see what exactly the
> >state is when this happens so I know where to look.  Can you run with this 
> >patch
> >and give me the dmesg?  The important information will be above the --- cut 
> >here
> >  --- line so make sure to grab that part.  Thanks,
> 
> Working on it...
> 
> BTW, when I recompiled, I noticed this warning:
> 
>   CC [M]  fs/btrfs/extent_io.o
> fs/btrfs/extent_io.c: In function ‘write_one_eb’:
> fs/btrfs/extent_io.c:3195: warning: ‘ret’ may be used uninitialized in this 
> function
> 
> Is there ever any chance at all that write_one_eb() can be
> called by mistake for an eb with zero pages?  If so, could
> that be part of the problem?
> 

It shouldn't happen but really neither should this bug sooooo go ahead and set
ret = 0 and put a BUG_ON(!num_pages); in write_one_eb after the

        num_pages = num_extent_pages(eb->start, eb->len);

and let it ride.  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to