Re: [ceph-devel] kernel bug with v0.15

2009-09-24 Thread Sage Weil
Hi Brian, The actual deadlock was in btrfs, a conflict between the new space management code and the transaction ioctl that Ceph uses. A fix for that is posted to linux-btrfs: http://www.mail-archive.com/linux-bt...@vger.kernel.org/msg03073.html Still, that just means you'll get ENOSP

Re: [ceph-devel] kernel bug with v0.15

2009-09-22 Thread Sage Weil
On Tue, 22 Sep 2009, Brian Koebbe wrote: > Quick update. > > I've wiped the slate clean (brand new fs) an rerun the rsync... at about 104 > of 114GB, rsync hung (that node being the only client). > > Can still move around (cd, ls) the fs, with other clients... and the one > with the hanging rsyn

Re: [ceph-devel] kernel bug with v0.15

2009-09-22 Thread Brian Koebbe
Quick update. I've wiped the slate clean (brand new fs) an rerun the rsync... at about 104 of 114GB, rsync hung (that node being the only client). Can still move around (cd, ls) the fs, with other clients... and the one with the hanging rsync, but it looks like all reads are hanging. It seems li

Re: [ceph-devel] kernel bug with v0.15

2009-09-22 Thread Brian Koebbe
I'll give it a try. One other thing I should mention in case it sheds light: Initially, my rsync hung after about 10 minutes (the other bug I need to report on). I CTRL-C'd rsync... stopped mds,osds,mon - started mon,osds,mds and reran the rsync... then got that BUG. On Tue, Sep 22, 2009 at 1:3

Re: [ceph-devel] kernel bug with v0.15

2009-09-22 Thread Sage Weil
On Tue, 22 Sep 2009, Brian Koebbe wrote: > I'll give it a try. One other thing I should mention in case it sheds > light: Initially, my rsync hung after about 10 minutes (the other bug I > need to report on). I CTRL-C'd rsync... stopped mds,osds,mon - started > mon,osds,mds and reran the rsync..

Re: [ceph-devel] kernel bug with v0.15

2009-09-22 Thread Sage Weil
Hi Brian, Hmm, it's not immediately obvious what's wrong with that code, but it's only useful for debugging anyway, so I'll just rip it out. Can you let me know if this fixes it for you? sage Subject: [PATCH] kclient: kill out_qlen This is apparently buggy (not immediately obvious why) but i