4TB) filesystems

Otto Moerbeek Thu, 08 Jul 2010 05:25:26 -0700

On Thu, Jul 08, 2010 at 01:26:39PM +0200, Benny L?fgren wrote:

> Hi,
> 
> Sorry about the tab mangling. How should I best send in diffs and
> communicate with developers in the future? (Tried to find something
> about best practice in the usual places, but failed so I winged it
> this time.)


Inline unified diffs using a mail agent that does not mangle tabs... 

> 
> Tested your diff, afaict it works fine. Thanks!
> 
> Regarding your comment about large filesystems - yes I usually do
> this as well. (Although the time to check is still negligible
> compared to the time it takes to reconstruct parity on a 10 TB
> RAIDframe partition - we're talking days... :-) ) A more serious
> problem is of course that fsck_ffs might run out of address space,
> but since I'm only running amd64 these days, that problem has been
> postponed to some distant future. :-)

Recent newfs warns if it's estimate to do a fsck is larger than
MAXDSIZE or the amount of physical mem. 

> 
> However, when I tried to newfs an even larger partition (8,2 TB)
> than the one I used for this bug (which by the way wasn't >4 TB, it
> was 3 TB so the bug heading should probably have stated >2TB
> instead. Sorry about that!), I managed to combine fragment and block
> sizes in such a way that I got a panic the first time I was creating
> a directory on the new file system.
> 
> I'll try to reproduce that problem and send a PR about that too. I
> probably did something stupid, but either newfs should warn about
> that or there is actually a bug in the newfs or ffs2 code, in which
> case it needs to be more resilient.

Yes, please try to gather more info on this. Youmight be creating an
fs with more than 2^32 inodes. newfs should catch that.

> 
> Still have some time to play with this particular system until it is
> supposed to go into production, so I'll try to weed out as many
> problems with very large arrays/partitions/file systems/files, if
> any, that I can.

In the meantime I also confirmed this on a smaller system (2T) but
with 4k blocks and 512 bytes fragments. 

I suspect the hang you mentioned and which I also saw on a unpatched
system is caused by the kernel trying to coredump the giant process,
which hangs the machine. This is of cousre a completely different bug. 

BTW, keep Cc:'ing bugs@, more pepole are interested in this.

        -Otto

Re: system/6421: fsck_ffs core dumps when checking very large (>4TB) filesystems

Reply via email to