On 01/25/2014 04:47 PM, Dan Merillat wrote:
I'm trying to track this down - this started happening without changing the
kernel in use, so probably
a corrupted filesystem. The symptoms are that all memory is suddenly used by no
apparent source. OOM
killer is invoked on every task, still can't free up enough memory to continue.
When it goes wrong, it's extremely rapid - system goes from stable to dead in
less than 30 seconds.
Tested 3.9.0, 3.12.0, 3.12.8. Limited testing on 3.13 shows I think the same
problem but I need
to double-check that it's not a different issue. Blows up the exact same way
on a real kernel or in
UML.
All sorts of things can trigger it - defrag, random writes to files. Balance
and scrub don't,
readonly mount doesn't.
I can reproduce this trivially, mount the filesystem read-write and perform
some activity. It only
takes a few minutes. The other btrfs filesystems on the same machine don't
show similar problems.
Unfortunately, the output of btrfs-image -c9 is 75gb, much more than I can
reasonably share. I've got
a reliable reproducer in UML using UML-COW to always start with the same base
image, defrag a file with
33,000 extents and the system explodes within a minute.
Here's the OOM report, the formatting is a bit off due to being delivered via
netconsole.
Swap was disabled on this run, but it makes no difference. I get insta-OOM
issues out of the blue
with very little memory swapped out.
Don't defrag right now, the snapshot aware defrag is horribly broken and
will OOM the box. Thanks,
Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html