On Sun, Mar 27, 2016 at 03:25:40AM -0000, DD Park wrote:
> Hello, I need your help. This bug seemed to have been placed offline due to
> inactivity. It is still a problem as been working on moving things around
> to get a testing platform. I've been getting new hardware, and started
> another build process to get me to a point of testing. I'm plan on doing a
> little more testing before going into production based on the thought that
> this problem was fixed, but initial testing shows I'm stilll having some
> similar problems. I've built a 18TB file system raid5 ext4, and I was
> crossing my fingers that it would be stable, but I'm seeing all kinds of
> corruptions and doing fsck early I see that the file system doesn't stay
> clean for long. I've built 3 systems so far. Two of them have gone into
> production and I've limited my ext4 to 16TB. I built another system with
> 18TB and  I once I start copying large amounts of files onto the system, I
> start seeing some warning messages indicating some forms of corruption, and
> I stop the copy, run fsck, and I find I do not have a clean file system.
> I'm running ubuntu 14.04.04 LTS on this test system. I've got another near
> identical setup with ubuntu-14.04 and 16TB or less and works fine(this was
> the original system that I saw my corruption. After downsizing, I'm good).
> I've got another with 18TB, but split into a 16TB partition and 2TB
> partition, on a ubuntu-15.04 system and that is working fine. I go back to
> an hybrid system I built to do this test. It is running ubuntu14.04.04 and
> built this one with 18TB. This was an older file server that did not have
> problems that I decomissioned recently so I could do this testing. I
> started my burn in tests and started seeing corruption of the file system.
> As expected the only thing I can determine is that it doesn't seem to like
> >16TB. Please let me know how I can help get this debugged.

So the original problem was about fsck crashing with a seg fault.

That's different from it finding corruptions.  So the first question
is what exactly are you seeing?  Corruptions?   Fsck crashing?   Both?

The next question is are you using resize2fs or not?  There are known
problems with using resize2fs with large partitions, especially if you
aren't using the very latest version of e2fsprogs.  In general on-line
resizing is going to be much safer than off-line resizing (the bugs
were in resize2fs's off-line resizing code).

If you are seeing it crash, the best thing to do is to get the very
latest version of e2fsprogs, and build it, and then run it from there,
so we can get a stack trace with line numbers.  Since I'm about to
release 1.43, ideally you would do this with both 1.42.13 as well as
the tip of the e2fsprogs git's "master" branch.

(Sorry, I don't provide support for distro versions of the kernel and
e2fsprogs.  If you want that, you need to pay $$$ to Canonical and get
their enterprise support product offerring.)

                                                        - Ted

P.S.  Also, to be clear, you are are using software raid?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1345682

Title:
  fsck on 24TB ext4 keeps crashing

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/e2fsprogs/+bug/1345682/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to