Re: ext4 filesystem corruption with 4.10-rc2 on ppc64le

2017-01-08 Thread Chandan Rajendra
On Wednesday, January 04, 2017 10:28:37 AM Theodore Ts'o wrote:
> On Wed, Jan 04, 2017 at 11:32:42AM +0530, Chandan Rajendra wrote:
> > On Wednesday, January 04, 2017 04:18:08 PM Anton Blanchard wrote:
> > > I'm consistently seeing ext4 filesystem corruption using a mainline
> > > kernel. It doesn't take much to trigger it - download a ppc64le Ubuntu
> > > cloud image, boot it in KVM and run:
> > > 
> > > sudo apt-get update
> > > sudo apt-get dist-upgrade
> > > sudo reboot
> > > 
> > > And it never makes it back up, dying with rather severe filesystem
> > > corruption.
> > 
> > The patch at https://patchwork.kernel.org/patch/9488235/ should fix the
> > bug.
> 
> It looks like this patch is already queued up on the "for-linus"
> branch on the linux-block.git tree.
> 
> Chandra, thanks for pointing this out!  I had missed your e-mail from
> Christmas day, and it was on my todo list to figure out why I was
> seeing lots of 1k block regressions on gce-xfstests post-merge window
> that wasn't showing up on the ext4.git tree before I sent my pull
> request to Linus.
> 
> Jens, could you expedite a pull request to Linus?  This is affecting
> ext4 on 1k block file systems on x86/x86_64, so this is not a ppc-only
> regression.  
> 
> Anton or Chandan, could you do me a favor and verify whether or not
> 64k block sizes are working for you on ppcle on ext4 by running
> xfstests?  Light duty testing works for me but when I stress ext4 with
> pagesize==blocksize on ppcle64 via xfstests, it blows up.  I suspect
> (but am not sure) it's due to (non-upstream) device driver issues, and
> a verification that you can run xfstests on your ppcle64 systems using
> standard upstream device drivers would be very helpful, since I don't
> have easy console access on the machines I have access to at $WORK.  :-(

Hi Ted,

I found one regression w.r.t 64k blocksize. I posted a patch
(http://marc.info/?l=linux-block=148388687722745=2) to fix the issue. 

-- 
chandan



Re: ext4 filesystem corruption with 4.10-rc2 on ppc64le

2017-01-05 Thread Anton Blanchard
Hi Ted,

> Anton or Chandan, could you do me a favor and verify whether or not
> 64k block sizes are working for you on ppcle on ext4 by running
> xfstests?  Light duty testing works for me but when I stress ext4 with
> pagesize==blocksize on ppcle64 via xfstests, it blows up.  I suspect
> (but am not sure) it's due to (non-upstream) device driver issues, and
> a verification that you can run xfstests on your ppcle64 systems using
> standard upstream device drivers would be very helpful, since I don't
> have easy console access on the machines I have access to at
> $WORK.  :-(

I fired off an xfstests run, and it looks good. There are 3 failures,
but they seem to be setup issues on my part. I also double checked
those same three failed on 4.8.

Chandan has been running the test suite regularly, and plans to do a
run against mainline too.

Anton


Re: ext4 filesystem corruption with 4.10-rc2 on ppc64le

2017-01-04 Thread Linus Torvalds
On Wed, Jan 4, 2017 at 8:23 AM, Jens Axboe  wrote:
> On 01/04/2017 08:28 AM, Theodore Ts'o wrote:
>>
>> Jens, could you expedite a pull request to Linus?  This is affecting
>> ext4 on 1k block file systems on x86/x86_64, so this is not a ppc-only
>> regression.
>
> Yes, it'll go out this morning.

It's merged and out there in my tree now.

 Linus


Re: ext4 filesystem corruption with 4.10-rc2 on ppc64le

2017-01-04 Thread Jens Axboe
On 01/04/2017 08:28 AM, Theodore Ts'o wrote:
> On Wed, Jan 04, 2017 at 11:32:42AM +0530, Chandan Rajendra wrote:
>> On Wednesday, January 04, 2017 04:18:08 PM Anton Blanchard wrote:
>>> I'm consistently seeing ext4 filesystem corruption using a mainline
>>> kernel. It doesn't take much to trigger it - download a ppc64le Ubuntu
>>> cloud image, boot it in KVM and run:
>>>
>>> sudo apt-get update
>>> sudo apt-get dist-upgrade
>>> sudo reboot
>>>
>>> And it never makes it back up, dying with rather severe filesystem
>>> corruption.
>>
>> The patch at https://patchwork.kernel.org/patch/9488235/ should fix the
>> bug.
> 
> It looks like this patch is already queued up on the "for-linus"
> branch on the linux-block.git tree.
> 
> Chandra, thanks for pointing this out!  I had missed your e-mail from
> Christmas day, and it was on my todo list to figure out why I was
> seeing lots of 1k block regressions on gce-xfstests post-merge window
> that wasn't showing up on the ext4.git tree before I sent my pull
> request to Linus.
> 
> Jens, could you expedite a pull request to Linus?  This is affecting
> ext4 on 1k block file systems on x86/x86_64, so this is not a ppc-only
> regression.  

Yes, it'll go out this morning.

-- 
Jens Axboe



Re: ext4 filesystem corruption with 4.10-rc2 on ppc64le

2017-01-04 Thread Theodore Ts'o
On Wed, Jan 04, 2017 at 11:32:42AM +0530, Chandan Rajendra wrote:
> On Wednesday, January 04, 2017 04:18:08 PM Anton Blanchard wrote:
> > I'm consistently seeing ext4 filesystem corruption using a mainline
> > kernel. It doesn't take much to trigger it - download a ppc64le Ubuntu
> > cloud image, boot it in KVM and run:
> > 
> > sudo apt-get update
> > sudo apt-get dist-upgrade
> > sudo reboot
> > 
> > And it never makes it back up, dying with rather severe filesystem
> > corruption.
> 
> The patch at https://patchwork.kernel.org/patch/9488235/ should fix the
> bug.

It looks like this patch is already queued up on the "for-linus"
branch on the linux-block.git tree.

Chandra, thanks for pointing this out!  I had missed your e-mail from
Christmas day, and it was on my todo list to figure out why I was
seeing lots of 1k block regressions on gce-xfstests post-merge window
that wasn't showing up on the ext4.git tree before I sent my pull
request to Linus.

Jens, could you expedite a pull request to Linus?  This is affecting
ext4 on 1k block file systems on x86/x86_64, so this is not a ppc-only
regression.  

Anton or Chandan, could you do me a favor and verify whether or not
64k block sizes are working for you on ppcle on ext4 by running
xfstests?  Light duty testing works for me but when I stress ext4 with
pagesize==blocksize on ppcle64 via xfstests, it blows up.  I suspect
(but am not sure) it's due to (non-upstream) device driver issues, and
a verification that you can run xfstests on your ppcle64 systems using
standard upstream device drivers would be very helpful, since I don't
have easy console access on the machines I have access to at $WORK.  :-(

And of course, if there are still blocksize==pagesize issues on ext4
on ppc64le, it would be good to know that too.

Many thanks!!
- Ted

P.S.  And for those people who are doing storage work, let me put in a
plug for "gce-xfstests full".  It's cheap and finds lots of problems
before I and others have to.  And if the $1.50 USD is the problem, let
me know and I'll try to work something out.  :-) :-)


Re: ext4 filesystem corruption with 4.10-rc2 on ppc64le

2017-01-04 Thread Jens Axboe
On 01/03/2017 10:18 PM, Anton Blanchard wrote:
> Hi,
> 
> I'm consistently seeing ext4 filesystem corruption using a mainline
> kernel. It doesn't take much to trigger it - download a ppc64le Ubuntu
> cloud image, boot it in KVM and run:
> 
> sudo apt-get update
> sudo apt-get dist-upgrade
> sudo reboot
> 
> And it never makes it back up, dying with rather severe filesystem
> corruption.
> 
> I've narrowed it down to:
> 
> 64e1c57fa474 ("ext4: Use clean_bdev_aliases() instead of iteration")
> e64855c6cfaa ("fs: Add helper to clean bdev aliases under a bh and use it")
> ce98321bf7d2 ("fs: Remove unmap_underlying_metadata")
> 
> Backing these patches out fixes the issue.

Fix is going out today, I see Chandan already pointed you at it. For the
other reporter, it's not an LE vs BE thing, it's a fs blocksize < page
size problem.

-- 
Jens Axboe



Re: ext4 filesystem corruption with 4.10-rc2 on ppc64le

2017-01-04 Thread luigi burdo
Hi,

it is present on ppc not le too.

found it on Ubuntu Mate 16.10 PPC with kernel 4.9 rc6 PPC64 on P5020/P5040


Thanks

Luigi



Da: Linuxppc-dev 
<linuxppc-dev-bounces+intermediadc=hotmail@lists.ozlabs.org> per conto di 
Anton Blanchard <an...@samba.org>
Inviato: mercoledì 4 gennaio 2017 06.18
A: j...@suse.cz; Michael Ellerman; Benjamin Herrenschmidt; Paul Mackerras; 
Stephen Rothwell; ax...@fb.com
Cc: linux-fsde...@vger.kernel.org; linux-e...@vger.kernel.org; 
linuxppc-dev@lists.ozlabs.org; linux-ker...@vger.kernel.org
Oggetto: ext4 filesystem corruption with 4.10-rc2 on ppc64le

Hi,

I'm consistently seeing ext4 filesystem corruption using a mainline
kernel. It doesn't take much to trigger it - download a ppc64le Ubuntu
cloud image, boot it in KVM and run:

sudo apt-get update
sudo apt-get dist-upgrade
sudo reboot

And it never makes it back up, dying with rather severe filesystem
corruption.

I've narrowed it down to:

64e1c57fa474 ("ext4: Use clean_bdev_aliases() instead of iteration")
e64855c6cfaa ("fs: Add helper to clean bdev aliases under a bh and use it")
ce98321bf7d2 ("fs: Remove unmap_underlying_metadata")

Backing these patches out fixes the issue.

Anton


Re: ext4 filesystem corruption with 4.10-rc2 on ppc64le

2017-01-03 Thread Chandan Rajendra
On Wednesday, January 04, 2017 04:18:08 PM Anton Blanchard wrote:
> Hi,
> 
> I'm consistently seeing ext4 filesystem corruption using a mainline
> kernel. It doesn't take much to trigger it - download a ppc64le Ubuntu
> cloud image, boot it in KVM and run:
> 
> sudo apt-get update
> sudo apt-get dist-upgrade
> sudo reboot
> 
> And it never makes it back up, dying with rather severe filesystem
> corruption.

Hi,

The patch at https://patchwork.kernel.org/patch/9488235/ should fix the
bug.

> 
> I've narrowed it down to:
> 
> 64e1c57fa474 ("ext4: Use clean_bdev_aliases() instead of iteration")
> e64855c6cfaa ("fs: Add helper to clean bdev aliases under a bh and use it")
> ce98321bf7d2 ("fs: Remove unmap_underlying_metadata")
> 
> Backing these patches out fixes the issue.
> 
> Anton
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
chandan



ext4 filesystem corruption with 4.10-rc2 on ppc64le

2017-01-03 Thread Anton Blanchard
Hi,

I'm consistently seeing ext4 filesystem corruption using a mainline
kernel. It doesn't take much to trigger it - download a ppc64le Ubuntu
cloud image, boot it in KVM and run:

sudo apt-get update
sudo apt-get dist-upgrade
sudo reboot

And it never makes it back up, dying with rather severe filesystem
corruption.

I've narrowed it down to:

64e1c57fa474 ("ext4: Use clean_bdev_aliases() instead of iteration")
e64855c6cfaa ("fs: Add helper to clean bdev aliases under a bh and use it")
ce98321bf7d2 ("fs: Remove unmap_underlying_metadata")

Backing these patches out fixes the issue.

Anton