Re: ffs_newvnode: inode has non zero blocks

2016-11-08 Thread Jaromír Doleček
> | There are some further changes needed to cover a possible dup alloc ,
> | and to keep the !wapbl case recoverable by fsck. There is ongoing
> | discussion on source-changes about that, hope we finalise fix later in
> | the week.
>
> Leaving a filesystem problem committed on head that can cause filesystem
> corruption for a week is not considerate to people who use current.

There shouldn't be anything causing filesystem corruption any more. I
will resolve this soon.

Jaromir


Re: ffs_newvnode: inode has non zero blocks

2016-11-08 Thread Christos Zoulas
On Nov 8, 11:51am, jaromir.dole...@gmail.com 
(=?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?=) wrote:
-- Subject: Re: ffs_newvnode: inode has non zero blocks

| Yes, that problem is related to the wapbl change. I've committed a bug
| fix, so newer kernel shouldn't trigger the panic any more.
| 
| There are some further changes needed to cover a possible dup alloc ,
| and to keep the !wapbl case recoverable by fsck. There is ongoing
| discussion on source-changes about that, hope we finalise fix later in
| the week.

Leaving a filesystem problem committed on head that can cause filesystem
corruption for a week is not considerate to people who use current.

christos


Re: ffs_newvnode: inode has non zero blocks

2016-11-08 Thread Jaromír Doleček
Yes, that problem is related to the wapbl change. I've committed a bug
fix, so newer kernel shouldn't trigger the panic any more.

There are some further changes needed to cover a possible dup alloc ,
and to keep the !wapbl case recoverable by fsck. There is ongoing
discussion on source-changes about that, hope we finalise fix later in
the week.

Jaromir

2016-11-07 23:07 GMT+01:00 Andreas Gustafsson :
> Earlier, I wrote:
>> Also, I have now narrowed down the appearance of the problem on the
>> testbed to the following commit:
>>
>>   2016.10.30.15.01.46 christos src/sys/ufs/ffs/ffs_alloc.c 1.154
>>
>> The mystery remains because the commit message says there should be no
>> functional change, and I also did a quick review of the diff and did
>> not spot anything that could be a cause of the panic.
>
> I have now also run a bisection of this on my own testbed, and it
> identified a different commits than the TNF testbed did:
>
>   2016.10.28.20.38.12 jdolecek src/sys/kern/vfs_wapbl.c 1.85
>   2016.10.28.20.38.12 jdolecek src/sys/sys/wapbl.h 1.19
>   2016.10.28.20.38.12 jdolecek src/sys/ufs/ffs/ffs_alloc.c 1.153
>   2016.10.28.20.38.12 jdolecek src/sys/ufs/ffs/ffs_inode.c 1.118
>   2016.10.28.20.38.12 jdolecek src/sys/ufs/ffs/ffs_snapshot.c 1.143
>   2016.10.28.20.38.12 jdolecek src/sys/ufs/ufs/ufs_extern.h 1.83
>   2016.10.28.20.38.12 jdolecek src/sys/ufs/ufs/ufs_inode.c 1.97
>   2016.10.28.20.38.12 jdolecek src/sys/ufs/ufs/ufs_rename.c 1.13
>   2016.10.28.20.38.12 jdolecek src/sys/ufs/ufs/ufs_vnops.c 1.233
>   2016.10.28.20.38.12 jdolecek src/sys/ufs/ufs/ufs_wapbl.h 1.12
>
> More data at:
>
>   
> http://releng.netbsd.org/b5reports/sparc/commits-2016.10.html#2016.10.30.15.01.46
>   
> http://www.gson.org/netbsd/bugs/build/sparc/commits-2016.10.html#2016.10.28.20.38.12
>
> --
> Andreas Gustafsson, g...@gson.org


Re: ffs_newvnode: inode has non zero blocks

2016-11-07 Thread Andreas Gustafsson
Jaromír Doleček wrote:
> There was recenly change a change in FFS in the general area for
> WAPBL. Can you try attached patch and check if following KASSERT()
> triggers?

I'm afraid I took your request literally and tested a release build
with your patch and no other changes.  Since KASSERT does nothing
unless DIAGNOSTIC is also defined, and it is not defined by default,
your patch made no difference.

If you still need me to test a patch, please send one that includes
all necessary changes.
-- 
Andreas Gustafsson, g...@gson.org


Re: ffs_newvnode: inode has non zero blocks

2016-11-07 Thread Andreas Gustafsson
Earlier, I wrote:
> Also, I have now narrowed down the appearance of the problem on the
> testbed to the following commit:
> 
>   2016.10.30.15.01.46 christos src/sys/ufs/ffs/ffs_alloc.c 1.154
> 
> The mystery remains because the commit message says there should be no
> functional change, and I also did a quick review of the diff and did
> not spot anything that could be a cause of the panic.

I have now also run a bisection of this on my own testbed, and it
identified a different commits than the TNF testbed did:

  2016.10.28.20.38.12 jdolecek src/sys/kern/vfs_wapbl.c 1.85
  2016.10.28.20.38.12 jdolecek src/sys/sys/wapbl.h 1.19
  2016.10.28.20.38.12 jdolecek src/sys/ufs/ffs/ffs_alloc.c 1.153
  2016.10.28.20.38.12 jdolecek src/sys/ufs/ffs/ffs_inode.c 1.118
  2016.10.28.20.38.12 jdolecek src/sys/ufs/ffs/ffs_snapshot.c 1.143
  2016.10.28.20.38.12 jdolecek src/sys/ufs/ufs/ufs_extern.h 1.83
  2016.10.28.20.38.12 jdolecek src/sys/ufs/ufs/ufs_inode.c 1.97
  2016.10.28.20.38.12 jdolecek src/sys/ufs/ufs/ufs_rename.c 1.13
  2016.10.28.20.38.12 jdolecek src/sys/ufs/ufs/ufs_vnops.c 1.233
  2016.10.28.20.38.12 jdolecek src/sys/ufs/ufs/ufs_wapbl.h 1.12

More data at:

  
http://releng.netbsd.org/b5reports/sparc/commits-2016.10.html#2016.10.30.15.01.46
  
http://www.gson.org/netbsd/bugs/build/sparc/commits-2016.10.html#2016.10.28.20.38.12

-- 
Andreas Gustafsson, g...@gson.org


Re: ffs_newvnode: inode has non zero blocks

2016-11-06 Thread Andreas Gustafsson
Jaromír,

Earlier, I wrote:
> I can't do automated installs of sparc under qemu>0 because of
> https://bugs.launchpad.net/qemu/+bug/1399943,

After some discussion with the qemu maintainers, it looks like bug
1399943 has been fixed, so I should be able to test your patch after
all, as soon as the bisection that's now running on my test machine
has finished.
-- 
Andreas Gustafsson, g...@gson.org


Re: ffs_newvnode: inode has non zero blocks

2016-11-04 Thread Andreas Gustafsson
Jaromír Doleček wrote:
> There was recenly change a change in FFS in the general area for
> WAPBL. Can you try attached patch and check if following KASSERT()
> triggers?

I'm afraid I don't have an easy way to test it.  The TNF testbed does
not have an easy way of testing patches, I can't do automated installs of
sparc under qemu>0 because of https://bugs.launchpad.net/qemu/+bug/1399943,
qemu0 is no longer in pkgsrc, and my physical SPARCs are in storage.

If your patch is safe to commit, the easiest way to get it tested would
probably be to commit it and let the testbed do its thing.

Also, I have now narrowed down the appearance of the problem on the
testbed to the following commit:

  2016.10.30.15.01.46 christos src/sys/ufs/ffs/ffs_alloc.c 1.154

The mystery remains because the commit message says there should be no
functional change, and I also did a quick review of the diff and did
not spot anything that could be a cause of the panic.
-- 
Andreas Gustafsson, g...@gson.org


Re: ffs_newvnode: inode has non zero blocks

2016-11-02 Thread Jaromír Doleček
There was recenly change a change in FFS in the general area for
WAPBL. Can you try attached patch and check if following KASSERT()
triggers?

2016-11-02 18:39 GMT+01:00 Andreas Gustafsson :
> co...@sdf.org wrote:
>> I'm pretty 'abusive' to my machine. unsurprisingly, I've managed to 
>> accumulate a problem:
>>
>>   ffs_newvnode: ino=20681997 on /: gen 5ae8a721/5ae8a721 has non zero blocks 
>> 980 or size 0
>>   panic: ffs_newvnode: dirty filesystem?
>
> The TNF sparc testbed recently started panicing with a similar error in
> every test run:
>
>   sbin/resize_ffs/t_grow_swapped (445/663): 4 test cases
>   grow_16M_v0_65536: ffs_newvnode: ino=45826 on /: gen 65327e67/65327e67 
> has non zero blocks 180 or size 0
>   panic: ffs_newvnode: dirty filesystem?
>   cpu0: Begin traceback...
>   0x0(0xf04010b8, 0xf4538a50, 0xf04a3800, 0xf04a4400, 0xf04a45c0, 0x104) at 
> netbsd:panic+0x20
>   panic(0xf04010b8, 0xf03c39a0, 0x0, 0xb302, 0xf07578d4, 0xf047e000) at 
> netbsd:ffs_newvnode+0x444
>   ffs_newvnode(0xf073, 0xf0970328, 0x81a4, 0xf4538cb0, 0xf069cb28, 
> 0xf0984810) at netbsd:vcache_new+0x5c
>   vcache_new(0xf073, 0xf0970328, 0xf4538cb0, 0xf069cb28, 0xf4538b74, 0x0) 
> at netbsd:ufs_makeinode+0x14
>   ufs_makeinode(0xf4538cb0, 0xf0970328, 0xf096ef2c, 0xf4538dcc, 0xf4538de0, 
> 0xf0926460) at netbsd:ufs_create+0x30
>   ufs_create(0xf4538c3c, 0xfff8, 0x0, 0x0, 0xf096ef2c, 0xf0970328) at 
> netbsd:VOP_CREATE+0x28
>   VOP_CREATE(0xf0970328, 0xf4538dcc, 0xf4538de0, 0xf4538cb0, 0xf0002000, 
> 0xf0785150) at netbsd:vn_open+0x24c
>   vn_open(0x0, 0x602, 0x1a4, 0xf069cb28, 0xf0851000, 0xf4538db8) at 
> netbsd:do_open+0x90
>   do_open(0x0, 0x0, 0xf0785150, 0x602, 0x1a4, 0xf4538ec4) at 
> netbsd:do_sys_openat+0x60
>   do_sys_openat(0xf0aa05a0, 0xff9c, 0xeda08080, 0x601, 0x1a4, 0xf4538ec4) 
> at netbsd:sys_open+0x18
>   sys_open(0xf0aa05a0, 0xf4538f30, 0xf4538f28, 0xeda08080, 0x0, 0x169b04f) at 
> netbsd:syscall+0x248
>   syscall(0xc05, 0xf4538fb0, 0xedc06b58, 0x5, 0x4e, 0xf0aa05a0) at 
> netbsd:memfault_sun4m+0x3f4
>   cpu0: End traceback...
>
> More logs at:
>
>   
> http://releng.netbsd.org/b5reports/sparc/commits-2016.10.html#2016.10.30.19.33.49
>
> The strange thing is that this problem seems to have started soon
> after your report, not before it as I would expect if it were also the
> cause of your crash.  The filesystems involved are all newly created
> in each test run.
> --
> Andreas Gustafsson, g...@gson.org
Index: ffs_inode.c
===
RCS file: /cvsroot/src/sys/ufs/ffs/ffs_inode.c,v
retrieving revision 1.118
diff -u -r1.118 ffs_inode.c
--- ffs_inode.c 28 Oct 2016 20:38:12 -  1.118
+++ ffs_inode.c 2 Nov 2016 21:15:11 -
@@ -543,6 +543,7 @@
oip->i_size = length;
DIP_ASSIGN(oip, size, length);
DIP_ADD(oip, blocks, -blocksreleased);
+   KASSERT((DIP(oip, size) == 0) == (DIP(oip, blocks) == 0));
genfs_node_unlock(ovp);
oip->i_flag |= IN_CHANGE;
UFS_WAPBL_UPDATE(ovp, NULL, NULL, 0);


Re: ffs_newvnode: inode has non zero blocks

2016-11-02 Thread Andreas Gustafsson
co...@sdf.org wrote:
> The lengthy problem description is not very important as it was fixed
> I am just wondering about the setup of the tests, if there's a
> possibility of bad data being left over (reuse of image, etc.)

I don't think so; the image is not reused because the ATF framework
runs each test case in a clean work directory.
-- 
Andreas Gustafsson, g...@gson.org


Re: ffs_newvnode: inode has non zero blocks

2016-11-02 Thread coypu
On Wed, Nov 02, 2016 at 10:50:51PM +0200, Andreas Gustafsson wrote:
> co...@sdf.org wrote:
> > There was recently an issue with resize_ffs mishandling a non zero
> > filled expansion of FFSv2 (PR 51116). I wonder if this is similar.
> > 
> > i.e., does zeroing out the disk help?
> 
> If I read the code of the test case correctly, it involves growing a
> file system image stored in a file, causing the file to grow
> correspondingly.  Since the expansion does not even exist in the image
> file prior before resize_ffs is run, there is no way for it to have
> non-zero content.
> -- 
> Andreas Gustafsson, g...@gson.org

The lengthy problem description is not very important as it was fixed
I am just wondering about the setup of the tests, if there's a
possibility of bad data being left over (reuse of image, etc.)


Re: ffs_newvnode: inode has non zero blocks

2016-11-02 Thread Andreas Gustafsson
co...@sdf.org wrote:
> There was recently an issue with resize_ffs mishandling a non zero
> filled expansion of FFSv2 (PR 51116). I wonder if this is similar.
> 
> i.e., does zeroing out the disk help?

If I read the code of the test case correctly, it involves growing a
file system image stored in a file, causing the file to grow
correspondingly.  Since the expansion does not even exist in the image
file prior before resize_ffs is run, there is no way for it to have
non-zero content.
-- 
Andreas Gustafsson, g...@gson.org


Re: ffs_newvnode: inode has non zero blocks

2016-11-02 Thread coypu
On Wed, Nov 02, 2016 at 07:39:01PM +0200, Andreas Gustafsson wrote:
> 
> The strange thing is that this problem seems to have started soon
> after your report, not before it as I would expect if it were also the
> cause of your crash.  The filesystems involved are all newly created
> in each test run.
> -- 
> Andreas Gustafsson, g...@gson.org

Hi,

There was recently an issue with resize_ffs mishandling a non zero
filled expansion of FFSv2 (PR 51116). I wonder if this is similar.

i.e., does zeroing out the disk help?


Re: ffs_newvnode: inode has non zero blocks

2016-11-02 Thread Andreas Gustafsson
co...@sdf.org wrote:
> I'm pretty 'abusive' to my machine. unsurprisingly, I've managed to 
> accumulate a problem:
> 
>   ffs_newvnode: ino=20681997 on /: gen 5ae8a721/5ae8a721 has non zero blocks 
> 980 or size 0
>   panic: ffs_newvnode: dirty filesystem?

The TNF sparc testbed recently started panicing with a similar error in
every test run:

  sbin/resize_ffs/t_grow_swapped (445/663): 4 test cases
  grow_16M_v0_65536: ffs_newvnode: ino=45826 on /: gen 65327e67/65327e67 
has non zero blocks 180 or size 0
  panic: ffs_newvnode: dirty filesystem?
  cpu0: Begin traceback...
  0x0(0xf04010b8, 0xf4538a50, 0xf04a3800, 0xf04a4400, 0xf04a45c0, 0x104) at 
netbsd:panic+0x20
  panic(0xf04010b8, 0xf03c39a0, 0x0, 0xb302, 0xf07578d4, 0xf047e000) at 
netbsd:ffs_newvnode+0x444
  ffs_newvnode(0xf073, 0xf0970328, 0x81a4, 0xf4538cb0, 0xf069cb28, 
0xf0984810) at netbsd:vcache_new+0x5c
  vcache_new(0xf073, 0xf0970328, 0xf4538cb0, 0xf069cb28, 0xf4538b74, 0x0) 
at netbsd:ufs_makeinode+0x14
  ufs_makeinode(0xf4538cb0, 0xf0970328, 0xf096ef2c, 0xf4538dcc, 0xf4538de0, 
0xf0926460) at netbsd:ufs_create+0x30
  ufs_create(0xf4538c3c, 0xfff8, 0x0, 0x0, 0xf096ef2c, 0xf0970328) at 
netbsd:VOP_CREATE+0x28
  VOP_CREATE(0xf0970328, 0xf4538dcc, 0xf4538de0, 0xf4538cb0, 0xf0002000, 
0xf0785150) at netbsd:vn_open+0x24c
  vn_open(0x0, 0x602, 0x1a4, 0xf069cb28, 0xf0851000, 0xf4538db8) at 
netbsd:do_open+0x90
  do_open(0x0, 0x0, 0xf0785150, 0x602, 0x1a4, 0xf4538ec4) at 
netbsd:do_sys_openat+0x60
  do_sys_openat(0xf0aa05a0, 0xff9c, 0xeda08080, 0x601, 0x1a4, 0xf4538ec4) 
at netbsd:sys_open+0x18
  sys_open(0xf0aa05a0, 0xf4538f30, 0xf4538f28, 0xeda08080, 0x0, 0x169b04f) at 
netbsd:syscall+0x248
  syscall(0xc05, 0xf4538fb0, 0xedc06b58, 0x5, 0x4e, 0xf0aa05a0) at 
netbsd:memfault_sun4m+0x3f4
  cpu0: End traceback...

More logs at:

  
http://releng.netbsd.org/b5reports/sparc/commits-2016.10.html#2016.10.30.19.33.49

The strange thing is that this problem seems to have started soon
after your report, not before it as I would expect if it were also the
cause of your crash.  The filesystems involved are all newly created
in each test run.
-- 
Andreas Gustafsson, g...@gson.org


Re: ffs_newvnode: inode has non zero blocks

2016-10-30 Thread Michael van Elst
co...@sdf.org writes:

>I do appear to have some franken-filesystem, half FFSv2, half FFSv1.

>file system: /dev/rwd0a
>format  FFSv1
>endian  little-endian
>magic   11954   timeSat Oct 29 21:16:21 2016
>superblock location 8192id  [ 56f77746 7e87473b ]
>cylgrp  dynamic inodes  4.4BSD  sblock  FFSv2   fslevel 4
>...

That's FFSv1 using a superblock following the FFSv2 layout which
is pretty standard nowadays. The old superblock layout only exists
for compatibility.



-- 
-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: ffs_newvnode: inode has non zero blocks

2016-10-29 Thread coypu
On Sat, Oct 29, 2016 at 06:06:18PM +, Christos Zoulas wrote:
> In article <20161029175845.ga6...@sdf.org>,   wrote:
> >Hi,
> >
> >I'm pretty 'abusive' to my machine. unsurprisingly, I've managed to
> >accumulate a problem:
> >
> >  ffs_newvnode: ino=20681997 on /: gen 5ae8a721/5ae8a721 has non zero
> >blocks 980 or size 0
> >  panic: ffs_newvnode: dirty filesystem?
> >
> >It appears that fsck is not able to clear it.
> >
> >Do you we have a tool for such circumstances?
> 
> This means that you are on ffsv1, which makes it weird. There is clri
> and fsdb for that.
> 
> christos

Cool, thanks!

I do appear to have some franken-filesystem, half FFSv2, half FFSv1.

file system: /dev/rwd0a
format  FFSv1
endian  little-endian
magic   11954   timeSat Oct 29 21:16:21 2016
superblock location 8192id  [ 56f77746 7e87473b ]
cylgrp  dynamic inodes  4.4BSD  sblock  FFSv2   fslevel 4
...


Re: ffs_newvnode: inode has non zero blocks

2016-10-29 Thread Christos Zoulas
In article <20161029175845.ga6...@sdf.org>,   wrote:
>Hi,
>
>I'm pretty 'abusive' to my machine. unsurprisingly, I've managed to
>accumulate a problem:
>
>  ffs_newvnode: ino=20681997 on /: gen 5ae8a721/5ae8a721 has non zero
>blocks 980 or size 0
>  panic: ffs_newvnode: dirty filesystem?
>
>It appears that fsck is not able to clear it.
>
>Do you we have a tool for such circumstances?

This means that you are on ffsv1, which makes it weird. There is clri
and fsdb for that.

christos



ffs_newvnode: inode has non zero blocks

2016-10-29 Thread coypu
Hi,

I'm pretty 'abusive' to my machine. unsurprisingly, I've managed to accumulate 
a problem:

  ffs_newvnode: ino=20681997 on /: gen 5ae8a721/5ae8a721 has non zero blocks 
980 or size 0
  panic: ffs_newvnode: dirty filesystem?

It appears that fsck is not able to clear it.

Do you we have a tool for such circumstances?