Re: lots of "no such file or directory" errors in zfs filesystem

2021-02-23 Thread Chris Anderson
On Tue, Feb 23, 2021 at 4:53 AM Andriy Gapon  wrote:

> On 23/02/2021 05:25, Chris Anderson wrote:
> > so I can't ls -i the file since that triggers the no such file warning.
> if I run
> > zdb - on the inode of a directory which contains one of those
> missing files,
> > I can get the inode of the file from that, but I don't get anything
> particularly
> > interesting in the output.
> >
> > most of the files that are missing are in directories with a large
> number of
> > files (the largest has 180k) but I managed to find a directory which had
> a
> > single file entry that is missing:
> >
> > Dataset tank/home/cva [ZPL], ID 196, cr_txg 163, 109G, 908537 objects,
> rootbp
> > DVA[0]=<0:13210311000:1000> DVA[1]=<0:18b9a02c000:1000> [L0 DMU objset]
> > fletcher4 uncompressed LE contiguous unique double size=800L/800P
> > birth=46916371L/46916371P fill=908537
> > cksum=11fdd21d1d:13cb24c87a6e:da0c9bf1b5df3:715ab2ec45b7b09
> >
> >
> > Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
> >
> >  382681   128K 1K  0512 1K  100.00  ZFS directory
> >
> >264   bonus  ZFS znode
> >
> > dnode flags: USED_BYTES USERUSED_ACCOUNTED
> >
> > dnode maxblkid: 0
> >
> > uid 1001
> >
> > gid 1001
> >
> > atime   Sun Aug  6 02:00:41 2017
> >
> > mtime   Wed Apr 15 12:12:42 2020
> >
> > ctime   Wed Apr 15 12:12:42 2020
> >
> > crtime  Sat Aug  5 15:10:07 2017
> >
> > gen 23881023
> >
> > mode40755
> >
> > size3
> >
> > parent  38176
> >
> > links   2
> >
> > pflags  4080144
> >
> > xattr   0
> >
> > rdev0x
> >
> > microzap: 1024 bytes, 1 entries
> >
> >
> >
> > hash_test.go = 38274 (type: Regular File)
> >
> >
> > # zdb - tank/home/cva 38274
> >
> > Dataset tank/home/cva [ZPL], ID 196, cr_txg 163, 109G, 908537 objects,
> rootbp
> > DVA[0]=<0:13210311000:1000> DVA[1]=<0:18b9a02c000:1000> [L0 DMU objset]
> > fletcher4 uncompressed LE contiguous unique double size=800L/800P
> > birth=46916371L/46916371P fill=908537
> > cksum=11fdd21d1d:13cb24c87a6e:da0c9bf1b5df3:715ab2ec45b7b09
> >
> >
> > Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
> >
> > zdb: dmu_bonus_hold(38274) failed, errno 2
>
> So, this looks like a "simple" problem.
> Unfortunately, it is very hard to tell retrospectively what bug caused it.
> The directory has an entry for the file, but the file does not actually
> exist
> (or has a different ID).
> This is a logical inconsistency, not a data integrity issue.
> So, a scrub, being a data integrity check, would not detect such an issue.
> Hypothetical zfs_fsck is needed to find and repair such logical problems.
>

ah, I see. that makes sense.


> Does that pool and filesystem have any special history?
> I mean upgrades, replication via send/recv, moving between OS-s, etc.
>

nope, it led a pretty boring life. that zfs filesystem was created on that
server and has been on the same two mirrored disks for its lifetime. it has
had freebsd upgrades applied as they became available. zfs upgrades were
for the most part avoided until quite recently (though the problem existed
prior to the upgrades) the server does have a relatively modest amount of
ram (2GB). dunno if that makes it more likely that these kinds of issues
get triggered.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: lots of "no such file or directory" errors in zfs filesystem

2021-02-22 Thread Chris Anderson
On Mon, Feb 22, 2021 at 9:13 AM Andriy Gapon  wrote:

> On 22/02/2021 16:20, Chris Anderson wrote:
> > On Mon, Feb 22, 2021 at 1:36 AM Andriy Gapon  > <mailto:a...@freebsd.org>> wrote:
> >
> > On 22/02/2021 09:31, Chris Anderson wrote:
> > > None of these files are especially important to me, however I was
> wondering
> > > if there would be any benefit to the community from trying to
> debug this
> > > issue further to understand what might be going wrong.
> >
> > Yes.
> >
> >
> > Could you offer any guidance about what kind of debugging information I
> could
> > collect that would be of use?
>
> You can start with picking a single file that demonstrates the problem.
> Then,
> ls -li the-file
> zdb - file's-filesystem file's-inode-number
> The filesystem can be found out from df output, the inode number is in ls
> -li
> output -- if the command prints anything at all.
> If it does not, then do ls -lid on the file's directory and then zdb -
> for
> the directory's inode number.  In the output there should be the file name
> and
> its number (I think that it's in hex, but not sure).
>

so I can't ls -i the file since that triggers the no such file warning. if
I run zdb - on the inode of a directory which contains one of those
missing files, I can get the inode of the file from that, but I don't get
anything particularly interesting in the output.

most of the files that are missing are in directories with a large number
of files (the largest has 180k) but I managed to find a directory which had
a single file entry that is missing:

Dataset tank/home/cva [ZPL], ID 196, cr_txg 163, 109G, 908537 objects,
rootbp DVA[0]=<0:13210311000:1000> DVA[1]=<0:18b9a02c000:1000> [L0 DMU
objset] fletcher4 uncompressed LE contiguous unique double size=800L/800P
birth=46916371L/46916371P fill=908537
cksum=11fdd21d1d:13cb24c87a6e:da0c9bf1b5df3:715ab2ec45b7b09


Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type

 382681   128K 1K  0512 1K  100.00  ZFS directory

   264   bonus  ZFS znode

dnode flags: USED_BYTES USERUSED_ACCOUNTED

dnode maxblkid: 0

uid 1001

gid 1001

atime   Sun Aug  6 02:00:41 2017

mtime   Wed Apr 15 12:12:42 2020

ctime   Wed Apr 15 12:12:42 2020

crtime  Sat Aug  5 15:10:07 2017

gen 23881023

mode40755

size3

parent  38176

links   2

pflags  4080144

xattr   0

rdev0x

microzap: 1024 bytes, 1 entries



hash_test.go = 38274 (type: Regular File)


# zdb - tank/home/cva 38274

Dataset tank/home/cva [ZPL], ID 196, cr_txg 163, 109G, 908537 objects,
rootbp DVA[0]=<0:13210311000:1000> DVA[1]=<0:18b9a02c000:1000> [L0 DMU
objset] fletcher4 uncompressed LE contiguous unique double size=800L/800P
birth=46916371L/46916371P fill=908537
cksum=11fdd21d1d:13cb24c87a6e:da0c9bf1b5df3:715ab2ec45b7b09


Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type

zdb: dmu_bonus_hold(38274) failed, errno 2


>
> --
> Andriy Gapon
>
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: lots of "no such file or directory" errors in zfs filesystem

2021-02-22 Thread Chris Anderson
On Mon, Feb 22, 2021 at 1:36 AM Andriy Gapon  wrote:

> On 22/02/2021 09:31, Chris Anderson wrote:
> > None of these files are especially important to me, however I was
> wondering
> > if there would be any benefit to the community from trying to debug this
> > issue further to understand what might be going wrong.
>
> Yes.
>

Could you offer any guidance about what kind of debugging information I
could collect that would be of use?


>
> --
> Andriy Gapon
> ___
> freebsd-stable@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


lots of "no such file or directory" errors in zfs filesystem

2021-02-21 Thread Chris Anderson
I'm in the process of decommissioning an old zfs based file server and I
noticed that around a dozen files with directory entries which fail with
"No such file or directory" when trying to read them.

I can't remember what the original version of freebsd installed was, but
it's been in production for at least 7 years and has been upgraded with
freebsd-update as new FreeBSD releases came available (it is currently on
12.2-RELEASE-p3).

The behavior is perplexing since I've never had any scrub failures on the
pool those files reside in yet from looking at old security run outputs,
the number of files in that state has increased over time.

None of these files are especially important to me, however I was wondering
if there would be any benefit to the community from trying to debug this
issue further to understand what might be going wrong.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


libopie problems after upgrade to 10.2

2015-08-15 Thread Chris Anderson
just upgraded from 10.1-RELEASE-p16 to 10.2-RELEASE using freebsd-update.

after the upgrade, I began getting errors because pam_opie.so.5 has an
unsatisfied link to libopie.so.7 (my system only has libopie.so.8).

I notice a fresh install of 10.2-RELEASE does indeed contain libopie.so.7,
so I'm curious how I managed to get into this state in the first place and
whether it is anything I should worry about. This machine has only been
upgraded using freebsd-update and I'm pretty sure it started from
10.0-RELEASE.

I have temporarily worked around with an entry in libmap.

Here are the files involved:

# ls -l /usr/lib/pam_opie*
lrwxr-xr-x  1 root  wheel13 Sep 27  2013 /usr/lib/pam_opie.so -
pam_opie.so.5
-r--r--r--  1 root  wheel  7000 Aug 14 11:56 /usr/lib/pam_opie.so.5
lrwxr-xr-x  1 root  wheel19 Sep 27  2013 /usr/lib/pam_opieaccess.so -
pam_opieaccess.so.5
-r--r--r--  1 root  wheel  5568 Aug 14 11:56 /usr/lib/pam_opieaccess.so.5

# ls -l /usr/lib/libopie*
-r--r--r--  1 root  wheel  84582 Aug 14 11:57 /usr/lib/libopie.a
lrwxr-xr-x  1 root  wheel 12 Sep 29  2014 /usr/lib/libopie.so -
libopie.so.8
-r--r--r--  1 root  wheel  38280 Oct  5  2014 /usr/lib/libopie.so.8
-r--r--r--  1 root  wheel  88048 Aug 14 11:57 /usr/lib/libopie_p.a
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org