Re: zfs file names (inodes) without files (ENOENT)

2011-11-07 Thread Martin von Gagern
On 07.11.2011 22:01, David Brodbeck wrote:
 On Sun, Nov 6, 2011 at 3:32 AM, Martin von Gagern
 martin.vgag...@gmx.net wrote:
 Makes me wonder whether I'd be better off with either some OpenSolaris
 descendant (hoping that the problem only lies in the FreeBSD port of
 ZFS) or with Linux (and either btrfs or some more mature fs).
 
 Both of those solutions are guaranteed to contain a different set of
 bugs. ;)

No doubt about that, but I have the hope that bugs I don't encounter
within the first 48 hours of system usage might be less likely to cause
severe trouble in the long run, too.

 I've looked at both FreeBSD and OpenIndiana, and could give
 you some thoughts on what I see is their respective strong and weak
 points, if you're interested.

Thanks for the offer, but see below.

 It *would* be interesting to see what OpenIndiana made of that
 filesystem.  You could boot a LiveCD and then import the pool and see
 if you had the same issue.  If nothing else that might let you delete
 the problematic file entry.

Thought the same, and gave it a try. zpool claims there is no pool of
that name. zpool -f doesn't help. Looking at the device nodes, it
appears as though OI would only recognize 3 of my 4 HDDs, which seems
really strange, given the fact that they're all wired the same way. So I
didn't even get to looking at the dir in question. That, combined with
the fact that OI boot process provides too little information for my
taste, and the additional fact that Backspace doesn't seem to work out
of the box, has let me develop a dislike for the system upon initial
use. I don't fancy having to configure such basic things.

So I'm heading towards Linux now, as no solution to the FreeBSD ZFS
problems seems to be forthcoming. Will probably be running some tandem
of btrfs and Ext4 for now, until btrfs becomes more mature or space
requirements force me to drop one of those file systems.

Thanks for your input, David,
 Martin



signature.asc
Description: OpenPGP digital signature


Re: zfs file names (inodes) without files (ENOENT)

2011-11-06 Thread Martin von Gagern
On 05.11.2011 23:13, Martin von Gagern wrote:
 Is there any
 tool to check or rebuild the inode data structures of zfs? zpool scrub
 doesn't seem to fit the bill, as its manpage indicates a computation of
 file content checksums.

Ran a scrub anyway, no errors reported there, problem persist.

Makes me wonder whether I'd be better off with either some OpenSolaris
descendant (hoping that the problem only lies in the FreeBSD port of
ZFS) or with Linux (and either btrfs or some more mature fs).

Unless I can find out what's happening here and how to avoid it from
happening again in the future, that is. Short of paying for ECC RAM and
fancy top-grade HDDs. I still doubt it's hardware failure, and even if
it were I'd rather be able to fix any problems that occur than cling to
the vain hope that I could ever completely avoid them by spending money.

Martin



signature.asc
Description: OpenPGP digital signature


zfs file names (inodes) without files (ENOENT)

2011-11-05 Thread Martin von Gagern
Hi!


A. SUMMARY

Long story short: I have a file name on my zfs without a file to it. ls
will include it in the dir content, but stat-ing that file will result
in an ENOENT error: No such file or directory.


B. HISTORY

So how did I come to this situation? I've recently had to kill the
sending side of an rsync, with the receiving side on FreeBSD. For
reasons yet unknown, the next run of rsync started deleting stuff it
shouldn't. Details on this are in PR 162318 [1], but quoting the most
important things:

Logging into the receiving FreeBSD as root, I found that large parts of
the user's home directory content had disappeared, even outside the
subdirectory used as the rsync destination!
- All the .* config files in the home directory were gone
- The .ssh directory was still present, but its content was gone as well
- Both the home dir and the .ssh subdir contained a file rsync.%stat,
  which should be the name of an extattr instead, used to implement the
  rsync --fake-super command line option.

[1] http://www.freebsd.org/cgi/query-pr.cgi?pr=162318


C. SYMPTOMS

I first assumed a problem in the binary rsync build for FreeBSD, but
devs on the above bug report favored RAM failure or an upstream source
code bug. So I gave it another try, and payed closer attention to the
error messages. Among them was the following:

 rsync: stat /home/name/backup/etc/ca-certificates failed: No such file or 
 directory (2)

Strange thing is, this isn't specific to rsync at all, it can be
reproduced using simple command line tools like ls:

 # ls /home/name/backup/etc/ | grep ca-cert
 ca-certificates
 ca-certificates.conf
 ca-certificates.conf~
 # ls /home/name/backup/etc/ca-*
 ls: /home/name/backup/etc/ca-certificates: No such file or directory
 /home/name/backup/etc/ca-certificates.conf
 /home/name/backup/etc/ca-certificates.conf~

So as you see, the name is returned by readdir(3), where both ls for the
dir and the wildcard expansion find it. But anything that stat(2)s the
file will encounter an ENOENT error. zpool status says everything's
fine, so zfs isn't aware of any corruption.

I believe that no matter what errors user space programs might make, the
kernel zfs driver should never allow the above to happen. Either a file
is there, or it isn't, there should be no such mixture. So what do you
think, is this likely to be a bug in the zfs implementation?

I found one other person describing problems like this: in threads
titled file lose inode in Memory-Based file system., lisen1001
described pretty much the same thing, except on ramdisk on 8.2 instead
of my own hdd-based raidz on 9.0-RC1 [2,3].

[2] http://thread.gmane.org/gmane.os.freebsd.questions/280183
[3] http://thread.gmane.org/gmane.os.freebsd.devel.file-systems/13153


D. NEXT STEPS

As I'm new to FreeBSD, I'm not yet sure how bug reports are handled
around here. As I said, I've reported a bug report against rsync, and it
has been closed on the grounds that this appears to be an upstream
problem. Would it make sense to include the above information in the bug
report for reference? Would replying to the gnats address be enough to
accomplish that? Should the bug be reopened, as I assume all my problems
to be related, and as the zfs corruption at least is specific to
FreeBSD? If so, how does one reopen a report? Or who can do that?

Do you agree that this looks like a problem in the ZFS implementation?
Should I file a new problem report for that?

Can you suggest any way I could resolve the corruption on my local ZFS
pool, short of destroying and recreating the whole file system? rm for
the file doesn't work, as it, too, encounters the ENOENT. Is there any
tool to check or rebuild the inode data structures of zfs? zpool scrub
doesn't seem to fit the bill, as its manpage indicates a computation of
file content checksums.


Greetings,
 Martin von Gagern



signature.asc
Description: OpenPGP digital signature


Re: zfs file names (inodes) without files (ENOENT)

2011-11-05 Thread Martin von Gagern
On 06.11.2011 00:27, David Brodbeck wrote:
 I'm curious if you've tried ls -B to see if there are any
 non-printable characters in the filename.

Hadn't tried yet, did try now, nothing strange there.
Nevertheless, thanks for the suggestion, David.

By the way, even ls -l of the whole directory will stat the file and
thus result in an error. There can hardly be any strange characters
involved there, as the name should be straight from readdir.

Martin



signature.asc
Description: OpenPGP digital signature