Russ Allbery <[email protected]> wrote:
Chris, to check, are you currently using --enable-fast-restart or
--enable-bitmap-later?
Yes, both of them.
Please understand that neither of those options are recommended now,
whether you have DAFS enabled or not. I consider --enable-fast-restart in
particular to be dangerous and likely to cause or propagate file
corruption and would not feel comfortable ever running it in production.
I know that some people are using the existing implementation and taking
their chances, and if they're expert AFS administrators and know what
they're risking, that's fine, but, as I understand it, it's pretty much
equivalent to disabling fsck and journaling on your file systems after
crashes and just trusting that there won't be any damage or that, if there
is, you'll fsck when you notice it.
I have heard that, but I have never experienced any problems myself in many
years of running that way. In general the way I see it is that if the power
goes out, my server stays up for a little longer due to its UPS but the
network dies immediately so the AFS processes are not doing anything when
the power finally dies and the server goes down a few minutes later. (This
is of course assuming no actual server crashes and luckily I haven't had any
of those.)
Its fine to not have it enabled by default, but I can't see why one would
remove the functionality from the source tree.
If you want to require a --yes-i-know-i-can-corrupt-data configure option,
that is also fine, but requiring source code patches sounds like an major
annoyance.
-----
I guess I don't understand the particulars of what could happen, but if one
is really worried about sending corrupt data, wouldn't the best thing to do
be check the data as it is being sent and return errors then and log that
something is wrong, not require an ENTIRE VOLUME to be salvaged, leaving all
of the files inaccessible for a potentially long period of time? I assume
that such a thing is not possible to do?
I mean I occationally see NTFS errors in the event log on Windows servers.
Windows doesn't take the disk offline and run a chkdsk for me to prevent
potential errors, it allows me to try and access other data and if it works
there are no problems and denies access to specific files or directories if
there is corruption.
At the same time, I'd be happy to start doing more testing of the
various DAFS features, although I'm not quite sure what version I should
be using for testing,
If you want to test DAFS, you need to use a 1.5 series server or (coming
soon) a 1.6 release candidate.
Ah, excellent. I will wait for a 1.6 release candidate.
Will DAFS be enabled by default in 1.6? Or is that still being determined?
nor am I completely sure how to actually migrate an existing file server
to use DAFS or if there is a reverse path to downgrade if I encounter
problems.
Migration is documented in the bos_create(8) man page as one of the
examples. You can do the inverse procedure to downgrade, although of
course you'll also need to replace the server binaries with a version
compiled without demand-attach.
Ok, so http://docs.openafs.org/Reference/8/bos_create.html is the only
documentation on openafs.org on demand attach?
Ah, I see a http://docs.openafs.org/Reference/8/salvageserver.html as well.
Perhaps a generic dafs man page is in order for us non-developer types to be
up to speed on what DAFS is, what the benefits are, and how to use it
correctly?
<<CDC
_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info