Rob van der Heij writes:
> I'm about to program changes to critical files. How does
> one do that in a reliable way?
>
> On CMS one would do it like this:
> - create a new file out of the current one
> - rename current to backup with NOUPDIR option
> - rename new to current
> The NOUPDIR prevents update on disk, so the next rename
> effectively does both in one go. When writing the directory
> to disk CMS itself creates the new one, and does a final
> write to swap between the old and new one. If anytime during
> this process the light would go out, I would still have a
> consistent disk (either with or without the change).
>
> How do you do this with Linux. I suppose I should minimize
> the window by creating a new file and then do the renames.
> But the way dirty pages are written to disk I could end up
> with a disk that has the new directory but not the new file?
> I don't think I can tell Linux to commit the change to disk,
> so should I do a sync before the renames and assume that the
> two renames short after each other will be written out in a
> single I/O operation?
No: "sync" and "committing to disk" are low level things that
happen behind the scenes and you don't need to worry about here.
Linux (same as any Unix) presents a file system namespace with
reference counting and atomic operations so you can do this stuff
without any race conditions or windows at all. Behold; I have
nothing up my sleeves:
ln foo foo.bak
Creates a new link to the same file as foo does: both
foo and foo.bak now refer to exactly the same underlying file.
cp foo newfoo
Create a copy newfoo which you then edit, change, modify and
do whatever you want to in order to prepare your new version.
mv newfoo foo
Atomically replaces the directory entry foo: before the
command (specifically: before the system call "rename" that
mv does for you), opening "foo" refers to the old file; after
it, opening "foo" refers to the new file. At no time is there
a window where no file named "foo" exists and at no time is
there a window where both exist or get mixed up in any way.
Processes which already have the old foo open continue happily
onwards with the old underlying file: they have reference counts on
the file in the same way that the directory entries do and they can
modify the underlying file however they like.
That initial "ln foo foo.bak" you did also means you can access the old
file under the name "foo.bak". Note that creating that foo.bak link
is not a necessary part of letting those existing processes continue
to access the file. If you don't create that extra foo.bak link, those
processes still have a reference count on the underlying file and can
access/modify/map it. The only difference is that there is no longer
any name in the filesystem by which that file can be newly opened.
However, creating that hard link is useful so that you still have a
name for that old file (in case your new one turns out not to have
been such a good idea after all). It also saves taking a complete new
copy which would have meant that you'd have had three copies of the
data at one point (old one, backup of old one, new one) which might
be inconvenient with big files.
--Malcolm
--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself