Everything that Mark says is true. I'll add that some shops optimize their read operations under certain conditions, and such optimizations would break if the RCS files are updated in-place.

What happens is that, if the version of every file can be identified in advance (using version number, tag, or branch/timestamp pair) then they can invoke RCS directly to fetch file versions, read metadata, and so on. This sidesteps CVS' overhead and can increase performance by as much as 50%. Such operations will also succeed and not interfere with write operations to the repository, such as commits and the creation of new tags. Moving tags or using "cvs admin" may sometimes cause race conditions that produce incorrect results, but that all depends on the nature of the changes being made at the time and how the readable versions have been identified.

The reason that such an optimization works is because RCS rewrites the RCS file updates into the lock file, filesystem semantics always keep the complete RCS file intact while it's being read, and pre-existing data in the RCS file are not changed during write operations (except for those race conditions I've identified above, which can be avoided).

On Mar 20, 2005, at 8:28 AM, [EMAIL PROTECTED] wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Dr. David Alan Gilbert <[EMAIL PROTECTED]> writes:

So - here are my questions/ideas - I'd appreciate comments to tell
me whether I'm on the right lines:
  1) As I understand it the tag data is the
  first of the 3 main data structures in the RCS
  file (tag, comments, diffs) and that when I do
  pretty much any CVS operation I rewrite the
  whole file - is this correct?

CVS write operations on a foo.c,v repository file will write ,foo.c, and then when the write operation is successful and without any errors, it does a rename (",foo.c,", "foo.c,v"); to make the new version the official version. While the ,foo.c, file exists, RCS commands will consider the file locked.

It is desirable to use RCS write semanitcs as many
other tools out there (cf, ViewCVS) use RCS on the
repository and want to obey RCS locking.

  2) White space appears to be irrelevent in RCS
  files; so adding arbitrary amounts in between
  sections should leave files still fully
  compatible with existing RCS/cvs tools.

Tools such as CVSup by default will canonicalize the whitespace between sections (although this may be configured). So, yes, whitespace is mostly irelevent between sections.

  3) So the idea is that when I add a tag I add
  a bunch of white space after the tag (lets say
  1KB of spaces split into 64 byte lines or
  similar); when I come to add the next tag I
  check if there is plenty of white space, if
  there is then instead of rewriting the file I
  just overwrite the white space with my new tag
  data; if there is no space then as I rewrite
  the file I add another lump of white space.

This has the potential to more easily corrupt the RCS file if the operation is interrupted for any reason.

  4) Whether dummy white space is added and how
  much is controlled by the existing size of the
  RCS file; so an RCS file that is only a few KB
  wont have any space added; that way this
  mechanism doesn't slow down/bloat small
  repositories. The amount of white space might
  be chosen to align data structures with disk
  block boundaries.

  5) My main concern is to do with
  concurrency/consistency requirements; is the
  file rewrite essential to ensure consistency,
  or is the locking that is carried out
  sufficient?

Does this make sense?

It would be more robust to enhance CVS to use an external database for tagging information instead of putting the tagging information into the RCS files directly than to rewrite parts of the RCS file and hope that the operation didn't corrupt the file along the way.

You may wish to consider looking at Meta-CVS as I
believe that Kaz keeps a lot of the branching
information outside of the RCS files already.

See http://users.footprints.net/~kaz/mcvs.html
for more details on Meta-CVS.

        Good luck,
        -- Mark
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (FreeBSD)

iD8DBQFCPaS23x41pRYZE/gRAjULAJ9RzLHw+gUDoMCbF0zjgmStBJIT9gCfUU83
K/TZMZdXbJx+BWVFaXGS0Jk=
=fz6n
-----END PGP SIGNATURE-----


_______________________________________________ Info-cvs mailing list [email protected] http://lists.gnu.org/mailman/listinfo/info-cvs

--
Paul Sander | "When a true genius appears in the world, you may
[EMAIL PROTECTED] | know him by this sign: that all the dunces are in
| confederacy against him." -- Jonathan Swift, writer.




_______________________________________________
Info-cvs mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/info-cvs

Reply via email to