Hello,

I'm working on adding symlink support to Fossil, mostly because having
it is very important for my Mac development, where I put frameworks
into repository, and valid frameworks on Mac contain at least 3
symlink inside them. Plus, there were requests for this feature in the
past. Currently Fossil just follows links and adds target files into
repository, which results in duplicates.

I'm posting this to collect your thoughts and opinions, discuss
improvements, and, possibly, get Richard's blessing :) 

Please let me know if I missed something.


FILE FORMAT
-----------

Symlink support doesn't modify Fossil's built-for-centuries file
format. According to Fossil File Format docs for F-cards in manifest:

  "The optional 3rd argument defines any special access permissions
  associated with the file. The only special code currently defined is
  "x" which means that the file is executable. All files are always
  readable and writable. This can be expressed by "w" permission if
  desired but is optional. The file format might be extended with new
  permission letters in the future."

I use this opportunity to add symlink support: to indicate that a file
is link, Fossil just adds "l" to access permissions (Mercurial and Git
do the same, except that they use octal format for permissions).

Example manifest entry:

F symlink.txt 53be9689d6ff8975e12c28b769da34d459f0965c l

According to readlink man page:

  "Unlike other filesystem objects, symbolic links may not have an
  owner, group, access mode, times, etc. Instead, these attributes may
  be taken from the directory that contains the link."

So other permission flags are not required if we have "l".

Inside the repo artifacts with symlinks are represented as a simple
text file with a link destination inside it. If symlink.txt points to
"originals/original.txt", this will be the content of artifact.


SETTINGS FLAG
-------------

To avoid confusing current users, to maintain compatibility, and for
those users who don't want symlinks inside their repos, there's a new
global and per-repository option:

allow-symlinks   If enabled, don't follow symlinks, and instead treat
                 them as symlinks on Unix. Has no effect on Windows
                 (existing links in repository created on Unix become 
                 plain-text files with link destination path inside).
                 Default: off

By default, it's off, so you have to explicitly turn on this option
(except for some import cases, read on).


WINDOWS
-------

Symlink support is, obviously, available only on Unix-like OSes, so
here's how Fossil handles symlinks on Windows:

  * "allow-symlinks" is treated as always off, even if it's on.
   
  * If you checkout a repo with symlinks, Fossil will create
    plain-text files with original symlink destination inside. Unless
    you modify this file, the "isLink" flag is preserved between
    commits. This allows almost smooth cross-platform development as
    long as Windows users don't touch symlinks.

I heard that Windows 7 supports symlinks, but I don't have this
version (I test on XP). So there's a possibility that some Windows
developer could add full symlink support to Fossil on Windows 7.

Another way would be to add some kind of "fossil symlink" commands to
allow creating and modifying symlinks on Windows, but it may be an
overkill (I haven't done this).


MERGING AND DIFFS
------------------

Merging of symlinks is implemented the same way as merging binary
files -- that is -- obviously, there's no merging at all:

  ***** Cannot merge symlink symlink.txt.
  
Diffs between symlinks are shown like diffs between text files, e.g.
if your link pointed to "originals/original.txt", but you changed it
to point to "originals/new.txt", the diff would be:

  - originals/original.txt
  + originals/new.txt

When you try to diff between a link and a regular file, Fossil will
show error:

  cannot compute difference between symlink and regular file 
    

CODE/SCHEMA CHANGES
-------------------

(you can skip this part of you're not a/the Fossil developer :)

The whole diff between trunk and symlink branch is not that big, but
it touches a lot of places: most of the commands that handle files
have to be modified to allow symlinks. Here's summary:

DB schema changes:

vfile:
   + islink BOOLEAN
stash:
   + isLink BOOLEAN  (I followed the "isExec" flag format)
%s.undo:
   + islink BOOLEAN
   
Repository changes are handled by rebuild, but there's no need to
rebuild if you don't intend to use symlinks. Stash and undo tables
are modified on-the-fly, no actions required.

file.c now contains file_isfile_or_link() function, and uses this
everywhere file_isfile() was used when handling files inside working
directory (I initially modified file_isfile() to return True for
symlinks, but it turned out to be a bad idea because some webserver
related places use this function -- we don't want symlinks there.)

Various functions check if the file is a symlink with file_islink().
Its result depends on "allow-symlinks" option.

There's now blob_read_link() function to use for symlinks instead of
blob_read_from_file() where appropriate. It returns link target.

Fossil normally just overwrites file contents when you do some
destructive changes to your working directory. However, overwriting a
symlink is not possible, so in cases when a) there's a symlink in your
working directory and Fossil needs to update this symlink or replace
it with a regular file, b) there's a regular file which is being
replaced with a symlink, it unlink()'s the file and creates a symlink
or writes a new regular file there.


PERFORMANCE
-----------

I think there's a tiny performance penalty (one "if" branch) for every
getStat operation: based on "allow-symlinks" option it desides whether
to do stat() or lstat(). This option is cached in g.allowSymlinks, so 
performance penalty must be negligible. I'll measure this later.


CHECKSUMS
---------

Fossil collects MD5 checksum of filenames+files to verify integrity. 
MD5 of a link is calculated the following way:

  sqlite3_snprintf(sizeof(zBuf), zBuf, " %ld\n", 
                 blob_read_link(&pathBuf, zFullpath));
  md5sum_step_text(zBuf, -1);
  md5sum_step_text(blob_str(&pathBuf), -1);

That is, for this purpose MD5 of a link is indistiguishable from MD5
of a plain-text file with link target path inside. This is needed for
cases where "allow-symlinks" is off and to ensure that Windows version
will correctly verify checksums.

SHA1 of a link "content" is, again, SHA1 of its target path.


IMPORT
------

When you're importing from a Git repository using "fossil import"
command, if there are symlinks inside this repo, Fossil will import
them as symlinks and turn on "allow-symlinks" option for the repo.


WEB INTERFACE
-------------

Currently there are no changes in web interface related to symlinks.
They are displayed as text (as I said earlier, inside the repo,
a symlink is just a text with link target path), and there's no
indication that it's a symlink.

We can implement some interesting things here, for example, display
content of a symlink as HTML link if it points to a file or a
directory inside the repository.

But for now, I think the next step would be to add some indication
that you're looking at symlink.


FINALLY
-------

Support for symlinks is in a very early alpha stage. I'm sure I missed
some cases where I don't handle symlinks property. If you notice such
case, please let me know (but even though I'm dogfooding it, don't try
it "in production", please!)

It's available from my clone of Fossil repo (I keep it for experiments
with Fossil) inside "symlinks" branch:

https://codingrobots.org/fossil/timeline?n=200&r=symlinks

I can prepare binaries for OS X, Linux (i386 and x86_64), Windows if
anyone is interested in trying it out without compiling.

Note that while developing it, I tested it on Windows sometimes, but
the recent changes were not yet tested on Windows, so it may even not
compile there at all.

Everything written here is how *I* implemented symlink support, but
*you* might have better ideas, so let me know what you think (even if
it's "I don't want symlink support in Fossil" :).

--
Dmitry Chestnykh

_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

Reply via email to