Hello, I'm working on adding symlink support to Fossil, mostly because having it is very important for my Mac development, where I put frameworks into repository, and valid frameworks on Mac contain at least 3 symlink inside them. Plus, there were requests for this feature in the past. Currently Fossil just follows links and adds target files into repository, which results in duplicates.
I'm posting this to collect your thoughts and opinions, discuss improvements, and, possibly, get Richard's blessing :) Please let me know if I missed something. FILE FORMAT ----------- Symlink support doesn't modify Fossil's built-for-centuries file format. According to Fossil File Format docs for F-cards in manifest: "The optional 3rd argument defines any special access permissions associated with the file. The only special code currently defined is "x" which means that the file is executable. All files are always readable and writable. This can be expressed by "w" permission if desired but is optional. The file format might be extended with new permission letters in the future." I use this opportunity to add symlink support: to indicate that a file is link, Fossil just adds "l" to access permissions (Mercurial and Git do the same, except that they use octal format for permissions). Example manifest entry: F symlink.txt 53be9689d6ff8975e12c28b769da34d459f0965c l According to readlink man page: "Unlike other filesystem objects, symbolic links may not have an owner, group, access mode, times, etc. Instead, these attributes may be taken from the directory that contains the link." So other permission flags are not required if we have "l". Inside the repo artifacts with symlinks are represented as a simple text file with a link destination inside it. If symlink.txt points to "originals/original.txt", this will be the content of artifact. SETTINGS FLAG ------------- To avoid confusing current users, to maintain compatibility, and for those users who don't want symlinks inside their repos, there's a new global and per-repository option: allow-symlinks If enabled, don't follow symlinks, and instead treat them as symlinks on Unix. Has no effect on Windows (existing links in repository created on Unix become plain-text files with link destination path inside). Default: off By default, it's off, so you have to explicitly turn on this option (except for some import cases, read on). WINDOWS ------- Symlink support is, obviously, available only on Unix-like OSes, so here's how Fossil handles symlinks on Windows: * "allow-symlinks" is treated as always off, even if it's on. * If you checkout a repo with symlinks, Fossil will create plain-text files with original symlink destination inside. Unless you modify this file, the "isLink" flag is preserved between commits. This allows almost smooth cross-platform development as long as Windows users don't touch symlinks. I heard that Windows 7 supports symlinks, but I don't have this version (I test on XP). So there's a possibility that some Windows developer could add full symlink support to Fossil on Windows 7. Another way would be to add some kind of "fossil symlink" commands to allow creating and modifying symlinks on Windows, but it may be an overkill (I haven't done this). MERGING AND DIFFS ------------------ Merging of symlinks is implemented the same way as merging binary files -- that is -- obviously, there's no merging at all: ***** Cannot merge symlink symlink.txt. Diffs between symlinks are shown like diffs between text files, e.g. if your link pointed to "originals/original.txt", but you changed it to point to "originals/new.txt", the diff would be: - originals/original.txt + originals/new.txt When you try to diff between a link and a regular file, Fossil will show error: cannot compute difference between symlink and regular file CODE/SCHEMA CHANGES ------------------- (you can skip this part of you're not a/the Fossil developer :) The whole diff between trunk and symlink branch is not that big, but it touches a lot of places: most of the commands that handle files have to be modified to allow symlinks. Here's summary: DB schema changes: vfile: + islink BOOLEAN stash: + isLink BOOLEAN (I followed the "isExec" flag format) %s.undo: + islink BOOLEAN Repository changes are handled by rebuild, but there's no need to rebuild if you don't intend to use symlinks. Stash and undo tables are modified on-the-fly, no actions required. file.c now contains file_isfile_or_link() function, and uses this everywhere file_isfile() was used when handling files inside working directory (I initially modified file_isfile() to return True for symlinks, but it turned out to be a bad idea because some webserver related places use this function -- we don't want symlinks there.) Various functions check if the file is a symlink with file_islink(). Its result depends on "allow-symlinks" option. There's now blob_read_link() function to use for symlinks instead of blob_read_from_file() where appropriate. It returns link target. Fossil normally just overwrites file contents when you do some destructive changes to your working directory. However, overwriting a symlink is not possible, so in cases when a) there's a symlink in your working directory and Fossil needs to update this symlink or replace it with a regular file, b) there's a regular file which is being replaced with a symlink, it unlink()'s the file and creates a symlink or writes a new regular file there. PERFORMANCE ----------- I think there's a tiny performance penalty (one "if" branch) for every getStat operation: based on "allow-symlinks" option it desides whether to do stat() or lstat(). This option is cached in g.allowSymlinks, so performance penalty must be negligible. I'll measure this later. CHECKSUMS --------- Fossil collects MD5 checksum of filenames+files to verify integrity. MD5 of a link is calculated the following way: sqlite3_snprintf(sizeof(zBuf), zBuf, " %ld\n", blob_read_link(&pathBuf, zFullpath)); md5sum_step_text(zBuf, -1); md5sum_step_text(blob_str(&pathBuf), -1); That is, for this purpose MD5 of a link is indistiguishable from MD5 of a plain-text file with link target path inside. This is needed for cases where "allow-symlinks" is off and to ensure that Windows version will correctly verify checksums. SHA1 of a link "content" is, again, SHA1 of its target path. IMPORT ------ When you're importing from a Git repository using "fossil import" command, if there are symlinks inside this repo, Fossil will import them as symlinks and turn on "allow-symlinks" option for the repo. WEB INTERFACE ------------- Currently there are no changes in web interface related to symlinks. They are displayed as text (as I said earlier, inside the repo, a symlink is just a text with link target path), and there's no indication that it's a symlink. We can implement some interesting things here, for example, display content of a symlink as HTML link if it points to a file or a directory inside the repository. But for now, I think the next step would be to add some indication that you're looking at symlink. FINALLY ------- Support for symlinks is in a very early alpha stage. I'm sure I missed some cases where I don't handle symlinks property. If you notice such case, please let me know (but even though I'm dogfooding it, don't try it "in production", please!) It's available from my clone of Fossil repo (I keep it for experiments with Fossil) inside "symlinks" branch: https://codingrobots.org/fossil/timeline?n=200&r=symlinks I can prepare binaries for OS X, Linux (i386 and x86_64), Windows if anyone is interested in trying it out without compiling. Note that while developing it, I tested it on Windows sometimes, but the recent changes were not yet tested on Windows, so it may even not compile there at all. Everything written here is how *I* implemented symlink support, but *you* might have better ideas, so let me know what you think (even if it's "I don't want symlink support in Fossil" :). -- Dmitry Chestnykh _______________________________________________ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users