On Thu, Nov 02, 2017 at 08:11:59PM +0100, Michał Górny wrote: > Next version. Now without MISC/OPTIONAL, and with many clarifications. Huge improvements in this version, I found it much easier to understand.
Nits: - please stick to ASCII ellipsis. The unicode ellipsis is unreadable in some monospace fonts. Further items inline: > Directory tree coverage > ----------------------- ... > The file entries (except for ``IGNORE``) can be specified for regular > files only. Symbolic links are followed when opening files > and traversing directories. It is an error to specify an entry for > a different file type. If the tree contain files of other types > that are not otherwise ignored, they need to be covered by an explicit > ``IGNORE``. > > All the local (non-``DIST``) files covered by a Manifest tree must > reside on the same filesystem. It is an error to specify entries > applying to files on another filesystem. If subdirectories > that are not otherwise ignored reside on a different filesystem, they > must be explicitly excluded via ``IGNORE``. I would prefer this to say: 'If files that are not otherwise ignored reside on a different filesystem', as expanded from sub-directories. This implicitly forbids following a symlink that crosses a filesystem boundary, and then matches the similar part of 'Tree layout restrictions'. > Rationale > ========= ... > Tree layout restrictions > ------------------------ > > The algorithm is meant to work primarily with ebuild repositories which > normally contain only files and directories. Directories provide > no useful metadata for verification, and specifying special entries > for additional file types is purposeless. Therefore, the specification > is restricted to dealing with regular files. > > The Gentoo repository does not use symbolic links. Some Gentoo > repositories do, however. To provide a simple solution for dealing with > symlinks without having to take care to implement special handling for > them, the common behavior of implicitly resolving them is used. > Therefore, symbolic links to files are stored as if they were regular > files, and symbolic links to directories are followed as if they were > regular directories. > > Dotfiles are implicitly ignored as that is a common notion used > in software written for POSIX systems. All other common filenames > require explicit ``IGNORE`` lines. 'common' in the second sentence seems odd. What about uncommon filenames? Maybe just s/other common filenames/other filenames/. > An ability to inject additional ignore entries is provided to account > for site configuration affecting the repository tree — placing > additional files in it, skipping some of the categories from syncing. Mention that the package manager may provide wildcards or regex in the additional entries. Eg: 'IGNORE **/metadata.xml' > Non-strict Manifest verification > -------------------------------- ... > The cases for stripping unnecessary files mostly focused around space > savings. For this purpose, stripping ``metadata.xml`` and similar files > has little value. It is much more common for users to strip whole > categories which can not be handled via the ``MISC`` type, and needs > a dedicated package manager mechanism. The same mechanism can also > handle files that used the ``MISC`` type. Exclusion by package does happen as well. A list of categories or packages can be used for both the rsync exclusion and the IGNORE. > Splitting distfile checksums from file checksums > ------------------------------------------------ > > Another problem with the current Manifest format is that the checksums > for fetched files are combined with checksums for local files > in a single file inside the package directory. It has been specifically > pointed out that: > > - since distfiles are sometimes reused across different packages, > the repeating checksums are redundant, Comment: 8.4% of all DIST entries are duplicate, representing a 2MiB saving in tree size (25MiB of DIST entries altogether). > - mirror admins were interested in the possibility of verifying all > the distfiles with a single tool. > > This specification does not provide a clean solution to this problem. > It technically permits moving ``DIST`` entries to higher-level Manifests > but the usefulness of such a solution is doubtful. This solution would require the packager manager to consider higher-level Manifests or all Manifests in the tree when searching for the DIST entry. The most useful implementation of this would be for the git->rsync process to move all DIST entries elsewhere (metadata/ maybe). Either way, this would have many downsides, and make manual work on the Manifest DIST entries painful. -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation Asst. Treasurer E-Mail : [email protected] GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136
signature.asc
Description: Digital signature
