W dniu czw, 02.11.2017 o godzinie 23∶43 +0000, użytkownik Robin H. Johnson napisał: > On Thu, Nov 02, 2017 at 08:11:59PM +0100, Michał Górny wrote: > > Next version. Now without MISC/OPTIONAL, and with many clarifications. > > Huge improvements in this version, I found it much easier to understand. > > Nits: > - please stick to ASCII ellipsis. The unicode ellipsis is unreadable in > some monospace fonts.
Done. Also replaced '—' for consistency. > > Further items inline: > > Directory tree coverage > > ----------------------- > > ... > > The file entries (except for ``IGNORE``) can be specified for regular > > files only. Symbolic links are followed when opening files > > and traversing directories. It is an error to specify an entry for > > a different file type. If the tree contain files of other types > > that are not otherwise ignored, they need to be covered by an explicit > > ``IGNORE``. > > > > All the local (non-``DIST``) files covered by a Manifest tree must > > reside on the same filesystem. It is an error to specify entries > > applying to files on another filesystem. If subdirectories > > that are not otherwise ignored reside on a different filesystem, they > > must be explicitly excluded via ``IGNORE``. > > I would prefer this to say: > 'If files that are not otherwise ignored reside on a different > filesystem', as expanded from sub-directories. > This implicitly forbids following a symlink that crosses a filesystem > boundary, and then matches the similar part of 'Tree layout > restrictions'. I've went for something even more explicit: | If files or directories that are not otherwise ignored reside | on a different filesystem, or symbolic links point to targets | on a different filesystem, they must be explicitly excluded | via ``IGNORE``. > > > Rationale > > ========= > > ... > > Tree layout restrictions > > ------------------------ > > > > The algorithm is meant to work primarily with ebuild repositories which > > normally contain only files and directories. Directories provide > > no useful metadata for verification, and specifying special entries > > for additional file types is purposeless. Therefore, the specification > > is restricted to dealing with regular files. > > > > The Gentoo repository does not use symbolic links. Some Gentoo > > repositories do, however. To provide a simple solution for dealing with > > symlinks without having to take care to implement special handling for > > them, the common behavior of implicitly resolving them is used. > > Therefore, symbolic links to files are stored as if they were regular > > files, and symbolic links to directories are followed as if they were > > regular directories. > > > > Dotfiles are implicitly ignored as that is a common notion used > > in software written for POSIX systems. All other common filenames > > require explicit ``IGNORE`` lines. > > 'common' in the second sentence seems odd. What about uncommon > filenames? Maybe just s/other common filenames/other filenames/. Done. The idea was to say 'do not put IGNORE for corner cases which are better handled via PM config' but I guess it's not necessary here. > > > An ability to inject additional ignore entries is provided to account > > for site configuration affecting the repository tree — placing > > additional files in it, skipping some of the categories from syncing. > > Mention that the package manager may provide wildcards or regex in the > additional entries. Eg: 'IGNORE **/metadata.xml' Done. | This configuration can extend beyond the limits of this GLEP, | e.g. by allowing wildcards or regular expressions. > > > Non-strict Manifest verification > > -------------------------------- > > ... > > The cases for stripping unnecessary files mostly focused around space > > savings. For this purpose, stripping ``metadata.xml`` and similar files > > has little value. It is much more common for users to strip whole > > categories which can not be handled via the ``MISC`` type, and needs > > a dedicated package manager mechanism. The same mechanism can also > > handle files that used the ``MISC`` type. > > Exclusion by package does happen as well. A list of categories or > packages can be used for both the rsync exclusion and the IGNORE. Rewritten to: | It is much more common for users to strip whole packages | or categories. The ``MISC`` type is not suitable for that, | and so a dedicated package manager mechanism needs to be developed | instead; possibly combining it with rsync exclusion list. The same | mechanism can also handle files that historically used the ``MISC`` | type. But it's merely a rationale, so I'd rather not spend another hour trying to cover every corner case in it. > > > Splitting distfile checksums from file checksums > > ------------------------------------------------ > > > > Another problem with the current Manifest format is that the checksums > > for fetched files are combined with checksums for local files > > in a single file inside the package directory. It has been specifically > > pointed out that: > > > > - since distfiles are sometimes reused across different packages, > > the repeating checksums are redundant, > > Comment: 8.4% of all DIST entries are duplicate, representing a 2MiB > saving in tree size (25MiB of DIST entries altogether). Included as footnote: .. [#DIST] According to Robin H. Johnson, 8.4% of all DIST entries at the time of writing are duplicate, representing a 2 MiB out of 25 MiB of DIST entries altogether. > > > - mirror admins were interested in the possibility of verifying all > > the distfiles with a single tool. > > > > This specification does not provide a clean solution to this problem. > > It technically permits moving ``DIST`` entries to higher-level Manifests > > but the usefulness of such a solution is doubtful. > > This solution would require the packager manager to consider > higher-level Manifests or all Manifests in the tree when searching for > the DIST entry. The most useful implementation of this would be for the > git->rsync process to move all DIST entries elsewhere (metadata/ maybe). Technically speaking, the package manager needs to consider parent Manifests anyway in order to verify the deeper Manifests, and I think we can reasonably assume it will keep them cached. > > Either way, this would have many downsides, and make manual work on the > Manifest DIST entries painful. That's what 'doubtful usefulness' means ;-P. -- Best regards, Michał Górny
