On Sun, May 19, 2013 at 02:24:30PM +0200, Raphael Hertzog wrote:
> On Fri, 17 May 2013, Aaron M. Ucko wrote:
> > Per http://dedup.debian.net/compare/publican/publican, publican ships
> > many copies of common resources (images, CSS files, etc.) under
> > /usr/share/publican/Common_Content and /usr/share/publican/doc,
> > accounting for most of its massive size increase from 2.8-3.  (It's
> > gone up from 6.4 MiB to 50.6 MiB.)
> > 
> > Could you please arrange to ship only one copy of each duplicated
> > file, at least within /usr/share/publican/Common_Content?

Wow. People now reporting bugs based on dedup.d.n. That's what I wrote
it for! \o/ Next time, please include a link to
http://wiki.debian.org/dedup.debian.net, because it includes useful
information for the maintainer.

> It doesn't look trivial. Each set of language files ought to be
> self-contained so that any generated document is independant. So replacing
> with symlinks is not satisfactory (unless we modify the publican build
> logic to replace symlinks with the corresponding file).

I would advise against any manual solution. It just causes work at
little benefit.

> Replacing with hardlinks is better but is quite uncommon in Debian
> packages (there's a lintian warning suggesting it's a bad idea).

I have discussed the question about hard link usage a number of times
now. Conclusions so far:

 * Hard links to files in the same directory (not subdirectory) are
   always ok. (Example: bzip2)
 * When you have many small files, hard links reduce the installation
   size over sym links due to savings in inodes.
 * Hard links, that cross directories should be ok, if the hierarchy is
   completely owned by the package in question. This includes
   /usr/lib/$package and /usr/share/$package. Of course this does not
   cover hard links from /usr/lib/$package/foo to
   /usr/share/$package/bar. As a rule of thumb: If a package is the only
   package to create a directory, you can use hard links therein.

> Last but not least, I'm not going to manually deduplicate all those
> files so someone should really create a helper script that would
> deduplicate a sub-directory.

The wiki page above gives some explanations on how to achieve this using
rdfind and symlinks. A helper utility does not exist.

In your case I'd suggest the following line as part of the build
process.

rdfind -makehardlinks true -outputname /dev/null 
debian/publican/usr/share/publican

Should you have any questions, just ask. In any case feedback on the
usability and documentation of dedup.d.n is welcome.

Helmut


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to