Hi, On Sun, Aug 11, 2013 at 04:59:47PM +0200, Ian Hinder wrote: > Taking a set of source trees which are very similar and compressing > each of them into a tarball makes it impossible to efficiently > delta-compress them.
That's not completely true. 'tar' isn't really the problem. The compression we use is. > This is what SVN does in its repository. While most users probably > don't care about what happens hidden inside the server-hosted SVN > repository, the mere fact that there is this duplication suggests that > it is an inelegant, and hence probably wrong, solution to the problem. We knew that it is not the most elegant solution from the beginning. However, it is the most convenient for a user. > Note that users will have to check out the whole new version of the > library each time it changes, rather than just downloading the > differences. This also applies to syncing to remote machines. So > users will see a performance hit with the current SVN approach anyway. Yes. But then we argued that the external libraries aren't going to change so often anyway - and when you actually have an update, you are probably connected to a high-speed network at work anyway. > This is a valid concern. Erik, with your testing of boost, could you > measure the number of files in the working tree, and the number of > files in the built tree, for just the boost library? I imagine that > the number of files in the built tree is a factor of a few larger than > the number in the source tree. However, as Frank said, if the user is > not building this library, then they do have a much larger inode cost > if we store the source tree. We discusses skipping syncing of certain > external libraries before; maybe that is a better solution here. Also, the files in the build tree are deleted once the library is built, leading to much lower overhead especially with a lot of configurations that might need different built versions of the libraries. > Note that I am already opposed to using a large library such as Boost > in the ET. It requires a GB of space to uncompress, and more to > build, for little benefit that I can see (though I have not looked). I didn't look as well, but hear almost every time how bad of a decision some people think it was to have certain other projects relying so heavily on boost. > I assume Erik planned to keep the original version on a branch, and > have an ET branch with our changes. That would limit the problem somewhat. But still - as user I might be interested to see which changes are necessary to make library X work with the ET without digging into the documentation of a VCS to find the patches with their descriptions - especially if they evolved over time and I am not at all interested in that evolution, just the actual patch, and reason. > A developer would not have to work with patches. You would commit > your changes on the ET branch of the library, and use the version > control tool to see differences, merge in the new version, etc. This > is much better than messing about with patches, which are just a hack > used in the absence of proper version control. When tracking an externally developed software I would always like to minimize the changes I have - dropping patches with time if they are not necessary anymore. I would like to keep patches apart that have distinct purposes. I am usually not very much interested in how these patches evolve in time. All I really care about is a minimal set of patches with a distinct, stated, reason for the each upstream version. Correct me if I am wrong, but what you describe is something different, isn't it? > * Updating an external library requires the whole library to be downloaded, > rather than just what has changed > * Syncing to a remote cluster requires the whole library to be sent, rather > than what has changed (often on a residential internet connection with > limited upstream bandwidth) These two are really the same, single issue. Also - don't sync when you are on low bandwidth. The same would apply for an initial clone of a giant git repository, including history you don't care about. Frank
signature.asc
Description: Digital signature
_______________________________________________ Users mailing list [email protected] http://lists.einsteintoolkit.org/mailman/listinfo/users
