I don't think store tarballs in Mesos git repository is a good idea, even in another 3rdparty repo.
*So my opinion is:* Add a 3rdparty configure file(/support/3rdparty.config), the file format can be: library_name git_repo_url git_tag/branch/commit_id .patch_file zookeeper https://github.com/apache/zookeeper.git release-3.4.8 zookeeper.patch leveldb https://github.com/google/leveldb.git v1.4 leveldb.patch ... ... ... ... In bootstrap file: 1. Traverse the support/3rdparty.config 2. If there is already a 3rdparty/library_name directory, skip and continue another item. 3. Clone the library code, switch to the defined tag/branch/commit_id, apply the .patch file. 4. Do the rest steps. So if users want to use their own 3rdparty libraries, he can checkout code to the /3rdparty/library_name and the bootstrap script will skip this library. And in Mesos repository, we only need to maintain the support/3rdparty.config file and the .patch files. On Tue, Mar 8, 2016 at 2:05 AM, Alex Clemmer <clemmer.alexan...@gmail.com> wrote: > So, at this point we have a bunch of different reviews open for > this[1], and I'd like to use this as an opportunity to start nudging > people towards thinking about possibly transitioning to a scheme where > the tarballs that are currently held in the `3rdparty/` directory are > moved to some external place, and retrieved for users out-of-band, as > necessary to build Mesos. > > In particular, doing this (as you all likely know) is very expensive > because git stores a complete copy of the entire tarball, for each > different revision in history, so if you have updated a tarball twice, > you have two complete copies rolling around in the `.git/` folder. It > seems like there are not many benefits for keeping this scheme, other > than the fact that it's pretty easy to implement. > > I'm not sure what it would take to transition the autotools build > system, but just recapping earlier what I said about the CMake build > system: The easiest thing to do (which we've already mostly done) is > to allow people to rope in tarballs from some mirror of the `3rdparty` > github repository[2]. Right now we have facilities that let you host > it either on your local FS or on a remote URL, and we'll download (if > necessary) and untar into the familliar place in the `build/` folder. > Easy! We could even have `bootstrap` clone the repository and make > CMake automatically pull in that repository if it's out of date. > > Thoughts? I recognize that this might be overcomplicating the problem > a bit, but I figured I'd throw the hat in the ring because this has > always kind of bothered me. :) > > > [1] They are: > https://reviews.apache.org/r/44252/ > https://reviews.apache.org/r/44382/ > https://reviews.apache.org/r/44372/ > https://reviews.apache.org/r/44378/ > https://reviews.apache.org/r/44376/ > https://reviews.apache.org/r/44257/ > > [2] https://github.com/3rdparty/mesos-3rdparty > > On Tue, Mar 1, 2016 at 10:48 AM, Alex Clemmer > <clemmer.alexan...@gmail.com> wrote: > > It doesn't seem to be the case that these things are mutually > > exclusive -- it is well within our purview to accept only a specific > > range of versions for any particular dependency, and error out if > > someone tries to select a version outside that range. The only thing > > these commits add is more fine-grained control over which of the > > supported versions you are allowed to select. > > > > At this point, there are no such guards, but that is certainly > > something that can be added. > > > > On Tue, Mar 1, 2016 at 10:17 AM, Neil Conway <neil.con...@gmail.com> > wrote: > >> The prospect of downloading dependencies from "rando" locations is > >> concerning to me :) > >> > >> Mesos can easily come to depend on implementation details of a > >> dependency that might change in a minor release. For example, a recent > >> change [1] depends on the connection retry logic in the Zk client > >> library in a fairly delicate way. I also wouldn't want users to > >> randomly upgrade to, say, protobuf 2.6.1 without it being thoroughly > >> tested. Increasing the support matrix of different users on different > >> platforms running arbitrarily different versions of third-party > >> dependencies doesn't seem like a net improvement to me. > >> > >> My two cents: if Windows requires additional dependencies that we > >> aren't currently vendoring, I would personally opt for (a) vendoring > >> those additional dependencies (b) ensuring that the vendored versions > >> we ship are modern enough to support all the platforms we care about. > >> Are there important use-cases that aren't supported by this scheme? > >> > >> Neil > >> > >> [1] > https://github.com/apache/mesos/commit/c2d496ed430eaf7daee3e57edefa39c25af2aa43 > >> > >> On Tue, Mar 1, 2016 at 10:00 AM, Alex Clemmer > >> <clemmer.alexan...@gmail.com> wrote: > >>> I guess a tl;dr might be in order. > >>> > >>> Basically: the CMake build system already supports roping in tarballs > >>> from rando places on the filesystem or Internet, so I think it makes > >>> sense to rope them in at configure time, and so I'm proposing we > >>> re-appropriate the sophisticated tools we already have to do this for > >>> WIndows, into a more general solution that is useful to other exotic > >>> platforms, rather than just Windows. > >>> > >>> As always, super interested to hear feedback, I'd love to know if I > >>> missed something. > >>> > >>> On Tue, Mar 1, 2016 at 9:58 AM, Alex Clemmer > >>> <clemmer.alexan...@gmail.com> wrote: > >>>> This is a great time to discuss the Mesos dependency channel story in > >>>> general, because it has had to evolve a bit to fit the requirements of > >>>> Windows, and some of the issues you describe are issues we had to > >>>> resolve (at least partially) to support the Windows integration work. > >>>> > >>>> More particularly, our problems are: first, Windows frequently > >>>> requires newer versions of dependencies (due to poor support of MSVC > >>>> 1900), so we have had to develop reasonably robust version-selection > >>>> mechanisms, so that Windows can get specific versions of different > >>>> packages. This means that the Mesos project does not have to evolve > >>>> the dependency support story in lock step, which in the long term may > >>>> actually be required, as some platforms (e.g., those run by > >>>> governmental organizations) are more conservative about what > >>>> dependencies are introduced on their clusters. > >>>> > >>>> Second, because Windows does not have a package manager, it became > >>>> necessary for the CMake build system to support actually hitting some > >>>> remote (possiblty the internet) to rope in the tarballs of arbitrary > >>>> (and arbitrarily-versioned) dependencies that we normally expect to > >>>> already be installed (such as APR or cURL). > >>>> > >>>> This last point is actually more convenient than it seems. Our CMake > >>>> implementation recently[1][2] introduced a flag that lets you specify > >>>> something like `cmake .. -D3RDPARTY_DEPENDENCIES=/some/path/or/url` > >>>> and it will proactively look for tarballs in the location you give it > >>>> -- and that location can be either a path on your filesystem, or a > >>>> URI, like the 3rdparty remote in github[3] that is owned by the GitHub > >>>> community. From the "exotic platform" perspective this is great > >>>> because it makes it trivial for people building (say) Windows to > >>>> upgrade to a version not supported by CMake: > >>>> > >>>> * Put a tarball of a new version somewhere on the filesystem. Say, we > >>>> decide to use glog 0.3.4 instead of 0.3.3, so we just put that tarball > >>>> for 0.3.4 in a well-known place in the filesystem. > >>>> * Update the version of glog in Versions.cmake. > >>>> * When you run cmake, just run `cmake .. > >>>> -D3RDPARTY_DEPENDENCIES=/my/fancy/3rdparty/path` > >>>> * Builds against new dep! Magic! > >>>> > >>>> Much of this was developed out of expediency, but going forward I > >>>> think a reasonable approach to dealing with the third-party channel > >>>> might be (and I would LOVE feedback on this): > >>>> > >>>> WORKFLOW THAT ASSUMES INTERNET ACCESS ON BUILD MACHINE: > >>>> * Clone a copy of mesos. > >>>> * (When we do a normal clone of Mesos, there are no tarballs in the > >>>> `3rdparty/` directory.) > >>>> * Run `bootstrap`. > >>>> * `mkdir build && cd build && cmake ..`. Part of the CMake > >>>> configuration step will be to `git clone` a copy of > >>>> `https://github.com/3rdparty/mesos-3rdparty`. (If you don't know, the > >>>> 3rdparty account is owned by the Mesos community, and the > >>>> `mesos-3rdparty` is where we store canonical copies of all our > >>>> third-party tarballs.) > >>>> * This dumps all the tarballs into a folder, `mesos-3rdparty`. > >>>> * We build against the tarballs we retrieved. Optionally you are > >>>> allowed to set the versions in `Versions.cmake` and mesos will "just > >>>> build" against those versions (as long as they are supported, and we > >>>> will complain if they're not). > >>>> * If you `git pull` and find that Mesos has upgraded its dependencies, > >>>> and a version is out of date, then when you next build, CMake will > >>>> explode automatically (even if you've built before) and ask you to > >>>> `git pull` to update your `mesos-3rdparty` repository. > >>>> > >>>> > >>>> WORKFLOW THAT DOES NOT ASSUME INTERNET ACCESS ON BUILD MACHINE: > >>>> Much like the above, except when you run cmake, you do `cmake .. > >>>> -D3RDPARTY_DEPENDENCIES="path/to/mesos-3rdparty/mirror"`. This will > >>>> tell CMake to not clone the mirror itself, but to look for an existing > >>>> mirror at the location specified. > >>>> > >>>> > >>>> WHAT WE'VE IMPLEMENTED: > >>>> We obviously haven't deleted the tarballs in 3rdparty, and the error > >>>> reporting around `Versions.cmake` and asking people to re-pull when a > >>>> version has been upgraded are not there, but a lot of the rest of this > >>>> is already in place. For example, yesterday we checked in an > >>>> implementation of the `-D3RDPARTY_DEPENDENCIES` flag[1][2], which > >>>> allows you to tell CMake to build against third-party dependencies > >>>> mirrored either at a local path (e.g., > >>>> `-D3RDPARTY_DEPENDENCIES="/your/path/here"`) or at a remote URI (e.g., > >>>> `-D3RDPARTY_DEPENDENCIES=https://github.com/3rdparty/mesos-3rdparty` > ). > >>>> > >>>> > >>>> [1] > https://github.com/apache/mesos/commit/6306b7d62dd5cbb34fa82636dfbb46cee46d0bf8 > >>>> [2] > https://github.com/apache/mesos/commit/3f7501b818662097f41b2d756b2389f6ed9fa5eb > >>>> [3] https://github.com/3rdparty/mesos-3rdparty > >>>> > >>>> On Tue, Mar 1, 2016 at 7:56 AM, Kapil Arya <ka...@mesosphere.io> > wrote: > >>>>>> > >>>>>> *3. 3rdparty/libprocess/3rdparty/stout/tests/protobuf_tests.pb.cc/h > >>>>>> <http://protobuf_tests.pb.cc/h> files.* > >>>>>> > >>>>>> Can anyone tell me why hardcode these two files in Mesos repo? > I think > >>>>>> these two files can be dynamically generated during make check, > this will > >>>>>> make it not depend on protoc version. > >>>>>> > >>>>> > >>>>> I think it's just due to the nature of the way dependencies are > structured > >>>>> in 3rdparty. Alex Rukletsov and I thought about fixing it but at > that time, > >>>>> there was some complication due to protoc related dependency paths > not > >>>>> being resolved properly or something like that (I don't remember > exactly). > >>>>> I think there is a way to do it in the current structure, but I > strongly > >>>>> suspect that this will get much better if/when we go ahead with > 3rdparty > >>>>> flattening. > >>>>> > >>>>> It will be great if you have any other comments, thanks. > >>>>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Alex > >>>> > >>>> Theory is the first term in the Taylor series of practice. -- Thomas M > >>>> Cover (1992) > >>> > >>> > >>> > >>> -- > >>> Alex > >>> > >>> Theory is the first term in the Taylor series of practice. -- Thomas M > >>> Cover (1992) > > > > > > > > -- > > Alex > > > > Theory is the first term in the Taylor series of practice. -- Thomas M > > Cover (1992) > > > > -- > Alex > > Theory is the first term in the Taylor series of practice. -- Thomas M > Cover (1992) >