So, at this point we have a bunch of different reviews open for
this[1], and I'd like to use this as an opportunity to start nudging
people towards thinking about possibly transitioning to a scheme where
the tarballs that are currently held in the `3rdparty/` directory are
moved to some external place, and retrieved for users out-of-band, as
necessary to build Mesos.

In particular, doing this (as you all likely know) is very expensive
because git stores a complete copy of the entire tarball, for each
different revision in history, so if you have updated a tarball twice,
you have two complete copies rolling around in the `.git/` folder. It
seems like there are not many benefits for keeping this scheme, other
than the fact that it's pretty easy to implement.

I'm not sure what it would take to transition the autotools build
system, but just recapping earlier what I said about the CMake build
system: The easiest thing to do (which we've already mostly done) is
to allow people to rope in tarballs from some mirror of the `3rdparty`
github repository[2]. Right now we have facilities that let you host
it either on your local FS or on a remote URL, and we'll download (if
necessary) and untar into the familliar place in the `build/` folder.
Easy! We could even have `bootstrap` clone the repository and make
CMake automatically pull in that repository if it's out of date.

Thoughts? I recognize that this might be overcomplicating the problem
a bit, but I figured I'd throw the hat in the ring because this has
always kind of bothered me. :)


[1] They are:
https://reviews.apache.org/r/44252/
https://reviews.apache.org/r/44382/
https://reviews.apache.org/r/44372/
https://reviews.apache.org/r/44378/
https://reviews.apache.org/r/44376/
https://reviews.apache.org/r/44257/

[2] https://github.com/3rdparty/mesos-3rdparty

On Tue, Mar 1, 2016 at 10:48 AM, Alex Clemmer
<[email protected]> wrote:
> It doesn't seem to be the case that these things are mutually
> exclusive -- it is well within our purview to accept only a specific
> range of versions for any particular dependency, and error out if
> someone tries to select a version outside that range. The only thing
> these commits add is more fine-grained control over which of the
> supported versions you are allowed to select.
>
> At this point, there are no such guards, but that is certainly
> something that can be added.
>
> On Tue, Mar 1, 2016 at 10:17 AM, Neil Conway <[email protected]> wrote:
>> The prospect of downloading dependencies from "rando" locations is
>> concerning to me :)
>>
>> Mesos can easily come to depend on implementation details of a
>> dependency that might change in a minor release. For example, a recent
>> change [1] depends on the connection retry logic in the Zk client
>> library in a fairly delicate way. I also wouldn't want users to
>> randomly upgrade to, say, protobuf 2.6.1 without it being thoroughly
>> tested. Increasing the support matrix of different users on different
>> platforms running arbitrarily different versions of third-party
>> dependencies doesn't seem like a net improvement to me.
>>
>> My two cents: if Windows requires additional dependencies that we
>> aren't currently vendoring, I would personally opt for (a) vendoring
>> those additional dependencies (b) ensuring that the vendored versions
>> we ship are modern enough to support all the platforms we care about.
>> Are there important use-cases that aren't supported by this scheme?
>>
>> Neil
>>
>> [1] 
>> https://github.com/apache/mesos/commit/c2d496ed430eaf7daee3e57edefa39c25af2aa43
>>
>> On Tue, Mar 1, 2016 at 10:00 AM, Alex Clemmer
>> <[email protected]> wrote:
>>> I guess a tl;dr might be in order.
>>>
>>> Basically: the CMake build system already supports roping in tarballs
>>> from rando places on the filesystem or Internet, so I think it makes
>>> sense to rope them in at configure time, and so I'm proposing we
>>> re-appropriate the sophisticated tools we already have to do this for
>>> WIndows, into a more general solution that is useful to other exotic
>>> platforms, rather than just Windows.
>>>
>>> As always, super interested to hear feedback, I'd love to know if I
>>> missed something.
>>>
>>> On Tue, Mar 1, 2016 at 9:58 AM, Alex Clemmer
>>> <[email protected]> wrote:
>>>> This is a great time to discuss the Mesos dependency channel story in
>>>> general, because it has had to evolve a bit to fit the requirements of
>>>> Windows, and some of the issues you describe are issues we had to
>>>> resolve (at least partially) to support the Windows integration work.
>>>>
>>>> More particularly, our problems are: first, Windows frequently
>>>> requires newer versions of dependencies (due to poor support of MSVC
>>>> 1900), so we have had to develop reasonably robust version-selection
>>>> mechanisms, so that Windows can get specific versions of different
>>>> packages. This means that the Mesos project does not have to evolve
>>>> the dependency support story in lock step, which in the long term may
>>>> actually be required, as some platforms (e.g., those run by
>>>> governmental organizations) are more conservative about what
>>>> dependencies are introduced on their clusters.
>>>>
>>>> Second, because Windows does not have a package manager, it became
>>>> necessary for the CMake build system to support actually hitting some
>>>> remote (possiblty the internet) to rope in the tarballs of arbitrary
>>>> (and arbitrarily-versioned) dependencies that we normally expect to
>>>> already be installed (such as APR or cURL).
>>>>
>>>> This last point is actually more convenient than it seems. Our CMake
>>>> implementation recently[1][2] introduced a flag that lets you specify
>>>> something like `cmake .. -D3RDPARTY_DEPENDENCIES=/some/path/or/url`
>>>> and it will proactively look for tarballs in the location you give it
>>>> -- and that location can be either a path on your filesystem, or a
>>>> URI, like the 3rdparty remote in github[3] that is owned by the GitHub
>>>> community. From the "exotic platform" perspective this is great
>>>> because it makes it trivial for people building (say) Windows to
>>>> upgrade to a version not supported by CMake:
>>>>
>>>> * Put a tarball of a new version somewhere on the filesystem. Say, we
>>>> decide to use glog 0.3.4 instead of 0.3.3, so we just put that tarball
>>>> for 0.3.4 in a well-known place in the filesystem.
>>>> * Update the version of glog in Versions.cmake.
>>>> * When you run cmake, just run `cmake ..
>>>> -D3RDPARTY_DEPENDENCIES=/my/fancy/3rdparty/path`
>>>> * Builds against new dep! Magic!
>>>>
>>>> Much of this was developed out of expediency, but going forward I
>>>> think a reasonable approach to dealing with the third-party channel
>>>> might be (and I would LOVE feedback on this):
>>>>
>>>> WORKFLOW THAT ASSUMES INTERNET ACCESS ON BUILD MACHINE:
>>>> * Clone a copy of mesos.
>>>> * (When we do a normal clone of Mesos, there are no tarballs in the
>>>> `3rdparty/` directory.)
>>>> * Run `bootstrap`.
>>>> * `mkdir build && cd build && cmake ..`. Part of the CMake
>>>> configuration step will be to `git clone` a copy of
>>>> `https://github.com/3rdparty/mesos-3rdparty`. (If you don't know, the
>>>> 3rdparty account is owned by the Mesos community, and the
>>>> `mesos-3rdparty` is where we store canonical copies of all our
>>>> third-party tarballs.)
>>>> * This dumps all the tarballs into a folder, `mesos-3rdparty`.
>>>> * We build against the tarballs we retrieved. Optionally you are
>>>> allowed to set the versions in `Versions.cmake` and mesos will "just
>>>> build" against those versions (as long as they are supported, and we
>>>> will complain if they're not).
>>>> * If you `git pull` and find that Mesos has upgraded its dependencies,
>>>> and a version is out of date, then when you next build, CMake will
>>>> explode automatically (even if you've built before) and ask you to
>>>> `git pull` to update your `mesos-3rdparty` repository.
>>>>
>>>>
>>>> WORKFLOW THAT DOES NOT ASSUME INTERNET ACCESS ON BUILD MACHINE:
>>>> Much like the above, except when you run cmake, you do `cmake ..
>>>> -D3RDPARTY_DEPENDENCIES="path/to/mesos-3rdparty/mirror"`. This will
>>>> tell CMake to not clone the mirror itself, but to look for an existing
>>>> mirror at the location specified.
>>>>
>>>>
>>>> WHAT WE'VE IMPLEMENTED:
>>>> We obviously haven't deleted the tarballs in 3rdparty, and the error
>>>> reporting around `Versions.cmake` and asking people to re-pull when a
>>>> version has been upgraded are not there, but a lot of the rest of this
>>>> is already in place. For example, yesterday we checked in an
>>>> implementation of the `-D3RDPARTY_DEPENDENCIES` flag[1][2], which
>>>> allows you to tell CMake to build against third-party dependencies
>>>> mirrored either at a local path (e.g.,
>>>> `-D3RDPARTY_DEPENDENCIES="/your/path/here"`) or at a remote URI (e.g.,
>>>> `-D3RDPARTY_DEPENDENCIES=https://github.com/3rdparty/mesos-3rdparty`).
>>>>
>>>>
>>>> [1] 
>>>> https://github.com/apache/mesos/commit/6306b7d62dd5cbb34fa82636dfbb46cee46d0bf8
>>>> [2] 
>>>> https://github.com/apache/mesos/commit/3f7501b818662097f41b2d756b2389f6ed9fa5eb
>>>> [3] https://github.com/3rdparty/mesos-3rdparty
>>>>
>>>> On Tue, Mar 1, 2016 at 7:56 AM, Kapil Arya <[email protected]> wrote:
>>>>>>
>>>>>> *3. 3rdparty/libprocess/3rdparty/stout/tests/protobuf_tests.pb.cc/h
>>>>>> <http://protobuf_tests.pb.cc/h> files.*
>>>>>>
>>>>>>     Can anyone tell me why hardcode these two files in Mesos repo? I 
>>>>>> think
>>>>>> these two files can be dynamically generated during make check, this will
>>>>>> make it not depend on protoc version.
>>>>>>
>>>>>
>>>>> I think it's just due to the nature of the way dependencies are structured
>>>>> in 3rdparty. Alex Rukletsov and I thought about fixing it but at that 
>>>>> time,
>>>>> there was some complication due to protoc related dependency paths not
>>>>> being resolved properly or something like that (I don't remember exactly).
>>>>> I think there is a way to do it in the current structure, but I strongly
>>>>> suspect that this will get much better if/when we go ahead with 3rdparty
>>>>> flattening.
>>>>>
>>>>> It will be great if you have any other comments, thanks.
>>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Alex
>>>>
>>>> Theory is the first term in the Taylor series of practice. -- Thomas M
>>>> Cover (1992)
>>>
>>>
>>>
>>> --
>>> Alex
>>>
>>> Theory is the first term in the Taylor series of practice. -- Thomas M
>>> Cover (1992)
>
>
>
> --
> Alex
>
> Theory is the first term in the Taylor series of practice. -- Thomas M
> Cover (1992)



-- 
Alex

Theory is the first term in the Taylor series of practice. -- Thomas M
Cover (1992)

Reply via email to