As Matt said, I have a patch that implements build ID-based versioning at https://gerrit.cloudera.org/#/c/6166/2.
Does anyone want to take a look? If we could get this in soon it would help smooth over the LZ4 change which is going in shortly. On 27 February 2017 at 14:21, Henry Robinson <[email protected]> wrote: > I agree that that might be useful, and that it's a separately addressable > problem. > > On 27 February 2017 at 14:18, Matthew Jacobs <[email protected]> wrote: > >> Just catching up to this e-mail, though I had seen your code reviews >> and I think this approach makes sense. An additional concern would be >> how to identify how a toolchain package was built, and AFAIK this is >> tricky now if only the 'toolchain ID' is known. Before I saw this >> e-mail I was thinking about this problem (which I think we can address >> separately), and that we might want to write the native-toolchain git >> hash with every toolchain build so that the exact build scripts are >> associated with those build artifacts. I filed >> https://issues.cloudera.org/browse/IMPALA-5002 for this related >> problem. >> >> On Sat, Feb 25, 2017 at 10:22 PM, Henry Robinson <[email protected]> >> wrote: >> > As written, the toolchain can't apparently deal with the possibility of >> > build flags changing, but a dependency version remaining the same. >> > >> > LZ4 has never (afaict) been built with optimization enabled. I have a >> > commit that enables -O3, but that continues to produce artifacts for >> > lz4-1.7.5 with no version change. This is a problem because >> bootstrapping >> > the toolchain will fail to pick up the new binaries - because the >> > previously downloaded version is still in the local cache, and won't be >> > overwritten because of the version change. >> > >> > I think the simplest way to fix this is to write the toolchain build ID >> to >> > the dependency version file (that's in the local cache only) when it's >> > downloaded. If that ID changes, the dependency will be re-downloaded. >> > >> > This has the disadvantage that any bump in IMPALA_TOOLCHAIN_BUILD_ID >> will >> > invalidate all dependencies, and bin/bootstrap_toolchain.py will >> > re-download all of them. My feeling is that that cost is better than >> trying >> > to individually determine whether a dependency has changed between >> > toolchain builds. >> > >> > Any thoughts on whether this is the right way to go? >> > >> > Henry >> > > > > -- > Henry Robinson > Software Engineer > Cloudera > 415-994-6679 <(415)%20994-6679> > -- Henry Robinson Software Engineer Cloudera 415-994-6679
