Just catching up to this e-mail, though I had seen your code reviews and I think this approach makes sense. An additional concern would be how to identify how a toolchain package was built, and AFAIK this is tricky now if only the 'toolchain ID' is known. Before I saw this e-mail I was thinking about this problem (which I think we can address separately), and that we might want to write the native-toolchain git hash with every toolchain build so that the exact build scripts are associated with those build artifacts. I filed https://issues.cloudera.org/browse/IMPALA-5002 for this related problem.
On Sat, Feb 25, 2017 at 10:22 PM, Henry Robinson <[email protected]> wrote: > As written, the toolchain can't apparently deal with the possibility of > build flags changing, but a dependency version remaining the same. > > LZ4 has never (afaict) been built with optimization enabled. I have a > commit that enables -O3, but that continues to produce artifacts for > lz4-1.7.5 with no version change. This is a problem because bootstrapping > the toolchain will fail to pick up the new binaries - because the > previously downloaded version is still in the local cache, and won't be > overwritten because of the version change. > > I think the simplest way to fix this is to write the toolchain build ID to > the dependency version file (that's in the local cache only) when it's > downloaded. If that ID changes, the dependency will be re-downloaded. > > This has the disadvantage that any bump in IMPALA_TOOLCHAIN_BUILD_ID will > invalidate all dependencies, and bin/bootstrap_toolchain.py will > re-download all of them. My feeling is that that cost is better than trying > to individually determine whether a dependency has changed between > toolchain builds. > > Any thoughts on whether this is the right way to go? > > Henry
