Michael Ho has posted comments on this change.

Change subject: IMPALA-3223: Supports download of CDH components from S3.
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3333/2/bin/bootstrap_toolchain.py
File bin/bootstrap_toolchain.py:

Line 288: def download_cdh_components(toolchain_root, cdh_components):
> bootstrap_toolchain can be run every time buildall.sh is run (for clean or 
I share your concern and I also tried to come up with a way to not download the 
md5sum file for every incremental build.

The major difference between the CDH components and other pre-existing binaries 
in the toolchain directory is that we don't really have a good versioning 
system for the CDH components. Up to this point, the way it works is that the 
integration Jenkins job will check the latest approved version of the CDH 
components into our git repos. A git fetch will pick up the latest version. 
However, the CDH components will have the same version string so it's hard to 
tell if the version cached locally is stale or not. 

It's unclear to me if we really need to get the latest CDH components for our 
day-to-day development. May be it's already sufficient to have our Jenkins jobs 
do that as they always bootstrap the toolchain from scratch for each run. It 
would be great if others can chime in on this point.

If we can agree that it's unnecessary to always download the latest version of 
CDH components, this function can just skip downloading the component if it 
exists locally already. On the other hand, if we want to preserve the existing 
behavior, we may consider storing a versioning file in the CDH components 
directory and download only if we are behind.

Another way to work around the repeated downloading problem is to set 
SKIP_TOOLCHAIN_BOOTSTRAP to true. That may be particularly useful in 
disconnected environment.

With all the above said, I prefer the first option which is to download only if 
it's missing.


-- 
To view, visit http://gerrit.cloudera.org:8080/3333
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I16fa79db0005554cc0a116e74775647ba99f8dda
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Michael Ho <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to