[Impala-ASF-CR] IMPALA-8586: Support download URLs for CDP

2019-09-12 Thread Joe McDonnell (Code Review)
Joe McDonnell has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/13432 )

Change subject: IMPALA-8586: Support download URLs for CDP
..

IMPALA-8586: Support download URLs for CDP

bin/bootstrap_toolchain.py has accumulated complexity over time.
CDH, CDP, and the native toolchain all use different download
machinery and naming. One feature that is needed on the CDP side
is the ability to specify the download URL in an IMPALA_*_URL
environment variable.

This adds that support and refactors CDH and native toolchain
downloads to use the new system. This is essentially a rewrite
of bin/bootstrap_toolchain.py.

Currently, there are multiple phases of downloads, each with their
own download functions and peculiarities to account for package
names and destinations for downloads. This changes the logic
so that a package will generate a DownloadUnpackTarball that is
completely resolved. It contains everything about what to download
and where to put it as well as a needs_download() function and a
download() function. Once there is a list of DownloadUnpackTarball
objects, they can all be downloaded and unpacked in a single phase.
This implements different types of packages as subclasses of
DownloadUnpackTarball. Since most subclasses want to be able to
construct URLs and archive names using templates, the
TemplatedDownloadUnpackTarball takes the same arguments as
DownloadUnpackTarball along with a map of template substitutions,
which are applied to all string arguments.

Kudu requires special handling and gets its own set of subclasses
to handle various subtleties like toolchain vs CDH Kudu, the Kudu
stub, and making sure that the "kudu" package and the "kudu-java"
package don't confuse each other.

As part of this change, USE_CDP_HIVE=true now uses the CDP version
of HBase rather than always using the CDH version.

Change-Id: I67824fd82b820e68e9f5c87939ec94ca6abadb8c
Reviewed-on: http://gerrit.cloudera.org:8080/13432
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 
---
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
M impala-parent/pom.xml
3 files changed, 522 insertions(+), 366 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/13432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I67824fd82b820e68e9f5c87939ec94ca6abadb8c
Gerrit-Change-Number: 13432
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-8586: Support download URLs for CDP

2019-09-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13432 )

Change subject: IMPALA-8586: Support download URLs for CDP
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/13432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67824fd82b820e68e9f5c87939ec94ca6abadb8c
Gerrit-Change-Number: 13432
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 13 Sep 2019 01:33:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8586: Support download URLs for CDP

2019-09-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13432 )

Change subject: IMPALA-8586: Support download URLs for CDP
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4945/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/13432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67824fd82b820e68e9f5c87939ec94ca6abadb8c
Gerrit-Change-Number: 13432
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 12 Sep 2019 21:22:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8586: Support download URLs for CDP

2019-09-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13432 )

Change subject: IMPALA-8586: Support download URLs for CDP
..


Patch Set 4: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13432/3/bin/bootstrap_toolchain.py
File bin/bootstrap_toolchain.py:

http://gerrit.cloudera.org:8080/#/c/13432/3/bin/bootstrap_toolchain.py@129
PS3, Line 129:
> Added a brief description. If it would be useful to have more method commen
They're about as close to self explanatory as things can be. I think its ok 
either way.



--
To view, visit http://gerrit.cloudera.org:8080/13432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67824fd82b820e68e9f5c87939ec94ca6abadb8c
Gerrit-Change-Number: 13432
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 12 Sep 2019 21:20:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8586: Support download URLs for CDP

2019-09-11 Thread Joe McDonnell (Code Review)
Hello Fredy Wijaya, Todd Lipcon, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13432

to look at the new patch set (#4).

Change subject: IMPALA-8586: Support download URLs for CDP
..

IMPALA-8586: Support download URLs for CDP

bin/bootstrap_toolchain.py has accumulated complexity over time.
CDH, CDP, and the native toolchain all use different download
machinery and naming. One feature that is needed on the CDP side
is the ability to specify the download URL in an IMPALA_*_URL
environment variable.

This adds that support and refactors CDH and native toolchain
downloads to use the new system. This is essentially a rewrite
of bin/bootstrap_toolchain.py.

Currently, there are multiple phases of downloads, each with their
own download functions and peculiarities to account for package
names and destinations for downloads. This changes the logic
so that a package will generate a DownloadUnpackTarball that is
completely resolved. It contains everything about what to download
and where to put it as well as a needs_download() function and a
download() function. Once there is a list of DownloadUnpackTarball
objects, they can all be downloaded and unpacked in a single phase.
This implements different types of packages as subclasses of
DownloadUnpackTarball. Since most subclasses want to be able to
construct URLs and archive names using templates, the
TemplatedDownloadUnpackTarball takes the same arguments as
DownloadUnpackTarball along with a map of template substitutions,
which are applied to all string arguments.

Kudu requires special handling and gets its own set of subclasses
to handle various subtleties like toolchain vs CDH Kudu, the Kudu
stub, and making sure that the "kudu" package and the "kudu-java"
package don't confuse each other.

As part of this change, USE_CDP_HIVE=true now uses the CDP version
of HBase rather than always using the CDH version.

Change-Id: I67824fd82b820e68e9f5c87939ec94ca6abadb8c
---
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
M impala-parent/pom.xml
3 files changed, 522 insertions(+), 366 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/13432/4
--
To view, visit http://gerrit.cloudera.org:8080/13432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I67824fd82b820e68e9f5c87939ec94ca6abadb8c
Gerrit-Change-Number: 13432
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-8586: Support download URLs for CDP

2019-09-11 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13432 )

Change subject: IMPALA-8586: Support download URLs for CDP
..


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/13432/3/bin/bootstrap_toolchain.py
File bin/bootstrap_toolchain.py:

http://gerrit.cloudera.org:8080/#/c/13432/3/bin/bootstrap_toolchain.py@61
PS3, Line 61: # python bootstrap_toolchain.py
> Not your change, but this is incorrect, with the #! we should call it direc
Changed this comment to tell people to execute it directly.


http://gerrit.cloudera.org:8080/#/c/13432/3/bin/bootstrap_toolchain.py@129
PS3, Line 129: class DownloadUnpackTarball(object):
> Brief class comment explaining what the abstraction is?
Added a brief description. If it would be useful to have more method comments, 
I could add those.


http://gerrit.cloudera.org:8080/#/c/13432/3/bin/bootstrap_toolchain.py@254
PS3, Line 254: version_file = os.path.join(unpack_dir, 
"toolchain_package_version.txt")
> I would be ok with removing the toolchain_package_version.txt logic and jus
Yeah, I think it might somewhat work today. If commit X has compiler 1 and 
commit Y has compiler 2, each time you switch between commit X and Y, the file 
contents would be different and it would delete the directory, download it, and 
unpack it.

I think I'm going to leave that as a change we can make in the future.


http://gerrit.cloudera.org:8080/#/c/13432/3/bin/bootstrap_toolchain.py@584
PS3, Line 584:   # The LLVM and GCC packages are the large in the toolchain 
(Kudu is handled
> Garbled?
Fixed this up.


http://gerrit.cloudera.org:8080/#/c/13432/3/bin/bootstrap_toolchain.py@719
PS3, Line 719:   download_cluster_components(downloads)
> Should this be download_and_unpack_tarballs or similar? Since it's also dow
It is pretty small, so I inlined it here.


http://gerrit.cloudera.org:8080/#/c/13432/3/bin/bootstrap_toolchain.py@727
PS3, Line 727: if __name__ == "__main__": main()
> flake8: E305 expected 2 blank lines after class or function definition, fou
Done



--
To view, visit http://gerrit.cloudera.org:8080/13432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67824fd82b820e68e9f5c87939ec94ca6abadb8c
Gerrit-Change-Number: 13432
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 12 Sep 2019 00:57:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8586: Support download URLs for CDP

2019-09-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13432 )

Change subject: IMPALA-8586: Support download URLs for CDP
..


Patch Set 3:

(5 comments)

This is a big improvement. Suggested some tweaks but i think we should try to 
get this in soon.

http://gerrit.cloudera.org:8080/#/c/13432/3/bin/bootstrap_toolchain.py
File bin/bootstrap_toolchain.py:

http://gerrit.cloudera.org:8080/#/c/13432/3/bin/bootstrap_toolchain.py@61
PS3, Line 61: # python bootstrap_toolchain.py
Not your change, but this is incorrect, with the #! we should call it directly 
instead of using system python


http://gerrit.cloudera.org:8080/#/c/13432/3/bin/bootstrap_toolchain.py@129
PS3, Line 129: class DownloadUnpackTarball(object):
Brief class comment explaining what the abstraction is?


http://gerrit.cloudera.org:8080/#/c/13432/3/bin/bootstrap_toolchain.py@254
PS3, Line 254: version_file = os.path.join(unpack_dir, 
"toolchain_package_version.txt")
I would be ok with removing the toolchain_package_version.txt logic and just 
assuming that if the directory is present, it's an acceptable version. I added 
it in https://gerrit.cloudera.org/#/c/2147/ with the idea that it would be 
useful for allowing for changes to compiler versions, but I don't think it 
solves that problem.


http://gerrit.cloudera.org:8080/#/c/13432/3/bin/bootstrap_toolchain.py@584
PS3, Line 584:   # The LLVM and GCC packages are the large in the toolchain 
(Kudu is handled
Garbled?


http://gerrit.cloudera.org:8080/#/c/13432/3/bin/bootstrap_toolchain.py@719
PS3, Line 719:   download_cluster_components(downloads)
Should this be download_and_unpack_tarballs or similar? Since it's also 
downloading toolchain packages.

Or it's probably simple enough you could just inline the logic instead of 
calling a separate function.



--
To view, visit http://gerrit.cloudera.org:8080/13432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67824fd82b820e68e9f5c87939ec94ca6abadb8c
Gerrit-Change-Number: 13432
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 11 Sep 2019 21:36:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8586: Support download URLs for CDP

2019-09-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13432 )

Change subject: IMPALA-8586: Support download URLs for CDP
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/13432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67824fd82b820e68e9f5c87939ec94ca6abadb8c
Gerrit-Change-Number: 13432
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 10 Sep 2019 06:32:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8586: Support download URLs for CDP

2019-09-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13432 )

Change subject: IMPALA-8586: Support download URLs for CDP
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4515/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67824fd82b820e68e9f5c87939ec94ca6abadb8c
Gerrit-Change-Number: 13432
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 10 Sep 2019 02:42:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8586: Support download URLs for CDP

2019-09-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13432 )

Change subject: IMPALA-8586: Support download URLs for CDP
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4917/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/13432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67824fd82b820e68e9f5c87939ec94ca6abadb8c
Gerrit-Change-Number: 13432
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 10 Sep 2019 01:14:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8586: Support download URLs for CDP

2019-09-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13432 )

Change subject: IMPALA-8586: Support download URLs for CDP
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13432/3/bin/bootstrap_toolchain.py
File bin/bootstrap_toolchain.py:

http://gerrit.cloudera.org:8080/#/c/13432/3/bin/bootstrap_toolchain.py@727
PS3, Line 727: if __name__ == "__main__": main()
flake8: E305 expected 2 blank lines after class or function definition, found 1



--
To view, visit http://gerrit.cloudera.org:8080/13432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67824fd82b820e68e9f5c87939ec94ca6abadb8c
Gerrit-Change-Number: 13432
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 10 Sep 2019 01:12:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8586: Support download URLs for CDP

2019-09-09 Thread Joe McDonnell (Code Review)
Hello Fredy Wijaya, Todd Lipcon, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13432

to look at the new patch set (#3).

Change subject: IMPALA-8586: Support download URLs for CDP
..

IMPALA-8586: Support download URLs for CDP

bin/bootstrap_toolchain.py has accumulated complexity over time.
CDH, CDP, and the native toolchain all use different download
machinery and naming. One feature that is needed on the CDP side
is the ability to specify the download URL in an IMPALA_*_URL
environment variable.

This adds that support and refactors CDH and native toolchain
downloads to use the new system. This is essentially a rewrite
of bin/bootstrap_toolchain.py.

Currently, there are multiple phases of downloads, each with their
own download functions and peculiarities to account for package
names and destinations for downloads. This changes the logic
so that a package will generate a DownloadUnpackTarball that is
completely resolved. It contains everything about what to download
and where to put it as well as a needs_download() function and a
download() function. Once there is a list of DownloadUnpackTarball
objects, they can all be downloaded and unpacked in a single phase.
This implements different types of packages as subclasses of
DownloadUnpackTarball. Since most subclasses want to be able to
construct URLs and archive names using templates, the
TemplatedDownloadUnpackTarball takes the same arguments as
DownloadUnpackTarball along with a map of template substitutions,
which are applied to all string arguments.

Kudu requires special handling and gets its own set of subclasses
to handle various subtleties like toolchain vs CDH Kudu, the Kudu
stub, and making sure that the "kudu" package and the "kudu-java"
package don't confuse each other.

As part of this change, USE_CDP_HIVE=true now uses the CDP version
of HBase rather than always using the CDH version.

Change-Id: I67824fd82b820e68e9f5c87939ec94ca6abadb8c
---
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
M impala-parent/pom.xml
3 files changed, 512 insertions(+), 362 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/13432/3
--
To view, visit http://gerrit.cloudera.org:8080/13432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I67824fd82b820e68e9f5c87939ec94ca6abadb8c
Gerrit-Change-Number: 13432
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Todd Lipcon