Re: RFC: Remove thirdparty
As mentioned in the original email, after step 3, we can start pushing changes to the build scripts to the ASF repos and push some released version of CDH components to S3. By then, ASF repos should be buildable (probably with some flags such as IMPALA_ASF_BUILD=1). On Thu, May 26, 2016 at 11:08 AM, Jim Apple wrote: > And , when that is done, Clouderans would be able to build the ASF > repo, but non-Clouderans would not? > > On Thu, May 26, 2016 at 11:05 AM, Michael Ho wrote: > > The jenkins job is this one: > > > http://sandbox.jenkins.cloudera.com/job/impala-cdh5-trunk-core-integration > > Harrison probably knows if there are other related jobs too. > > > > I am thinking of using the following location to host the golden > snapshot of > > CDH components. > > http://repos.jenkins.cloudera.com/impala-repos/ > > > > Michael > > > > On Thu, May 26, 2016 at 8:22 AM, Jim Apple wrote: > >> > >> > 5. Update integration jenkins job to copy the snapshots of the > >> > components above to > >> > internal jenkins repo in addition to checking them in to github. > Update > >> > bootstrap_toolchain > >> > to point to internal repos. > >> > >> Which Jenkins job(s), exactly? Which internal Jenkins repo? > > > > > > > > > > -- > > Thanks, > > Michael > -- Thanks, Michael
Re: RFC: Remove thirdparty
And , when that is done, Clouderans would be able to build the ASF repo, but non-Clouderans would not? On Thu, May 26, 2016 at 11:05 AM, Michael Ho wrote: > The jenkins job is this one: > http://sandbox.jenkins.cloudera.com/job/impala-cdh5-trunk-core-integration > Harrison probably knows if there are other related jobs too. > > I am thinking of using the following location to host the golden snapshot of > CDH components. > http://repos.jenkins.cloudera.com/impala-repos/ > > Michael > > On Thu, May 26, 2016 at 8:22 AM, Jim Apple wrote: >> >> > 5. Update integration jenkins job to copy the snapshots of the >> > components above to >> > internal jenkins repo in addition to checking them in to github. Update >> > bootstrap_toolchain >> > to point to internal repos. >> >> Which Jenkins job(s), exactly? Which internal Jenkins repo? > > > > > -- > Thanks, > Michael
Re: RFC: Remove thirdparty
The jenkins job is this one: http://sandbox.jenkins.cloudera.com/job/impala-cdh5-trunk-core-integration Harrison probably knows if there are other related jobs too. I am thinking of using the following location to host the golden snapshot of CDH components. http://repos.jenkins.cloudera.com/impala-repos/ Michael On Thu, May 26, 2016 at 8:22 AM, Jim Apple wrote: > > 5. Update integration jenkins job to copy the snapshots of the > components above to > > internal jenkins repo in addition to checking them in to github. Update > bootstrap_toolchain > > to point to internal repos. > > Which Jenkins job(s), exactly? Which internal Jenkins repo? > -- Thanks, Michael
Re: RFC: Remove thirdparty
On Thu, May 26, 2016 at 10:13 AM, Henry Robinson wrote: > > > On 26 May 2016 at 10:06, Todd Lipcon wrote: > >> In terms of Apache policies, it's OK to require some "Impala" toolchain, >> so long as the ability to regenerate that toolchain is public. >> >> For example, in Kudu, we use thirdparty tarballs hosted on S3. The actual >> bucket is owned by Cloudera (someone has to pay for it), but the tarballs >> are exactly the upstream source releases of the dependencies, so if someone >> wanted to use their own copies, it could be done with a bit of work. >> > > What about LLVM / GCC? Are those hosted in S3 as well for Kudu? > Yes, though we don't currently rebuild GCC. We do rebuild libstdcxx for the purposes of TSAN builds. It does make our initial build pretty long, so caching built artifacts for different platforms would be nice, but we don't do that today. -Todd > > >> >> I think Impala depending upon pre-built thirdparty deps is also fine, so >> long as they can be re-built from source using publicly available scripts. >> Making it trivial to do so isn't a strict requirement IMO -- so long as if >> someone asked for help to do that work, they got the appropriate assistance. >> >> In terms of depending upon vendor packages (CDH) vs upstream releases, >> again I think it's reasonable to continue to use the current dependencies >> for the time being until some contributor steps forward and volunteers to >> make some change. Projects like Apache Ambari already do this (they deploy >> HDP) so there's precedent. >> >> -Todd >> >> On Thu, May 26, 2016 at 9:40 AM, Michael Ho wrote: >> >>> Also adding mentors. >>> >>> On Thu, May 26, 2016 at 9:37 AM, Michael Ho wrote: >>> I guess point number 1 is more about requiring all the thirdparty binary for getting Impala to build and work to be located at a location specified by the environment variable $IMPALA_TOOLCHAIN. It's not strictly necessary for users to use exactly the version of toolchain we provide. For instance, a user can check out a copy of our native-toolchain (which is public) and tinkle with it or they can create their own version of IMPALA_TOOLCHAIN as long as they have all the necessary binaries we expect. The user can also feel free to create a symlink to the system library of their choice in the $IMPALA_TOOLCHAIN directory if they choose to do so. My question is more about whether we should clean up our build script so that we expect to find everything we need to build in $IMPALA_TOOLCHAIN ? Michael On Thu, May 26, 2016 at 8:53 AM, Tim Armstrong >>> > wrote: > > > On Wed, May 25, 2016 at 8:42 PM, Michael Ho wrote: > >> Hi, >> >> Following up on the discussion about IMPALA-3223, I'd like to send out >> an email about the removal of thirdparty. In particular, the >> following changes >> will happen in stages. Please voice your comment before I commit to >> any action. >> >> 1. Requires $IMPALA_TOOLCHAIN to be set in order to build Impala. >> In other words, all the logic in the build script to build thirdparty >> component >> if $IMPALA_TOOLCHAIN is not set will be removed. >> > > I think we probably need to make a firm decision about whether we're > going to try to support non-toolchain builds. In the past we've said that > it would be nice to allow building Impala with system libraries (even if > we > don't put special effort into supporting it), but I don't think we've > committed to the idea, or committed to toolchain builds only. > > If we're going to support non-toolchain builds we would need some kind > of testing to prevent it breaking all the time. > > It would be nice to have, but I'm not sure anyone has the > time/motivation to do it. What do people think? > > >> >> 2. Remove build_thirdparty.sh >> >> 3. Move postgressql-jdbc and may be llama-minikdc (?) to toolchain >> and update >> scripts about it. >> > >> 4. Remove everything in thirdparty directory except for the following >> components: >> hadoop, hbase, hive, llama and sentry. >> >> 5. Update integration jenkins job to copy the snapshots of the >> components above to >> internal jenkins repo in addition to checking them in to github. >> Update bootstrap_toolchain >> to point to internal repos. >> >> 6. Remove thirdparty directory and update integration job to not >> check in to git repo. >> >> After step (3) is done, we can already push the changes of the build >> script to ASF tree >> and check in snapshots of hadoop, hbase, llama and sentry to S3 and >> hopefully >> get the build to work. >> > > We can probably test this out as we go by manually copying the > artifacts to the impala-incubator repo. I did a
Re: RFC: Remove thirdparty
On 26 May 2016 at 10:06, Todd Lipcon wrote: > In terms of Apache policies, it's OK to require some "Impala" toolchain, > so long as the ability to regenerate that toolchain is public. > > For example, in Kudu, we use thirdparty tarballs hosted on S3. The actual > bucket is owned by Cloudera (someone has to pay for it), but the tarballs > are exactly the upstream source releases of the dependencies, so if someone > wanted to use their own copies, it could be done with a bit of work. > What about LLVM / GCC? Are those hosted in S3 as well for Kudu? > > I think Impala depending upon pre-built thirdparty deps is also fine, so > long as they can be re-built from source using publicly available scripts. > Making it trivial to do so isn't a strict requirement IMO -- so long as if > someone asked for help to do that work, they got the appropriate assistance. > > In terms of depending upon vendor packages (CDH) vs upstream releases, > again I think it's reasonable to continue to use the current dependencies > for the time being until some contributor steps forward and volunteers to > make some change. Projects like Apache Ambari already do this (they deploy > HDP) so there's precedent. > > -Todd > > On Thu, May 26, 2016 at 9:40 AM, Michael Ho wrote: > >> Also adding mentors. >> >> On Thu, May 26, 2016 at 9:37 AM, Michael Ho wrote: >> >>> I guess point number 1 is more about requiring all the thirdparty binary >>> for getting Impala to build >>> and work to be located at a location specified by the environment >>> variable $IMPALA_TOOLCHAIN. >>> >>> It's not strictly necessary for users to use exactly the version of >>> toolchain we provide. For instance, >>> a user can check out a copy of our native-toolchain (which is public) >>> and tinkle with it or they can >>> create their own version of IMPALA_TOOLCHAIN as long as they have all >>> the necessary binaries >>> we expect. >>> >>> The user can also feel free to create a symlink to the system library of >>> their choice in the >>> $IMPALA_TOOLCHAIN directory if they choose to do so. >>> >>> My question is more about whether we should clean up our build script so >>> that we expect to find >>> everything we need to build in $IMPALA_TOOLCHAIN ? >>> >>> Michael >>> >>> On Thu, May 26, 2016 at 8:53 AM, Tim Armstrong >>> wrote: >>> On Wed, May 25, 2016 at 8:42 PM, Michael Ho wrote: > Hi, > > Following up on the discussion about IMPALA-3223, I'd like to send out > an email about the removal of thirdparty. In particular, the following > changes > will happen in stages. Please voice your comment before I commit to > any action. > > 1. Requires $IMPALA_TOOLCHAIN to be set in order to build Impala. > In other words, all the logic in the build script to build thirdparty > component > if $IMPALA_TOOLCHAIN is not set will be removed. > I think we probably need to make a firm decision about whether we're going to try to support non-toolchain builds. In the past we've said that it would be nice to allow building Impala with system libraries (even if we don't put special effort into supporting it), but I don't think we've committed to the idea, or committed to toolchain builds only. If we're going to support non-toolchain builds we would need some kind of testing to prevent it breaking all the time. It would be nice to have, but I'm not sure anyone has the time/motivation to do it. What do people think? > > 2. Remove build_thirdparty.sh > > 3. Move postgressql-jdbc and may be llama-minikdc (?) to toolchain and > update > scripts about it. > > 4. Remove everything in thirdparty directory except for the following > components: > hadoop, hbase, hive, llama and sentry. > > 5. Update integration jenkins job to copy the snapshots of the > components above to > internal jenkins repo in addition to checking them in to github. > Update bootstrap_toolchain > to point to internal repos. > > 6. Remove thirdparty directory and update integration job to not check > in to git repo. > > After step (3) is done, we can already push the changes of the build > script to ASF tree > and check in snapshots of hadoop, hbase, llama and sentry to S3 and > hopefully > get the build to work. > We can probably test this out as we go by manually copying the artifacts to the impala-incubator repo. I did a test of this yesterday (running download_requirements and copying thirdparty) and it built ok. > > > -- > Thanks, > Michael > >>> >>> >>> -- >>> Thanks, >>> Michael >>> >> >> >> >> -- >> Thanks, >> Michael >> > > > > -- > Todd Lipcon > Software Engineer, Cloudera > -- Henry Robinson Software Engineer Cloudera 415-994-6679
Re: RFC: Remove thirdparty
In terms of Apache policies, it's OK to require some "Impala" toolchain, so long as the ability to regenerate that toolchain is public. For example, in Kudu, we use thirdparty tarballs hosted on S3. The actual bucket is owned by Cloudera (someone has to pay for it), but the tarballs are exactly the upstream source releases of the dependencies, so if someone wanted to use their own copies, it could be done with a bit of work. I think Impala depending upon pre-built thirdparty deps is also fine, so long as they can be re-built from source using publicly available scripts. Making it trivial to do so isn't a strict requirement IMO -- so long as if someone asked for help to do that work, they got the appropriate assistance. In terms of depending upon vendor packages (CDH) vs upstream releases, again I think it's reasonable to continue to use the current dependencies for the time being until some contributor steps forward and volunteers to make some change. Projects like Apache Ambari already do this (they deploy HDP) so there's precedent. -Todd On Thu, May 26, 2016 at 9:40 AM, Michael Ho wrote: > Also adding mentors. > > On Thu, May 26, 2016 at 9:37 AM, Michael Ho wrote: > >> I guess point number 1 is more about requiring all the thirdparty binary >> for getting Impala to build >> and work to be located at a location specified by the environment >> variable $IMPALA_TOOLCHAIN. >> >> It's not strictly necessary for users to use exactly the version of >> toolchain we provide. For instance, >> a user can check out a copy of our native-toolchain (which is public) and >> tinkle with it or they can >> create their own version of IMPALA_TOOLCHAIN as long as they have all the >> necessary binaries >> we expect. >> >> The user can also feel free to create a symlink to the system library of >> their choice in the >> $IMPALA_TOOLCHAIN directory if they choose to do so. >> >> My question is more about whether we should clean up our build script so >> that we expect to find >> everything we need to build in $IMPALA_TOOLCHAIN ? >> >> Michael >> >> On Thu, May 26, 2016 at 8:53 AM, Tim Armstrong >> wrote: >> >>> >>> >>> On Wed, May 25, 2016 at 8:42 PM, Michael Ho wrote: >>> Hi, Following up on the discussion about IMPALA-3223, I'd like to send out an email about the removal of thirdparty. In particular, the following changes will happen in stages. Please voice your comment before I commit to any action. 1. Requires $IMPALA_TOOLCHAIN to be set in order to build Impala. In other words, all the logic in the build script to build thirdparty component if $IMPALA_TOOLCHAIN is not set will be removed. >>> >>> I think we probably need to make a firm decision about whether we're >>> going to try to support non-toolchain builds. In the past we've said that >>> it would be nice to allow building Impala with system libraries (even if we >>> don't put special effort into supporting it), but I don't think we've >>> committed to the idea, or committed to toolchain builds only. >>> >>> If we're going to support non-toolchain builds we would need some kind >>> of testing to prevent it breaking all the time. >>> >>> It would be nice to have, but I'm not sure anyone has the >>> time/motivation to do it. What do people think? >>> >>> 2. Remove build_thirdparty.sh 3. Move postgressql-jdbc and may be llama-minikdc (?) to toolchain and update scripts about it. >>> 4. Remove everything in thirdparty directory except for the following components: hadoop, hbase, hive, llama and sentry. 5. Update integration jenkins job to copy the snapshots of the components above to internal jenkins repo in addition to checking them in to github. Update bootstrap_toolchain to point to internal repos. 6. Remove thirdparty directory and update integration job to not check in to git repo. After step (3) is done, we can already push the changes of the build script to ASF tree and check in snapshots of hadoop, hbase, llama and sentry to S3 and hopefully get the build to work. >>> >>> We can probably test this out as we go by manually copying the artifacts >>> to the impala-incubator repo. I did a test of this yesterday (running >>> download_requirements and copying thirdparty) and it built ok. >>> >>> -- Thanks, Michael >>> >>> >> >> >> -- >> Thanks, >> Michael >> > > > > -- > Thanks, > Michael > -- Todd Lipcon Software Engineer, Cloudera
Re: RFC: Remove thirdparty
Also adding mentors. On Thu, May 26, 2016 at 9:37 AM, Michael Ho wrote: > I guess point number 1 is more about requiring all the thirdparty binary > for getting Impala to build > and work to be located at a location specified by the environment variable > $IMPALA_TOOLCHAIN. > > It's not strictly necessary for users to use exactly the version of > toolchain we provide. For instance, > a user can check out a copy of our native-toolchain (which is public) and > tinkle with it or they can > create their own version of IMPALA_TOOLCHAIN as long as they have all the > necessary binaries > we expect. > > The user can also feel free to create a symlink to the system library of > their choice in the > $IMPALA_TOOLCHAIN directory if they choose to do so. > > My question is more about whether we should clean up our build script so > that we expect to find > everything we need to build in $IMPALA_TOOLCHAIN ? > > Michael > > On Thu, May 26, 2016 at 8:53 AM, Tim Armstrong > wrote: > >> >> >> On Wed, May 25, 2016 at 8:42 PM, Michael Ho wrote: >> >>> Hi, >>> >>> Following up on the discussion about IMPALA-3223, I'd like to send out >>> an email about the removal of thirdparty. In particular, the following >>> changes >>> will happen in stages. Please voice your comment before I commit to >>> any action. >>> >>> 1. Requires $IMPALA_TOOLCHAIN to be set in order to build Impala. >>> In other words, all the logic in the build script to build thirdparty >>> component >>> if $IMPALA_TOOLCHAIN is not set will be removed. >>> >> >> I think we probably need to make a firm decision about whether we're >> going to try to support non-toolchain builds. In the past we've said that >> it would be nice to allow building Impala with system libraries (even if we >> don't put special effort into supporting it), but I don't think we've >> committed to the idea, or committed to toolchain builds only. >> >> If we're going to support non-toolchain builds we would need some kind of >> testing to prevent it breaking all the time. >> >> It would be nice to have, but I'm not sure anyone has the time/motivation >> to do it. What do people think? >> >> >>> >>> 2. Remove build_thirdparty.sh >>> >>> 3. Move postgressql-jdbc and may be llama-minikdc (?) to toolchain and >>> update >>> scripts about it. >>> >> >>> 4. Remove everything in thirdparty directory except for the following >>> components: >>> hadoop, hbase, hive, llama and sentry. >>> >>> 5. Update integration jenkins job to copy the snapshots of the >>> components above to >>> internal jenkins repo in addition to checking them in to github. Update >>> bootstrap_toolchain >>> to point to internal repos. >>> >>> 6. Remove thirdparty directory and update integration job to not check >>> in to git repo. >>> >>> After step (3) is done, we can already push the changes of the build >>> script to ASF tree >>> and check in snapshots of hadoop, hbase, llama and sentry to S3 and >>> hopefully >>> get the build to work. >>> >> >> We can probably test this out as we go by manually copying the artifacts >> to the impala-incubator repo. I did a test of this yesterday (running >> download_requirements and copying thirdparty) and it built ok. >> >> >>> >>> >>> -- >>> Thanks, >>> Michael >>> >> >> > > > -- > Thanks, > Michael > -- Thanks, Michael
Re: RFC: Remove thirdparty
(Actually adding mentors this time) On 26 May 2016 at 09:19, Henry Robinson wrote: > (+Impala's podling mentors for advice) > > > On 26 May 2016 at 08:57, Jim Apple wrote: > >> > I think we probably need to make a firm decision about whether we're >> going >> > to try to support non-toolchain builds. In the past we've said that it >> would >> > be nice to allow building Impala with system libraries (even if we >> don't put >> > special effort into supporting it), but I don't think we've committed >> to the >> > idea, or committed to toolchain builds only. >> > >> > If we're going to support non-toolchain builds we would need some kind >> of >> > testing to prevent it breaking all the time. >> > >> > It would be nice to have, but I'm not sure anyone has the >> time/motivation to >> > do it. What do people think? >> >> I agree that it would be nice to support non-toolchain builds, and I >> agree that we don't have the time for this right now. >> >> I would call this a lower priority than most of the other ASF infra >> transition work. >> > > Is it (or will it be) possible to build Impala without downloading source > or binary packages from Cloudera's managed S3 bucket? Is the situation > different at all for link-time dependencies compared to system tools like > gcc? Both of these are managed through the toolchain. > > My concern is that people might balk at being forced to use compiler > binaries from a non-ASF source, and that if they want to at least verify > for themselves that the compiler binaries are built from a clean source > tarball they have to rebuild the toolchain themselves, which takes hours. > Looking at this from the perspective of a fresh user it's not very > user-friendly to say you can't use the system compiler that you already > have installed from a trusted source. However, if it's easy to override the > compiler location in the toolchain, that point is moot. > > We should ask the podling mentors for guidance once the technical details > are clear. > > > -- Henry Robinson Software Engineer Cloudera 415-994-6679
Re: RFC: Remove thirdparty
I guess point number 1 is more about requiring all the thirdparty binary for getting Impala to build and work to be located at a location specified by the environment variable $IMPALA_TOOLCHAIN. It's not strictly necessary for users to use exactly the version of toolchain we provide. For instance, a user can check out a copy of our native-toolchain (which is public) and tinkle with it or they can create their own version of IMPALA_TOOLCHAIN as long as they have all the necessary binaries we expect. The user can also feel free to create a symlink to the system library of their choice in the $IMPALA_TOOLCHAIN directory if they choose to do so. My question is more about whether we should clean up our build script so that we expect to find everything we need to build in $IMPALA_TOOLCHAIN ? Michael On Thu, May 26, 2016 at 8:53 AM, Tim Armstrong wrote: > > > On Wed, May 25, 2016 at 8:42 PM, Michael Ho wrote: > >> Hi, >> >> Following up on the discussion about IMPALA-3223, I'd like to send out >> an email about the removal of thirdparty. In particular, the following >> changes >> will happen in stages. Please voice your comment before I commit to >> any action. >> >> 1. Requires $IMPALA_TOOLCHAIN to be set in order to build Impala. >> In other words, all the logic in the build script to build thirdparty >> component >> if $IMPALA_TOOLCHAIN is not set will be removed. >> > > I think we probably need to make a firm decision about whether we're going > to try to support non-toolchain builds. In the past we've said that it > would be nice to allow building Impala with system libraries (even if we > don't put special effort into supporting it), but I don't think we've > committed to the idea, or committed to toolchain builds only. > > If we're going to support non-toolchain builds we would need some kind of > testing to prevent it breaking all the time. > > It would be nice to have, but I'm not sure anyone has the time/motivation > to do it. What do people think? > > >> >> 2. Remove build_thirdparty.sh >> >> 3. Move postgressql-jdbc and may be llama-minikdc (?) to toolchain and >> update >> scripts about it. >> > >> 4. Remove everything in thirdparty directory except for the following >> components: >> hadoop, hbase, hive, llama and sentry. >> >> 5. Update integration jenkins job to copy the snapshots of the components >> above to >> internal jenkins repo in addition to checking them in to github. Update >> bootstrap_toolchain >> to point to internal repos. >> >> 6. Remove thirdparty directory and update integration job to not check in >> to git repo. >> >> After step (3) is done, we can already push the changes of the build >> script to ASF tree >> and check in snapshots of hadoop, hbase, llama and sentry to S3 and >> hopefully >> get the build to work. >> > > We can probably test this out as we go by manually copying the artifacts > to the impala-incubator repo. I did a test of this yesterday (running > download_requirements and copying thirdparty) and it built ok. > > >> >> >> -- >> Thanks, >> Michael >> > > -- Thanks, Michael
Re: RFC: Remove thirdparty
(+Impala's podling mentors for advice) On 26 May 2016 at 08:57, Jim Apple wrote: > > I think we probably need to make a firm decision about whether we're > going > > to try to support non-toolchain builds. In the past we've said that it > would > > be nice to allow building Impala with system libraries (even if we don't > put > > special effort into supporting it), but I don't think we've committed to > the > > idea, or committed to toolchain builds only. > > > > If we're going to support non-toolchain builds we would need some kind of > > testing to prevent it breaking all the time. > > > > It would be nice to have, but I'm not sure anyone has the > time/motivation to > > do it. What do people think? > > I agree that it would be nice to support non-toolchain builds, and I > agree that we don't have the time for this right now. > > I would call this a lower priority than most of the other ASF infra > transition work. > Is it (or will it be) possible to build Impala without downloading source or binary packages from Cloudera's managed S3 bucket? Is the situation different at all for link-time dependencies compared to system tools like gcc? Both of these are managed through the toolchain. My concern is that people might balk at being forced to use compiler binaries from a non-ASF source, and that if they want to at least verify for themselves that the compiler binaries are built from a clean source tarball they have to rebuild the toolchain themselves, which takes hours. Looking at this from the perspective of a fresh user it's not very user-friendly to say you can't use the system compiler that you already have installed from a trusted source. However, if it's easy to override the compiler location in the toolchain, that point is moot. We should ask the podling mentors for guidance once the technical details are clear.
Re: RFC: Remove thirdparty
> I think we probably need to make a firm decision about whether we're going > to try to support non-toolchain builds. In the past we've said that it would > be nice to allow building Impala with system libraries (even if we don't put > special effort into supporting it), but I don't think we've committed to the > idea, or committed to toolchain builds only. > > If we're going to support non-toolchain builds we would need some kind of > testing to prevent it breaking all the time. > > It would be nice to have, but I'm not sure anyone has the time/motivation to > do it. What do people think? I agree that it would be nice to support non-toolchain builds, and I agree that we don't have the time for this right now. I would call this a lower priority than most of the other ASF infra transition work.
Re: RFC: Remove thirdparty
On Wed, May 25, 2016 at 8:42 PM, Michael Ho wrote: > Hi, > > Following up on the discussion about IMPALA-3223, I'd like to send out > an email about the removal of thirdparty. In particular, the following > changes > will happen in stages. Please voice your comment before I commit to > any action. > > 1. Requires $IMPALA_TOOLCHAIN to be set in order to build Impala. > In other words, all the logic in the build script to build thirdparty > component > if $IMPALA_TOOLCHAIN is not set will be removed. > I think we probably need to make a firm decision about whether we're going to try to support non-toolchain builds. In the past we've said that it would be nice to allow building Impala with system libraries (even if we don't put special effort into supporting it), but I don't think we've committed to the idea, or committed to toolchain builds only. If we're going to support non-toolchain builds we would need some kind of testing to prevent it breaking all the time. It would be nice to have, but I'm not sure anyone has the time/motivation to do it. What do people think? > > 2. Remove build_thirdparty.sh > > 3. Move postgressql-jdbc and may be llama-minikdc (?) to toolchain and > update > scripts about it. > > 4. Remove everything in thirdparty directory except for the following > components: > hadoop, hbase, hive, llama and sentry. > > 5. Update integration jenkins job to copy the snapshots of the components > above to > internal jenkins repo in addition to checking them in to github. Update > bootstrap_toolchain > to point to internal repos. > > 6. Remove thirdparty directory and update integration job to not check in > to git repo. > > After step (3) is done, we can already push the changes of the build > script to ASF tree > and check in snapshots of hadoop, hbase, llama and sentry to S3 and > hopefully > get the build to work. > We can probably test this out as we go by manually copying the artifacts to the impala-incubator repo. I did a test of this yesterday (running download_requirements and copying thirdparty) and it built ok. > > > -- > Thanks, > Michael >
Re: RFC: Remove thirdparty
> 5. Update integration jenkins job to copy the snapshots of the components > above to > internal jenkins repo in addition to checking them in to github. Update > bootstrap_toolchain > to point to internal repos. Which Jenkins job(s), exactly? Which internal Jenkins repo?
