[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org
[ https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846400#comment-17846400 ] ASF GitHub Bot commented on HADOOP-18786: - steveloughran commented on PR #5789: URL: https://github.com/apache/hadoop/pull/5789#issuecomment-2110965419 thanks, in trunk. can you do a pr cherrypicking to branch-3.4 so we can keep that in sync. No need for more reviews, just a yetus test run. > Hadoop build depends on archives.apache.org > --- > > Key: HADOOP-18786 > URL: https://issues.apache.org/jira/browse/HADOOP-18786 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 3.3.6 >Reporter: Christopher Tubbs >Priority: Critical > Labels: pull-request-available > > Several times throughout Hadoop's source, the ASF archive is referenced, > including part of the build that downloads Yetus. > Building a release from source should not require access to the ASF archives, > as that contributes to end users being subject to throttling and blocking by > INFRA, for "abuse" of the archives, even though they are merely building a > current ASF release from source. This is particularly problematic for > downstream packagers who must build from Hadoop's source, or for CI/CD > situations that depend on Hadoop's source, and particularly problematic for > those end users behind a NAT gateway, because even if Hadoop's use of the > archive is modest, it adds up for multiple users. > The build should be modified, so that it does not require access to fixed > versions in the archives (or should work with the upstream of those dependent > projects to publish their releases elsewhere, for routine consumptions). In > the interim, the source could be updated to point to the current dependency > versions available on downloads.apache.org. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org
[ https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846399#comment-17846399 ] ASF GitHub Bot commented on HADOOP-18786: - steveloughran merged PR #5789: URL: https://github.com/apache/hadoop/pull/5789 > Hadoop build depends on archives.apache.org > --- > > Key: HADOOP-18786 > URL: https://issues.apache.org/jira/browse/HADOOP-18786 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 3.3.6 >Reporter: Christopher Tubbs >Priority: Critical > Labels: pull-request-available > > Several times throughout Hadoop's source, the ASF archive is referenced, > including part of the build that downloads Yetus. > Building a release from source should not require access to the ASF archives, > as that contributes to end users being subject to throttling and blocking by > INFRA, for "abuse" of the archives, even though they are merely building a > current ASF release from source. This is particularly problematic for > downstream packagers who must build from Hadoop's source, or for CI/CD > situations that depend on Hadoop's source, and particularly problematic for > those end users behind a NAT gateway, because even if Hadoop's use of the > archive is modest, it adds up for multiple users. > The build should be modified, so that it does not require access to fixed > versions in the archives (or should work with the upstream of those dependent > projects to publish their releases elsewhere, for routine consumptions). In > the interim, the source could be updated to point to the current dependency > versions available on downloads.apache.org. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org
[ https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846398#comment-17846398 ] ASF GitHub Bot commented on HADOOP-18786: - steveloughran commented on PR #5789: URL: https://github.com/apache/hadoop/pull/5789#issuecomment-2110961454 ok, let's merge and see what happens. > Hadoop build depends on archives.apache.org > --- > > Key: HADOOP-18786 > URL: https://issues.apache.org/jira/browse/HADOOP-18786 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 3.3.6 >Reporter: Christopher Tubbs >Priority: Critical > Labels: pull-request-available > > Several times throughout Hadoop's source, the ASF archive is referenced, > including part of the build that downloads Yetus. > Building a release from source should not require access to the ASF archives, > as that contributes to end users being subject to throttling and blocking by > INFRA, for "abuse" of the archives, even though they are merely building a > current ASF release from source. This is particularly problematic for > downstream packagers who must build from Hadoop's source, or for CI/CD > situations that depend on Hadoop's source, and particularly problematic for > those end users behind a NAT gateway, because even if Hadoop's use of the > archive is modest, it adds up for multiple users. > The build should be modified, so that it does not require access to fixed > versions in the archives (or should work with the upstream of those dependent > projects to publish their releases elsewhere, for routine consumptions). In > the interim, the source could be updated to point to the current dependency > versions available on downloads.apache.org. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org
[ https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846044#comment-17846044 ] ASF GitHub Bot commented on HADOOP-18786: - ctubbsii commented on PR #5789: URL: https://github.com/apache/hadoop/pull/5789#issuecomment-2108580048 > The main thing I want to be sure is from this build, what gets into the distro? only stuff from the maven repo, right? that is: this PR MUST NOT force updates in the binaries we ship. I don't quite understand the question. The premise seems to be that the current build is only grabbing stuff from the Maven repo. However, that's not true currently, and that's part of the problem. The build currently grabs stuff from the archives, and not just from the Maven repo. Those are the URLs that this PR changes... to use the ASF CDN instead of the archives. The only change that might affect the distro is a couple of tools do not have that version available in the CDN anymore, so a version bump was necessary to be able to grab it from the CDN instead of from the archives. However, I don't know if those affect the binaries in the distro either, or if those are only used as unshipped build tools. But even if it does change the binaries in some way, the current situation of automatically going to the ASF archives cannot continue... it makes offline builds very hard, and the download of things from the archives causes frequent builds to trigger automated bans of ASF services, because the archives aren't meant to be used this way (for routine builds). > Hadoop build depends on archives.apache.org > --- > > Key: HADOOP-18786 > URL: https://issues.apache.org/jira/browse/HADOOP-18786 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 3.3.6 >Reporter: Christopher Tubbs >Priority: Critical > Labels: pull-request-available > > Several times throughout Hadoop's source, the ASF archive is referenced, > including part of the build that downloads Yetus. > Building a release from source should not require access to the ASF archives, > as that contributes to end users being subject to throttling and blocking by > INFRA, for "abuse" of the archives, even though they are merely building a > current ASF release from source. This is particularly problematic for > downstream packagers who must build from Hadoop's source, or for CI/CD > situations that depend on Hadoop's source, and particularly problematic for > those end users behind a NAT gateway, because even if Hadoop's use of the > archive is modest, it adds up for multiple users. > The build should be modified, so that it does not require access to fixed > versions in the archives (or should work with the upstream of those dependent > projects to publish their releases elsewhere, for routine consumptions). In > the interim, the source could be updated to point to the current dependency > versions available on downloads.apache.org. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org
[ https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846024#comment-17846024 ] ASF GitHub Bot commented on HADOOP-18786: - steveloughran commented on PR #5789: URL: https://github.com/apache/hadoop/pull/5789#issuecomment-2108352558 The main thing I want to be sure is from this build, what gets into the distro? only stuff from the maven repo, right? that is: this PR MUST NOT force updates in the binaries we ship. > Hadoop build depends on archives.apache.org > --- > > Key: HADOOP-18786 > URL: https://issues.apache.org/jira/browse/HADOOP-18786 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 3.3.6 >Reporter: Christopher Tubbs >Priority: Critical > Labels: pull-request-available > > Several times throughout Hadoop's source, the ASF archive is referenced, > including part of the build that downloads Yetus. > Building a release from source should not require access to the ASF archives, > as that contributes to end users being subject to throttling and blocking by > INFRA, for "abuse" of the archives, even though they are merely building a > current ASF release from source. This is particularly problematic for > downstream packagers who must build from Hadoop's source, or for CI/CD > situations that depend on Hadoop's source, and particularly problematic for > those end users behind a NAT gateway, because even if Hadoop's use of the > archive is modest, it adds up for multiple users. > The build should be modified, so that it does not require access to fixed > versions in the archives (or should work with the upstream of those dependent > projects to publish their releases elsewhere, for routine consumptions). In > the interim, the source could be updated to point to the current dependency > versions available on downloads.apache.org. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org
[ https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844718#comment-17844718 ] ASF GitHub Bot commented on HADOOP-18786: - ctubbsii commented on PR #5789: URL: https://github.com/apache/hadoop/pull/5789#issuecomment-2101037371 This was previously approved, and I've answered all the questions raised. I just resolved the merge conflicts from upstream, where some lines got moved around in the Dockerfile for Windows. Is anybody willing to merge this? > Hadoop build depends on archives.apache.org > --- > > Key: HADOOP-18786 > URL: https://issues.apache.org/jira/browse/HADOOP-18786 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 3.3.6 >Reporter: Christopher Tubbs >Priority: Critical > Labels: pull-request-available > > Several times throughout Hadoop's source, the ASF archive is referenced, > including part of the build that downloads Yetus. > Building a release from source should not require access to the ASF archives, > as that contributes to end users being subject to throttling and blocking by > INFRA, for "abuse" of the archives, even though they are merely building a > current ASF release from source. This is particularly problematic for > downstream packagers who must build from Hadoop's source, or for CI/CD > situations that depend on Hadoop's source, and particularly problematic for > those end users behind a NAT gateway, because even if Hadoop's use of the > archive is modest, it adds up for multiple users. > The build should be modified, so that it does not require access to fixed > versions in the archives (or should work with the upstream of those dependent > projects to publish their releases elsewhere, for routine consumptions). In > the interim, the source could be updated to point to the current dependency > versions available on downloads.apache.org. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org
[ https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748283#comment-17748283 ] ASF GitHub Bot commented on HADOOP-18786: - ctubbsii commented on PR #5789: URL: https://github.com/apache/hadoop/pull/5789#issuecomment-1654127004 @steveloughran wrote: > @ctubbsii do we need to bump up the versions for the move to downloads to work? The old versions only exist in the archives, so yes. But, as previously said, this is really only a temporary fix, as it forces the build to depend on a transient state on downloads.a.o. The Hadoop devs need to figure out a more permanent solution. This is just a stop-gap. > Hadoop build depends on archives.apache.org > --- > > Key: HADOOP-18786 > URL: https://issues.apache.org/jira/browse/HADOOP-18786 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 3.3.6 >Reporter: Christopher Tubbs >Priority: Critical > Labels: pull-request-available > > Several times throughout Hadoop's source, the ASF archive is referenced, > including part of the build that downloads Yetus. > Building a release from source should not require access to the ASF archives, > as that contributes to end users being subject to throttling and blocking by > INFRA, for "abuse" of the archives, even though they are merely building a > current ASF release from source. This is particularly problematic for > downstream packagers who must build from Hadoop's source, or for CI/CD > situations that depend on Hadoop's source, and particularly problematic for > those end users behind a NAT gateway, because even if Hadoop's use of the > archive is modest, it adds up for multiple users. > The build should be modified, so that it does not require access to fixed > versions in the archives (or should work with the upstream of those dependent > projects to publish their releases elsewhere, for routine consumptions). In > the interim, the source could be updated to point to the current dependency > versions available on downloads.apache.org. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org
[ https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748069#comment-17748069 ] ASF GitHub Bot commented on HADOOP-18786: - steveloughran commented on PR #5789: URL: https://github.com/apache/hadoop/pull/5789#issuecomment-1653336705 reviewing this. @ctubbsii do we need to bump up the versions for the move to downloads to work? > Hadoop build depends on archives.apache.org > --- > > Key: HADOOP-18786 > URL: https://issues.apache.org/jira/browse/HADOOP-18786 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 3.3.6 >Reporter: Christopher Tubbs >Priority: Critical > Labels: pull-request-available > > Several times throughout Hadoop's source, the ASF archive is referenced, > including part of the build that downloads Yetus. > Building a release from source should not require access to the ASF archives, > as that contributes to end users being subject to throttling and blocking by > INFRA, for "abuse" of the archives, even though they are merely building a > current ASF release from source. This is particularly problematic for > downstream packagers who must build from Hadoop's source, or for CI/CD > situations that depend on Hadoop's source, and particularly problematic for > those end users behind a NAT gateway, because even if Hadoop's use of the > archive is modest, it adds up for multiple users. > The build should be modified, so that it does not require access to fixed > versions in the archives (or should work with the upstream of those dependent > projects to publish their releases elsewhere, for routine consumptions). In > the interim, the source could be updated to point to the current dependency > versions available on downloads.apache.org. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org
[ https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17738791#comment-17738791 ] ASF GitHub Bot commented on HADOOP-18786: - ctubbsii commented on PR #5789: URL: https://github.com/apache/hadoop/pull/5789#issuecomment-1613822991 The version bumps are merely incidental to the actual issue this PR and JIRA are intending to expose: that the use of the archives should be avoided. If committers want to bump the versions first, I can rebase this PR once that is done. However, these changes are pretty trivial, so I'm not really needed for that part. Once the committers decide to act on this, they can rebase or merge however they see fit. Up to now, these "should" comments about what to do about the version bumps have been very unclear to me... it looks like discussion among yourselves... but if it's a request for me to change something in this PR, please state the request clearly. > Hadoop build depends on archives.apache.org > --- > > Key: HADOOP-18786 > URL: https://issues.apache.org/jira/browse/HADOOP-18786 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 3.3.6 >Reporter: Christopher Tubbs >Priority: Critical > Labels: pull-request-available > > Several times throughout Hadoop's source, the ASF archive is referenced, > including part of the build that downloads Yetus. > Building a release from source should not require access to the ASF archives, > as that contributes to end users being subject to throttling and blocking by > INFRA, for "abuse" of the archives, even though they are merely building a > current ASF release from source. This is particularly problematic for > downstream packagers who must build from Hadoop's source, or for CI/CD > situations that depend on Hadoop's source, and particularly problematic for > those end users behind a NAT gateway, because even if Hadoop's use of the > archive is modest, it adds up for multiple users. > The build should be modified, so that it does not require access to fixed > versions in the archives (or should work with the upstream of those dependent > projects to publish their releases elsewhere, for routine consumptions). In > the interim, the source could be updated to point to the current dependency > versions available on downloads.apache.org. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org
[ https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17738568#comment-17738568 ] ASF GitHub Bot commented on HADOOP-18786: - ctubbsii commented on PR #5789: URL: https://github.com/apache/hadoop/pull/5789#issuecomment-1613163547 > yetus is special compared to gradle as it is asf, but then so would thriftc and mvn be. From https://www.apache.org/legal/release-policy.html#source-packages: `A source release SHOULD not contain compiled code` So, it's fine if you're just bundling and redistributing Yetus scripts, which are also source. I'm not sure if Yetus contains compiled code, or if it's just scripts, and I'm not sure which parts Hadoop's build uses. > Hadoop build depends on archives.apache.org > --- > > Key: HADOOP-18786 > URL: https://issues.apache.org/jira/browse/HADOOP-18786 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 3.3.6 >Reporter: Christopher Tubbs >Priority: Critical > Labels: pull-request-available > > Several times throughout Hadoop's source, the ASF archive is referenced, > including part of the build that downloads Yetus. > Building a release from source should not require access to the ASF archives, > as that contributes to end users being subject to throttling and blocking by > INFRA, for "abuse" of the archives, even though they are merely building a > current ASF release from source. This is particularly problematic for > downstream packagers who must build from Hadoop's source, or for CI/CD > situations that depend on Hadoop's source, and particularly problematic for > those end users behind a NAT gateway, because even if Hadoop's use of the > archive is modest, it adds up for multiple users. > The build should be modified, so that it does not require access to fixed > versions in the archives (or should work with the upstream of those dependent > projects to publish their releases elsewhere, for routine consumptions). In > the interim, the source could be updated to point to the current dependency > versions available on downloads.apache.org. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org
[ https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17738566#comment-17738566 ] ASF GitHub Bot commented on HADOOP-18786: - steveloughran commented on PR #5789: URL: https://github.com/apache/hadoop/pull/5789#issuecomment-1613156364 yetus is special compared to gradle as it is asf, but then so would thriftc and mvn be. > Hadoop build depends on archives.apache.org > --- > > Key: HADOOP-18786 > URL: https://issues.apache.org/jira/browse/HADOOP-18786 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 3.3.6 >Reporter: Christopher Tubbs >Priority: Critical > Labels: pull-request-available > > Several times throughout Hadoop's source, the ASF archive is referenced, > including part of the build that downloads Yetus. > Building a release from source should not require access to the ASF archives, > as that contributes to end users being subject to throttling and blocking by > INFRA, for "abuse" of the archives, even though they are merely building a > current ASF release from source. This is particularly problematic for > downstream packagers who must build from Hadoop's source, or for CI/CD > situations that depend on Hadoop's source, and particularly problematic for > those end users behind a NAT gateway, because even if Hadoop's use of the > archive is modest, it adds up for multiple users. > The build should be modified, so that it does not require access to fixed > versions in the archives (or should work with the upstream of those dependent > projects to publish their releases elsewhere, for routine consumptions). In > the interim, the source could be updated to point to the current dependency > versions available on downloads.apache.org. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org
[ https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17738301#comment-17738301 ] ASF GitHub Bot commented on HADOOP-18786: - ctubbsii commented on PR #5789: URL: https://github.com/apache/hadoop/pull/5789#issuecomment-1612109177 > * yetus in source tarballs is interesting. I'm happy for a source bundle to include it as it makes for more of a "all the tools you need are here" world. (ignoring maven artifacts...) Unfortunately, I'm pretty sure it's against policy. This is the same issue that people face when they want to include the gradlew or mvnw jars in their sources. > Hadoop build depends on archives.apache.org > --- > > Key: HADOOP-18786 > URL: https://issues.apache.org/jira/browse/HADOOP-18786 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 3.3.6 >Reporter: Christopher Tubbs >Priority: Critical > Labels: pull-request-available > > Several times throughout Hadoop's source, the ASF archive is referenced, > including part of the build that downloads Yetus. > Building a release from source should not require access to the ASF archives, > as that contributes to end users being subject to throttling and blocking by > INFRA, for "abuse" of the archives, even though they are merely building a > current ASF release from source. This is particularly problematic for > downstream packagers who must build from Hadoop's source, or for CI/CD > situations that depend on Hadoop's source, and particularly problematic for > those end users behind a NAT gateway, because even if Hadoop's use of the > archive is modest, it adds up for multiple users. > The build should be modified, so that it does not require access to fixed > versions in the archives (or should work with the upstream of those dependent > projects to publish their releases elsewhere, for routine consumptions). In > the interim, the source could be updated to point to the current dependency > versions available on downloads.apache.org. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org
[ https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17738287#comment-17738287 ] ASF GitHub Bot commented on HADOOP-18786: - steveloughran commented on PR #5789: URL: https://github.com/apache/hadoop/pull/5789#issuecomment-1612045152 * solr change should be pulled out as its own thing, as it includes a version change. * yetus in source tarballs is interesting. I'm happy for a source bundle to include it as it makes for more of a "all the tools you need are here" world. (ignoring maven artifacts...) > Hadoop build depends on archives.apache.org > --- > > Key: HADOOP-18786 > URL: https://issues.apache.org/jira/browse/HADOOP-18786 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 3.3.6 >Reporter: Christopher Tubbs >Priority: Critical > Labels: pull-request-available > > Several times throughout Hadoop's source, the ASF archive is referenced, > including part of the build that downloads Yetus. > Building a release from source should not require access to the ASF archives, > as that contributes to end users being subject to throttling and blocking by > INFRA, for "abuse" of the archives, even though they are merely building a > current ASF release from source. This is particularly problematic for > downstream packagers who must build from Hadoop's source, or for CI/CD > situations that depend on Hadoop's source, and particularly problematic for > those end users behind a NAT gateway, because even if Hadoop's use of the > archive is modest, it adds up for multiple users. > The build should be modified, so that it does not require access to fixed > versions in the archives (or should work with the upstream of those dependent > projects to publish their releases elsewhere, for routine consumptions). In > the interim, the source could be updated to point to the current dependency > versions available on downloads.apache.org. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org
[ https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737886#comment-17737886 ] ASF GitHub Bot commented on HADOOP-18786: - ctubbsii commented on PR #5789: URL: https://github.com/apache/hadoop/pull/5789#issuecomment-1610266483 > Also, I guess we want to this change in all active branches? i.e. branch-3.3 branch-3.2 and branch-2.10. As a stop-gap, it would be good to prevent any downstream users from hitting this on all new releases, as it can trigger a site-wide ban if they've used the archives too much. Building Hadoop releases from source shouldn't cause users to be banned from access to the ASF. But, this is only a stop-gap solution, regardless of where it is applied. Once these dependencies roll over to the archives, then you'll have the problem of users being unable to build Hadoop releases from source without first patching its build in some way. So, a more complete solution still needs to be created. The Dockerfile changes are probably okay (as I imagine those are optional, and users can derive their own Dockerfiles easily enough from these as reference), and the generic message about JVSC is certainly okay. You could probably get away with just applying those changes to the trunk. The main problem to address across all branches is the yetus-wrapper's use of the archives. Perhaps one of the following would work? 1. yetus can be made an optional part of the build (a dev-only profile that is inactive by default when users build from source)? 2. Or you can bundle yetus into the release as a build tool so it doesn't need to go to the archives (might be against ASF policy for source releases, but perhaps there's an exception to the rule)? 3. Or perhaps yetus is stable enough that you can just reference downloads.apache.org/yetus/latest instead of a specific version? 4. Or maybe the build instructions should just tell the user that they need to download or install it as a build prerequisite, rather than have the Hadoop scripts download it? 5. Or perhaps Yetus can publish its releases to Maven Central or another place, from which these can be downloaded? I'm not sure what the best solution is, but I definitely think this part should be fixed in all branches, somehow. > Hadoop build depends on archives.apache.org > --- > > Key: HADOOP-18786 > URL: https://issues.apache.org/jira/browse/HADOOP-18786 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 3.3.6 >Reporter: Christopher Tubbs >Priority: Critical > Labels: pull-request-available > > Several times throughout Hadoop's source, the ASF archive is referenced, > including part of the build that downloads Yetus. > Building a release from source should not require access to the ASF archives, > as that contributes to end users being subject to throttling and blocking by > INFRA, for "abuse" of the archives, even though they are merely building a > current ASF release from source. This is particularly problematic for > downstream packagers who must build from Hadoop's source, or for CI/CD > situations that depend on Hadoop's source, and particularly problematic for > those end users behind a NAT gateway, because even if Hadoop's use of the > archive is modest, it adds up for multiple users. > The build should be modified, so that it does not require access to fixed > versions in the archives (or should work with the upstream of those dependent > projects to publish their releases elsewhere, for routine consumptions). In > the interim, the source could be updated to point to the current dependency > versions available on downloads.apache.org. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org
[ https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737829#comment-17737829 ] ASF GitHub Bot commented on HADOOP-18786: - ctubbsii opened a new pull request, #5789: URL: https://github.com/apache/hadoop/pull/5789 ### Description of PR * Use Yetus 0.14.1 from downloads.apache.org in yetus-wrapper * Use Maven 3.8.8 from downloads.apache.org in Win 10 Dockerfile * Point users to downloads.apache.org for JVSC * Use Solr 8.11.2 from downloads.apache.org in YARN Dockerfile ### How was this patch tested? Download links verified to work. ### For code changes: - [x] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? > Hadoop build depends on archives.apache.org > --- > > Key: HADOOP-18786 > URL: https://issues.apache.org/jira/browse/HADOOP-18786 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 3.3.6 >Reporter: Christopher Tubbs >Priority: Critical > > Several times throughout Hadoop's source, the ASF archive is referenced, > including part of the build that downloads Yetus. > Building a release from source should not require access to the ASF archives, > as that contributes to end users being subject to throttling and blocking by > INFRA, for "abuse" of the archives, even though they are merely building a > current ASF release from source. This is particularly problematic for > downstream packagers who must build from Hadoop's source, or for CI/CD > situations that depend on Hadoop's source, and particularly problematic for > those end users behind a NAT gateway, because even if Hadoop's use of the > archive is modest, it adds up for multiple users. > The build should be modified, so that it does not require access to fixed > versions in the archives (or should work with the upstream of those dependent > projects to publish their releases elsewhere, for routine consumptions). In > the interim, the source could be updated to point to the current dependency > versions available on downloads.apache.org. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org