[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org

2024-05-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846400#comment-17846400
 ] 

ASF GitHub Bot commented on HADOOP-18786:
-

steveloughran commented on PR #5789:
URL: https://github.com/apache/hadoop/pull/5789#issuecomment-2110965419

   thanks, in trunk. can you do a pr cherrypicking to branch-3.4 so we can keep 
that in sync. No need for more reviews, just a yetus test run.




> Hadoop build depends on archives.apache.org
> ---
>
> Key: HADOOP-18786
> URL: https://issues.apache.org/jira/browse/HADOOP-18786
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Christopher Tubbs
>Priority: Critical
>  Labels: pull-request-available
>
> Several times throughout Hadoop's source, the ASF archive is referenced, 
> including part of the build that downloads Yetus.
> Building a release from source should not require access to the ASF archives, 
> as that contributes to end users being subject to throttling and blocking by 
> INFRA, for "abuse" of the archives, even though they are merely building a 
> current ASF release from source. This is particularly problematic for 
> downstream packagers who must build from Hadoop's source, or for CI/CD 
> situations that depend on Hadoop's source, and particularly problematic for 
> those end users behind a NAT gateway, because even if Hadoop's use of the 
> archive is modest, it adds up for multiple users.
> The build should be modified, so that it does not require access to fixed 
> versions in the archives (or should work with the upstream of those dependent 
> projects to publish their releases elsewhere, for routine consumptions). In 
> the interim, the source could be updated to point to the current dependency 
> versions available on downloads.apache.org.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org

2024-05-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846399#comment-17846399
 ] 

ASF GitHub Bot commented on HADOOP-18786:
-

steveloughran merged PR #5789:
URL: https://github.com/apache/hadoop/pull/5789




> Hadoop build depends on archives.apache.org
> ---
>
> Key: HADOOP-18786
> URL: https://issues.apache.org/jira/browse/HADOOP-18786
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Christopher Tubbs
>Priority: Critical
>  Labels: pull-request-available
>
> Several times throughout Hadoop's source, the ASF archive is referenced, 
> including part of the build that downloads Yetus.
> Building a release from source should not require access to the ASF archives, 
> as that contributes to end users being subject to throttling and blocking by 
> INFRA, for "abuse" of the archives, even though they are merely building a 
> current ASF release from source. This is particularly problematic for 
> downstream packagers who must build from Hadoop's source, or for CI/CD 
> situations that depend on Hadoop's source, and particularly problematic for 
> those end users behind a NAT gateway, because even if Hadoop's use of the 
> archive is modest, it adds up for multiple users.
> The build should be modified, so that it does not require access to fixed 
> versions in the archives (or should work with the upstream of those dependent 
> projects to publish their releases elsewhere, for routine consumptions). In 
> the interim, the source could be updated to point to the current dependency 
> versions available on downloads.apache.org.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org

2024-05-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846398#comment-17846398
 ] 

ASF GitHub Bot commented on HADOOP-18786:
-

steveloughran commented on PR #5789:
URL: https://github.com/apache/hadoop/pull/5789#issuecomment-2110961454

   ok, let's merge and see what happens.




> Hadoop build depends on archives.apache.org
> ---
>
> Key: HADOOP-18786
> URL: https://issues.apache.org/jira/browse/HADOOP-18786
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Christopher Tubbs
>Priority: Critical
>  Labels: pull-request-available
>
> Several times throughout Hadoop's source, the ASF archive is referenced, 
> including part of the build that downloads Yetus.
> Building a release from source should not require access to the ASF archives, 
> as that contributes to end users being subject to throttling and blocking by 
> INFRA, for "abuse" of the archives, even though they are merely building a 
> current ASF release from source. This is particularly problematic for 
> downstream packagers who must build from Hadoop's source, or for CI/CD 
> situations that depend on Hadoop's source, and particularly problematic for 
> those end users behind a NAT gateway, because even if Hadoop's use of the 
> archive is modest, it adds up for multiple users.
> The build should be modified, so that it does not require access to fixed 
> versions in the archives (or should work with the upstream of those dependent 
> projects to publish their releases elsewhere, for routine consumptions). In 
> the interim, the source could be updated to point to the current dependency 
> versions available on downloads.apache.org.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846044#comment-17846044
 ] 

ASF GitHub Bot commented on HADOOP-18786:
-

ctubbsii commented on PR #5789:
URL: https://github.com/apache/hadoop/pull/5789#issuecomment-2108580048

   > The main thing I want to be sure is from this build, what gets into the 
distro? only stuff from the maven repo, right? that is: this PR MUST NOT force 
updates in the binaries we ship.
   
   I don't quite understand the question. The premise seems to be that the 
current build is only grabbing stuff from the Maven repo. However, that's not 
true currently, and that's part of the problem. The build currently grabs stuff 
from the archives, and not just from the Maven repo. Those are the URLs that 
this PR changes... to use the ASF CDN instead of the archives. The only change 
that might affect the distro is a couple of tools do not have that version 
available in the CDN anymore, so a version bump was necessary to be able to 
grab it from the CDN instead of from the archives. However, I don't know if 
those affect the binaries in the distro either, or if those are only used as 
unshipped build tools. But even if it does change the binaries in some way, the 
current situation of automatically going to the ASF archives cannot continue... 
it makes offline builds very hard, and the download of things from the archives 
causes frequent builds to trigger automated bans of ASF services, because the 
archives aren't meant to be used this way (for routine builds).




> Hadoop build depends on archives.apache.org
> ---
>
> Key: HADOOP-18786
> URL: https://issues.apache.org/jira/browse/HADOOP-18786
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Christopher Tubbs
>Priority: Critical
>  Labels: pull-request-available
>
> Several times throughout Hadoop's source, the ASF archive is referenced, 
> including part of the build that downloads Yetus.
> Building a release from source should not require access to the ASF archives, 
> as that contributes to end users being subject to throttling and blocking by 
> INFRA, for "abuse" of the archives, even though they are merely building a 
> current ASF release from source. This is particularly problematic for 
> downstream packagers who must build from Hadoop's source, or for CI/CD 
> situations that depend on Hadoop's source, and particularly problematic for 
> those end users behind a NAT gateway, because even if Hadoop's use of the 
> archive is modest, it adds up for multiple users.
> The build should be modified, so that it does not require access to fixed 
> versions in the archives (or should work with the upstream of those dependent 
> projects to publish their releases elsewhere, for routine consumptions). In 
> the interim, the source could be updated to point to the current dependency 
> versions available on downloads.apache.org.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846024#comment-17846024
 ] 

ASF GitHub Bot commented on HADOOP-18786:
-

steveloughran commented on PR #5789:
URL: https://github.com/apache/hadoop/pull/5789#issuecomment-2108352558

   The main thing I want to be sure is from this build, what gets into the 
distro? only stuff from the maven repo, right? that is: this PR MUST NOT force 
updates in the binaries we ship.




> Hadoop build depends on archives.apache.org
> ---
>
> Key: HADOOP-18786
> URL: https://issues.apache.org/jira/browse/HADOOP-18786
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Christopher Tubbs
>Priority: Critical
>  Labels: pull-request-available
>
> Several times throughout Hadoop's source, the ASF archive is referenced, 
> including part of the build that downloads Yetus.
> Building a release from source should not require access to the ASF archives, 
> as that contributes to end users being subject to throttling and blocking by 
> INFRA, for "abuse" of the archives, even though they are merely building a 
> current ASF release from source. This is particularly problematic for 
> downstream packagers who must build from Hadoop's source, or for CI/CD 
> situations that depend on Hadoop's source, and particularly problematic for 
> those end users behind a NAT gateway, because even if Hadoop's use of the 
> archive is modest, it adds up for multiple users.
> The build should be modified, so that it does not require access to fixed 
> versions in the archives (or should work with the upstream of those dependent 
> projects to publish their releases elsewhere, for routine consumptions). In 
> the interim, the source could be updated to point to the current dependency 
> versions available on downloads.apache.org.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org

2024-05-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844718#comment-17844718
 ] 

ASF GitHub Bot commented on HADOOP-18786:
-

ctubbsii commented on PR #5789:
URL: https://github.com/apache/hadoop/pull/5789#issuecomment-2101037371

   This was previously approved, and I've answered all the questions raised. I 
just resolved the merge conflicts from upstream, where some lines got moved 
around in the Dockerfile for Windows. Is anybody willing to merge this?




> Hadoop build depends on archives.apache.org
> ---
>
> Key: HADOOP-18786
> URL: https://issues.apache.org/jira/browse/HADOOP-18786
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Christopher Tubbs
>Priority: Critical
>  Labels: pull-request-available
>
> Several times throughout Hadoop's source, the ASF archive is referenced, 
> including part of the build that downloads Yetus.
> Building a release from source should not require access to the ASF archives, 
> as that contributes to end users being subject to throttling and blocking by 
> INFRA, for "abuse" of the archives, even though they are merely building a 
> current ASF release from source. This is particularly problematic for 
> downstream packagers who must build from Hadoop's source, or for CI/CD 
> situations that depend on Hadoop's source, and particularly problematic for 
> those end users behind a NAT gateway, because even if Hadoop's use of the 
> archive is modest, it adds up for multiple users.
> The build should be modified, so that it does not require access to fixed 
> versions in the archives (or should work with the upstream of those dependent 
> projects to publish their releases elsewhere, for routine consumptions). In 
> the interim, the source could be updated to point to the current dependency 
> versions available on downloads.apache.org.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org

2023-07-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748283#comment-17748283
 ] 

ASF GitHub Bot commented on HADOOP-18786:
-

ctubbsii commented on PR #5789:
URL: https://github.com/apache/hadoop/pull/5789#issuecomment-1654127004

   @steveloughran wrote: 
   > @ctubbsii do we need to bump up the versions for the move to downloads to 
work?
   
   The old versions only exist in the archives, so yes. But, as previously 
said, this is really only a temporary fix, as it forces the build to depend on 
a transient state on downloads.a.o. The Hadoop devs need to figure out a more 
permanent solution. This is just a stop-gap.




> Hadoop build depends on archives.apache.org
> ---
>
> Key: HADOOP-18786
> URL: https://issues.apache.org/jira/browse/HADOOP-18786
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Christopher Tubbs
>Priority: Critical
>  Labels: pull-request-available
>
> Several times throughout Hadoop's source, the ASF archive is referenced, 
> including part of the build that downloads Yetus.
> Building a release from source should not require access to the ASF archives, 
> as that contributes to end users being subject to throttling and blocking by 
> INFRA, for "abuse" of the archives, even though they are merely building a 
> current ASF release from source. This is particularly problematic for 
> downstream packagers who must build from Hadoop's source, or for CI/CD 
> situations that depend on Hadoop's source, and particularly problematic for 
> those end users behind a NAT gateway, because even if Hadoop's use of the 
> archive is modest, it adds up for multiple users.
> The build should be modified, so that it does not require access to fixed 
> versions in the archives (or should work with the upstream of those dependent 
> projects to publish their releases elsewhere, for routine consumptions). In 
> the interim, the source could be updated to point to the current dependency 
> versions available on downloads.apache.org.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org

2023-07-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748069#comment-17748069
 ] 

ASF GitHub Bot commented on HADOOP-18786:
-

steveloughran commented on PR #5789:
URL: https://github.com/apache/hadoop/pull/5789#issuecomment-1653336705

   reviewing this. 
   
   @ctubbsii do we need to bump up the versions for the move to downloads to 
work? 




> Hadoop build depends on archives.apache.org
> ---
>
> Key: HADOOP-18786
> URL: https://issues.apache.org/jira/browse/HADOOP-18786
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Christopher Tubbs
>Priority: Critical
>  Labels: pull-request-available
>
> Several times throughout Hadoop's source, the ASF archive is referenced, 
> including part of the build that downloads Yetus.
> Building a release from source should not require access to the ASF archives, 
> as that contributes to end users being subject to throttling and blocking by 
> INFRA, for "abuse" of the archives, even though they are merely building a 
> current ASF release from source. This is particularly problematic for 
> downstream packagers who must build from Hadoop's source, or for CI/CD 
> situations that depend on Hadoop's source, and particularly problematic for 
> those end users behind a NAT gateway, because even if Hadoop's use of the 
> archive is modest, it adds up for multiple users.
> The build should be modified, so that it does not require access to fixed 
> versions in the archives (or should work with the upstream of those dependent 
> projects to publish their releases elsewhere, for routine consumptions). In 
> the interim, the source could be updated to point to the current dependency 
> versions available on downloads.apache.org.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org

2023-06-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17738791#comment-17738791
 ] 

ASF GitHub Bot commented on HADOOP-18786:
-

ctubbsii commented on PR #5789:
URL: https://github.com/apache/hadoop/pull/5789#issuecomment-1613822991

   The version bumps are merely incidental to the actual issue this PR and JIRA 
are intending to expose: that the use of the archives should be avoided.
   
   If committers want to bump the versions first, I can rebase this PR once 
that is done. However, these changes are pretty trivial, so I'm not really 
needed for that part. Once the committers decide to act on this, they can 
rebase or merge however they see fit. Up to now, these "should" comments about 
what to do about the version bumps have been very unclear to me... it looks 
like discussion among yourselves... but if it's a request for me to change 
something in this PR, please state the request clearly.




> Hadoop build depends on archives.apache.org
> ---
>
> Key: HADOOP-18786
> URL: https://issues.apache.org/jira/browse/HADOOP-18786
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Christopher Tubbs
>Priority: Critical
>  Labels: pull-request-available
>
> Several times throughout Hadoop's source, the ASF archive is referenced, 
> including part of the build that downloads Yetus.
> Building a release from source should not require access to the ASF archives, 
> as that contributes to end users being subject to throttling and blocking by 
> INFRA, for "abuse" of the archives, even though they are merely building a 
> current ASF release from source. This is particularly problematic for 
> downstream packagers who must build from Hadoop's source, or for CI/CD 
> situations that depend on Hadoop's source, and particularly problematic for 
> those end users behind a NAT gateway, because even if Hadoop's use of the 
> archive is modest, it adds up for multiple users.
> The build should be modified, so that it does not require access to fixed 
> versions in the archives (or should work with the upstream of those dependent 
> projects to publish their releases elsewhere, for routine consumptions). In 
> the interim, the source could be updated to point to the current dependency 
> versions available on downloads.apache.org.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org

2023-06-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17738568#comment-17738568
 ] 

ASF GitHub Bot commented on HADOOP-18786:
-

ctubbsii commented on PR #5789:
URL: https://github.com/apache/hadoop/pull/5789#issuecomment-1613163547

   > yetus is special compared to gradle as it is asf, but then so would 
thriftc and mvn be.
   
   From https://www.apache.org/legal/release-policy.html#source-packages: `A 
source release SHOULD not contain compiled code`
   
   So, it's fine if you're just bundling and redistributing Yetus scripts, 
which are also source. I'm not sure if Yetus contains compiled code, or if it's 
just scripts, and I'm not sure which parts Hadoop's build uses.




> Hadoop build depends on archives.apache.org
> ---
>
> Key: HADOOP-18786
> URL: https://issues.apache.org/jira/browse/HADOOP-18786
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Christopher Tubbs
>Priority: Critical
>  Labels: pull-request-available
>
> Several times throughout Hadoop's source, the ASF archive is referenced, 
> including part of the build that downloads Yetus.
> Building a release from source should not require access to the ASF archives, 
> as that contributes to end users being subject to throttling and blocking by 
> INFRA, for "abuse" of the archives, even though they are merely building a 
> current ASF release from source. This is particularly problematic for 
> downstream packagers who must build from Hadoop's source, or for CI/CD 
> situations that depend on Hadoop's source, and particularly problematic for 
> those end users behind a NAT gateway, because even if Hadoop's use of the 
> archive is modest, it adds up for multiple users.
> The build should be modified, so that it does not require access to fixed 
> versions in the archives (or should work with the upstream of those dependent 
> projects to publish their releases elsewhere, for routine consumptions). In 
> the interim, the source could be updated to point to the current dependency 
> versions available on downloads.apache.org.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org

2023-06-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17738566#comment-17738566
 ] 

ASF GitHub Bot commented on HADOOP-18786:
-

steveloughran commented on PR #5789:
URL: https://github.com/apache/hadoop/pull/5789#issuecomment-1613156364

   yetus is special compared to gradle as it is asf, but then so would thriftc 
and mvn be.




> Hadoop build depends on archives.apache.org
> ---
>
> Key: HADOOP-18786
> URL: https://issues.apache.org/jira/browse/HADOOP-18786
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Christopher Tubbs
>Priority: Critical
>  Labels: pull-request-available
>
> Several times throughout Hadoop's source, the ASF archive is referenced, 
> including part of the build that downloads Yetus.
> Building a release from source should not require access to the ASF archives, 
> as that contributes to end users being subject to throttling and blocking by 
> INFRA, for "abuse" of the archives, even though they are merely building a 
> current ASF release from source. This is particularly problematic for 
> downstream packagers who must build from Hadoop's source, or for CI/CD 
> situations that depend on Hadoop's source, and particularly problematic for 
> those end users behind a NAT gateway, because even if Hadoop's use of the 
> archive is modest, it adds up for multiple users.
> The build should be modified, so that it does not require access to fixed 
> versions in the archives (or should work with the upstream of those dependent 
> projects to publish their releases elsewhere, for routine consumptions). In 
> the interim, the source could be updated to point to the current dependency 
> versions available on downloads.apache.org.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org

2023-06-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17738301#comment-17738301
 ] 

ASF GitHub Bot commented on HADOOP-18786:
-

ctubbsii commented on PR #5789:
URL: https://github.com/apache/hadoop/pull/5789#issuecomment-1612109177

   > * yetus in source tarballs is interesting. I'm happy for a source bundle 
to include it as it makes for more of a "all the tools you need are here" 
world. (ignoring maven artifacts...)
   
   Unfortunately, I'm pretty sure it's against policy. This is the same issue 
that people face when they want to include the gradlew or mvnw jars in their 
sources.




> Hadoop build depends on archives.apache.org
> ---
>
> Key: HADOOP-18786
> URL: https://issues.apache.org/jira/browse/HADOOP-18786
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Christopher Tubbs
>Priority: Critical
>  Labels: pull-request-available
>
> Several times throughout Hadoop's source, the ASF archive is referenced, 
> including part of the build that downloads Yetus.
> Building a release from source should not require access to the ASF archives, 
> as that contributes to end users being subject to throttling and blocking by 
> INFRA, for "abuse" of the archives, even though they are merely building a 
> current ASF release from source. This is particularly problematic for 
> downstream packagers who must build from Hadoop's source, or for CI/CD 
> situations that depend on Hadoop's source, and particularly problematic for 
> those end users behind a NAT gateway, because even if Hadoop's use of the 
> archive is modest, it adds up for multiple users.
> The build should be modified, so that it does not require access to fixed 
> versions in the archives (or should work with the upstream of those dependent 
> projects to publish their releases elsewhere, for routine consumptions). In 
> the interim, the source could be updated to point to the current dependency 
> versions available on downloads.apache.org.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org

2023-06-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17738287#comment-17738287
 ] 

ASF GitHub Bot commented on HADOOP-18786:
-

steveloughran commented on PR #5789:
URL: https://github.com/apache/hadoop/pull/5789#issuecomment-1612045152

   * solr change should be pulled out as its own thing, as it includes a 
version change.
   * yetus in source tarballs is interesting. I'm happy for a source bundle to 
include it as it makes for more of a "all the tools you need are here" world. 
(ignoring maven artifacts...)




> Hadoop build depends on archives.apache.org
> ---
>
> Key: HADOOP-18786
> URL: https://issues.apache.org/jira/browse/HADOOP-18786
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Christopher Tubbs
>Priority: Critical
>  Labels: pull-request-available
>
> Several times throughout Hadoop's source, the ASF archive is referenced, 
> including part of the build that downloads Yetus.
> Building a release from source should not require access to the ASF archives, 
> as that contributes to end users being subject to throttling and blocking by 
> INFRA, for "abuse" of the archives, even though they are merely building a 
> current ASF release from source. This is particularly problematic for 
> downstream packagers who must build from Hadoop's source, or for CI/CD 
> situations that depend on Hadoop's source, and particularly problematic for 
> those end users behind a NAT gateway, because even if Hadoop's use of the 
> archive is modest, it adds up for multiple users.
> The build should be modified, so that it does not require access to fixed 
> versions in the archives (or should work with the upstream of those dependent 
> projects to publish their releases elsewhere, for routine consumptions). In 
> the interim, the source could be updated to point to the current dependency 
> versions available on downloads.apache.org.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org

2023-06-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737886#comment-17737886
 ] 

ASF GitHub Bot commented on HADOOP-18786:
-

ctubbsii commented on PR #5789:
URL: https://github.com/apache/hadoop/pull/5789#issuecomment-1610266483

   > Also, I guess we want to this change in all active branches? i.e. 
branch-3.3 branch-3.2 and branch-2.10.
   
   As a stop-gap, it would be good to prevent any downstream users from hitting 
this on all new releases, as it can trigger a site-wide ban if they've used the 
archives too much. Building Hadoop releases from source shouldn't cause users 
to be banned from access to the ASF.
   
   But, this is only a stop-gap solution, regardless of where it is applied. 
Once these dependencies roll over to the archives, then you'll have the problem 
of users being unable to build Hadoop releases from source without first 
patching its build in some way. So, a more complete solution still needs to be 
created.
   
   The Dockerfile changes are probably okay (as I imagine those are optional, 
and users can derive their own Dockerfiles easily enough from these as 
reference), and the generic message about JVSC is certainly okay. You could 
probably get away with just applying those changes to the trunk.
   
   The main problem to address across all branches is the yetus-wrapper's use 
of the archives. Perhaps one of the following would work?
   1. yetus can be made an optional part of the build (a dev-only profile that 
is inactive by default when users build from source)?
   2. Or you can bundle yetus into the release as a build tool so it doesn't 
need to go to the archives (might be against ASF policy for source releases, 
but perhaps there's an exception to the rule)?
   3. Or perhaps yetus is stable enough that you can just reference 
downloads.apache.org/yetus/latest instead of a specific version?
   4. Or maybe the build instructions should just tell the user that they need 
to download or install it as a build prerequisite, rather than have the Hadoop 
scripts download it?
   5. Or perhaps Yetus can publish its releases to Maven Central or another 
place, from which these can be downloaded?
   I'm not sure what the best solution is, but I definitely think this part 
should be fixed in all branches, somehow.




> Hadoop build depends on archives.apache.org
> ---
>
> Key: HADOOP-18786
> URL: https://issues.apache.org/jira/browse/HADOOP-18786
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Christopher Tubbs
>Priority: Critical
>  Labels: pull-request-available
>
> Several times throughout Hadoop's source, the ASF archive is referenced, 
> including part of the build that downloads Yetus.
> Building a release from source should not require access to the ASF archives, 
> as that contributes to end users being subject to throttling and blocking by 
> INFRA, for "abuse" of the archives, even though they are merely building a 
> current ASF release from source. This is particularly problematic for 
> downstream packagers who must build from Hadoop's source, or for CI/CD 
> situations that depend on Hadoop's source, and particularly problematic for 
> those end users behind a NAT gateway, because even if Hadoop's use of the 
> archive is modest, it adds up for multiple users.
> The build should be modified, so that it does not require access to fixed 
> versions in the archives (or should work with the upstream of those dependent 
> projects to publish their releases elsewhere, for routine consumptions). In 
> the interim, the source could be updated to point to the current dependency 
> versions available on downloads.apache.org.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18786) Hadoop build depends on archives.apache.org

2023-06-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737829#comment-17737829
 ] 

ASF GitHub Bot commented on HADOOP-18786:
-

ctubbsii opened a new pull request, #5789:
URL: https://github.com/apache/hadoop/pull/5789

   ### Description of PR
   
   * Use Yetus 0.14.1 from downloads.apache.org in yetus-wrapper
   * Use Maven 3.8.8 from downloads.apache.org in Win 10 Dockerfile
   * Point users to downloads.apache.org for JVSC
   * Use Solr 8.11.2 from downloads.apache.org in YARN Dockerfile
   
   ### How was this patch tested?
   
   Download links verified to work.
   
   ### For code changes:
   
   - [x] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   




> Hadoop build depends on archives.apache.org
> ---
>
> Key: HADOOP-18786
> URL: https://issues.apache.org/jira/browse/HADOOP-18786
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Christopher Tubbs
>Priority: Critical
>
> Several times throughout Hadoop's source, the ASF archive is referenced, 
> including part of the build that downloads Yetus.
> Building a release from source should not require access to the ASF archives, 
> as that contributes to end users being subject to throttling and blocking by 
> INFRA, for "abuse" of the archives, even though they are merely building a 
> current ASF release from source. This is particularly problematic for 
> downstream packagers who must build from Hadoop's source, or for CI/CD 
> situations that depend on Hadoop's source, and particularly problematic for 
> those end users behind a NAT gateway, because even if Hadoop's use of the 
> archive is modest, it adds up for multiple users.
> The build should be modified, so that it does not require access to fixed 
> versions in the archives (or should work with the upstream of those dependent 
> projects to publish their releases elsewhere, for routine consumptions). In 
> the interim, the source could be updated to point to the current dependency 
> versions available on downloads.apache.org.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org