[ 
https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17737886#comment-17737886
 ] 

ASF GitHub Bot commented on HADOOP-18786:
-----------------------------------------

ctubbsii commented on PR #5789:
URL: https://github.com/apache/hadoop/pull/5789#issuecomment-1610266483

   > Also, I guess we want to this change in all active branches? i.e. 
branch-3.3 branch-3.2 and branch-2.10.
   
   As a stop-gap, it would be good to prevent any downstream users from hitting 
this on all new releases, as it can trigger a site-wide ban if they've used the 
archives too much. Building Hadoop releases from source shouldn't cause users 
to be banned from access to the ASF.
   
   But, this is only a stop-gap solution, regardless of where it is applied. 
Once these dependencies roll over to the archives, then you'll have the problem 
of users being unable to build Hadoop releases from source without first 
patching its build in some way. So, a more complete solution still needs to be 
created.
   
   The Dockerfile changes are probably okay (as I imagine those are optional, 
and users can derive their own Dockerfiles easily enough from these as 
reference), and the generic message about JVSC is certainly okay. You could 
probably get away with just applying those changes to the trunk.
   
   The main problem to address across all branches is the yetus-wrapper's use 
of the archives. Perhaps one of the following would work?
   1. yetus can be made an optional part of the build (a dev-only profile that 
is inactive by default when users build from source)?
   2. Or you can bundle yetus into the release as a build tool so it doesn't 
need to go to the archives (might be against ASF policy for source releases, 
but perhaps there's an exception to the rule)?
   3. Or perhaps yetus is stable enough that you can just reference 
downloads.apache.org/yetus/latest instead of a specific version?
   4. Or maybe the build instructions should just tell the user that they need 
to download or install it as a build prerequisite, rather than have the Hadoop 
scripts download it?
   5. Or perhaps Yetus can publish its releases to Maven Central or another 
place, from which these can be downloaded?
   I'm not sure what the best solution is, but I definitely think this part 
should be fixed in all branches, somehow.




> Hadoop build depends on archives.apache.org
> -------------------------------------------
>
>                 Key: HADOOP-18786
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18786
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 3.3.6
>            Reporter: Christopher Tubbs
>            Priority: Critical
>              Labels: pull-request-available
>
> Several times throughout Hadoop's source, the ASF archive is referenced, 
> including part of the build that downloads Yetus.
> Building a release from source should not require access to the ASF archives, 
> as that contributes to end users being subject to throttling and blocking by 
> INFRA, for "abuse" of the archives, even though they are merely building a 
> current ASF release from source. This is particularly problematic for 
> downstream packagers who must build from Hadoop's source, or for CI/CD 
> situations that depend on Hadoop's source, and particularly problematic for 
> those end users behind a NAT gateway, because even if Hadoop's use of the 
> archive is modest, it adds up for multiple users.
> The build should be modified, so that it does not require access to fixed 
> versions in the archives (or should work with the upstream of those dependent 
> projects to publish their releases elsewhere, for routine consumptions). In 
> the interim, the source could be updated to point to the current dependency 
> versions available on downloads.apache.org.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to