Oh, sweet! For example: http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.7.1/hadoop-2.7.1.tar.gz?asjson=1
Thanks for sharing that tip. Looks like you can also use as_json <https://svn.apache.org/repos/asf/infrastructure/site/trunk/content/dyn/mirrors/mirrors.cgi> (vs. asjson). Nick On Sun, Nov 1, 2015 at 5:32 PM Shivaram Venkataraman < shiva...@eecs.berkeley.edu> wrote: > On Sun, Nov 1, 2015 at 2:16 PM, Nicholas Chammas > <nicholas.cham...@gmail.com> wrote: > > OK, I’ll focus on the Apache mirrors going forward. > > > > The problem with the Apache mirrors, if I am not mistaken, is that you > > cannot use a single URL that automatically redirects you to a working > mirror > > to download Hadoop. You have to pick a specific mirror and pray it > doesn’t > > disappear tomorrow. > > > > They don’t go away, especially http://mirror.ox.ac.uk , and in the us > the > > apache.osuosl.org, osu being a where a lot of the ASF servers are kept. > > > > So does Apache offer no way to query a URL and automatically get the > closest > > working mirror? If I’m installing HDFS onto servers in various EC2 > regions, > > the best mirror will vary depending on my location. > > > Not sure if this is officially documented somewhere but if you pass > '&asjson=1' you will get back a JSON which has a 'preferred' field set > to the closest mirror. > > Shivaram > > Nick > > > > > > On Sun, Nov 1, 2015 at 12:25 PM Shivaram Venkataraman > > <shiva...@eecs.berkeley.edu> wrote: > >> > >> I think that getting them from the ASF mirrors is a better strategy in > >> general as it'll remove the overhead of keeping the S3 bucket up to > >> date. It works in the spark-ec2 case because we only support a limited > >> number of Hadoop versions from the tool. FWIW I don't have write > >> access to the bucket and also haven't heard of any plans to support > >> newer versions in spark-ec2. > >> > >> Thanks > >> Shivaram > >> > >> On Sun, Nov 1, 2015 at 2:30 AM, Steve Loughran <ste...@hortonworks.com> > >> wrote: > >> > > >> > On 1 Nov 2015, at 03:17, Nicholas Chammas <nicholas.cham...@gmail.com > > > >> > wrote: > >> > > >> > https://s3.amazonaws.com/spark-related-packages/ > >> > > >> > spark-ec2 uses this bucket to download and install HDFS on clusters. > Is > >> > it > >> > owned by the Spark project or by the AMPLab? > >> > > >> > Anyway, it looks like the latest Hadoop install available on there is > >> > Hadoop > >> > 2.4.0. > >> > > >> > Are there plans to add newer versions of Hadoop for use by spark-ec2 > and > >> > similar tools, or should we just be getting that stuff via an Apache > >> > mirror? > >> > The latest version is 2.7.1, by the way. > >> > > >> > > >> > you should be grabbing the artifacts off the ASF and then verifying > >> > their > >> > SHA1 checksums as published on the ASF HTTPS web site > >> > > >> > > >> > The problem with the Apache mirrors, if I am not mistaken, is that you > >> > cannot use a single URL that automatically redirects you to a working > >> > mirror > >> > to download Hadoop. You have to pick a specific mirror and pray it > >> > doesn't > >> > disappear tomorrow. > >> > > >> > > >> > They don't go away, especially http://mirror.ox.ac.uk , and in the us > >> > the > >> > apache.osuosl.org, osu being a where a lot of the ASF servers are > kept. > >> > > >> > full list with availability stats > >> > > >> > http://www.apache.org/mirrors/ > >> > > >> > >