That test case is trying to test the backward compatibility of `HiveExternalCatalog`. It downloads official Spark releases and creates tables with them, and then read these tables via the current Spark.
About the download link, I just picked it from the Spark website, and this link is the default one when you choose "direct download". Do we have a better choice? On Thu, Sep 14, 2017 at 3:05 AM, Shivaram Venkataraman < shiva...@eecs.berkeley.edu> wrote: > Mark, I agree with your point on the risks of using Cloudfront while > building Spark. I was only trying to provide background on when we > started using Cloudfront. > > Personally, I don't have enough about context about the test case in > question (e.g. Why are we downloading Spark in a test case ?). > > Thanks > Shivaram > > On Wed, Sep 13, 2017 at 11:50 AM, Mark Hamstra <m...@clearstorydata.com> > wrote: > > Yeah, but that discussion and use case is a bit different -- providing a > > different route to download the final released and approved artifacts > that > > were built using only acceptable artifacts and sources vs. building and > > checking prior to release using something that is not from an Apache > mirror. > > This new use case puts us in the position of approving spark artifacts > that > > weren't built entirely from canonical resources located in presumably > secure > > and monitored repositories. Incorporating something that is not > completely > > trusted or approved into the process of building something that we are > then > > going to approve as trusted is different from the prior use of > cloudfront. > > > > On Wed, Sep 13, 2017 at 10:26 AM, Shivaram Venkataraman > > <shiva...@eecs.berkeley.edu> wrote: > >> > >> The bucket comes from Cloudfront, a CDN thats part of AWS. There was a > >> bunch of discussion about this back in 2013 > >> > >> https://lists.apache.org/thread.html/9a72ff7ce913dd85a6b112b1b2de53 > 6dcda74b28b050f70646aba0ac@1380147885@%3Cdev.spark.apache.org%3E > >> > >> Shivaram > >> > >> On Wed, Sep 13, 2017 at 9:30 AM, Sean Owen <so...@cloudera.com> wrote: > >> > Not a big deal, but Mark noticed that this test now downloads Spark > >> > artifacts from the same 'direct download' link available on the > >> > downloads > >> > page: > >> > > >> > > >> > https://github.com/apache/spark/blob/master/sql/hive/ > src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSui > te.scala#L53 > >> > > >> > https://d3kbcqa49mib13.cloudfront.net/spark-$version- > bin-hadoop2.7.tgz > >> > > >> > I don't know of any particular problem with this, which is a parallel > >> > download option in addition to the Apache mirrors. It's also the > >> > default. > >> > > >> > Does anyone know what this bucket is and if there's a strong reason we > >> > can't > >> > just use mirrors? > >> > >> --------------------------------------------------------------------- > >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >> > > > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >