That test case is trying to test the backward compatibility of
`HiveExternalCatalog`. It downloads official Spark releases and creates
tables with them, and then read these tables via the current Spark.

About the download link, I just picked it from the Spark website, and this
link is the default one when you choose "direct download". Do we have a
better choice?

On Thu, Sep 14, 2017 at 3:05 AM, Shivaram Venkataraman <
shiva...@eecs.berkeley.edu> wrote:

> Mark, I agree with your point on the risks of using Cloudfront while
> building Spark. I was only trying to provide background on when we
> started using Cloudfront.
>
> Personally, I don't have enough about context about the test case in
> question (e.g. Why are we downloading Spark in a test case ?).
>
> Thanks
> Shivaram
>
> On Wed, Sep 13, 2017 at 11:50 AM, Mark Hamstra <m...@clearstorydata.com>
> wrote:
> > Yeah, but that discussion and use case is a bit different -- providing a
> > different route to download the final released and approved artifacts
> that
> > were built using only acceptable artifacts and sources vs. building and
> > checking prior to release using something that is not from an Apache
> mirror.
> > This new use case puts us in the position of approving spark artifacts
> that
> > weren't built entirely from canonical resources located in presumably
> secure
> > and monitored repositories. Incorporating something that is not
> completely
> > trusted or approved into the process of building something that we are
> then
> > going to approve as trusted is different from the prior use of
> cloudfront.
> >
> > On Wed, Sep 13, 2017 at 10:26 AM, Shivaram Venkataraman
> > <shiva...@eecs.berkeley.edu> wrote:
> >>
> >> The bucket comes from Cloudfront, a CDN thats part of AWS. There was a
> >> bunch of discussion about this back in 2013
> >>
> >> https://lists.apache.org/thread.html/9a72ff7ce913dd85a6b112b1b2de53
> 6dcda74b28b050f70646aba0ac@1380147885@%3Cdev.spark.apache.org%3E
> >>
> >> Shivaram
> >>
> >> On Wed, Sep 13, 2017 at 9:30 AM, Sean Owen <so...@cloudera.com> wrote:
> >> > Not a big deal, but Mark noticed that this test now downloads Spark
> >> > artifacts from the same 'direct download' link available on the
> >> > downloads
> >> > page:
> >> >
> >> >
> >> > https://github.com/apache/spark/blob/master/sql/hive/
> src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSui
> te.scala#L53
> >> >
> >> > https://d3kbcqa49mib13.cloudfront.net/spark-$version-
> bin-hadoop2.7.tgz
> >> >
> >> > I don't know of any particular problem with this, which is a parallel
> >> > download option in addition to the Apache mirrors. It's also the
> >> > default.
> >> >
> >> > Does anyone know what this bucket is and if there's a strong reason we
> >> > can't
> >> > just use mirrors?
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

Reply via email to