The problem is that it's not really an "official" download link, but rather
just a supplemental convenience. While that may be ok when distributing
artifacts, it's more of a problem when actually building and testing
artifacts. In the latter case, the download should really only be from an
Apache mirror.

On Thu, Sep 14, 2017 at 1:20 AM, Wenchen Fan <cloud0...@gmail.com> wrote:

> That test case is trying to test the backward compatibility of
> `HiveExternalCatalog`. It downloads official Spark releases and creates
> tables with them, and then read these tables via the current Spark.
>
> About the download link, I just picked it from the Spark website, and this
> link is the default one when you choose "direct download". Do we have a
> better choice?
>
> On Thu, Sep 14, 2017 at 3:05 AM, Shivaram Venkataraman <
> shiva...@eecs.berkeley.edu> wrote:
>
>> Mark, I agree with your point on the risks of using Cloudfront while
>> building Spark. I was only trying to provide background on when we
>> started using Cloudfront.
>>
>> Personally, I don't have enough about context about the test case in
>> question (e.g. Why are we downloading Spark in a test case ?).
>>
>> Thanks
>> Shivaram
>>
>> On Wed, Sep 13, 2017 at 11:50 AM, Mark Hamstra <m...@clearstorydata.com>
>> wrote:
>> > Yeah, but that discussion and use case is a bit different -- providing a
>> > different route to download the final released and approved artifacts
>> that
>> > were built using only acceptable artifacts and sources vs. building and
>> > checking prior to release using something that is not from an Apache
>> mirror.
>> > This new use case puts us in the position of approving spark artifacts
>> that
>> > weren't built entirely from canonical resources located in presumably
>> secure
>> > and monitored repositories. Incorporating something that is not
>> completely
>> > trusted or approved into the process of building something that we are
>> then
>> > going to approve as trusted is different from the prior use of
>> cloudfront.
>> >
>> > On Wed, Sep 13, 2017 at 10:26 AM, Shivaram Venkataraman
>> > <shiva...@eecs.berkeley.edu> wrote:
>> >>
>> >> The bucket comes from Cloudfront, a CDN thats part of AWS. There was a
>> >> bunch of discussion about this back in 2013
>> >>
>> >> https://lists.apache.org/thread.html/9a72ff7ce913dd85a6b112b
>> 1b2de536dcda74b28b050f70646aba0ac@1380147885@%3Cdev.spark.apache.org%3E
>> >>
>> >> Shivaram
>> >>
>> >> On Wed, Sep 13, 2017 at 9:30 AM, Sean Owen <so...@cloudera.com> wrote:
>> >> > Not a big deal, but Mark noticed that this test now downloads Spark
>> >> > artifacts from the same 'direct download' link available on the
>> >> > downloads
>> >> > page:
>> >> >
>> >> >
>> >> > https://github.com/apache/spark/blob/master/sql/hive/src/
>> test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVe
>> rsionsSuite.scala#L53
>> >> >
>> >> > https://d3kbcqa49mib13.cloudfront.net/spark-$version-bin-
>> hadoop2.7.tgz
>> >> >
>> >> > I don't know of any particular problem with this, which is a parallel
>> >> > download option in addition to the Apache mirrors. It's also the
>> >> > default.
>> >> >
>> >> > Does anyone know what this bucket is and if there's a strong reason
>> we
>> >> > can't
>> >> > just use mirrors?
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>

Reply via email to