Ideally, that list is updated with each release, yes. Non-current releases
will now always download from archive.apache.org though. But we run into
rate-limiting problems if that gets pinged too much. So yes good to keep
the list only to current branches.

It looks like the download is cached in /tmp/test-spark, for what it's
worth.

On Thu, Jul 19, 2018 at 11:06 AM Felix Cheung <felixcheun...@hotmail.com>
wrote:

> +1 this has been problematic.
>
> Also, this list needs to be updated every time we make a new release?
>
> Plus can we cache them on Jenkins, maybe we can avoid downloading the same
> thing from Apache archive every test run.
>
>
> ------------------------------
> *From:* Marco Gaido <marcogaid...@gmail.com>
> *Sent:* Monday, July 16, 2018 11:12 PM
> *To:* Hyukjin Kwon
> *Cc:* Sean Owen; dev
> *Subject:* Re: Cleaning Spark releases from mirrors, and the flakiness of
> HiveExternalCatalogVersionsSuite
>
> +1 too
>
> On Tue, 17 Jul 2018, 05:38 Hyukjin Kwon, <gurwls...@gmail.com> wrote:
>
>> +1
>>
>> 2018년 7월 17일 (화) 오전 7:34, Sean Owen <sro...@apache.org>님이 작성:
>>
>>> Fix is committed to branches back through 2.2.x, where this test was
>>> added.
>>>
>>> There is still some issue; I'm seeing that archive.apache.org is
>>> rate-limiting downloads and frequently returning 503 errors.
>>>
>>> We can help, I guess, by avoiding testing against non-current releases.
>>> Right now we should be testing against 2.3.1, 2.2.2, 2.1.3, right? 2.0.x is
>>> now effectively EOL right?
>>>
>>> I can make that quick change too if everyone's amenable, in order to
>>> prevent more failures in this test from master.
>>>
>>> On Sun, Jul 15, 2018 at 3:51 PM Sean Owen <sro...@gmail.com> wrote:
>>>
>>>> Yesterday I cleaned out old Spark releases from the mirror system --
>>>> we're supposed to only keep the latest release from active branches out on
>>>> mirrors. (All releases are available from the Apache archive site.)
>>>>
>>>> Having done so I realized quickly that the
>>>> HiveExternalCatalogVersionsSuite relies on the versions it downloads being
>>>> available from mirrors. It has been flaky, as sometimes mirrors are
>>>> unreliable. I think now it will not work for any versions except 2.3.1,
>>>> 2.2.2, 2.1.3.
>>>>
>>>> Because we do need to clean those releases out of the mirrors soon
>>>> anyway, and because they're flaky sometimes, I propose adding logic to the
>>>> test to fall back on downloading from the Apache archive site.
>>>>
>>>> ... and I'll do that right away to unblock
>>>> HiveExternalCatalogVersionsSuite runs. I think it needs to be backported to
>>>> other branches as they will still be testing against potentially
>>>> non-current Spark releases.
>>>>
>>>> Sean
>>>>
>>>

Reply via email to