Re: [VOTE] Release Apache Spark 2.4.5 (RC1)

2020-01-16 Thread Xiao Li
-1

Let us include the correctness fix:
https://github.com/apache/spark/pull/27229

Thanks,

Xiao

On Thu, Jan 16, 2020 at 8:46 AM Dongjoon Hyun 
wrote:

> Thank you, Jungtaek!
>
> Bests,
> Dongjoon.
>
>
> On Wed, Jan 15, 2020 at 8:57 PM Jungtaek Lim 
> wrote:
>
>> Once we decided to cancel the RC1, what about including SPARK-29450 (
>> https://github.com/apache/spark/pull/27209) into RC2?
>>
>> SPARK-29450 was merged into master, and Xiao figured out it fixed a
>> regression, long lasting one (broken at 2.3.0). The link refers the PR for
>> 2.4 branch.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>> On Thu, Jan 16, 2020 at 12:56 PM Dongjoon Hyun 
>> wrote:
>>
>>> Sure. Wenchen and Hyukjin.
>>>
>>> I observed all of the above reported issues and have been waiting to
>>> collect more information before cancelling RC1 vote.
>>>
>>> The other stuff I've observed is that Marcelo and Sean also requested
>>> reverting the existing commit.
>>> - https://github.com/apache/spark/pull/24732 (spark.shuffle.io.backLog
>>> change)
>>>
>>> To All.
>>> We want your explicit feedbacks. Please reply on this thread.
>>>
>>> Although we get enough positive feedbacks here, I'll cancel this RC1.
>>> I want to address at least the above negative feedbacks and roll RC2
>>> next Monday.
>>>
>>> Bests,
>>> Dongjoon.
>>>
>>>
>>> On Wed, Jan 15, 2020 at 7:47 PM Hyukjin Kwon 
>>> wrote:
>>>
 If we go for RC2, we should include both:

 https://github.com/apache/spark/pull/27210
 https://github.com/apache/spark/pull/27184

 just for the sake of being complete and making the maintenance simple.


 2020년 1월 16일 (목) 오후 12:38, Wenchen Fan 님이 작성:

> Recently we merged several fixes to 2.4:
> https://issues.apache.org/jira/browse/SPARK-30325   a driver hang
> issue
> https://issues.apache.org/jira/browse/SPARK-30246   a memory leak
> issue
> https://issues.apache.org/jira/browse/SPARK-29708   a correctness
> issue(for a rarely used feature, so not merged to 2.4 yet)
>
> Shall we include them?
>
>
> On Wed, Jan 15, 2020 at 9:51 PM Hyukjin Kwon 
> wrote:
>
>> +1
>>
>> On Wed, 15 Jan 2020, 08:24 Takeshi Yamamuro, 
>> wrote:
>>
>>> +1;
>>>
>>> I checked the links and materials, then I run the tests with
>>> `-Pyarn -Phadoop-2.7 -Phive -Phive-thriftserver -Pmesos -Pkubernetes
>>> -Psparkr`
>>> on macOS (Java 8).
>>> All the things look fine and I didn't see the error on my env
>>> that Sean said above.
>>>
>>> Thanks, Dongjoon!
>>>
>>> Bests,
>>> Takeshi
>>>
>>> On Wed, Jan 15, 2020 at 4:09 AM DB Tsai  wrote:
>>>
 +1 Thanks.

 Sincerely,

 DB Tsai
 --
 Web: https://www.dbtsai.com
 PGP Key ID: 42E5B25A8F7A82C1

 On Tue, Jan 14, 2020 at 11:08 AM Sean Owen 
 wrote:
 >
 > Yeah it's something about the env I spun up, but I don't know
 what. It
 > happens frequently when I test, but not on Jenkins.
 > The Kafka error comes up every now and then and a clean rebuild
 fixes
 > it, but not in my case. I don't know why.
 > But if nobody else sees it, I'm pretty sure it's just an artifact
 of
 > the local VM.
 >
 > On Tue, Jan 14, 2020 at 12:57 PM Dongjoon Hyun <
 dongjoon.h...@gmail.com> wrote:
 > >
 > > Thank you, Sean.
 > >
 > > First of all, the `Ubuntu` job on Amplab Jenkins farm is green.
 > >
 > >
 https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-branch-2.4-test-sbt-hadoop-2.7-ubuntu-testing/
 > >
 > > For the failures,
 > >1. Yes, the `HiveExternalCatalogVersionsSuite` flakiness is
 a known one.
 > >2. For `HDFSMetadataLogSuite` failure, I also observed a few
 time before in CentOS too.
 > >3. Kafka build error is new to me. Does it happen on `Maven`
 clean build?
 > >
 > > Bests,
 > > Dongjoon.
 > >
 > >
 > > On Tue, Jan 14, 2020 at 6:40 AM Sean Owen 
 wrote:
 > >>
 > >> +1 from me. I checked sigs/licenses, and built/tested from
 source on
 > >> Java 8 + Ubuntu 18.04 with " -Pyarn -Phive -Phive-thriftserver
 > >> -Phadoop-2.7 -Pmesos -Pkubernetes -Psparkr -Pkinesis-asl". I
 do get
 > >> test failures, but, these are some I have always seen on
 Ubuntu, and I
 > >> do not know why they happen. They don't seem to affect others,
 but,
 > >> let me know if anyone else sees these?
 > >>
 > >>
 > >> Always happens for me:
 > >>
 > >> - HDFSMetadataLog: metadata directory collision *** FAILED ***
>>

Re: [VOTE] Release Apache Spark 2.4.5 (RC1)

2020-01-16 Thread Dongjoon Hyun
Thank you, Jungtaek!

Bests,
Dongjoon.


On Wed, Jan 15, 2020 at 8:57 PM Jungtaek Lim 
wrote:

> Once we decided to cancel the RC1, what about including SPARK-29450 (
> https://github.com/apache/spark/pull/27209) into RC2?
>
> SPARK-29450 was merged into master, and Xiao figured out it fixed a
> regression, long lasting one (broken at 2.3.0). The link refers the PR for
> 2.4 branch.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> On Thu, Jan 16, 2020 at 12:56 PM Dongjoon Hyun 
> wrote:
>
>> Sure. Wenchen and Hyukjin.
>>
>> I observed all of the above reported issues and have been waiting to
>> collect more information before cancelling RC1 vote.
>>
>> The other stuff I've observed is that Marcelo and Sean also requested
>> reverting the existing commit.
>> - https://github.com/apache/spark/pull/24732 (spark.shuffle.io.backLog
>> change)
>>
>> To All.
>> We want your explicit feedbacks. Please reply on this thread.
>>
>> Although we get enough positive feedbacks here, I'll cancel this RC1.
>> I want to address at least the above negative feedbacks and roll RC2 next
>> Monday.
>>
>> Bests,
>> Dongjoon.
>>
>>
>> On Wed, Jan 15, 2020 at 7:47 PM Hyukjin Kwon  wrote:
>>
>>> If we go for RC2, we should include both:
>>>
>>> https://github.com/apache/spark/pull/27210
>>> https://github.com/apache/spark/pull/27184
>>>
>>> just for the sake of being complete and making the maintenance simple.
>>>
>>>
>>> 2020년 1월 16일 (목) 오후 12:38, Wenchen Fan 님이 작성:
>>>
 Recently we merged several fixes to 2.4:
 https://issues.apache.org/jira/browse/SPARK-30325   a driver hang issue
 https://issues.apache.org/jira/browse/SPARK-30246   a memory leak issue
 https://issues.apache.org/jira/browse/SPARK-29708   a correctness
 issue(for a rarely used feature, so not merged to 2.4 yet)

 Shall we include them?


 On Wed, Jan 15, 2020 at 9:51 PM Hyukjin Kwon 
 wrote:

> +1
>
> On Wed, 15 Jan 2020, 08:24 Takeshi Yamamuro, 
> wrote:
>
>> +1;
>>
>> I checked the links and materials, then I run the tests with
>> `-Pyarn -Phadoop-2.7 -Phive -Phive-thriftserver -Pmesos -Pkubernetes
>> -Psparkr`
>> on macOS (Java 8).
>> All the things look fine and I didn't see the error on my env
>> that Sean said above.
>>
>> Thanks, Dongjoon!
>>
>> Bests,
>> Takeshi
>>
>> On Wed, Jan 15, 2020 at 4:09 AM DB Tsai  wrote:
>>
>>> +1 Thanks.
>>>
>>> Sincerely,
>>>
>>> DB Tsai
>>> --
>>> Web: https://www.dbtsai.com
>>> PGP Key ID: 42E5B25A8F7A82C1
>>>
>>> On Tue, Jan 14, 2020 at 11:08 AM Sean Owen 
>>> wrote:
>>> >
>>> > Yeah it's something about the env I spun up, but I don't know
>>> what. It
>>> > happens frequently when I test, but not on Jenkins.
>>> > The Kafka error comes up every now and then and a clean rebuild
>>> fixes
>>> > it, but not in my case. I don't know why.
>>> > But if nobody else sees it, I'm pretty sure it's just an artifact
>>> of
>>> > the local VM.
>>> >
>>> > On Tue, Jan 14, 2020 at 12:57 PM Dongjoon Hyun <
>>> dongjoon.h...@gmail.com> wrote:
>>> > >
>>> > > Thank you, Sean.
>>> > >
>>> > > First of all, the `Ubuntu` job on Amplab Jenkins farm is green.
>>> > >
>>> > >
>>> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-branch-2.4-test-sbt-hadoop-2.7-ubuntu-testing/
>>> > >
>>> > > For the failures,
>>> > >1. Yes, the `HiveExternalCatalogVersionsSuite` flakiness is a
>>> known one.
>>> > >2. For `HDFSMetadataLogSuite` failure, I also observed a few
>>> time before in CentOS too.
>>> > >3. Kafka build error is new to me. Does it happen on `Maven`
>>> clean build?
>>> > >
>>> > > Bests,
>>> > > Dongjoon.
>>> > >
>>> > >
>>> > > On Tue, Jan 14, 2020 at 6:40 AM Sean Owen 
>>> wrote:
>>> > >>
>>> > >> +1 from me. I checked sigs/licenses, and built/tested from
>>> source on
>>> > >> Java 8 + Ubuntu 18.04 with " -Pyarn -Phive -Phive-thriftserver
>>> > >> -Phadoop-2.7 -Pmesos -Pkubernetes -Psparkr -Pkinesis-asl". I do
>>> get
>>> > >> test failures, but, these are some I have always seen on
>>> Ubuntu, and I
>>> > >> do not know why they happen. They don't seem to affect others,
>>> but,
>>> > >> let me know if anyone else sees these?
>>> > >>
>>> > >>
>>> > >> Always happens for me:
>>> > >>
>>> > >> - HDFSMetadataLog: metadata directory collision *** FAILED ***
>>> > >>   The await method on Waiter timed out.
>>> (HDFSMetadataLogSuite.scala:178)
>>> > >>
>>> > >> This one has been flaky at times due to external dependencies:
>>> > >>
>>> > >> org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite ***
>>> ABORTED ***
>>> > >>   Exception encountered wh

Re: [VOTE] Release Apache Spark 2.4.5 (RC1)

2020-01-15 Thread Jungtaek Lim
Once we decided to cancel the RC1, what about including SPARK-29450 (
https://github.com/apache/spark/pull/27209) into RC2?

SPARK-29450 was merged into master, and Xiao figured out it fixed a
regression, long lasting one (broken at 2.3.0). The link refers the PR for
2.4 branch.

Thanks,
Jungtaek Lim (HeartSaVioR)

On Thu, Jan 16, 2020 at 12:56 PM Dongjoon Hyun 
wrote:

> Sure. Wenchen and Hyukjin.
>
> I observed all of the above reported issues and have been waiting to
> collect more information before cancelling RC1 vote.
>
> The other stuff I've observed is that Marcelo and Sean also requested
> reverting the existing commit.
> - https://github.com/apache/spark/pull/24732 (spark.shuffle.io.backLog
> change)
>
> To All.
> We want your explicit feedbacks. Please reply on this thread.
>
> Although we get enough positive feedbacks here, I'll cancel this RC1.
> I want to address at least the above negative feedbacks and roll RC2 next
> Monday.
>
> Bests,
> Dongjoon.
>
>
> On Wed, Jan 15, 2020 at 7:47 PM Hyukjin Kwon  wrote:
>
>> If we go for RC2, we should include both:
>>
>> https://github.com/apache/spark/pull/27210
>> https://github.com/apache/spark/pull/27184
>>
>> just for the sake of being complete and making the maintenance simple.
>>
>>
>> 2020년 1월 16일 (목) 오후 12:38, Wenchen Fan 님이 작성:
>>
>>> Recently we merged several fixes to 2.4:
>>> https://issues.apache.org/jira/browse/SPARK-30325   a driver hang issue
>>> https://issues.apache.org/jira/browse/SPARK-30246   a memory leak issue
>>> https://issues.apache.org/jira/browse/SPARK-29708   a correctness
>>> issue(for a rarely used feature, so not merged to 2.4 yet)
>>>
>>> Shall we include them?
>>>
>>>
>>> On Wed, Jan 15, 2020 at 9:51 PM Hyukjin Kwon 
>>> wrote:
>>>
 +1

 On Wed, 15 Jan 2020, 08:24 Takeshi Yamamuro, 
 wrote:

> +1;
>
> I checked the links and materials, then I run the tests with
> `-Pyarn -Phadoop-2.7 -Phive -Phive-thriftserver -Pmesos -Pkubernetes
> -Psparkr`
> on macOS (Java 8).
> All the things look fine and I didn't see the error on my env
> that Sean said above.
>
> Thanks, Dongjoon!
>
> Bests,
> Takeshi
>
> On Wed, Jan 15, 2020 at 4:09 AM DB Tsai  wrote:
>
>> +1 Thanks.
>>
>> Sincerely,
>>
>> DB Tsai
>> --
>> Web: https://www.dbtsai.com
>> PGP Key ID: 42E5B25A8F7A82C1
>>
>> On Tue, Jan 14, 2020 at 11:08 AM Sean Owen  wrote:
>> >
>> > Yeah it's something about the env I spun up, but I don't know what.
>> It
>> > happens frequently when I test, but not on Jenkins.
>> > The Kafka error comes up every now and then and a clean rebuild
>> fixes
>> > it, but not in my case. I don't know why.
>> > But if nobody else sees it, I'm pretty sure it's just an artifact of
>> > the local VM.
>> >
>> > On Tue, Jan 14, 2020 at 12:57 PM Dongjoon Hyun <
>> dongjoon.h...@gmail.com> wrote:
>> > >
>> > > Thank you, Sean.
>> > >
>> > > First of all, the `Ubuntu` job on Amplab Jenkins farm is green.
>> > >
>> > >
>> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-branch-2.4-test-sbt-hadoop-2.7-ubuntu-testing/
>> > >
>> > > For the failures,
>> > >1. Yes, the `HiveExternalCatalogVersionsSuite` flakiness is a
>> known one.
>> > >2. For `HDFSMetadataLogSuite` failure, I also observed a few
>> time before in CentOS too.
>> > >3. Kafka build error is new to me. Does it happen on `Maven`
>> clean build?
>> > >
>> > > Bests,
>> > > Dongjoon.
>> > >
>> > >
>> > > On Tue, Jan 14, 2020 at 6:40 AM Sean Owen 
>> wrote:
>> > >>
>> > >> +1 from me. I checked sigs/licenses, and built/tested from
>> source on
>> > >> Java 8 + Ubuntu 18.04 with " -Pyarn -Phive -Phive-thriftserver
>> > >> -Phadoop-2.7 -Pmesos -Pkubernetes -Psparkr -Pkinesis-asl". I do
>> get
>> > >> test failures, but, these are some I have always seen on Ubuntu,
>> and I
>> > >> do not know why they happen. They don't seem to affect others,
>> but,
>> > >> let me know if anyone else sees these?
>> > >>
>> > >>
>> > >> Always happens for me:
>> > >>
>> > >> - HDFSMetadataLog: metadata directory collision *** FAILED ***
>> > >>   The await method on Waiter timed out.
>> (HDFSMetadataLogSuite.scala:178)
>> > >>
>> > >> This one has been flaky at times due to external dependencies:
>> > >>
>> > >> org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite ***
>> ABORTED ***
>> > >>   Exception encountered when invoking run on a nested suite -
>> > >> spark-submit returned with exit code 1.
>> > >>   Command line: './bin/spark-submit' '--name' 'prepare testing
>> tables'
>> > >> '--master' 'local[2]' '--conf' 'spark.ui.enabled=false' '--conf'
>> > >> 

Re: [VOTE] Release Apache Spark 2.4.5 (RC1)

2020-01-15 Thread Dongjoon Hyun
Sure. Wenchen and Hyukjin.

I observed all of the above reported issues and have been waiting to
collect more information before cancelling RC1 vote.

The other stuff I've observed is that Marcelo and Sean also requested
reverting the existing commit.
- https://github.com/apache/spark/pull/24732 (spark.shuffle.io.backLog
change)

To All.
We want your explicit feedbacks. Please reply on this thread.

Although we get enough positive feedbacks here, I'll cancel this RC1.
I want to address at least the above negative feedbacks and roll RC2 next
Monday.

Bests,
Dongjoon.


On Wed, Jan 15, 2020 at 7:47 PM Hyukjin Kwon  wrote:

> If we go for RC2, we should include both:
>
> https://github.com/apache/spark/pull/27210
> https://github.com/apache/spark/pull/27184
>
> just for the sake of being complete and making the maintenance simple.
>
>
> 2020년 1월 16일 (목) 오후 12:38, Wenchen Fan 님이 작성:
>
>> Recently we merged several fixes to 2.4:
>> https://issues.apache.org/jira/browse/SPARK-30325   a driver hang issue
>> https://issues.apache.org/jira/browse/SPARK-30246   a memory leak issue
>> https://issues.apache.org/jira/browse/SPARK-29708   a correctness
>> issue(for a rarely used feature, so not merged to 2.4 yet)
>>
>> Shall we include them?
>>
>>
>> On Wed, Jan 15, 2020 at 9:51 PM Hyukjin Kwon  wrote:
>>
>>> +1
>>>
>>> On Wed, 15 Jan 2020, 08:24 Takeshi Yamamuro, 
>>> wrote:
>>>
 +1;

 I checked the links and materials, then I run the tests with
 `-Pyarn -Phadoop-2.7 -Phive -Phive-thriftserver -Pmesos -Pkubernetes
 -Psparkr`
 on macOS (Java 8).
 All the things look fine and I didn't see the error on my env
 that Sean said above.

 Thanks, Dongjoon!

 Bests,
 Takeshi

 On Wed, Jan 15, 2020 at 4:09 AM DB Tsai  wrote:

> +1 Thanks.
>
> Sincerely,
>
> DB Tsai
> --
> Web: https://www.dbtsai.com
> PGP Key ID: 42E5B25A8F7A82C1
>
> On Tue, Jan 14, 2020 at 11:08 AM Sean Owen  wrote:
> >
> > Yeah it's something about the env I spun up, but I don't know what.
> It
> > happens frequently when I test, but not on Jenkins.
> > The Kafka error comes up every now and then and a clean rebuild fixes
> > it, but not in my case. I don't know why.
> > But if nobody else sees it, I'm pretty sure it's just an artifact of
> > the local VM.
> >
> > On Tue, Jan 14, 2020 at 12:57 PM Dongjoon Hyun <
> dongjoon.h...@gmail.com> wrote:
> > >
> > > Thank you, Sean.
> > >
> > > First of all, the `Ubuntu` job on Amplab Jenkins farm is green.
> > >
> > >
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-branch-2.4-test-sbt-hadoop-2.7-ubuntu-testing/
> > >
> > > For the failures,
> > >1. Yes, the `HiveExternalCatalogVersionsSuite` flakiness is a
> known one.
> > >2. For `HDFSMetadataLogSuite` failure, I also observed a few
> time before in CentOS too.
> > >3. Kafka build error is new to me. Does it happen on `Maven`
> clean build?
> > >
> > > Bests,
> > > Dongjoon.
> > >
> > >
> > > On Tue, Jan 14, 2020 at 6:40 AM Sean Owen 
> wrote:
> > >>
> > >> +1 from me. I checked sigs/licenses, and built/tested from source
> on
> > >> Java 8 + Ubuntu 18.04 with " -Pyarn -Phive -Phive-thriftserver
> > >> -Phadoop-2.7 -Pmesos -Pkubernetes -Psparkr -Pkinesis-asl". I do
> get
> > >> test failures, but, these are some I have always seen on Ubuntu,
> and I
> > >> do not know why they happen. They don't seem to affect others,
> but,
> > >> let me know if anyone else sees these?
> > >>
> > >>
> > >> Always happens for me:
> > >>
> > >> - HDFSMetadataLog: metadata directory collision *** FAILED ***
> > >>   The await method on Waiter timed out.
> (HDFSMetadataLogSuite.scala:178)
> > >>
> > >> This one has been flaky at times due to external dependencies:
> > >>
> > >> org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite ***
> ABORTED ***
> > >>   Exception encountered when invoking run on a nested suite -
> > >> spark-submit returned with exit code 1.
> > >>   Command line: './bin/spark-submit' '--name' 'prepare testing
> tables'
> > >> '--master' 'local[2]' '--conf' 'spark.ui.enabled=false' '--conf'
> > >> 'spark.master.rest.enabled=false' '--conf'
> > >>
> 'spark.sql.warehouse.dir=/data/spark-2.4.5/sql/hive/target/tmp/warehouse-c2f762fd-688e-42b7-a822-06823a6bbd98'
> > >> '--conf' 'spark.sql.test.version.index=0' '--driver-java-options'
> > >>
> '-Dderby.system.home=/data/spark-2.4.5/sql/hive/target/tmp/warehouse-c2f762fd-688e-42b7-a822-06823a6bbd98'
> > >> '/data/spark-2.4.5/sql/hive/target/tmp/test7297526474581770293.py'
> > >>
> > >> Kafka doesn't build with this weird error. I tried a clean build

Re: [VOTE] Release Apache Spark 2.4.5 (RC1)

2020-01-15 Thread Hyukjin Kwon
If we go for RC2, we should include both:

https://github.com/apache/spark/pull/27210
https://github.com/apache/spark/pull/27184

just for the sake of being complete and making the maintenance simple.


2020년 1월 16일 (목) 오후 12:38, Wenchen Fan 님이 작성:

> Recently we merged several fixes to 2.4:
> https://issues.apache.org/jira/browse/SPARK-30325   a driver hang issue
> https://issues.apache.org/jira/browse/SPARK-30246   a memory leak issue
> https://issues.apache.org/jira/browse/SPARK-29708   a correctness
> issue(for a rarely used feature, so not merged to 2.4 yet)
>
> Shall we include them?
>
>
> On Wed, Jan 15, 2020 at 9:51 PM Hyukjin Kwon  wrote:
>
>> +1
>>
>> On Wed, 15 Jan 2020, 08:24 Takeshi Yamamuro, 
>> wrote:
>>
>>> +1;
>>>
>>> I checked the links and materials, then I run the tests with
>>> `-Pyarn -Phadoop-2.7 -Phive -Phive-thriftserver -Pmesos -Pkubernetes
>>> -Psparkr`
>>> on macOS (Java 8).
>>> All the things look fine and I didn't see the error on my env
>>> that Sean said above.
>>>
>>> Thanks, Dongjoon!
>>>
>>> Bests,
>>> Takeshi
>>>
>>> On Wed, Jan 15, 2020 at 4:09 AM DB Tsai  wrote:
>>>
 +1 Thanks.

 Sincerely,

 DB Tsai
 --
 Web: https://www.dbtsai.com
 PGP Key ID: 42E5B25A8F7A82C1

 On Tue, Jan 14, 2020 at 11:08 AM Sean Owen  wrote:
 >
 > Yeah it's something about the env I spun up, but I don't know what. It
 > happens frequently when I test, but not on Jenkins.
 > The Kafka error comes up every now and then and a clean rebuild fixes
 > it, but not in my case. I don't know why.
 > But if nobody else sees it, I'm pretty sure it's just an artifact of
 > the local VM.
 >
 > On Tue, Jan 14, 2020 at 12:57 PM Dongjoon Hyun <
 dongjoon.h...@gmail.com> wrote:
 > >
 > > Thank you, Sean.
 > >
 > > First of all, the `Ubuntu` job on Amplab Jenkins farm is green.
 > >
 > >
 https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-branch-2.4-test-sbt-hadoop-2.7-ubuntu-testing/
 > >
 > > For the failures,
 > >1. Yes, the `HiveExternalCatalogVersionsSuite` flakiness is a
 known one.
 > >2. For `HDFSMetadataLogSuite` failure, I also observed a few
 time before in CentOS too.
 > >3. Kafka build error is new to me. Does it happen on `Maven`
 clean build?
 > >
 > > Bests,
 > > Dongjoon.
 > >
 > >
 > > On Tue, Jan 14, 2020 at 6:40 AM Sean Owen 
 wrote:
 > >>
 > >> +1 from me. I checked sigs/licenses, and built/tested from source
 on
 > >> Java 8 + Ubuntu 18.04 with " -Pyarn -Phive -Phive-thriftserver
 > >> -Phadoop-2.7 -Pmesos -Pkubernetes -Psparkr -Pkinesis-asl". I do get
 > >> test failures, but, these are some I have always seen on Ubuntu,
 and I
 > >> do not know why they happen. They don't seem to affect others, but,
 > >> let me know if anyone else sees these?
 > >>
 > >>
 > >> Always happens for me:
 > >>
 > >> - HDFSMetadataLog: metadata directory collision *** FAILED ***
 > >>   The await method on Waiter timed out.
 (HDFSMetadataLogSuite.scala:178)
 > >>
 > >> This one has been flaky at times due to external dependencies:
 > >>
 > >> org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite ***
 ABORTED ***
 > >>   Exception encountered when invoking run on a nested suite -
 > >> spark-submit returned with exit code 1.
 > >>   Command line: './bin/spark-submit' '--name' 'prepare testing
 tables'
 > >> '--master' 'local[2]' '--conf' 'spark.ui.enabled=false' '--conf'
 > >> 'spark.master.rest.enabled=false' '--conf'
 > >>
 'spark.sql.warehouse.dir=/data/spark-2.4.5/sql/hive/target/tmp/warehouse-c2f762fd-688e-42b7-a822-06823a6bbd98'
 > >> '--conf' 'spark.sql.test.version.index=0' '--driver-java-options'
 > >>
 '-Dderby.system.home=/data/spark-2.4.5/sql/hive/target/tmp/warehouse-c2f762fd-688e-42b7-a822-06823a6bbd98'
 > >> '/data/spark-2.4.5/sql/hive/target/tmp/test7297526474581770293.py'
 > >>
 > >> Kafka doesn't build with this weird error. I tried a clean build. I
 > >> think we've seen this before.
 > >>
 > >> [error] This symbol is required by 'method
 > >> org.apache.spark.metrics.MetricsSystem.getServletHandlers'.
 > >> [error] Make sure that term eclipse is in your classpath and check
 for
 > >> conflicting dependencies with `-Ylog-classpath`.
 > >> [error] A full rebuild may help if 'MetricsSystem.class' was
 compiled
 > >> against an incompatible version of org.
 > >> [error] testUtils.sendMessages(topic, data.toArray)
 > >> [error]
 > >>
 > >> On Mon, Jan 13, 2020 at 6:28 AM Dongjoon Hyun <
 dongjoon.h...@gmail.com> wrote:
 > >> >
 > >> > Please vote on releasing the following candidate as Apache Spark
 version 2.4.5.
 > >> >

Re: [VOTE] Release Apache Spark 2.4.5 (RC1)

2020-01-15 Thread Wenchen Fan
Recently we merged several fixes to 2.4:
https://issues.apache.org/jira/browse/SPARK-30325   a driver hang issue
https://issues.apache.org/jira/browse/SPARK-30246   a memory leak issue
https://issues.apache.org/jira/browse/SPARK-29708   a correctness issue(for
a rarely used feature, so not merged to 2.4 yet)

Shall we include them?


On Wed, Jan 15, 2020 at 9:51 PM Hyukjin Kwon  wrote:

> +1
>
> On Wed, 15 Jan 2020, 08:24 Takeshi Yamamuro, 
> wrote:
>
>> +1;
>>
>> I checked the links and materials, then I run the tests with
>> `-Pyarn -Phadoop-2.7 -Phive -Phive-thriftserver -Pmesos -Pkubernetes
>> -Psparkr`
>> on macOS (Java 8).
>> All the things look fine and I didn't see the error on my env
>> that Sean said above.
>>
>> Thanks, Dongjoon!
>>
>> Bests,
>> Takeshi
>>
>> On Wed, Jan 15, 2020 at 4:09 AM DB Tsai  wrote:
>>
>>> +1 Thanks.
>>>
>>> Sincerely,
>>>
>>> DB Tsai
>>> --
>>> Web: https://www.dbtsai.com
>>> PGP Key ID: 42E5B25A8F7A82C1
>>>
>>> On Tue, Jan 14, 2020 at 11:08 AM Sean Owen  wrote:
>>> >
>>> > Yeah it's something about the env I spun up, but I don't know what. It
>>> > happens frequently when I test, but not on Jenkins.
>>> > The Kafka error comes up every now and then and a clean rebuild fixes
>>> > it, but not in my case. I don't know why.
>>> > But if nobody else sees it, I'm pretty sure it's just an artifact of
>>> > the local VM.
>>> >
>>> > On Tue, Jan 14, 2020 at 12:57 PM Dongjoon Hyun <
>>> dongjoon.h...@gmail.com> wrote:
>>> > >
>>> > > Thank you, Sean.
>>> > >
>>> > > First of all, the `Ubuntu` job on Amplab Jenkins farm is green.
>>> > >
>>> > >
>>> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-branch-2.4-test-sbt-hadoop-2.7-ubuntu-testing/
>>> > >
>>> > > For the failures,
>>> > >1. Yes, the `HiveExternalCatalogVersionsSuite` flakiness is a
>>> known one.
>>> > >2. For `HDFSMetadataLogSuite` failure, I also observed a few time
>>> before in CentOS too.
>>> > >3. Kafka build error is new to me. Does it happen on `Maven`
>>> clean build?
>>> > >
>>> > > Bests,
>>> > > Dongjoon.
>>> > >
>>> > >
>>> > > On Tue, Jan 14, 2020 at 6:40 AM Sean Owen  wrote:
>>> > >>
>>> > >> +1 from me. I checked sigs/licenses, and built/tested from source on
>>> > >> Java 8 + Ubuntu 18.04 with " -Pyarn -Phive -Phive-thriftserver
>>> > >> -Phadoop-2.7 -Pmesos -Pkubernetes -Psparkr -Pkinesis-asl". I do get
>>> > >> test failures, but, these are some I have always seen on Ubuntu,
>>> and I
>>> > >> do not know why they happen. They don't seem to affect others, but,
>>> > >> let me know if anyone else sees these?
>>> > >>
>>> > >>
>>> > >> Always happens for me:
>>> > >>
>>> > >> - HDFSMetadataLog: metadata directory collision *** FAILED ***
>>> > >>   The await method on Waiter timed out.
>>> (HDFSMetadataLogSuite.scala:178)
>>> > >>
>>> > >> This one has been flaky at times due to external dependencies:
>>> > >>
>>> > >> org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite ***
>>> ABORTED ***
>>> > >>   Exception encountered when invoking run on a nested suite -
>>> > >> spark-submit returned with exit code 1.
>>> > >>   Command line: './bin/spark-submit' '--name' 'prepare testing
>>> tables'
>>> > >> '--master' 'local[2]' '--conf' 'spark.ui.enabled=false' '--conf'
>>> > >> 'spark.master.rest.enabled=false' '--conf'
>>> > >>
>>> 'spark.sql.warehouse.dir=/data/spark-2.4.5/sql/hive/target/tmp/warehouse-c2f762fd-688e-42b7-a822-06823a6bbd98'
>>> > >> '--conf' 'spark.sql.test.version.index=0' '--driver-java-options'
>>> > >>
>>> '-Dderby.system.home=/data/spark-2.4.5/sql/hive/target/tmp/warehouse-c2f762fd-688e-42b7-a822-06823a6bbd98'
>>> > >> '/data/spark-2.4.5/sql/hive/target/tmp/test7297526474581770293.py'
>>> > >>
>>> > >> Kafka doesn't build with this weird error. I tried a clean build. I
>>> > >> think we've seen this before.
>>> > >>
>>> > >> [error] This symbol is required by 'method
>>> > >> org.apache.spark.metrics.MetricsSystem.getServletHandlers'.
>>> > >> [error] Make sure that term eclipse is in your classpath and check
>>> for
>>> > >> conflicting dependencies with `-Ylog-classpath`.
>>> > >> [error] A full rebuild may help if 'MetricsSystem.class' was
>>> compiled
>>> > >> against an incompatible version of org.
>>> > >> [error] testUtils.sendMessages(topic, data.toArray)
>>> > >> [error]
>>> > >>
>>> > >> On Mon, Jan 13, 2020 at 6:28 AM Dongjoon Hyun <
>>> dongjoon.h...@gmail.com> wrote:
>>> > >> >
>>> > >> > Please vote on releasing the following candidate as Apache Spark
>>> version 2.4.5.
>>> > >> >
>>> > >> > The vote is open until January 16th 5AM PST and passes if a
>>> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>>> > >> >
>>> > >> > [ ] +1 Release this package as Apache Spark 2.4.5
>>> > >> > [ ] -1 Do not release this package because ...
>>> > >> >
>>> > >> > To learn more about Apache Spark, please see
>>> http://spark.apache.org/
>>> > >> >
>>> > >> > The tag to b

Re: [VOTE] Release Apache Spark 2.4.5 (RC1)

2020-01-15 Thread Hyukjin Kwon
+1

On Wed, 15 Jan 2020, 08:24 Takeshi Yamamuro,  wrote:

> +1;
>
> I checked the links and materials, then I run the tests with
> `-Pyarn -Phadoop-2.7 -Phive -Phive-thriftserver -Pmesos -Pkubernetes
> -Psparkr`
> on macOS (Java 8).
> All the things look fine and I didn't see the error on my env
> that Sean said above.
>
> Thanks, Dongjoon!
>
> Bests,
> Takeshi
>
> On Wed, Jan 15, 2020 at 4:09 AM DB Tsai  wrote:
>
>> +1 Thanks.
>>
>> Sincerely,
>>
>> DB Tsai
>> --
>> Web: https://www.dbtsai.com
>> PGP Key ID: 42E5B25A8F7A82C1
>>
>> On Tue, Jan 14, 2020 at 11:08 AM Sean Owen  wrote:
>> >
>> > Yeah it's something about the env I spun up, but I don't know what. It
>> > happens frequently when I test, but not on Jenkins.
>> > The Kafka error comes up every now and then and a clean rebuild fixes
>> > it, but not in my case. I don't know why.
>> > But if nobody else sees it, I'm pretty sure it's just an artifact of
>> > the local VM.
>> >
>> > On Tue, Jan 14, 2020 at 12:57 PM Dongjoon Hyun 
>> wrote:
>> > >
>> > > Thank you, Sean.
>> > >
>> > > First of all, the `Ubuntu` job on Amplab Jenkins farm is green.
>> > >
>> > >
>> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-branch-2.4-test-sbt-hadoop-2.7-ubuntu-testing/
>> > >
>> > > For the failures,
>> > >1. Yes, the `HiveExternalCatalogVersionsSuite` flakiness is a
>> known one.
>> > >2. For `HDFSMetadataLogSuite` failure, I also observed a few time
>> before in CentOS too.
>> > >3. Kafka build error is new to me. Does it happen on `Maven` clean
>> build?
>> > >
>> > > Bests,
>> > > Dongjoon.
>> > >
>> > >
>> > > On Tue, Jan 14, 2020 at 6:40 AM Sean Owen  wrote:
>> > >>
>> > >> +1 from me. I checked sigs/licenses, and built/tested from source on
>> > >> Java 8 + Ubuntu 18.04 with " -Pyarn -Phive -Phive-thriftserver
>> > >> -Phadoop-2.7 -Pmesos -Pkubernetes -Psparkr -Pkinesis-asl". I do get
>> > >> test failures, but, these are some I have always seen on Ubuntu, and
>> I
>> > >> do not know why they happen. They don't seem to affect others, but,
>> > >> let me know if anyone else sees these?
>> > >>
>> > >>
>> > >> Always happens for me:
>> > >>
>> > >> - HDFSMetadataLog: metadata directory collision *** FAILED ***
>> > >>   The await method on Waiter timed out.
>> (HDFSMetadataLogSuite.scala:178)
>> > >>
>> > >> This one has been flaky at times due to external dependencies:
>> > >>
>> > >> org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite ***
>> ABORTED ***
>> > >>   Exception encountered when invoking run on a nested suite -
>> > >> spark-submit returned with exit code 1.
>> > >>   Command line: './bin/spark-submit' '--name' 'prepare testing
>> tables'
>> > >> '--master' 'local[2]' '--conf' 'spark.ui.enabled=false' '--conf'
>> > >> 'spark.master.rest.enabled=false' '--conf'
>> > >>
>> 'spark.sql.warehouse.dir=/data/spark-2.4.5/sql/hive/target/tmp/warehouse-c2f762fd-688e-42b7-a822-06823a6bbd98'
>> > >> '--conf' 'spark.sql.test.version.index=0' '--driver-java-options'
>> > >>
>> '-Dderby.system.home=/data/spark-2.4.5/sql/hive/target/tmp/warehouse-c2f762fd-688e-42b7-a822-06823a6bbd98'
>> > >> '/data/spark-2.4.5/sql/hive/target/tmp/test7297526474581770293.py'
>> > >>
>> > >> Kafka doesn't build with this weird error. I tried a clean build. I
>> > >> think we've seen this before.
>> > >>
>> > >> [error] This symbol is required by 'method
>> > >> org.apache.spark.metrics.MetricsSystem.getServletHandlers'.
>> > >> [error] Make sure that term eclipse is in your classpath and check
>> for
>> > >> conflicting dependencies with `-Ylog-classpath`.
>> > >> [error] A full rebuild may help if 'MetricsSystem.class' was compiled
>> > >> against an incompatible version of org.
>> > >> [error] testUtils.sendMessages(topic, data.toArray)
>> > >> [error]
>> > >>
>> > >> On Mon, Jan 13, 2020 at 6:28 AM Dongjoon Hyun <
>> dongjoon.h...@gmail.com> wrote:
>> > >> >
>> > >> > Please vote on releasing the following candidate as Apache Spark
>> version 2.4.5.
>> > >> >
>> > >> > The vote is open until January 16th 5AM PST and passes if a
>> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>> > >> >
>> > >> > [ ] +1 Release this package as Apache Spark 2.4.5
>> > >> > [ ] -1 Do not release this package because ...
>> > >> >
>> > >> > To learn more about Apache Spark, please see
>> http://spark.apache.org/
>> > >> >
>> > >> > The tag to be voted on is v2.4.5-rc1 (commit
>> 33bd2beee5e3772a9af1d782f195e6a678c54cf0):
>> > >> > https://github.com/apache/spark/tree/v2.4.5-rc1
>> > >> >
>> > >> > The release files, including signatures, digests, etc. can be
>> found at:
>> > >> > https://dist.apache.org/repos/dist/dev/spark/v2.4.5-rc1-bin/
>> > >> >
>> > >> > Signatures used for Spark RCs can be found in this file:
>> > >> > https://dist.apache.org/repos/dist/dev/spark/KEYS
>> > >> >
>> > >> > The staging repository for this release can be found at:
>> > >> >
>> https://repositor

Re: [VOTE] Release Apache Spark 2.4.5 (RC1)

2020-01-14 Thread Takeshi Yamamuro
+1;

I checked the links and materials, then I run the tests with
`-Pyarn -Phadoop-2.7 -Phive -Phive-thriftserver -Pmesos -Pkubernetes
-Psparkr`
on macOS (Java 8).
All the things look fine and I didn't see the error on my env
that Sean said above.

Thanks, Dongjoon!

Bests,
Takeshi

On Wed, Jan 15, 2020 at 4:09 AM DB Tsai  wrote:

> +1 Thanks.
>
> Sincerely,
>
> DB Tsai
> --
> Web: https://www.dbtsai.com
> PGP Key ID: 42E5B25A8F7A82C1
>
> On Tue, Jan 14, 2020 at 11:08 AM Sean Owen  wrote:
> >
> > Yeah it's something about the env I spun up, but I don't know what. It
> > happens frequently when I test, but not on Jenkins.
> > The Kafka error comes up every now and then and a clean rebuild fixes
> > it, but not in my case. I don't know why.
> > But if nobody else sees it, I'm pretty sure it's just an artifact of
> > the local VM.
> >
> > On Tue, Jan 14, 2020 at 12:57 PM Dongjoon Hyun 
> wrote:
> > >
> > > Thank you, Sean.
> > >
> > > First of all, the `Ubuntu` job on Amplab Jenkins farm is green.
> > >
> > >
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-branch-2.4-test-sbt-hadoop-2.7-ubuntu-testing/
> > >
> > > For the failures,
> > >1. Yes, the `HiveExternalCatalogVersionsSuite` flakiness is a known
> one.
> > >2. For `HDFSMetadataLogSuite` failure, I also observed a few time
> before in CentOS too.
> > >3. Kafka build error is new to me. Does it happen on `Maven` clean
> build?
> > >
> > > Bests,
> > > Dongjoon.
> > >
> > >
> > > On Tue, Jan 14, 2020 at 6:40 AM Sean Owen  wrote:
> > >>
> > >> +1 from me. I checked sigs/licenses, and built/tested from source on
> > >> Java 8 + Ubuntu 18.04 with " -Pyarn -Phive -Phive-thriftserver
> > >> -Phadoop-2.7 -Pmesos -Pkubernetes -Psparkr -Pkinesis-asl". I do get
> > >> test failures, but, these are some I have always seen on Ubuntu, and I
> > >> do not know why they happen. They don't seem to affect others, but,
> > >> let me know if anyone else sees these?
> > >>
> > >>
> > >> Always happens for me:
> > >>
> > >> - HDFSMetadataLog: metadata directory collision *** FAILED ***
> > >>   The await method on Waiter timed out.
> (HDFSMetadataLogSuite.scala:178)
> > >>
> > >> This one has been flaky at times due to external dependencies:
> > >>
> > >> org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite ***
> ABORTED ***
> > >>   Exception encountered when invoking run on a nested suite -
> > >> spark-submit returned with exit code 1.
> > >>   Command line: './bin/spark-submit' '--name' 'prepare testing tables'
> > >> '--master' 'local[2]' '--conf' 'spark.ui.enabled=false' '--conf'
> > >> 'spark.master.rest.enabled=false' '--conf'
> > >>
> 'spark.sql.warehouse.dir=/data/spark-2.4.5/sql/hive/target/tmp/warehouse-c2f762fd-688e-42b7-a822-06823a6bbd98'
> > >> '--conf' 'spark.sql.test.version.index=0' '--driver-java-options'
> > >>
> '-Dderby.system.home=/data/spark-2.4.5/sql/hive/target/tmp/warehouse-c2f762fd-688e-42b7-a822-06823a6bbd98'
> > >> '/data/spark-2.4.5/sql/hive/target/tmp/test7297526474581770293.py'
> > >>
> > >> Kafka doesn't build with this weird error. I tried a clean build. I
> > >> think we've seen this before.
> > >>
> > >> [error] This symbol is required by 'method
> > >> org.apache.spark.metrics.MetricsSystem.getServletHandlers'.
> > >> [error] Make sure that term eclipse is in your classpath and check for
> > >> conflicting dependencies with `-Ylog-classpath`.
> > >> [error] A full rebuild may help if 'MetricsSystem.class' was compiled
> > >> against an incompatible version of org.
> > >> [error] testUtils.sendMessages(topic, data.toArray)
> > >> [error]
> > >>
> > >> On Mon, Jan 13, 2020 at 6:28 AM Dongjoon Hyun <
> dongjoon.h...@gmail.com> wrote:
> > >> >
> > >> > Please vote on releasing the following candidate as Apache Spark
> version 2.4.5.
> > >> >
> > >> > The vote is open until January 16th 5AM PST and passes if a
> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
> > >> >
> > >> > [ ] +1 Release this package as Apache Spark 2.4.5
> > >> > [ ] -1 Do not release this package because ...
> > >> >
> > >> > To learn more about Apache Spark, please see
> http://spark.apache.org/
> > >> >
> > >> > The tag to be voted on is v2.4.5-rc1 (commit
> 33bd2beee5e3772a9af1d782f195e6a678c54cf0):
> > >> > https://github.com/apache/spark/tree/v2.4.5-rc1
> > >> >
> > >> > The release files, including signatures, digests, etc. can be found
> at:
> > >> > https://dist.apache.org/repos/dist/dev/spark/v2.4.5-rc1-bin/
> > >> >
> > >> > Signatures used for Spark RCs can be found in this file:
> > >> > https://dist.apache.org/repos/dist/dev/spark/KEYS
> > >> >
> > >> > The staging repository for this release can be found at:
> > >> >
> https://repository.apache.org/content/repositories/orgapachespark-1339/
> > >> >
> > >> > The documentation corresponding to this release can be found at:
> > >> > https://dist.apache.org/repos/dist/dev/spark/v2.4.5-rc1-docs

Re: [VOTE] Release Apache Spark 2.4.5 (RC1)

2020-01-14 Thread DB Tsai
+1 Thanks.

Sincerely,

DB Tsai
--
Web: https://www.dbtsai.com
PGP Key ID: 42E5B25A8F7A82C1

On Tue, Jan 14, 2020 at 11:08 AM Sean Owen  wrote:
>
> Yeah it's something about the env I spun up, but I don't know what. It
> happens frequently when I test, but not on Jenkins.
> The Kafka error comes up every now and then and a clean rebuild fixes
> it, but not in my case. I don't know why.
> But if nobody else sees it, I'm pretty sure it's just an artifact of
> the local VM.
>
> On Tue, Jan 14, 2020 at 12:57 PM Dongjoon Hyun  
> wrote:
> >
> > Thank you, Sean.
> >
> > First of all, the `Ubuntu` job on Amplab Jenkins farm is green.
> >
> > 
> > https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-branch-2.4-test-sbt-hadoop-2.7-ubuntu-testing/
> >
> > For the failures,
> >1. Yes, the `HiveExternalCatalogVersionsSuite` flakiness is a known one.
> >2. For `HDFSMetadataLogSuite` failure, I also observed a few time before 
> > in CentOS too.
> >3. Kafka build error is new to me. Does it happen on `Maven` clean build?
> >
> > Bests,
> > Dongjoon.
> >
> >
> > On Tue, Jan 14, 2020 at 6:40 AM Sean Owen  wrote:
> >>
> >> +1 from me. I checked sigs/licenses, and built/tested from source on
> >> Java 8 + Ubuntu 18.04 with " -Pyarn -Phive -Phive-thriftserver
> >> -Phadoop-2.7 -Pmesos -Pkubernetes -Psparkr -Pkinesis-asl". I do get
> >> test failures, but, these are some I have always seen on Ubuntu, and I
> >> do not know why they happen. They don't seem to affect others, but,
> >> let me know if anyone else sees these?
> >>
> >>
> >> Always happens for me:
> >>
> >> - HDFSMetadataLog: metadata directory collision *** FAILED ***
> >>   The await method on Waiter timed out. (HDFSMetadataLogSuite.scala:178)
> >>
> >> This one has been flaky at times due to external dependencies:
> >>
> >> org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite *** ABORTED ***
> >>   Exception encountered when invoking run on a nested suite -
> >> spark-submit returned with exit code 1.
> >>   Command line: './bin/spark-submit' '--name' 'prepare testing tables'
> >> '--master' 'local[2]' '--conf' 'spark.ui.enabled=false' '--conf'
> >> 'spark.master.rest.enabled=false' '--conf'
> >> 'spark.sql.warehouse.dir=/data/spark-2.4.5/sql/hive/target/tmp/warehouse-c2f762fd-688e-42b7-a822-06823a6bbd98'
> >> '--conf' 'spark.sql.test.version.index=0' '--driver-java-options'
> >> '-Dderby.system.home=/data/spark-2.4.5/sql/hive/target/tmp/warehouse-c2f762fd-688e-42b7-a822-06823a6bbd98'
> >> '/data/spark-2.4.5/sql/hive/target/tmp/test7297526474581770293.py'
> >>
> >> Kafka doesn't build with this weird error. I tried a clean build. I
> >> think we've seen this before.
> >>
> >> [error] This symbol is required by 'method
> >> org.apache.spark.metrics.MetricsSystem.getServletHandlers'.
> >> [error] Make sure that term eclipse is in your classpath and check for
> >> conflicting dependencies with `-Ylog-classpath`.
> >> [error] A full rebuild may help if 'MetricsSystem.class' was compiled
> >> against an incompatible version of org.
> >> [error] testUtils.sendMessages(topic, data.toArray)
> >> [error]
> >>
> >> On Mon, Jan 13, 2020 at 6:28 AM Dongjoon Hyun  
> >> wrote:
> >> >
> >> > Please vote on releasing the following candidate as Apache Spark version 
> >> > 2.4.5.
> >> >
> >> > The vote is open until January 16th 5AM PST and passes if a majority +1 
> >> > PMC votes are cast, with a minimum of 3 +1 votes.
> >> >
> >> > [ ] +1 Release this package as Apache Spark 2.4.5
> >> > [ ] -1 Do not release this package because ...
> >> >
> >> > To learn more about Apache Spark, please see http://spark.apache.org/
> >> >
> >> > The tag to be voted on is v2.4.5-rc1 (commit 
> >> > 33bd2beee5e3772a9af1d782f195e6a678c54cf0):
> >> > https://github.com/apache/spark/tree/v2.4.5-rc1
> >> >
> >> > The release files, including signatures, digests, etc. can be found at:
> >> > https://dist.apache.org/repos/dist/dev/spark/v2.4.5-rc1-bin/
> >> >
> >> > Signatures used for Spark RCs can be found in this file:
> >> > https://dist.apache.org/repos/dist/dev/spark/KEYS
> >> >
> >> > The staging repository for this release can be found at:
> >> > https://repository.apache.org/content/repositories/orgapachespark-1339/
> >> >
> >> > The documentation corresponding to this release can be found at:
> >> > https://dist.apache.org/repos/dist/dev/spark/v2.4.5-rc1-docs/
> >> >
> >> > The list of bug fixes going into 2.4.5 can be found at the following URL:
> >> > https://issues.apache.org/jira/projects/SPARK/versions/12346042
> >> >
> >> > This release is using the release script of the tag v2.4.5-rc1.
> >> >
> >> > FAQ
> >> >
> >> > =
> >> > How can I help test this release?
> >> > =
> >> >
> >> > If you are a Spark user, you can help us test this release by taking
> >> > an existing Spark workload and running on this release candidate, then
> >> > re

Re: [VOTE] Release Apache Spark 2.4.5 (RC1)

2020-01-14 Thread Sean Owen
Yeah it's something about the env I spun up, but I don't know what. It
happens frequently when I test, but not on Jenkins.
The Kafka error comes up every now and then and a clean rebuild fixes
it, but not in my case. I don't know why.
But if nobody else sees it, I'm pretty sure it's just an artifact of
the local VM.

On Tue, Jan 14, 2020 at 12:57 PM Dongjoon Hyun  wrote:
>
> Thank you, Sean.
>
> First of all, the `Ubuntu` job on Amplab Jenkins farm is green.
>
> 
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-branch-2.4-test-sbt-hadoop-2.7-ubuntu-testing/
>
> For the failures,
>1. Yes, the `HiveExternalCatalogVersionsSuite` flakiness is a known one.
>2. For `HDFSMetadataLogSuite` failure, I also observed a few time before 
> in CentOS too.
>3. Kafka build error is new to me. Does it happen on `Maven` clean build?
>
> Bests,
> Dongjoon.
>
>
> On Tue, Jan 14, 2020 at 6:40 AM Sean Owen  wrote:
>>
>> +1 from me. I checked sigs/licenses, and built/tested from source on
>> Java 8 + Ubuntu 18.04 with " -Pyarn -Phive -Phive-thriftserver
>> -Phadoop-2.7 -Pmesos -Pkubernetes -Psparkr -Pkinesis-asl". I do get
>> test failures, but, these are some I have always seen on Ubuntu, and I
>> do not know why they happen. They don't seem to affect others, but,
>> let me know if anyone else sees these?
>>
>>
>> Always happens for me:
>>
>> - HDFSMetadataLog: metadata directory collision *** FAILED ***
>>   The await method on Waiter timed out. (HDFSMetadataLogSuite.scala:178)
>>
>> This one has been flaky at times due to external dependencies:
>>
>> org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite *** ABORTED ***
>>   Exception encountered when invoking run on a nested suite -
>> spark-submit returned with exit code 1.
>>   Command line: './bin/spark-submit' '--name' 'prepare testing tables'
>> '--master' 'local[2]' '--conf' 'spark.ui.enabled=false' '--conf'
>> 'spark.master.rest.enabled=false' '--conf'
>> 'spark.sql.warehouse.dir=/data/spark-2.4.5/sql/hive/target/tmp/warehouse-c2f762fd-688e-42b7-a822-06823a6bbd98'
>> '--conf' 'spark.sql.test.version.index=0' '--driver-java-options'
>> '-Dderby.system.home=/data/spark-2.4.5/sql/hive/target/tmp/warehouse-c2f762fd-688e-42b7-a822-06823a6bbd98'
>> '/data/spark-2.4.5/sql/hive/target/tmp/test7297526474581770293.py'
>>
>> Kafka doesn't build with this weird error. I tried a clean build. I
>> think we've seen this before.
>>
>> [error] This symbol is required by 'method
>> org.apache.spark.metrics.MetricsSystem.getServletHandlers'.
>> [error] Make sure that term eclipse is in your classpath and check for
>> conflicting dependencies with `-Ylog-classpath`.
>> [error] A full rebuild may help if 'MetricsSystem.class' was compiled
>> against an incompatible version of org.
>> [error] testUtils.sendMessages(topic, data.toArray)
>> [error]
>>
>> On Mon, Jan 13, 2020 at 6:28 AM Dongjoon Hyun  
>> wrote:
>> >
>> > Please vote on releasing the following candidate as Apache Spark version 
>> > 2.4.5.
>> >
>> > The vote is open until January 16th 5AM PST and passes if a majority +1 
>> > PMC votes are cast, with a minimum of 3 +1 votes.
>> >
>> > [ ] +1 Release this package as Apache Spark 2.4.5
>> > [ ] -1 Do not release this package because ...
>> >
>> > To learn more about Apache Spark, please see http://spark.apache.org/
>> >
>> > The tag to be voted on is v2.4.5-rc1 (commit 
>> > 33bd2beee5e3772a9af1d782f195e6a678c54cf0):
>> > https://github.com/apache/spark/tree/v2.4.5-rc1
>> >
>> > The release files, including signatures, digests, etc. can be found at:
>> > https://dist.apache.org/repos/dist/dev/spark/v2.4.5-rc1-bin/
>> >
>> > Signatures used for Spark RCs can be found in this file:
>> > https://dist.apache.org/repos/dist/dev/spark/KEYS
>> >
>> > The staging repository for this release can be found at:
>> > https://repository.apache.org/content/repositories/orgapachespark-1339/
>> >
>> > The documentation corresponding to this release can be found at:
>> > https://dist.apache.org/repos/dist/dev/spark/v2.4.5-rc1-docs/
>> >
>> > The list of bug fixes going into 2.4.5 can be found at the following URL:
>> > https://issues.apache.org/jira/projects/SPARK/versions/12346042
>> >
>> > This release is using the release script of the tag v2.4.5-rc1.
>> >
>> > FAQ
>> >
>> > =
>> > How can I help test this release?
>> > =
>> >
>> > If you are a Spark user, you can help us test this release by taking
>> > an existing Spark workload and running on this release candidate, then
>> > reporting any regressions.
>> >
>> > If you're working in PySpark you can set up a virtual env and install
>> > the current RC and see if anything important breaks, in the Java/Scala
>> > you can add the staging repository to your projects resolvers and test
>> > with the RC (make sure to clean up the artifact cache before/after so
>> > you don't end up building with a out of date RC going forward).
>> >
>> > =

Re: [VOTE] Release Apache Spark 2.4.5 (RC1)

2020-01-14 Thread Dongjoon Hyun
Thank you, Sean.

First of all, the `Ubuntu` job on Amplab Jenkins farm is green.


https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-branch-2.4-test-sbt-hadoop-2.7-ubuntu-testing/

For the failures,
   1. Yes, the `HiveExternalCatalogVersionsSuite` flakiness is a known one.
   2. For `HDFSMetadataLogSuite` failure, I also observed a few time before
in CentOS too.
   3. Kafka build error is new to me. Does it happen on `Maven` clean build?

Bests,
Dongjoon.


On Tue, Jan 14, 2020 at 6:40 AM Sean Owen  wrote:

> +1 from me. I checked sigs/licenses, and built/tested from source on
> Java 8 + Ubuntu 18.04 with " -Pyarn -Phive -Phive-thriftserver
> -Phadoop-2.7 -Pmesos -Pkubernetes -Psparkr -Pkinesis-asl". I do get
> test failures, but, these are some I have always seen on Ubuntu, and I
> do not know why they happen. They don't seem to affect others, but,
> let me know if anyone else sees these?
>
>
> Always happens for me:
>
> - HDFSMetadataLog: metadata directory collision *** FAILED ***
>   The await method on Waiter timed out. (HDFSMetadataLogSuite.scala:178)
>
> This one has been flaky at times due to external dependencies:
>
> org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite *** ABORTED ***
>   Exception encountered when invoking run on a nested suite -
> spark-submit returned with exit code 1.
>   Command line: './bin/spark-submit' '--name' 'prepare testing tables'
> '--master' 'local[2]' '--conf' 'spark.ui.enabled=false' '--conf'
> 'spark.master.rest.enabled=false' '--conf'
>
> 'spark.sql.warehouse.dir=/data/spark-2.4.5/sql/hive/target/tmp/warehouse-c2f762fd-688e-42b7-a822-06823a6bbd98'
> '--conf' 'spark.sql.test.version.index=0' '--driver-java-options'
>
> '-Dderby.system.home=/data/spark-2.4.5/sql/hive/target/tmp/warehouse-c2f762fd-688e-42b7-a822-06823a6bbd98'
> '/data/spark-2.4.5/sql/hive/target/tmp/test7297526474581770293.py'
>
> Kafka doesn't build with this weird error. I tried a clean build. I
> think we've seen this before.
>
> [error] This symbol is required by 'method
> org.apache.spark.metrics.MetricsSystem.getServletHandlers'.
> [error] Make sure that term eclipse is in your classpath and check for
> conflicting dependencies with `-Ylog-classpath`.
> [error] A full rebuild may help if 'MetricsSystem.class' was compiled
> against an incompatible version of org.
> [error] testUtils.sendMessages(topic, data.toArray)
> [error]
>
> On Mon, Jan 13, 2020 at 6:28 AM Dongjoon Hyun 
> wrote:
> >
> > Please vote on releasing the following candidate as Apache Spark version
> 2.4.5.
> >
> > The vote is open until January 16th 5AM PST and passes if a majority +1
> PMC votes are cast, with a minimum of 3 +1 votes.
> >
> > [ ] +1 Release this package as Apache Spark 2.4.5
> > [ ] -1 Do not release this package because ...
> >
> > To learn more about Apache Spark, please see http://spark.apache.org/
> >
> > The tag to be voted on is v2.4.5-rc1 (commit
> 33bd2beee5e3772a9af1d782f195e6a678c54cf0):
> > https://github.com/apache/spark/tree/v2.4.5-rc1
> >
> > The release files, including signatures, digests, etc. can be found at:
> > https://dist.apache.org/repos/dist/dev/spark/v2.4.5-rc1-bin/
> >
> > Signatures used for Spark RCs can be found in this file:
> > https://dist.apache.org/repos/dist/dev/spark/KEYS
> >
> > The staging repository for this release can be found at:
> > https://repository.apache.org/content/repositories/orgapachespark-1339/
> >
> > The documentation corresponding to this release can be found at:
> > https://dist.apache.org/repos/dist/dev/spark/v2.4.5-rc1-docs/
> >
> > The list of bug fixes going into 2.4.5 can be found at the following URL:
> > https://issues.apache.org/jira/projects/SPARK/versions/12346042
> >
> > This release is using the release script of the tag v2.4.5-rc1.
> >
> > FAQ
> >
> > =
> > How can I help test this release?
> > =
> >
> > If you are a Spark user, you can help us test this release by taking
> > an existing Spark workload and running on this release candidate, then
> > reporting any regressions.
> >
> > If you're working in PySpark you can set up a virtual env and install
> > the current RC and see if anything important breaks, in the Java/Scala
> > you can add the staging repository to your projects resolvers and test
> > with the RC (make sure to clean up the artifact cache before/after so
> > you don't end up building with a out of date RC going forward).
> >
> > ===
> > What should happen to JIRA tickets still targeting 2.4.5?
> > ===
> >
> > The current list of open tickets targeted at 2.4.5 can be found at:
> > https://issues.apache.org/jira/projects/SPARK and search for "Target
> Version/s" = 2.4.5
> >
> > Committers should look at those and triage. Extremely important bug
> > fixes, documentation, and API tweaks that impact compatibility should
> > be worked on immediately. Everything else pl

Re: [VOTE] Release Apache Spark 2.4.5 (RC1)

2020-01-14 Thread Sean Owen
+1 from me. I checked sigs/licenses, and built/tested from source on
Java 8 + Ubuntu 18.04 with " -Pyarn -Phive -Phive-thriftserver
-Phadoop-2.7 -Pmesos -Pkubernetes -Psparkr -Pkinesis-asl". I do get
test failures, but, these are some I have always seen on Ubuntu, and I
do not know why they happen. They don't seem to affect others, but,
let me know if anyone else sees these?


Always happens for me:

- HDFSMetadataLog: metadata directory collision *** FAILED ***
  The await method on Waiter timed out. (HDFSMetadataLogSuite.scala:178)

This one has been flaky at times due to external dependencies:

org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite *** ABORTED ***
  Exception encountered when invoking run on a nested suite -
spark-submit returned with exit code 1.
  Command line: './bin/spark-submit' '--name' 'prepare testing tables'
'--master' 'local[2]' '--conf' 'spark.ui.enabled=false' '--conf'
'spark.master.rest.enabled=false' '--conf'
'spark.sql.warehouse.dir=/data/spark-2.4.5/sql/hive/target/tmp/warehouse-c2f762fd-688e-42b7-a822-06823a6bbd98'
'--conf' 'spark.sql.test.version.index=0' '--driver-java-options'
'-Dderby.system.home=/data/spark-2.4.5/sql/hive/target/tmp/warehouse-c2f762fd-688e-42b7-a822-06823a6bbd98'
'/data/spark-2.4.5/sql/hive/target/tmp/test7297526474581770293.py'

Kafka doesn't build with this weird error. I tried a clean build. I
think we've seen this before.

[error] This symbol is required by 'method
org.apache.spark.metrics.MetricsSystem.getServletHandlers'.
[error] Make sure that term eclipse is in your classpath and check for
conflicting dependencies with `-Ylog-classpath`.
[error] A full rebuild may help if 'MetricsSystem.class' was compiled
against an incompatible version of org.
[error] testUtils.sendMessages(topic, data.toArray)
[error]

On Mon, Jan 13, 2020 at 6:28 AM Dongjoon Hyun  wrote:
>
> Please vote on releasing the following candidate as Apache Spark version 
> 2.4.5.
>
> The vote is open until January 16th 5AM PST and passes if a majority +1 PMC 
> votes are cast, with a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 2.4.5
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
> The tag to be voted on is v2.4.5-rc1 (commit 
> 33bd2beee5e3772a9af1d782f195e6a678c54cf0):
> https://github.com/apache/spark/tree/v2.4.5-rc1
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v2.4.5-rc1-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1339/
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v2.4.5-rc1-docs/
>
> The list of bug fixes going into 2.4.5 can be found at the following URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12346042
>
> This release is using the release script of the tag v2.4.5-rc1.
>
> FAQ
>
> =
> How can I help test this release?
> =
>
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC and see if anything important breaks, in the Java/Scala
> you can add the staging repository to your projects resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with a out of date RC going forward).
>
> ===
> What should happen to JIRA tickets still targeting 2.4.5?
> ===
>
> The current list of open tickets targeted at 2.4.5 can be found at:
> https://issues.apache.org/jira/projects/SPARK and search for "Target 
> Version/s" = 2.4.5
>
> Committers should look at those and triage. Extremely important bug
> fixes, documentation, and API tweaks that impact compatibility should
> be worked on immediately. Everything else please retarget to an
> appropriate release.
>
> ==
> But my bug isn't fixed?
> ==
>
> In order to make timely releases, we will typically not hold the
> release unless the bug in question is a regression from the previous
> release. That being said, if there is something which is a regression
> that has not been correctly targeted please ping me or a committer to
> help target the issue.

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [VOTE] Release Apache Spark 2.4.5 (RC1)

2020-01-13 Thread Dongjoon Hyun
+1.

I verified with GPG and tested RC1 with the followings.

  - Profile: -Pyarn -Phadoop-2.7 -Pkubernetes -Pkinesis-asl -Phive
-Phive-thriftserver
  - Java: OpenJDK 1.8.0_232
  - OS: CentOS (7.5.1804)
  - All Scala/Java UTs and JDBC IT passed.
  - Test with Amazon EKS
Client Version: v1.17.0
Server Version: v1.14.9-eks-c0eccc

Bests,
Dongjoon.


On Mon, Jan 13, 2020 at 4:27 AM Dongjoon Hyun 
wrote:

> Please vote on releasing the following candidate as Apache Spark version
> 2.4.5.
>
> The vote is open until January 16th 5AM PST and passes if a majority +1
> PMC votes are cast, with a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 2.4.5
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
> The tag to be voted on is v2.4.5-rc1 (commit
> 33bd2beee5e3772a9af1d782f195e6a678c54cf0):
> https://github.com/apache/spark/tree/v2.4.5-rc1
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v2.4.5-rc1-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1339/
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v2.4.5-rc1-docs/
>
> The list of bug fixes going into 2.4.5 can be found at the following URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12346042
>
> This release is using the release script of the tag v2.4.5-rc1.
>
> FAQ
>
> =
> How can I help test this release?
> =
>
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC and see if anything important breaks, in the Java/Scala
> you can add the staging repository to your projects resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with a out of date RC going forward).
>
> ===
> What should happen to JIRA tickets still targeting 2.4.5?
> ===
>
> The current list of open tickets targeted at 2.4.5 can be found at:
> https://issues.apache.org/jira/projects/SPARK and search for "Target
> Version/s" = 2.4.5
>
> Committers should look at those and triage. Extremely important bug
> fixes, documentation, and API tweaks that impact compatibility should
> be worked on immediately. Everything else please retarget to an
> appropriate release.
>
> ==
> But my bug isn't fixed?
> ==
>
> In order to make timely releases, we will typically not hold the
> release unless the bug in question is a regression from the previous
> release. That being said, if there is something which is a regression
> that has not been correctly targeted please ping me or a committer to
> help target the issue.
>