Re: [VOTE] Release Apache Spark 2.4.5 (RC1)

Xiao Li Thu, 16 Jan 2020 09:32:29 -0800

-1

Let us include the correctness fix:
https://github.com/apache/spark/pull/27229


Thanks,

Xiao

On Thu, Jan 16, 2020 at 8:46 AM Dongjoon Hyun <dongjoon.h...@gmail.com>
wrote:

> Thank you, Jungtaek!
>
> Bests,
> Dongjoon.
>
>
> On Wed, Jan 15, 2020 at 8:57 PM Jungtaek Lim <kabhwan.opensou...@gmail.com>
> wrote:
>
>> Once we decided to cancel the RC1, what about including SPARK-29450 (
>> https://github.com/apache/spark/pull/27209) into RC2?
>>
>> SPARK-29450 was merged into master, and Xiao figured out it fixed a
>> regression, long lasting one (broken at 2.3.0). The link refers the PR for
>> 2.4 branch.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>> On Thu, Jan 16, 2020 at 12:56 PM Dongjoon Hyun <dongjoon.h...@gmail.com>
>> wrote:
>>
>>> Sure. Wenchen and Hyukjin.
>>>
>>> I observed all of the above reported issues and have been waiting to
>>> collect more information before cancelling RC1 vote.
>>>
>>> The other stuff I've observed is that Marcelo and Sean also requested
>>> reverting the existing commit.
>>> - https://github.com/apache/spark/pull/24732 (spark.shuffle.io.backLog
>>> change)
>>>
>>> To All.
>>> We want your explicit feedbacks. Please reply on this thread.
>>>
>>> Although we get enough positive feedbacks here, I'll cancel this RC1.
>>> I want to address at least the above negative feedbacks and roll RC2
>>> next Monday.
>>>
>>> Bests,
>>> Dongjoon.
>>>
>>>
>>> On Wed, Jan 15, 2020 at 7:47 PM Hyukjin Kwon <gurwls...@gmail.com>
>>> wrote:
>>>
>>>> If we go for RC2, we should include both:
>>>>
>>>> https://github.com/apache/spark/pull/27210
>>>> https://github.com/apache/spark/pull/27184
>>>>
>>>> just for the sake of being complete and making the maintenance simple.
>>>>
>>>>
>>>> 2020년 1월 16일 (목) 오후 12:38, Wenchen Fan <cloud0...@gmail.com>님이 작성:
>>>>
>>>>> Recently we merged several fixes to 2.4:
>>>>> https://issues.apache.org/jira/browse/SPARK-30325   a driver hang
>>>>> issue
>>>>> https://issues.apache.org/jira/browse/SPARK-30246   a memory leak
>>>>> issue
>>>>> https://issues.apache.org/jira/browse/SPARK-29708   a correctness
>>>>> issue(for a rarely used feature, so not merged to 2.4 yet)
>>>>>
>>>>> Shall we include them?
>>>>>
>>>>>
>>>>> On Wed, Jan 15, 2020 at 9:51 PM Hyukjin Kwon <gurwls...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> +1
>>>>>>
>>>>>> On Wed, 15 Jan 2020, 08:24 Takeshi Yamamuro, <linguin....@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> +1;
>>>>>>>
>>>>>>> I checked the links and materials, then I run the tests with
>>>>>>> `-Pyarn -Phadoop-2.7 -Phive -Phive-thriftserver -Pmesos -Pkubernetes
>>>>>>> -Psparkr`
>>>>>>> on macOS (Java 8).
>>>>>>> All the things look fine and I didn't see the error on my env
>>>>>>> that Sean said above.
>>>>>>>
>>>>>>> Thanks, Dongjoon!
>>>>>>>
>>>>>>> Bests,
>>>>>>> Takeshi
>>>>>>>
>>>>>>> On Wed, Jan 15, 2020 at 4:09 AM DB Tsai <dbt...@dbtsai.com> wrote:
>>>>>>>
>>>>>>>> +1 Thanks.
>>>>>>>>
>>>>>>>> Sincerely,
>>>>>>>>
>>>>>>>> DB Tsai
>>>>>>>> ----------------------------------------------------------
>>>>>>>> Web: https://www.dbtsai.com
>>>>>>>> PGP Key ID: 42E5B25A8F7A82C1
>>>>>>>>
>>>>>>>> On Tue, Jan 14, 2020 at 11:08 AM Sean Owen <sro...@apache.org>
>>>>>>>> wrote:
>>>>>>>> >
>>>>>>>> > Yeah it's something about the env I spun up, but I don't know
>>>>>>>> what. It
>>>>>>>> > happens frequently when I test, but not on Jenkins.
>>>>>>>> > The Kafka error comes up every now and then and a clean rebuild
>>>>>>>> fixes
>>>>>>>> > it, but not in my case. I don't know why.
>>>>>>>> > But if nobody else sees it, I'm pretty sure it's just an artifact
>>>>>>>> of
>>>>>>>> > the local VM.
>>>>>>>> >
>>>>>>>> > On Tue, Jan 14, 2020 at 12:57 PM Dongjoon Hyun <
>>>>>>>> dongjoon.h...@gmail.com> wrote:
>>>>>>>> > >
>>>>>>>> > > Thank you, Sean.
>>>>>>>> > >
>>>>>>>> > > First of all, the `Ubuntu` job on Amplab Jenkins farm is green.
>>>>>>>> > >
>>>>>>>> > >
>>>>>>>> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-branch-2.4-test-sbt-hadoop-2.7-ubuntu-testing/
>>>>>>>> > >
>>>>>>>> > > For the failures,
>>>>>>>> > >    1. Yes, the `HiveExternalCatalogVersionsSuite` flakiness is
>>>>>>>> a known one.
>>>>>>>> > >    2. For `HDFSMetadataLogSuite` failure, I also observed a few
>>>>>>>> time before in CentOS too.
>>>>>>>> > >    3. Kafka build error is new to me. Does it happen on `Maven`
>>>>>>>> clean build?
>>>>>>>> > >
>>>>>>>> > > Bests,
>>>>>>>> > > Dongjoon.
>>>>>>>> > >
>>>>>>>> > >
>>>>>>>> > > On Tue, Jan 14, 2020 at 6:40 AM Sean Owen <sro...@apache.org>
>>>>>>>> wrote:
>>>>>>>> > >>
>>>>>>>> > >> +1 from me. I checked sigs/licenses, and built/tested from
>>>>>>>> source on
>>>>>>>> > >> Java 8 + Ubuntu 18.04 with " -Pyarn -Phive -Phive-thriftserver
>>>>>>>> > >> -Phadoop-2.7 -Pmesos -Pkubernetes -Psparkr -Pkinesis-asl". I
>>>>>>>> do get
>>>>>>>> > >> test failures, but, these are some I have always seen on
>>>>>>>> Ubuntu, and I
>>>>>>>> > >> do not know why they happen. They don't seem to affect others,
>>>>>>>> but,
>>>>>>>> > >> let me know if anyone else sees these?
>>>>>>>> > >>
>>>>>>>> > >>
>>>>>>>> > >> Always happens for me:
>>>>>>>> > >>
>>>>>>>> > >> - HDFSMetadataLog: metadata directory collision *** FAILED ***
>>>>>>>> > >>   The await method on Waiter timed out.
>>>>>>>> (HDFSMetadataLogSuite.scala:178)
>>>>>>>> > >>
>>>>>>>> > >> This one has been flaky at times due to external dependencies:
>>>>>>>> > >>
>>>>>>>> > >> org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite ***
>>>>>>>> ABORTED ***
>>>>>>>> > >>   Exception encountered when invoking run on a nested suite -
>>>>>>>> > >> spark-submit returned with exit code 1.
>>>>>>>> > >>   Command line: './bin/spark-submit' '--name' 'prepare testing
>>>>>>>> tables'
>>>>>>>> > >> '--master' 'local[2]' '--conf' 'spark.ui.enabled=false'
>>>>>>>> '--conf'
>>>>>>>> > >> 'spark.master.rest.enabled=false' '--conf'
>>>>>>>> > >>
>>>>>>>> 'spark.sql.warehouse.dir=/data/spark-2.4.5/sql/hive/target/tmp/warehouse-c2f762fd-688e-42b7-a822-06823a6bbd98'
>>>>>>>> > >> '--conf' 'spark.sql.test.version.index=0'
>>>>>>>> '--driver-java-options'
>>>>>>>> > >>
>>>>>>>> '-Dderby.system.home=/data/spark-2.4.5/sql/hive/target/tmp/warehouse-c2f762fd-688e-42b7-a822-06823a6bbd98'
>>>>>>>> > >>
>>>>>>>> '/data/spark-2.4.5/sql/hive/target/tmp/test7297526474581770293.py'
>>>>>>>> > >>
>>>>>>>> > >> Kafka doesn't build with this weird error. I tried a clean
>>>>>>>> build. I
>>>>>>>> > >> think we've seen this before.
>>>>>>>> > >>
>>>>>>>> > >> [error] This symbol is required by 'method
>>>>>>>> > >> org.apache.spark.metrics.MetricsSystem.getServletHandlers'.
>>>>>>>> > >> [error] Make sure that term eclipse is in your classpath and
>>>>>>>> check for
>>>>>>>> > >> conflicting dependencies with `-Ylog-classpath`.
>>>>>>>> > >> [error] A full rebuild may help if 'MetricsSystem.class' was
>>>>>>>> compiled
>>>>>>>> > >> against an incompatible version of org.
>>>>>>>> > >> [error]     testUtils.sendMessages(topic, data.toArray)
>>>>>>>> > >> [error]
>>>>>>>> > >>
>>>>>>>> > >> On Mon, Jan 13, 2020 at 6:28 AM Dongjoon Hyun <
>>>>>>>> dongjoon.h...@gmail.com> wrote:
>>>>>>>> > >> >
>>>>>>>> > >> > Please vote on releasing the following candidate as Apache
>>>>>>>> Spark version 2.4.5.
>>>>>>>> > >> >
>>>>>>>> > >> > The vote is open until January 16th 5AM PST and passes if a
>>>>>>>> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>>>>>>>> > >> >
>>>>>>>> > >> > [ ] +1 Release this package as Apache Spark 2.4.5
>>>>>>>> > >> > [ ] -1 Do not release this package because ...
>>>>>>>> > >> >
>>>>>>>> > >> > To learn more about Apache Spark, please see
>>>>>>>> http://spark.apache.org/
>>>>>>>> > >> >
>>>>>>>> > >> > The tag to be voted on is v2.4.5-rc1 (commit
>>>>>>>> 33bd2beee5e3772a9af1d782f195e6a678c54cf0):
>>>>>>>> > >> > https://github.com/apache/spark/tree/v2.4.5-rc1
>>>>>>>> > >> >
>>>>>>>> > >> > The release files, including signatures, digests, etc. can
>>>>>>>> be found at:
>>>>>>>> > >> > https://dist.apache.org/repos/dist/dev/spark/v2.4.5-rc1-bin/
>>>>>>>> > >> >
>>>>>>>> > >> > Signatures used for Spark RCs can be found in this file:
>>>>>>>> > >> > https://dist.apache.org/repos/dist/dev/spark/KEYS
>>>>>>>> > >> >
>>>>>>>> > >> > The staging repository for this release can be found at:
>>>>>>>> > >> >
>>>>>>>> https://repository.apache.org/content/repositories/orgapachespark-1339/
>>>>>>>> > >> >
>>>>>>>> > >> > The documentation corresponding to this release can be found
>>>>>>>> at:
>>>>>>>> > >> >
>>>>>>>> https://dist.apache.org/repos/dist/dev/spark/v2.4.5-rc1-docs/
>>>>>>>> > >> >
>>>>>>>> > >> > The list of bug fixes going into 2.4.5 can be found at the
>>>>>>>> following URL:
>>>>>>>> > >> >
>>>>>>>> https://issues.apache.org/jira/projects/SPARK/versions/12346042
>>>>>>>> > >> >
>>>>>>>> > >> > This release is using the release script of the tag
>>>>>>>> v2.4.5-rc1.
>>>>>>>> > >> >
>>>>>>>> > >> > FAQ
>>>>>>>> > >> >
>>>>>>>> > >> > =========================
>>>>>>>> > >> > How can I help test this release?
>>>>>>>> > >> > =========================
>>>>>>>> > >> >
>>>>>>>> > >> > If you are a Spark user, you can help us test this release
>>>>>>>> by taking
>>>>>>>> > >> > an existing Spark workload and running on this release
>>>>>>>> candidate, then
>>>>>>>> > >> > reporting any regressions.
>>>>>>>> > >> >
>>>>>>>> > >> > If you're working in PySpark you can set up a virtual env
>>>>>>>> and install
>>>>>>>> > >> > the current RC and see if anything important breaks, in the
>>>>>>>> Java/Scala
>>>>>>>> > >> > you can add the staging repository to your projects
>>>>>>>> resolvers and test
>>>>>>>> > >> > with the RC (make sure to clean up the artifact cache
>>>>>>>> before/after so
>>>>>>>> > >> > you don't end up building with a out of date RC going
>>>>>>>> forward).
>>>>>>>> > >> >
>>>>>>>> > >> > ===========================================
>>>>>>>> > >> > What should happen to JIRA tickets still targeting 2.4.5?
>>>>>>>> > >> > ===========================================
>>>>>>>> > >> >
>>>>>>>> > >> > The current list of open tickets targeted at 2.4.5 can be
>>>>>>>> found at:
>>>>>>>> > >> > https://issues.apache.org/jira/projects/SPARK and search
>>>>>>>> for "Target Version/s" = 2.4.5
>>>>>>>> > >> >
>>>>>>>> > >> > Committers should look at those and triage. Extremely
>>>>>>>> important bug
>>>>>>>> > >> > fixes, documentation, and API tweaks that impact
>>>>>>>> compatibility should
>>>>>>>> > >> > be worked on immediately. Everything else please retarget to
>>>>>>>> an
>>>>>>>> > >> > appropriate release.
>>>>>>>> > >> >
>>>>>>>> > >> > ==================
>>>>>>>> > >> > But my bug isn't fixed?
>>>>>>>> > >> > ==================
>>>>>>>> > >> >
>>>>>>>> > >> > In order to make timely releases, we will typically not hold
>>>>>>>> the
>>>>>>>> > >> > release unless the bug in question is a regression from the
>>>>>>>> previous
>>>>>>>> > >> > release. That being said, if there is something which is a
>>>>>>>> regression
>>>>>>>> > >> > that has not been correctly targeted please ping me or a
>>>>>>>> committer to
>>>>>>>> > >> > help target the issue.
>>>>>>>> >
>>>>>>>> >
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>>>>> >
>>>>>>>>
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> ---
>>>>>>> Takeshi Yamamuro
>>>>>>>
>>>>>>

-- 
[image: Databricks Summit - Watch the talks]
<https://databricks.com/sparkaisummit/north-america>

Re: [VOTE] Release Apache Spark 2.4.5 (RC1)

Reply via email to