I'm sorry to post -1 on this, since there is a non-trivial correctness
issue that I believe we should fix in 2.3.

TL;DR; of the issue: A certain pattern of shuffle+repartition in a query
may produce wrong result if some downstream stages failed and trigger retry
of repartition, the reason of this bug is that current implementation of
`repartition()` doesn't generate deterministic output. The JIRA task:
https://issues.apache.org/jira/browse/SPARK-23207

This is NOT a regression, but since it's a non-trivial correctness issue,
we'd better ship the patch along with 2.3,

2018-01-24 11:42 GMT-08:00 Marcelo Vanzin <van...@cloudera.com>:

> Given that the bugs I was worried about have been dealt with, I'm
> upgrading to +1.
>
> On Mon, Jan 22, 2018 at 5:09 PM, Marcelo Vanzin <van...@cloudera.com>
> wrote:
> > +0
> >
> > Signatures check out. Code compiles, although I see the errors in [1]
> > when untarring the source archive; perhaps we should add "use GNU tar"
> > to the RM checklist?
> >
> > Also ran our internal tests and they seem happy.
> >
> > My concern is the list of open bugs targeted at 2.3.0 (ignoring the
> > documentation ones). It is not long, but it seems some of those need
> > to be looked at. It would be nice for the committers who are involved
> > in those bugs to take a look.
> >
> > [1] https://superuser.com/questions/318809/linux-os-x-
> tar-incompatibility-tarballs-created-on-os-x-give-errors-when-unt
> >
> >
> > On Mon, Jan 22, 2018 at 1:36 PM, Sameer Agarwal <samee...@apache.org>
> wrote:
> >> Please vote on releasing the following candidate as Apache Spark version
> >> 2.3.0. The vote is open until Friday January 26, 2018 at 8:00:00 am UTC
> and
> >> passes if a majority of at least 3 PMC +1 votes are cast.
> >>
> >>
> >> [ ] +1 Release this package as Apache Spark 2.3.0
> >>
> >> [ ] -1 Do not release this package because ...
> >>
> >>
> >> To learn more about Apache Spark, please see https://spark.apache.org/
> >>
> >> The tag to be voted on is v2.3.0-rc2:
> >> https://github.com/apache/spark/tree/v2.3.0-rc2
> >> (489ecb0ef23e5d9b705e5e5bae4fa3d871bdac91)
> >>
> >> List of JIRA tickets resolved in this release can be found here:
> >> https://issues.apache.org/jira/projects/SPARK/versions/12339551
> >>
> >> The release files, including signatures, digests, etc. can be found at:
> >> https://dist.apache.org/repos/dist/dev/spark/v2.3.0-rc2-bin/
> >>
> >> Release artifacts are signed with the following key:
> >> https://dist.apache.org/repos/dist/dev/spark/KEYS
> >>
> >> The staging repository for this release can be found at:
> >> https://repository.apache.org/content/repositories/orgapachespark-1262/
> >>
> >> The documentation corresponding to this release can be found at:
> >> https://dist.apache.org/repos/dist/dev/spark/v2.3.0-rc2-
> docs/_site/index.html
> >>
> >>
> >> FAQ
> >>
> >> =======================================
> >> What are the unresolved issues targeted for 2.3.0?
> >> =======================================
> >>
> >> Please see https://s.apache.org/oXKi. At the time of writing, there are
> >> currently no known release blockers.
> >>
> >> =========================
> >> How can I help test this release?
> >> =========================
> >>
> >> If you are a Spark user, you can help us test this release by taking an
> >> existing Spark workload and running on this release candidate, then
> >> reporting any regressions.
> >>
> >> If you're working in PySpark you can set up a virtual env and install
> the
> >> current RC and see if anything important breaks, in the Java/Scala you
> can
> >> add the staging repository to your projects resolvers and test with the
> RC
> >> (make sure to clean up the artifact cache before/after so you don't end
> up
> >> building with a out of date RC going forward).
> >>
> >> ===========================================
> >> What should happen to JIRA tickets still targeting 2.3.0?
> >> ===========================================
> >>
> >> Committers should look at those and triage. Extremely important bug
> fixes,
> >> documentation, and API tweaks that impact compatibility should be
> worked on
> >> immediately. Everything else please retarget to 2.3.1 or 2.3.0 as
> >> appropriate.
> >>
> >> ===================
> >> Why is my bug not fixed?
> >> ===================
> >>
> >> In order to make timely releases, we will typically not hold the release
> >> unless the bug in question is a regression from 2.2.0. That being said,
> if
> >> there is something which is a regression from 2.2.0 and has not been
> >> correctly targeted please ping me or a committer to help target the
> issue
> >> (you can see the open issues listed as impacting Spark 2.3.0 at
> >> https://s.apache.org/WmoI).
> >>
> >>
> >> Regards,
> >> Sameer
> >
> >
> >
> > --
> > Marcelo
>
>
>
> --
> Marcelo
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

Reply via email to