I'm sorry to post -1 on this, since there is a non-trivial correctness issue that I believe we should fix in 2.3.
TL;DR; of the issue: A certain pattern of shuffle+repartition in a query may produce wrong result if some downstream stages failed and trigger retry of repartition, the reason of this bug is that current implementation of `repartition()` doesn't generate deterministic output. The JIRA task: https://issues.apache.org/jira/browse/SPARK-23207 This is NOT a regression, but since it's a non-trivial correctness issue, we'd better ship the patch along with 2.3, 2018-01-24 11:42 GMT-08:00 Marcelo Vanzin <van...@cloudera.com>: > Given that the bugs I was worried about have been dealt with, I'm > upgrading to +1. > > On Mon, Jan 22, 2018 at 5:09 PM, Marcelo Vanzin <van...@cloudera.com> > wrote: > > +0 > > > > Signatures check out. Code compiles, although I see the errors in [1] > > when untarring the source archive; perhaps we should add "use GNU tar" > > to the RM checklist? > > > > Also ran our internal tests and they seem happy. > > > > My concern is the list of open bugs targeted at 2.3.0 (ignoring the > > documentation ones). It is not long, but it seems some of those need > > to be looked at. It would be nice for the committers who are involved > > in those bugs to take a look. > > > > [1] https://superuser.com/questions/318809/linux-os-x- > tar-incompatibility-tarballs-created-on-os-x-give-errors-when-unt > > > > > > On Mon, Jan 22, 2018 at 1:36 PM, Sameer Agarwal <samee...@apache.org> > wrote: > >> Please vote on releasing the following candidate as Apache Spark version > >> 2.3.0. The vote is open until Friday January 26, 2018 at 8:00:00 am UTC > and > >> passes if a majority of at least 3 PMC +1 votes are cast. > >> > >> > >> [ ] +1 Release this package as Apache Spark 2.3.0 > >> > >> [ ] -1 Do not release this package because ... > >> > >> > >> To learn more about Apache Spark, please see https://spark.apache.org/ > >> > >> The tag to be voted on is v2.3.0-rc2: > >> https://github.com/apache/spark/tree/v2.3.0-rc2 > >> (489ecb0ef23e5d9b705e5e5bae4fa3d871bdac91) > >> > >> List of JIRA tickets resolved in this release can be found here: > >> https://issues.apache.org/jira/projects/SPARK/versions/12339551 > >> > >> The release files, including signatures, digests, etc. can be found at: > >> https://dist.apache.org/repos/dist/dev/spark/v2.3.0-rc2-bin/ > >> > >> Release artifacts are signed with the following key: > >> https://dist.apache.org/repos/dist/dev/spark/KEYS > >> > >> The staging repository for this release can be found at: > >> https://repository.apache.org/content/repositories/orgapachespark-1262/ > >> > >> The documentation corresponding to this release can be found at: > >> https://dist.apache.org/repos/dist/dev/spark/v2.3.0-rc2- > docs/_site/index.html > >> > >> > >> FAQ > >> > >> ======================================= > >> What are the unresolved issues targeted for 2.3.0? > >> ======================================= > >> > >> Please see https://s.apache.org/oXKi. At the time of writing, there are > >> currently no known release blockers. > >> > >> ========================= > >> How can I help test this release? > >> ========================= > >> > >> If you are a Spark user, you can help us test this release by taking an > >> existing Spark workload and running on this release candidate, then > >> reporting any regressions. > >> > >> If you're working in PySpark you can set up a virtual env and install > the > >> current RC and see if anything important breaks, in the Java/Scala you > can > >> add the staging repository to your projects resolvers and test with the > RC > >> (make sure to clean up the artifact cache before/after so you don't end > up > >> building with a out of date RC going forward). > >> > >> =========================================== > >> What should happen to JIRA tickets still targeting 2.3.0? > >> =========================================== > >> > >> Committers should look at those and triage. Extremely important bug > fixes, > >> documentation, and API tweaks that impact compatibility should be > worked on > >> immediately. Everything else please retarget to 2.3.1 or 2.3.0 as > >> appropriate. > >> > >> =================== > >> Why is my bug not fixed? > >> =================== > >> > >> In order to make timely releases, we will typically not hold the release > >> unless the bug in question is a regression from 2.2.0. That being said, > if > >> there is something which is a regression from 2.2.0 and has not been > >> correctly targeted please ping me or a committer to help target the > issue > >> (you can see the open issues listed as impacting Spark 2.3.0 at > >> https://s.apache.org/WmoI). > >> > >> > >> Regards, > >> Sameer > > > > > > > > -- > > Marcelo > > > > -- > Marcelo > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >