Re: [VOTE] Spark 2.3.0 (RC3)
In addition to the issues mentioned above, Wenchen and Xiao have flagged two other regressions (https://issues.apache.org/jira/browse/SPARK-23316 and https://issues.apache.org/jira/browse/SPARK-23388) that were merged after RC3 was cut. Due to these, this vote fails. I'll follow-up with an RC4 in a day (this will probably also give us enough time to resolve https://issues.apache.org/jira/browse/SPARK-23381 and https://issues.apache.org/jira/browse/SPARK-23410). On 15 February 2018 at 17:22, mrkm4ntrwrote: > I agree that this is not a blocker against RC3. It was not appropriate as > a > vote for RC3. > There is no problem if it is in time for release 2.3.0. > > > > > -- > Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >
Re: [VOTE] Spark 2.3.0 (RC3)
I agree that this is not a blocker against RC3. It was not appropriate as a vote for RC3. There is no problem if it is in time for release 2.3.0. -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Re: [VOTE] Spark 2.3.0 (RC3)
I agree that SPARK-23413 should be considered a blocker. It isn't unreasonable to run a history server that is used for several versions of Spark. On Thu, Feb 15, 2018 at 7:49 AM, Sean Owenwrote: > SPARK-23381 is probably not a blocker IMHO; it's a nice-to-have to make > some returned values match an external implementation, for code that hasn't > been published yet. > > However I think it's OK to add to the 2.3.0 release if there's going to be > another RC. > > > On Wed, Feb 14, 2018 at 10:49 PM Holden Karau > wrote: > >> So it's currently tagged as minor and under consideration for 2.4.0. Do >> you think this priority is incorrect? This doesn't seem like a regression >> or a correctness issue so normally we wouldn't hold the release. Of course >> your free to vote how you choose, just providing some additional context >> around how tend to do released. >> >> >> On Feb 14, 2018 11:03 PM, "mrkm4ntr" wrote: >> >> I'm -1 because of this issue. >> I want to fix the hashing implementation in FeatureHasher before >> FeatureHasher released in 2.3.0. >> >> https://issues.apache.org/jira/browse/SPARK-23381 >> https://github.com/apache/spark/pull/20568 >> >> I will fix it soon. >> >> >> >> -- >> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ >> >> - >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >> >> -- Ryan Blue Software Engineer Netflix
Re: [VOTE] Spark 2.3.0 (RC3)
SPARK-23381 is probably not a blocker IMHO; it's a nice-to-have to make some returned values match an external implementation, for code that hasn't been published yet. However I think it's OK to add to the 2.3.0 release if there's going to be another RC. On Wed, Feb 14, 2018 at 10:49 PM Holden Karauwrote: > So it's currently tagged as minor and under consideration for 2.4.0. Do > you think this priority is incorrect? This doesn't seem like a regression > or a correctness issue so normally we wouldn't hold the release. Of course > your free to vote how you choose, just providing some additional context > around how tend to do released. > > > On Feb 14, 2018 11:03 PM, "mrkm4ntr" wrote: > > I'm -1 because of this issue. > I want to fix the hashing implementation in FeatureHasher before > FeatureHasher released in 2.3.0. > > https://issues.apache.org/jira/browse/SPARK-23381 > https://github.com/apache/spark/pull/20568 > > I will fix it soon. > > > > -- > Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > > >
Re: [VOTE] Spark 2.3.0 (RC3)
Since it seems there are other issues to fix, I raised SPARK-23413 to blocker status to avoid having to change the disk format of history data in a minor release. On Wed, Feb 14, 2018 at 11:06 PM, Nick Pentreathwrote: > -1 for me as we elevated https://issues.apache.org/jira/browse/SPARK-23377 > to a Blocker. It should be fixed before release. > > On Thu, 15 Feb 2018 at 07:25 Holden Karau wrote: >> >> If this is a blocker in your view then the vote thread is an important >> place to mention it. I'm not super sure all of the places these methods are >> used so I'll defer to srowen and folks, but for the ML related implications >> in the past we've allowed people to set the hashing function when we've >> introduced changes. >> >> On Feb 15, 2018 2:08 PM, "mrkm4ntr" wrote: >>> >>> I was advised to post here in the discussion at GitHub. I do not know >>> what to >>> do about the problem that discussions dispersing in two places. >>> >>> >>> >>> -- >>> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ >>> >>> - >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> > -- Marcelo - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Re: [VOTE] Spark 2.3.0 (RC3)
-1 for me as we elevated https://issues.apache.org/jira/browse/SPARK-23377 to a Blocker. It should be fixed before release. On Thu, 15 Feb 2018 at 07:25 Holden Karauwrote: > If this is a blocker in your view then the vote thread is an important > place to mention it. I'm not super sure all of the places these methods are > used so I'll defer to srowen and folks, but for the ML related implications > in the past we've allowed people to set the hashing function when we've > introduced changes. > > On Feb 15, 2018 2:08 PM, "mrkm4ntr" wrote: > >> I was advised to post here in the discussion at GitHub. I do not know >> what to >> do about the problem that discussions dispersing in two places. >> >> >> >> -- >> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ >> >> - >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >>
Re: [VOTE] Spark 2.3.0 (RC3)
If this is a blocker in your view then the vote thread is an important place to mention it. I'm not super sure all of the places these methods are used so I'll defer to srowen and folks, but for the ML related implications in the past we've allowed people to set the hashing function when we've introduced changes. On Feb 15, 2018 2:08 PM, "mrkm4ntr"wrote: > I was advised to post here in the discussion at GitHub. I do not know what > to > do about the problem that discussions dispersing in two places. > > > > -- > Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >
Re: [VOTE] Spark 2.3.0 (RC3)
I was advised to post here in the discussion at GitHub. I do not know what to do about the problem that discussions dispersing in two places. -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Re: [VOTE] Spark 2.3.0 (RC3)
So it's currently tagged as minor and under consideration for 2.4.0. Do you think this priority is incorrect? This doesn't seem like a regression or a correctness issue so normally we wouldn't hold the release. Of course your free to vote how you choose, just providing some additional context around how tend to do released. On Feb 14, 2018 11:03 PM, "mrkm4ntr"wrote: I'm -1 because of this issue. I want to fix the hashing implementation in FeatureHasher before FeatureHasher released in 2.3.0. https://issues.apache.org/jira/browse/SPARK-23381 https://github.com/apache/spark/pull/20568 I will fix it soon. -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Re: [VOTE] Spark 2.3.0 (RC3)
I'm -1 because of this issue. I want to fix the hashing implementation in FeatureHasher before FeatureHasher released in 2.3.0. https://issues.apache.org/jira/browse/SPARK-23381 https://github.com/apache/spark/pull/20568 I will fix it soon. -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Re: [VOTE] Spark 2.3.0 (RC3)
The issue with SPARK-23292 is that we currently run the python tests related to pandas and pyarrow with python 3 (which is already installed on all amplab jenkins machines). Since the code path is fully tested, we decided to not mark it as a blocker; I've reworded the title to better indicate that. On 13 February 2018 at 08:16, Sean Owenwrote: > +1 from me. Again, licenses and sigs look fine. I built the source > distribution with "-Phive -Phadoop-2.7 -Pyarn -Pkubernetes" and all tests > passed. > > Remaining issues for 2.3.0, none of which are a Blocker: > > SPARK-22797 Add multiple column support to PySpark Bucketizer > SPARK-23083 Adding Kubernetes as an option to https://spark.apache.org/ > SPARK-23292 python tests related to pandas are skipped > SPARK-23309 Spark 2.3 cached query performance 20-30% worse then spark 2.2 > SPARK-23316 AnalysisException after max iteration reached for IN query > > ... though the pandas tests issue is "Critical". > > (SPARK-23083 is an update to the main site that should happen as the > artifacts are released, so it's OK.) > > On Tue, Feb 13, 2018 at 12:30 AM Sameer Agarwal > wrote: > >> Now that all known blockers have once again been resolved, please vote on >> releasing the following candidate as Apache Spark version 2.3.0. The vote >> is open until Friday February 16, 2018 at 8:00:00 am UTC and passes if a >> majority of at least 3 PMC +1 votes are cast. >> >> >> [ ] +1 Release this package as Apache Spark 2.3.0 >> >> [ ] -1 Do not release this package because ... >> >> >> To learn more about Apache Spark, please see https://spark.apache.org/ >> >> The tag to be voted on is v2.3.0-rc3: https://github.com/apache/ >> spark/tree/v2.3.0-rc3 (89f6fcbafcfb0a7aeb897fba6036cb085bd35121) >> >> List of JIRA tickets resolved in this release can be found here: >> https://issues.apache.org/jira/projects/SPARK/versions/12339551 >> >> The release files, including signatures, digests, etc. can be found at: >> https://dist.apache.org/repos/dist/dev/spark/v2.3.0-rc3-bin/ >> >> Release artifacts are signed with the following key: >> https://dist.apache.org/repos/dist/dev/spark/KEYS >> >> The staging repository for this release can be found at: >> https://repository.apache.org/content/repositories/orgapachespark-1264/ >> >> The documentation corresponding to this release can be found at: >> https://dist.apache.org/repos/dist/dev/spark/v2.3.0-rc3- >> docs/_site/index.html >> >> >> FAQ >> >> === >> What are the unresolved issues targeted for 2.3.0? >> === >> >> Please see https://s.apache.org/oXKi. At the time of writing, there are >> currently no known release blockers. >> >> = >> How can I help test this release? >> = >> >> If you are a Spark user, you can help us test this release by taking an >> existing Spark workload and running on this release candidate, then >> reporting any regressions. >> >> If you're working in PySpark you can set up a virtual env and install the >> current RC and see if anything important breaks, in the Java/Scala you can >> add the staging repository to your projects resolvers and test with the RC >> (make sure to clean up the artifact cache before/after so you don't end up >> building with a out of date RC going forward). >> >> === >> What should happen to JIRA tickets still targeting 2.3.0? >> === >> >> Committers should look at those and triage. Extremely important bug >> fixes, documentation, and API tweaks that impact compatibility should be >> worked on immediately. Everything else please retarget to 2.3.1 or 2.4.0 as >> appropriate. >> >> === >> Why is my bug not fixed? >> === >> >> In order to make timely releases, we will typically not hold the release >> unless the bug in question is a regression from 2.2.0. That being said, if >> there is something which is a regression from 2.2.0 and has not been >> correctly targeted please ping me or a committer to help target the issue >> (you can see the open issues listed as impacting Spark 2.3.0 at >> https://s.apache.org/WmoI). >> >> >> Regards, >> Sameer >> > -- Sameer Agarwal Computer Science | UC Berkeley http://cs.berkeley.edu/~sameerag
Re: [VOTE] Spark 2.3.0 (RC3)
+1 from me. Again, licenses and sigs look fine. I built the source distribution with "-Phive -Phadoop-2.7 -Pyarn -Pkubernetes" and all tests passed. Remaining issues for 2.3.0, none of which are a Blocker: SPARK-22797 Add multiple column support to PySpark Bucketizer SPARK-23083 Adding Kubernetes as an option to https://spark.apache.org/ SPARK-23292 python tests related to pandas are skipped SPARK-23309 Spark 2.3 cached query performance 20-30% worse then spark 2.2 SPARK-23316 AnalysisException after max iteration reached for IN query ... though the pandas tests issue is "Critical". (SPARK-23083 is an update to the main site that should happen as the artifacts are released, so it's OK.) On Tue, Feb 13, 2018 at 12:30 AM Sameer Agarwalwrote: > Now that all known blockers have once again been resolved, please vote on > releasing the following candidate as Apache Spark version 2.3.0. The vote > is open until Friday February 16, 2018 at 8:00:00 am UTC and passes if a > majority of at least 3 PMC +1 votes are cast. > > > [ ] +1 Release this package as Apache Spark 2.3.0 > > [ ] -1 Do not release this package because ... > > > To learn more about Apache Spark, please see https://spark.apache.org/ > > The tag to be voted on is v2.3.0-rc3: > https://github.com/apache/spark/tree/v2.3.0-rc3 > (89f6fcbafcfb0a7aeb897fba6036cb085bd35121) > > List of JIRA tickets resolved in this release can be found here: > https://issues.apache.org/jira/projects/SPARK/versions/12339551 > > The release files, including signatures, digests, etc. can be found at: > https://dist.apache.org/repos/dist/dev/spark/v2.3.0-rc3-bin/ > > Release artifacts are signed with the following key: > https://dist.apache.org/repos/dist/dev/spark/KEYS > > The staging repository for this release can be found at: > https://repository.apache.org/content/repositories/orgapachespark-1264/ > > The documentation corresponding to this release can be found at: > > https://dist.apache.org/repos/dist/dev/spark/v2.3.0-rc3-docs/_site/index.html > > > FAQ > > === > What are the unresolved issues targeted for 2.3.0? > === > > Please see https://s.apache.org/oXKi. At the time of writing, there are > currently no known release blockers. > > = > How can I help test this release? > = > > If you are a Spark user, you can help us test this release by taking an > existing Spark workload and running on this release candidate, then > reporting any regressions. > > If you're working in PySpark you can set up a virtual env and install the > current RC and see if anything important breaks, in the Java/Scala you can > add the staging repository to your projects resolvers and test with the RC > (make sure to clean up the artifact cache before/after so you don't end up > building with a out of date RC going forward). > > === > What should happen to JIRA tickets still targeting 2.3.0? > === > > Committers should look at those and triage. Extremely important bug fixes, > documentation, and API tweaks that impact compatibility should be worked on > immediately. Everything else please retarget to 2.3.1 or 2.4.0 as > appropriate. > > === > Why is my bug not fixed? > === > > In order to make timely releases, we will typically not hold the release > unless the bug in question is a regression from 2.2.0. That being said, if > there is something which is a regression from 2.2.0 and has not been > correctly targeted please ping me or a committer to help target the issue > (you can see the open issues listed as impacting Spark 2.3.0 at > https://s.apache.org/WmoI). > > > Regards, > Sameer >
Re: [VOTE] Spark 2.3.0 (RC3)
I'll start the vote with a +1. As of today, all known release blockers and QA tasks have been resolved, and the jenkins builds are healthy: https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/ On 12 February 2018 at 22:30, Sameer Agarwalwrote: > Now that all known blockers have once again been resolved, please vote on > releasing the following candidate as Apache Spark version 2.3.0. The vote > is open until Friday February 16, 2018 at 8:00:00 am UTC and passes if a > majority of at least 3 PMC +1 votes are cast. > > > [ ] +1 Release this package as Apache Spark 2.3.0 > > [ ] -1 Do not release this package because ... > > > To learn more about Apache Spark, please see https://spark.apache.org/ > > The tag to be voted on is v2.3.0-rc3: https://github.com/apache/ > spark/tree/v2.3.0-rc3 (89f6fcbafcfb0a7aeb897fba6036cb085bd35121) > > List of JIRA tickets resolved in this release can be found here: > https://issues.apache.org/jira/projects/SPARK/versions/12339551 > > The release files, including signatures, digests, etc. can be found at: > https://dist.apache.org/repos/dist/dev/spark/v2.3.0-rc3-bin/ > > Release artifacts are signed with the following key: > https://dist.apache.org/repos/dist/dev/spark/KEYS > > The staging repository for this release can be found at: > https://repository.apache.org/content/repositories/orgapachespark-1264/ > > The documentation corresponding to this release can be found at: > https://dist.apache.org/repos/dist/dev/spark/v2.3.0-rc3- > docs/_site/index.html > > > FAQ > > === > What are the unresolved issues targeted for 2.3.0? > === > > Please see https://s.apache.org/oXKi. At the time of writing, there are > currently no known release blockers. > > = > How can I help test this release? > = > > If you are a Spark user, you can help us test this release by taking an > existing Spark workload and running on this release candidate, then > reporting any regressions. > > If you're working in PySpark you can set up a virtual env and install the > current RC and see if anything important breaks, in the Java/Scala you can > add the staging repository to your projects resolvers and test with the RC > (make sure to clean up the artifact cache before/after so you don't end up > building with a out of date RC going forward). > > === > What should happen to JIRA tickets still targeting 2.3.0? > === > > Committers should look at those and triage. Extremely important bug fixes, > documentation, and API tweaks that impact compatibility should be worked on > immediately. Everything else please retarget to 2.3.1 or 2.4.0 as > appropriate. > > === > Why is my bug not fixed? > === > > In order to make timely releases, we will typically not hold the release > unless the bug in question is a regression from 2.2.0. That being said, if > there is something which is a regression from 2.2.0 and has not been > correctly targeted please ping me or a committer to help target the issue > (you can see the open issues listed as impacting Spark 2.3.0 at > https://s.apache.org/WmoI). > > > Regards, > Sameer >
[VOTE] Spark 2.3.0 (RC3)
Now that all known blockers have once again been resolved, please vote on releasing the following candidate as Apache Spark version 2.3.0. The vote is open until Friday February 16, 2018 at 8:00:00 am UTC and passes if a majority of at least 3 PMC +1 votes are cast. [ ] +1 Release this package as Apache Spark 2.3.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see https://spark.apache.org/ The tag to be voted on is v2.3.0-rc3: https://github.com/apache/spark/tree/v2.3.0-rc3 (89f6fcbafcfb0a7aeb897fba6036cb085bd35121) List of JIRA tickets resolved in this release can be found here: https://issues.apache.org/jira/projects/SPARK/versions/12339551 The release files, including signatures, digests, etc. can be found at: https://dist.apache.org/repos/dist/dev/spark/v2.3.0-rc3-bin/ Release artifacts are signed with the following key: https://dist.apache.org/repos/dist/dev/spark/KEYS The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1264/ The documentation corresponding to this release can be found at: https://dist.apache.org/repos/dist/dev/spark/v2.3.0-rc3-docs/_site/index.html FAQ === What are the unresolved issues targeted for 2.3.0? === Please see https://s.apache.org/oXKi. At the time of writing, there are currently no known release blockers. = How can I help test this release? = If you are a Spark user, you can help us test this release by taking an existing Spark workload and running on this release candidate, then reporting any regressions. If you're working in PySpark you can set up a virtual env and install the current RC and see if anything important breaks, in the Java/Scala you can add the staging repository to your projects resolvers and test with the RC (make sure to clean up the artifact cache before/after so you don't end up building with a out of date RC going forward). === What should happen to JIRA tickets still targeting 2.3.0? === Committers should look at those and triage. Extremely important bug fixes, documentation, and API tweaks that impact compatibility should be worked on immediately. Everything else please retarget to 2.3.1 or 2.4.0 as appropriate. === Why is my bug not fixed? === In order to make timely releases, we will typically not hold the release unless the bug in question is a regression from 2.2.0. That being said, if there is something which is a regression from 2.2.0 and has not been correctly targeted please ping me or a committer to help target the issue (you can see the open issues listed as impacting Spark 2.3.0 at https://s.apache.org/WmoI). Regards, Sameer