Hi All, +1 for the tickets proposed by Ryan Blue
Any possible chance of this one https://issues.apache.org/jira/browse/SPARK-23406 getting into 2.3.0? It's a very important feature for us so if it doesn't make the cut I would have to cherry-pick this commit and compile from the source for our production release. Thanks! On Wed, Feb 21, 2018 at 9:01 AM, Ryan Blue <rb...@netflix.com.invalid> wrote: > What does everyone think about getting some of the newer DataSourceV2 > improvements in? It should be low risk because it is a new code path, and > v2 isn't very usable without things like support for using the output > commit coordinator to deconflict writes. > > The ones I'd like to get in are: > * Use the output commit coordinator: https://issues. > apache.org/jira/browse/SPARK-23323 > * Use immutable trees and the same push-down logic as other read paths: > https://issues.apache.org/jira/browse/SPARK-23203 > * Don't allow users to supply schemas when they aren't supported: > https://issues.apache.org/jira/browse/SPARK-23418 > > I think it would make the 2.3.0 release more usable for anyone interested > in the v2 read and write paths. > > Thanks! > > On Tue, Feb 20, 2018 at 7:07 PM, Weichen Xu <weichen...@databricks.com> > wrote: > >> +1 >> >> On Wed, Feb 21, 2018 at 10:07 AM, Marcelo Vanzin <van...@cloudera.com> >> wrote: >> >>> Done, thanks! >>> >>> On Tue, Feb 20, 2018 at 6:05 PM, Sameer Agarwal <samee...@apache.org> >>> wrote: >>> > Sure, please feel free to backport. >>> > >>> > On 20 February 2018 at 18:02, Marcelo Vanzin <van...@cloudera.com> >>> wrote: >>> >> >>> >> Hey Sameer, >>> >> >>> >> Mind including https://github.com/apache/spark/pull/20643 >>> >> (SPARK-23468) in the new RC? It's a minor bug since I've only hit it >>> >> with older shuffle services, but it's pretty safe. >>> >> >>> >> On Tue, Feb 20, 2018 at 5:58 PM, Sameer Agarwal <samee...@apache.org> >>> >> wrote: >>> >> > This RC has failed due to >>> >> > https://issues.apache.org/jira/browse/SPARK-23470. >>> >> > Now that the fix has been merged in 2.3 (thanks Marcelo!), I'll >>> follow >>> >> > up >>> >> > with an RC5 soon. >>> >> > >>> >> > On 20 February 2018 at 16:49, Ryan Blue <rb...@netflix.com> wrote: >>> >> >> >>> >> >> +1 >>> >> >> >>> >> >> Build & tests look fine, checked signature and checksums for src >>> >> >> tarball. >>> >> >> >>> >> >> On Tue, Feb 20, 2018 at 12:54 PM, Shixiong(Ryan) Zhu >>> >> >> <shixi...@databricks.com> wrote: >>> >> >>> >>> >> >>> I'm -1 because of the UI regression >>> >> >>> https://issues.apache.org/jira/browse/SPARK-23470 : the All Jobs >>> page >>> >> >>> may be >>> >> >>> too slow and cause "read timeout" when there are lots of jobs and >>> >> >>> stages. >>> >> >>> This is one of the most important pages because when it's broken, >>> it's >>> >> >>> pretty hard to use Spark Web UI. >>> >> >>> >>> >> >>> >>> >> >>> On Tue, Feb 20, 2018 at 4:37 AM, Marco Gaido < >>> marcogaid...@gmail.com> >>> >> >>> wrote: >>> >> >>>> >>> >> >>>> +1 >>> >> >>>> >>> >> >>>> 2018-02-20 12:30 GMT+01:00 Hyukjin Kwon <gurwls...@gmail.com>: >>> >> >>>>> >>> >> >>>>> +1 too >>> >> >>>>> >>> >> >>>>> 2018-02-20 14:41 GMT+09:00 Takuya UESHIN < >>> ues...@happy-camper.st>: >>> >> >>>>>> >>> >> >>>>>> +1 >>> >> >>>>>> >>> >> >>>>>> >>> >> >>>>>> On Tue, Feb 20, 2018 at 2:14 PM, Xingbo Jiang >>> >> >>>>>> <jiangxb1...@gmail.com> >>> >> >>>>>> wrote: >>> >> >>>>>>> >>> >> >>>>>>> +1 >>> >> >>>>>>> >>> >> >>>>>>> >>> >> >>>>>>> Wenchen Fan <cloud0...@gmail.com>于2018年2月20日 周二下午1:09写道: >>> >> >>>>>>>> >>> >> >>>>>>>> +1 >>> >> >>>>>>>> >>> >> >>>>>>>> On Tue, Feb 20, 2018 at 12:53 PM, Reynold Xin >>> >> >>>>>>>> <r...@databricks.com> >>> >> >>>>>>>> wrote: >>> >> >>>>>>>>> >>> >> >>>>>>>>> +1 >>> >> >>>>>>>>> >>> >> >>>>>>>>> On Feb 20, 2018, 5:51 PM +1300, Sameer Agarwal >>> >> >>>>>>>>> <sameer.a...@gmail.com>, wrote: >>> >> >>>>>>>>>> >>> >> >>>>>>>>>> this file shouldn't be included? >>> >> >>>>>>>>>> >>> >> >>>>>>>>>> https://dist.apache.org/repos/ >>> dist/dev/spark/v2.3.0-rc4-bin/spark-parent_2.11.iml >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> I've now deleted this file >>> >> >>>>>>>>> >>> >> >>>>>>>>>> From: Sameer Agarwal <sameer.a...@gmail.com> >>> >> >>>>>>>>>> Sent: Saturday, February 17, 2018 1:43:39 PM >>> >> >>>>>>>>>> To: Sameer Agarwal >>> >> >>>>>>>>>> Cc: dev >>> >> >>>>>>>>>> Subject: Re: [VOTE] Spark 2.3.0 (RC4) >>> >> >>>>>>>>>> >>> >> >>>>>>>>>> I'll start with a +1 once again. >>> >> >>>>>>>>>> >>> >> >>>>>>>>>> All blockers reported against RC3 have been resolved and >>> the >>> >> >>>>>>>>>> builds are healthy. >>> >> >>>>>>>>>> >>> >> >>>>>>>>>> On 17 February 2018 at 13:41, Sameer Agarwal >>> >> >>>>>>>>>> <samee...@apache.org> >>> >> >>>>>>>>>> wrote: >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> Please vote on releasing the following candidate as Apache >>> >> >>>>>>>>>>> Spark >>> >> >>>>>>>>>>> version 2.3.0. The vote is open until Thursday February >>> 22, >>> >> >>>>>>>>>>> 2018 at 8:00:00 >>> >> >>>>>>>>>>> am UTC and passes if a majority of at least 3 PMC +1 >>> votes are >>> >> >>>>>>>>>>> cast. >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> [ ] +1 Release this package as Apache Spark 2.3.0 >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> [ ] -1 Do not release this package because ... >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> To learn more about Apache Spark, please see >>> >> >>>>>>>>>>> https://spark.apache.org/ >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> The tag to be voted on is v2.3.0-rc4: >>> >> >>>>>>>>>>> https://github.com/apache/spark/tree/v2.3.0-rc4 >>> >> >>>>>>>>>>> (44095cb65500739695b0324c177c19dfa1471472) >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> List of JIRA tickets resolved in this release can be found >>> >> >>>>>>>>>>> here: >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> https://issues.apache.org/jira >>> /projects/SPARK/versions/12339551 >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> The release files, including signatures, digests, etc. >>> can be >>> >> >>>>>>>>>>> found at: >>> >> >>>>>>>>>>> https://dist.apache.org/repos/ >>> dist/dev/spark/v2.3.0-rc4-bin/ >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> Release artifacts are signed with the following key: >>> >> >>>>>>>>>>> https://dist.apache.org/repos/dist/dev/spark/KEYS >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> The staging repository for this release can be found at: >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> https://repository.apache.org/ >>> content/repositories/orgapachespark-1265/ >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> The documentation corresponding to this release can be >>> found >>> >> >>>>>>>>>>> at: >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> https://dist.apache.org/repos/ >>> dist/dev/spark/v2.3.0-rc4-docs/_site/index.html >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> FAQ >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> ======================================= >>> >> >>>>>>>>>>> What are the unresolved issues targeted for 2.3.0? >>> >> >>>>>>>>>>> ======================================= >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> Please see https://s.apache.org/oXKi. At the time of >>> writing, >>> >> >>>>>>>>>>> there are currently no known release blockers. >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> ========================= >>> >> >>>>>>>>>>> How can I help test this release? >>> >> >>>>>>>>>>> ========================= >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> If you are a Spark user, you can help us test this >>> release by >>> >> >>>>>>>>>>> taking an existing Spark workload and running on this >>> release >>> >> >>>>>>>>>>> candidate, >>> >> >>>>>>>>>>> then reporting any regressions. >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> If you're working in PySpark you can set up a virtual env >>> and >>> >> >>>>>>>>>>> install the current RC and see if anything important >>> breaks, >>> >> >>>>>>>>>>> in the >>> >> >>>>>>>>>>> Java/Scala you can add the staging repository to your >>> projects >>> >> >>>>>>>>>>> resolvers and >>> >> >>>>>>>>>>> test with the RC (make sure to clean up the artifact cache >>> >> >>>>>>>>>>> before/after so >>> >> >>>>>>>>>>> you don't end up building with a out of date RC going >>> >> >>>>>>>>>>> forward). >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> =========================================== >>> >> >>>>>>>>>>> What should happen to JIRA tickets still targeting 2.3.0? >>> >> >>>>>>>>>>> =========================================== >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> Committers should look at those and triage. Extremely >>> >> >>>>>>>>>>> important >>> >> >>>>>>>>>>> bug fixes, documentation, and API tweaks that impact >>> >> >>>>>>>>>>> compatibility should be >>> >> >>>>>>>>>>> worked on immediately. Everything else please retarget to >>> >> >>>>>>>>>>> 2.3.1 or 2.4.0 as >>> >> >>>>>>>>>>> appropriate. >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> =================== >>> >> >>>>>>>>>>> Why is my bug not fixed? >>> >> >>>>>>>>>>> =================== >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> In order to make timely releases, we will typically not >>> hold >>> >> >>>>>>>>>>> the >>> >> >>>>>>>>>>> release unless the bug in question is a regression from >>> 2.2.0. >>> >> >>>>>>>>>>> That being >>> >> >>>>>>>>>>> said, if there is something which is a regression from >>> 2.2.0 >>> >> >>>>>>>>>>> and has not >>> >> >>>>>>>>>>> been correctly targeted please ping me or a committer to >>> help >>> >> >>>>>>>>>>> target the >>> >> >>>>>>>>>>> issue (you can see the open issues listed as impacting >>> Spark >>> >> >>>>>>>>>>> 2.3.0 at >>> >> >>>>>>>>>>> https://s.apache.org/WmoI). >>> >> >>>>>>>>>> >>> >> >>>>>>>>>> >>> >> >>>>>>>>>> >>> >> >>>>>>>>>> >>> >> >>>>>>>>>> -- >>> >> >>>>>>>>>> Sameer Agarwal >>> >> >>>>>>>>>> Computer Science | UC Berkeley >>> >> >>>>>>>>>> http://cs.berkeley.edu/~sameerag >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> -- >>> >> >>>>>>>>> Sameer Agarwal >>> >> >>>>>>>>> Computer Science | UC Berkeley >>> >> >>>>>>>>> http://cs.berkeley.edu/~sameerag >>> >> >>>>>>>> >>> >> >>>>>>>> >>> >> >>>>>> >>> >> >>>>>> >>> >> >>>>>> >>> >> >>>>>> -- >>> >> >>>>>> Takuya UESHIN >>> >> >>>>>> Tokyo, Japan >>> >> >>>>>> >>> >> >>>>>> http://twitter.com/ueshin >>> >> >>>>> >>> >> >>>>> >>> >> >>>> >>> >> >>> >>> >> >> >>> >> >> >>> >> >> >>> >> >> -- >>> >> >> Ryan Blue >>> >> >> Software Engineer >>> >> >> Netflix >>> >> > >>> >> > >>> >> >>> >> >>> >> >>> >> -- >>> >> Marcelo >>> > >>> > >>> >>> >>> >>> -- >>> Marcelo >>> >>> --------------------------------------------------------------------- >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> >>> >> > > > -- > Ryan Blue > Software Engineer > Netflix >