Thank you, Shane! Xiao
On Tue, Feb 4, 2020 at 2:16 PM Dongjoon Hyun <dongjoon.h...@gmail.com> wrote: > Thank you, Shane! :D > > Bests, > Dongjoon > > On Tue, Feb 4, 2020 at 13:28 shane knapp ☠ <skn...@berkeley.edu> wrote: > >> all the 3.0 builds have been created and are currently churning away! >> >> (the failed builds were to a silly bug in the build scripts sneaking it's >> way back in, but that's resolved now) >> >> shane >> >> On Sat, Feb 1, 2020 at 6:16 PM Reynold Xin <r...@databricks.com> wrote: >> >>> Note that branch-3.0 was cut. Please focus on testing, polish, and let's >>> get the release out! >>> >>> >>> On Wed, Jan 29, 2020 at 3:41 PM, Reynold Xin <r...@databricks.com> >>> wrote: >>> >>>> Just a reminder - code freeze is coming this Fri! >>>> >>>> There can always be exceptions, but those should be exceptions and >>>> discussed on a case by case basis rather than becoming the norm. >>>> >>>> >>>> >>>> On Tue, Dec 24, 2019 at 4:55 PM, Jungtaek Lim < >>>> kabhwan.opensou...@gmail.com> wrote: >>>> >>>>> Jan 31 sounds good to me. >>>>> >>>>> Just curious, do we allow some exception on code freeze? One thing >>>>> came into my mind is that some feature could have multiple subtasks and >>>>> part of subtasks have been merged and other subtask(s) are in reviewing. >>>>> In >>>>> this case do we allow these subtasks to have more days to get reviewed and >>>>> merged later? >>>>> >>>>> Happy Holiday! >>>>> >>>>> Thanks, >>>>> Jungtaek Lim (HeartSaVioR) >>>>> >>>>> On Wed, Dec 25, 2019 at 8:36 AM Takeshi Yamamuro < >>>>> linguin....@gmail.com> wrote: >>>>> >>>>>> Looks nice, happy holiday, all! >>>>>> >>>>>> Bests, >>>>>> Takeshi >>>>>> >>>>>> On Wed, Dec 25, 2019 at 3:56 AM Dongjoon Hyun < >>>>>> dongjoon.h...@gmail.com> wrote: >>>>>> >>>>>>> +1 for January 31st. >>>>>>> >>>>>>> Bests, >>>>>>> Dongjoon. >>>>>>> >>>>>>> On Tue, Dec 24, 2019 at 7:11 AM Xiao Li <lix...@databricks.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Jan 31 is pretty reasonable. Happy Holidays! >>>>>>>> >>>>>>>> Xiao >>>>>>>> >>>>>>>> On Tue, Dec 24, 2019 at 5:52 AM Sean Owen <sro...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Yep, always happens. Is earlier realistic, like Jan 15? it's all >>>>>>>>> arbitrary but indeed this has been in progress for a while, and >>>>>>>>> there's a >>>>>>>>> downside to not releasing it, to making the gap to 3.0 larger. >>>>>>>>> On my end I don't know of anything that's holding up a release; is >>>>>>>>> it basically DSv2? >>>>>>>>> >>>>>>>>> BTW these are the items still targeted to 3.0.0, some of which may >>>>>>>>> not have been legitimately tagged. It may be worth reviewing what's >>>>>>>>> still >>>>>>>>> open and necessary, and what should be untargeted. >>>>>>>>> >>>>>>>>> SPARK-29768 nondeterministic expression fails column pruning >>>>>>>>> SPARK-29345 Add an API that allows a user to define and observe >>>>>>>>> arbitrary metrics on streaming queries >>>>>>>>> SPARK-29348 Add observable metrics >>>>>>>>> SPARK-29429 Support Prometheus monitoring natively >>>>>>>>> SPARK-29577 Implement p-value simulation and unit tests for chi2 >>>>>>>>> test >>>>>>>>> SPARK-28900 Test Pyspark, SparkR on JDK 11 with run-tests >>>>>>>>> SPARK-28883 Fix a flaky test: ThriftServerQueryTestSuite >>>>>>>>> SPARK-28717 Update SQL ALTER TABLE RENAME to use TableCatalog API >>>>>>>>> SPARK-28588 Build a SQL reference doc >>>>>>>>> SPARK-28629 Capture the missing rules in HiveSessionStateBuilder >>>>>>>>> SPARK-28684 Hive module support JDK 11 >>>>>>>>> SPARK-28548 explain() shows wrong result for persisted DataFrames >>>>>>>>> after some operations >>>>>>>>> SPARK-28264 Revisiting Python / pandas UDF >>>>>>>>> SPARK-28301 fix the behavior of table name resolution with >>>>>>>>> multi-catalog >>>>>>>>> SPARK-28155 do not leak SaveMode to file source v2 >>>>>>>>> SPARK-28103 Cannot infer filters from union table with empty local >>>>>>>>> relation table properly >>>>>>>>> SPARK-27986 Support Aggregate Expressions with filter >>>>>>>>> SPARK-28024 Incorrect numeric values when out of range >>>>>>>>> SPARK-27936 Support local dependency uploading from --py-files >>>>>>>>> SPARK-27780 Shuffle server & client should be versioned to enable >>>>>>>>> smoother upgrade >>>>>>>>> SPARK-27714 Support Join Reorder based on Genetic Algorithm when >>>>>>>>> the # of joined tables > 12 >>>>>>>>> SPARK-27471 Reorganize public v2 catalog API >>>>>>>>> SPARK-27520 Introduce a global config system to replace >>>>>>>>> hadoopConfiguration >>>>>>>>> SPARK-24625 put all the backward compatible behavior change >>>>>>>>> configs under spark.sql.legacy.* >>>>>>>>> SPARK-24941 Add RDDBarrier.coalesce() function >>>>>>>>> SPARK-25017 Add test suite for ContextBarrierState >>>>>>>>> SPARK-25083 remove the type erasure hack in data source scan >>>>>>>>> SPARK-25383 Image data source supports sample pushdown >>>>>>>>> SPARK-27272 Enable blacklisting of node/executor on fetch failures >>>>>>>>> by default >>>>>>>>> SPARK-27296 Efficient User Defined Aggregators >>>>>>>>> SPARK-25128 multiple simultaneous job submissions against k8s >>>>>>>>> backend cause driver pods to hang >>>>>>>>> SPARK-26664 Make DecimalType's minimum adjusted scale configurable >>>>>>>>> SPARK-21559 Remove Mesos fine-grained mode >>>>>>>>> SPARK-24942 Improve cluster resource management with jobs >>>>>>>>> containing barrier stage >>>>>>>>> SPARK-25914 Separate projection from grouping and aggregate in >>>>>>>>> logical Aggregate >>>>>>>>> SPARK-20964 Make some keywords reserved along with the ANSI/SQL >>>>>>>>> standard >>>>>>>>> SPARK-26221 Improve Spark SQL instrumentation and metrics >>>>>>>>> SPARK-26425 Add more constraint checks in file streaming source to >>>>>>>>> avoid checkpoint corruption >>>>>>>>> SPARK-25843 Redesign rangeBetween API >>>>>>>>> SPARK-25841 Redesign window function rangeBetween API >>>>>>>>> SPARK-25752 Add trait to easily whitelist logical operators that >>>>>>>>> produce named output from CleanupAliases >>>>>>>>> SPARK-25640 Clarify/Improve EvalType for grouped aggregate and >>>>>>>>> window aggregate >>>>>>>>> SPARK-25531 new write APIs for data source v2 >>>>>>>>> SPARK-25547 Pluggable jdbc connection factory >>>>>>>>> SPARK-20845 Support specification of column names in INSERT INTO >>>>>>>>> SPARK-24724 Discuss necessary info and access in barrier mode + >>>>>>>>> Kubernetes >>>>>>>>> SPARK-24725 Discuss necessary info and access in barrier mode + >>>>>>>>> Mesos >>>>>>>>> SPARK-25074 Implement maxNumConcurrentTasks() in >>>>>>>>> MesosFineGrainedSchedulerBackend >>>>>>>>> SPARK-23710 Upgrade the built-in Hive to 2.3.5 for hadoop-3.2 >>>>>>>>> SPARK-25186 Stabilize Data Source V2 API >>>>>>>>> SPARK-25376 Scenarios we should handle but missed in 2.4 for >>>>>>>>> barrier execution mode >>>>>>>>> SPARK-7768 Make user-defined type (UDT) API public >>>>>>>>> SPARK-14922 Alter Table Drop Partition Using Predicate-based >>>>>>>>> Partition Spec >>>>>>>>> SPARK-15694 Implement ScriptTransformation in sql/core >>>>>>>>> SPARK-18134 SQL: MapType in Group BY and Joins not working >>>>>>>>> SPARK-19842 Informational Referential Integrity Constraints >>>>>>>>> Support in Spark >>>>>>>>> SPARK-22231 Support of map, filter, withColumn, dropColumn in >>>>>>>>> nested list of structures >>>>>>>>> SPARK-22386 Data Source V2 improvements >>>>>>>>> SPARK-24723 Discuss necessary info and access in barrier mode + >>>>>>>>> YARN >>>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, Dec 23, 2019 at 5:48 PM Reynold Xin <r...@databricks.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> We've pushed out 3.0 multiple times. The latest release window >>>>>>>>>> documented on the website >>>>>>>>>> <http://spark.apache.org/versioning-policy.html> says we'd code >>>>>>>>>> freeze and cut branch-3.0 early Dec. It looks like we are suffering >>>>>>>>>> a bit >>>>>>>>>> from the tragedy of the commons, that nobody is pushing for getting >>>>>>>>>> the >>>>>>>>>> release out. I understand the natural tendency for each individual >>>>>>>>>> is to >>>>>>>>>> finish or extend the feature/bug that the person has been working >>>>>>>>>> on. At >>>>>>>>>> some point we need to say "this is it" and get the release out. I'm >>>>>>>>>> happy >>>>>>>>>> to help drive this process. >>>>>>>>>> >>>>>>>>>> To be realistic, I don't think we should just code freeze *today*. >>>>>>>>>> Although we have updated the website, contributors have all been >>>>>>>>>> operating >>>>>>>>>> under the assumption that all active developments are still going >>>>>>>>>> on. I >>>>>>>>>> propose we *cut the branch on **Jan 31**, and code freeze and >>>>>>>>>> switch over to bug squashing mode, and try to get the 3.0 official >>>>>>>>>> release >>>>>>>>>> out in Q1*. That is, by default no new features can go into the >>>>>>>>>> branch starting Jan 31. >>>>>>>>>> >>>>>>>>>> What do you think? >>>>>>>>>> >>>>>>>>>> And happy holidays everybody. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> [image: Databricks Summit - Watch the talks] >>>>>>>> <https://databricks.com/sparkaisummit/north-america> >>>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> --- >>>>>> Takeshi Yamamuro >>>>>> >>>>> >>> >> >> -- >> Shane Knapp >> Computer Guy / Voice of Reason >> UC Berkeley EECS Research / RISELab Staff Technical Lead >> https://rise.cs.berkeley.edu >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > > -- <https://databricks.com/sparkaisummit/north-america>