Re: Spark 3.0 branch cut and code freeze on Jan 31?

Reynold Xin Wed, 29 Jan 2020 15:42:23 -0800

Just a reminder - code freeze is coming this Fri !

There can always be exceptions, but those should be exceptions and discussed on 
a case by case basis rather than becoming the norm.


On Tue, Dec 24, 2019 at 4:55 PM, Jungtaek Lim < kabhwan.opensou...@gmail.com > 
wrote:

> 
> Jan 31 sounds good to me.
> 
> 
> Just curious, do we allow some exception on code freeze? One thing came
> into my mind is that some feature could have multiple subtasks and part of
> subtasks have been merged and other subtask(s) are in reviewing. In this
> case do we allow these subtasks to have more days to get reviewed and
> merged later?
> 
> 
> Happy Holiday!
> 
> 
> Thanks,
> Jungtaek Lim (HeartSaVioR)
> 
> On Wed, Dec 25, 2019 at 8:36 AM Takeshi Yamamuro < linguin. m. s@ gmail. com
> ( linguin....@gmail.com ) > wrote:
> 
> 
>> Looks nice, happy holiday, all!
>> 
>> 
>> Bests,
>> Takeshi
>> 
>> On Wed, Dec 25, 2019 at 3:56 AM Dongjoon Hyun < dongjoon. hyun@ gmail. com
>> ( dongjoon.h...@gmail.com ) > wrote:
>> 
>> 
>>> +1 for January 31st.
>>> 
>>> 
>>> Bests,
>>> Dongjoon.
>>> 
>>> On Tue, Dec 24, 2019 at 7:11 AM Xiao Li < lixiao@ databricks. com (
>>> lix...@databricks.com ) > wrote:
>>> 
>>> 
>>>> Jan 31 is pretty reasonable. Happy Holidays! 
>>>> 
>>>> 
>>>> Xiao
>>>> 
>>>> On Tue, Dec 24, 2019 at 5:52 AM Sean Owen < srowen@ gmail. com (
>>>> sro...@gmail.com ) > wrote:
>>>> 
>>>> 
>>>>> Yep, always happens. Is earlier realistic, like Jan 15? it's all arbitrary
>>>>> but indeed this has been in progress for a while, and there's a downside
>>>>> to not releasing it, to making the gap to 3.0 larger. 
>>>>> On my end I don't know of anything that's holding up a release; is it
>>>>> basically DSv2?
>>>>> 
>>>>> BTW these are the items still targeted to 3.0.0, some of which may not
>>>>> have been legitimately tagged. It may be worth reviewing what's still open
>>>>> and necessary, and what should be untargeted.
>>>>> 
>>>>> 
>>>>> SPARK-29768 nondeterministic expression fails column pruning
>>>>> SPARK-29345 Add an API that allows a user to define and observe arbitrary
>>>>> metrics on streaming queries
>>>>> SPARK-29348 Add observable metrics
>>>>> SPARK-29429 Support Prometheus monitoring natively
>>>>> SPARK-29577 Implement p-value simulation and unit tests for chi2 test
>>>>> SPARK-28900 Test Pyspark, SparkR on JDK 11 with run-tests
>>>>> SPARK-28883 Fix a flaky test: ThriftServerQueryTestSuite
>>>>> SPARK-28717 Update SQL ALTER TABLE RENAME  to use TableCatalog API
>>>>> SPARK-28588 Build a SQL reference doc
>>>>> SPARK-28629 Capture the missing rules in HiveSessionStateBuilder
>>>>> SPARK-28684 Hive module support JDK 11
>>>>> SPARK-28548 explain() shows wrong result for persisted DataFrames after
>>>>> some operations
>>>>> SPARK-28264 Revisiting Python / pandas UDF
>>>>> SPARK-28301 fix the behavior of table name resolution with multi-catalog
>>>>> SPARK-28155 do not leak SaveMode to file source v2
>>>>> SPARK-28103 Cannot infer filters from union table with empty local
>>>>> relation table properly
>>>>> SPARK-27986 Support Aggregate Expressions with filter
>>>>> SPARK-28024 Incorrect numeric values when out of range
>>>>> SPARK-27936 Support local dependency uploading from --py-files
>>>>> SPARK-27780 Shuffle server & client should be versioned to enable smoother
>>>>> upgrade
>>>>> SPARK-27714 Support Join Reorder based on Genetic Algorithm when the # of
>>>>> joined tables > 12
>>>>> SPARK-27471 Reorganize public v2 catalog API
>>>>> SPARK-27520 Introduce a global config system to replace
>>>>> hadoopConfiguration
>>>>> SPARK-24625 put all the backward compatible behavior change configs under
>>>>> spark.sql.legacy.*
>>>>> SPARK-24941 Add RDDBarrier.coalesce() function
>>>>> SPARK-25017 Add test suite for ContextBarrierState
>>>>> SPARK-25083 remove the type erasure hack in data source scan
>>>>> SPARK-25383 Image data source supports sample pushdown
>>>>> SPARK-27272 Enable blacklisting of node/executor on fetch failures by
>>>>> default
>>>>> SPARK-27296 Efficient User Defined Aggregators
>>>>> SPARK-25128 multiple simultaneous job submissions against k8s backend
>>>>> cause driver pods to hang
>>>>> SPARK-26664 Make DecimalType's minimum adjusted scale configurable
>>>>> SPARK-21559 Remove Mesos fine-grained mode
>>>>> SPARK-24942 Improve cluster resource management with jobs containing
>>>>> barrier stage
>>>>> SPARK-25914 Separate projection from grouping and aggregate in logical
>>>>> Aggregate
>>>>> SPARK-20964 Make some keywords reserved along with the ANSI/SQL standard
>>>>> SPARK-26221 Improve Spark SQL instrumentation and metrics
>>>>> SPARK-26425 Add more constraint checks in file streaming source to avoid
>>>>> checkpoint corruption
>>>>> SPARK-25843 Redesign rangeBetween API
>>>>> SPARK-25841 Redesign window function rangeBetween API
>>>>> SPARK-25752 Add trait to easily whitelist logical operators that produce
>>>>> named output from CleanupAliases
>>>>> SPARK-25640 Clarify/Improve EvalType for grouped aggregate and window
>>>>> aggregate
>>>>> SPARK-25531 new write APIs for data source v2
>>>>> SPARK-25547 Pluggable jdbc connection factory
>>>>> SPARK-20845 Support specification of column names in INSERT INTO
>>>>> SPARK-24724 Discuss necessary info and access in barrier mode + Kubernetes
>>>>> 
>>>>> SPARK-24725 Discuss necessary info and access in barrier mode + Mesos
>>>>> SPARK-25074 Implement maxNumConcurrentTasks() in
>>>>> MesosFineGrainedSchedulerBackend
>>>>> SPARK-23710 Upgrade the built-in Hive to 2.3.5 for hadoop-3.2
>>>>> SPARK-25186 Stabilize Data Source V2 API
>>>>> SPARK-25376 Scenarios we should handle but missed in 2.4 for barrier
>>>>> execution mode
>>>>> SPARK-7768 Make user-defined type (UDT) API public
>>>>> SPARK-14922 Alter Table Drop Partition Using Predicate-based Partition
>>>>> Spec
>>>>> SPARK-15694 Implement ScriptTransformation in sql/core
>>>>> SPARK-18134 SQL: MapType in Group BY and Joins not working
>>>>> SPARK-19842 Informational Referential Integrity Constraints Support in
>>>>> Spark
>>>>> SPARK-22231 Support of map, filter, withColumn, dropColumn in nested list
>>>>> of structures
>>>>> SPARK-22386 Data Source V2 improvements
>>>>> SPARK-24723 Discuss necessary info and access in barrier mode + YARN
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Mon, Dec 23, 2019 at 5:48 PM Reynold Xin < rxin@ databricks. com (
>>>>> r...@databricks.com ) > wrote:
>>>>> 
>>>>> 
>>>>>> We've pushed out 3.0 multiple times. The latest release window documented
>>>>>> on the website ( http://spark.apache.org/versioning-policy.html ) says
>>>>>> we'd code freeze and cut branch-3.0 early Dec. It looks like we are
>>>>>> suffering a bit from the tragedy of the commons, that nobody is pushing
>>>>>> for getting the release out. I understand the natural tendency for each
>>>>>> individual is to finish or extend the feature/bug that the person has 
>>>>>> been
>>>>>> working on. At some point we need to say "this is it" and get the release
>>>>>> out. I'm happy to help drive this process.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> To be realistic, I don't think we should just code freeze * today *.
>>>>>> Although we have updated the website, contributors have all been 
>>>>>> operating
>>>>>> under the assumption that all active developments are still going on. I
>>>>>> propose we *cut the branch on* *Jan 31* *, and code freeze and switch 
>>>>>> over
>>>>>> to bug squashing mode, and try to get the 3.0 official release out in 
>>>>>> Q1*.
>>>>>> That is, by default no new features can go into the branch starting Jan 
>>>>>> 31
>>>>>> .
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> What do you think?
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> And happy holidays everybody.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Databricks Summit - Watch the talks (
>>>> https://databricks.com/sparkaisummit/north-america ) 
>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
>> 
>> 
>> --
>> ---
>> Takeshi Yamamuro
>> 
> 
>

Re: Spark 3.0 branch cut and code freeze on Jan 31?

Reply via email to