This is an automated email from the ASF dual-hosted git repository.

alamb pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/datafusion.git


The following commit(s) were added to refs/heads/main by this push:
     new ad0dc2fb43 chore: Improve change log generator (#10841)
ad0dc2fb43 is described below

commit ad0dc2fb43fa6a7ea3116075511a5bc9c39851b6
Author: Andy Grove <[email protected]>
AuthorDate: Sun Jun 9 09:59:19 2024 -0600

    chore: Improve change log generator (#10841)
    
    * Improve change log generator
    
    * prettier
    
    * prettier
---
 dev/changelog/39.0.0.md           | 146 +++++++++++++++++++++++---------------
 dev/release/README.md             |  51 ++++++-------
 dev/release/generate-changelog.py |  64 ++++++++++++++---
 3 files changed, 164 insertions(+), 97 deletions(-)

diff --git a/dev/changelog/39.0.0.md b/dev/changelog/39.0.0.md
index f94e34592c..ff27b4ba24 100644
--- a/dev/changelog/39.0.0.md
+++ b/dev/changelog/39.0.0.md
@@ -1,23 +1,25 @@
-<!---
-  Licensed to the Apache Software Foundation (ASF) under one
-  or more contributor license agreements.  See the NOTICE file
-  distributed with this work for additional information
-  regarding copyright ownership.  The ASF licenses this file
-  to you under the Apache License, Version 2.0 (the
-  "License"); you may not use this file except in compliance
-  with the License.  You may obtain a copy of the License at
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
 
-    http://www.apache.org/licenses/LICENSE-2.0
+  http://www.apache.org/licenses/LICENSE-2.0
 
-  Unless required by applicable law or agreed to in writing,
-  software distributed under the License is distributed on an
-  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-  KIND, either express or implied.  See the License for the
-  specific language governing permissions and limitations
-  under the License.
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
 -->
 
-## [39.0.0](https://github.com/apache/datafusion/tree/39.0.0) (2024-06-07)
+# Apache DataFusion 39.0.0 Changelog
+
+This release consists of 234 commits from 59 contributors. See credits at the 
end of this changelog for more information.
 
 **Breaking changes:**
 
@@ -72,16 +74,12 @@
 - docs: add documents to substrait type variation consts 
[#10719](https://github.com/apache/datafusion/pull/10719) (waynexia)
 - Minor: (Doc) Enable rt-multi-thread feature for sample code 
[#10770](https://github.com/apache/datafusion/pull/10770) (hsiang-c)
 
-**Merged pull requests:**
+**Other:**
 
-- Prepare 38.0.0 release candidate 1 
[#10407](https://github.com/apache/datafusion/pull/10407) (andygrove)
 - Minor: Add more docs and examples for `Expr::unalias` 
[#10406](https://github.com/apache/datafusion/pull/10406) (alamb)
 - minor: Remove [RUST][datafusion] from release vote email subject line 
[#10411](https://github.com/apache/datafusion/pull/10411) (andygrove)
-- Remove ScalarFunctionDefinition 
[#10325](https://github.com/apache/datafusion/pull/10325) (lewiszlw)
-- chore(docs): update subquery documentation with more information 
[#10361](https://github.com/apache/datafusion/pull/10361) (sanderson)
 - fix dml logical plan output schema 
[#10394](https://github.com/apache/datafusion/pull/10394) (leoyvens)
 - [MINOR]: Move transpose code to under common 
[#10409](https://github.com/apache/datafusion/pull/10409) (mustafasrepo)
-- minor: Remove docs archive 
[#10416](https://github.com/apache/datafusion/pull/10416) (andygrove)
 - Fix incorrect Schema over aggregate function, Remove unnecessary 
`exprlist_to_fields_aggregate` 
[#10408](https://github.com/apache/datafusion/pull/10408) (jonahgao)
 - Enable user defined display_name for ScalarUDF 
[#10417](https://github.com/apache/datafusion/pull/10417) (yyy1000)
 - Fix and improve `CommonSubexprEliminate` rule 
[#10396](https://github.com/apache/datafusion/pull/10396) (peter-toth)
@@ -94,16 +92,10 @@
 - Improve flight sql examples 
[#10432](https://github.com/apache/datafusion/pull/10432) (lewiszlw)
 - Move Covariance (Population) covar_pop to be a User Defined Aggregate 
Function [#10418](https://github.com/apache/datafusion/pull/10418) (yyy1000)
 - Stop copying LogicalPlan and Exprs in `OptimizeProjections` (2% faster 
planning) [#10405](https://github.com/apache/datafusion/pull/10405) (alamb)
-- Minor: format comments in `PushDownFilter` rule 
[#10437](https://github.com/apache/datafusion/pull/10437) (alamb)
 - chore: Improve release process for next time 
[#10447](https://github.com/apache/datafusion/pull/10447) (andygrove)
-- Minor: Add usecase to comments in `LogicalPlan::recompute_schema` 
[#10443](https://github.com/apache/datafusion/pull/10443) (alamb)
-- doc: fix old master branch references to main 
[#10458](https://github.com/apache/datafusion/pull/10458) (Jefffrey)
 - Move bit_and_or_xor unit tests to slt 
[#10457](https://github.com/apache/datafusion/pull/10457) (NoeB)
-- Introduce user-defined signature 
[#10439](https://github.com/apache/datafusion/pull/10439) (jayzhan211)
-- Remove `AggregateFunctionDefinition::Name` 
[#10441](https://github.com/apache/datafusion/pull/10441) (lewiszlw)
 - Remove some Expr clones in `EliminateCrossJoin`(3%-5% faster planning) 
[#10430](https://github.com/apache/datafusion/pull/10430) (alamb)
 - refactor: Reduce string allocations in Expr::display_name (use write instead 
of format!) [#10454](https://github.com/apache/datafusion/pull/10454) 
(erratic-pattern)
-- Make `CREATE EXTERNAL TABLE` format options consistent, remove special 
syntax for `HEADER ROW`, `DELIMITER` and `COMPRESSION` 
[#10404](https://github.com/apache/datafusion/pull/10404) (berkaysynnada)
 - Add `simplify` method to aggregate function 
[#10354](https://github.com/apache/datafusion/pull/10354) (milenkovicm)
 - Add cast array test to sqllogictest 
[#10474](https://github.com/apache/datafusion/pull/10474) (viirya)
 - Add `Expr::try_as_col`, deprecate `Expr::try_into_col` (speed up optimizer) 
[#10448](https://github.com/apache/datafusion/pull/10448) (alamb)
@@ -113,21 +105,13 @@
 - Stop copying LogicalPlan and Exprs in `ReplaceDistinctWithAggregate` 
[#10460](https://github.com/apache/datafusion/pull/10460) (ClSlaid)
 - Stop copying LogicalPlan and Exprs in `EliminateCrossJoin` (4% faster 
planning) [#10431](https://github.com/apache/datafusion/pull/10431) (alamb)
 - Improved ergonomy for `CREATE EXTERNAL TABLE OPTIONS`: Don't require 
quotations for simple namespaced keys like `foo.bar` 
[#10483](https://github.com/apache/datafusion/pull/10483) (ozankabak)
-- feat: allow `array_slice` to take an optional stride parameter 
[#10469](https://github.com/apache/datafusion/pull/10469) (jonahgao)
 - Replace `GetFieldAccess` with indexing function in `SqlToRel ` 
[#10375](https://github.com/apache/datafusion/pull/10375) (jayzhan211)
-- fix: make `columnize_expr` resistant to display_name collisions 
[#10459](https://github.com/apache/datafusion/pull/10459) (jonahgao)
 - Fix values with different data types caused failure 
[#10445](https://github.com/apache/datafusion/pull/10445) (b41sh)
-- fix: avoid compressed json files repartitioning 
[#10470](https://github.com/apache/datafusion/pull/10470) (korowa)
-- Minor: Improved document string for `LogicalPlanBuilder` 
[#10496](https://github.com/apache/datafusion/pull/10496) (AbrarNitk)
 - Fix SortMergeJoin with join filter filtering all rows out 
[#10495](https://github.com/apache/datafusion/pull/10495) (viirya)
 - chore: use fullpath in macro to avoid declaring in other module 
[#10503](https://github.com/apache/datafusion/pull/10503) (jayzhan211)
-- Minor: Extend more style of udaf `expr_fn`, Remove order args 
for`covar_samp` and `covar_pop` 
[#10492](https://github.com/apache/datafusion/pull/10492) (jayzhan211)
 - Minor: remove unused source file `udf.rs` 
[#10497](https://github.com/apache/datafusion/pull/10497) (jonahgao)
-- feat: optional args for regexp\_\* UDFs 
[#10514](https://github.com/apache/datafusion/pull/10514) (Michael-J-Ward)
 - Support UDAF to align Builtin aggregate function 
[#10493](https://github.com/apache/datafusion/pull/10493) (jayzhan211)
-- Remove `file_type()` from `FileFormat` 
[#10499](https://github.com/apache/datafusion/pull/10499) (Jefffrey)
 - Minor: add a test for `current_time` (no args) 
[#10509](https://github.com/apache/datafusion/pull/10509) (alamb)
-- fix: parsing timestamp with date format 
[#10476](https://github.com/apache/datafusion/pull/10476) (shanretoo)
 - [MINOR]: Move pipeline checker rule to the end 
[#10502](https://github.com/apache/datafusion/pull/10502) (mustafasrepo)
 - Minor: Extract parent/child limit calculation into a function, improve docs 
[#10501](https://github.com/apache/datafusion/pull/10501) (alamb)
 - Fix window expr deserialization 
[#10506](https://github.com/apache/datafusion/pull/10506) (lewiszlw)
@@ -135,7 +119,6 @@
 - Stop copying LogicalPlan and Exprs in `TypeCoercion` (10% faster planning) 
[#10356](https://github.com/apache/datafusion/pull/10356) (alamb)
 - Implement unparse `IS_NULL` to String and enhance the tests 
[#10529](https://github.com/apache/datafusion/pull/10529) (goldmedal)
 - Fix panic in array_agg(distinct) query 
[#10526](https://github.com/apache/datafusion/pull/10526) (jayzhan211)
-- UDAF: Extend more args to `state_fields` and `groups_accumulator_supported` 
and introduce `ReversedUDAF` 
[#10525](https://github.com/apache/datafusion/pull/10525) (jayzhan211)
 - Move min_max unit tests to slt 
[#10539](https://github.com/apache/datafusion/pull/10539) (xinlifoobar)
 - Implement unparse `IsNotFalse` to String 
[#10538](https://github.com/apache/datafusion/pull/10538) (goldmedal)
 - Implement Unparse TryCast Expr --> String Support 
[#10542](https://github.com/apache/datafusion/pull/10542) (xinlifoobar)
@@ -145,38 +128,30 @@
 - Stop most copying LogicalPlan and Exprs in `ScalarSubqueryToJoin` 
[#10489](https://github.com/apache/datafusion/pull/10489) (alamb)
 - Example for simple Expr --> SQL conversion 
[#10528](https://github.com/apache/datafusion/pull/10528) (edmondop)
 - fix `null_count` on `compute_record_batch_statistics` to report null counts 
across partitions [#10468](https://github.com/apache/datafusion/pull/10468) 
(samuelcolvin)
-- fix: `array_slice` panics 
[#10547](https://github.com/apache/datafusion/pull/10547) (jonahgao)
 - Minor: Add `PullUpCorrelatedExpr::new` and improve documentation 
[#10500](https://github.com/apache/datafusion/pull/10500) (alamb)
 - Stop copying LogicalPlan and Exprs in `PushDownLimit` 
[#10508](https://github.com/apache/datafusion/pull/10508) (alamb)
 - Break up contributing guide into smaller pages 
[#10533](https://github.com/apache/datafusion/pull/10533) (alamb)
 - PhysicalExpr Orderings with Range Information 
[#10504](https://github.com/apache/datafusion/pull/10504) (berkaysynnada)
 - Implement unparse `ScalarVariable` to String 
[#10541](https://github.com/apache/datafusion/pull/10541) (reswqa)
-- feat: Expose Parquet Schema Adapter 
[#10515](https://github.com/apache/datafusion/pull/10515) (HawaiianSpork)
 - Handle dictionary values in ScalarValue serde 
[#10563](https://github.com/apache/datafusion/pull/10563) (thinkharderdev)
 - Improve signature of `get_field` function 
[#10569](https://github.com/apache/datafusion/pull/10569) (lewiszlw)
 - Implement Unparse `GroupingSet` Expr --> String Support sql 
[#10555](https://github.com/apache/datafusion/pull/10555) (xinlifoobar)
 - Minor: Move proxy to datafusion common 
[#10561](https://github.com/apache/datafusion/pull/10561) (jayzhan211)
 - Update prost-build requirement from =0.12.4 to =0.12.6 
[#10578](https://github.com/apache/datafusion/pull/10578) (dependabot[bot])
 - Add examples of how to convert logical plan to/from sql strings 
[#10558](https://github.com/apache/datafusion/pull/10558) (xinlifoobar)
-- feat: API for collecting statistics/index for metadata of a parquet file + 
tests [#10537](https://github.com/apache/datafusion/pull/10537) (NGA-TRAN)
 - Fix: Sort Merge Join LeftSemi issues when JoinFilter is set 
[#10304](https://github.com/apache/datafusion/pull/10304) (comphead)
-- Remove `Expr::GetIndexedField`, replace `Expr::{field,index,range}` with 
`FieldAccessor`, `IndexAccessor`, and `SliceAccessor` 
[#10568](https://github.com/apache/datafusion/pull/10568) (jayzhan211)
 - Minor: Fix `ArrayFunctionRewriter` name reporting 
[#10581](https://github.com/apache/datafusion/pull/10581) (alamb)
 - Improve `UserDefinedLogicalNode::from_template` API to return `Result` 
[#10575](https://github.com/apache/datafusion/pull/10575) (lewiszlw)
 - Migrate testing optimizer rules to use `rewrite` API 
[#10576](https://github.com/apache/datafusion/pull/10576) (lewiszlw)
-- Improve ContextProvider 
[#10577](https://github.com/apache/datafusion/pull/10577) (lewiszlw)
 - test: add more tests for statistics reading 
[#10592](https://github.com/apache/datafusion/pull/10592) (NGA-TRAN)
 - refactor: reduce allocations in push down filter 
[#10567](https://github.com/apache/datafusion/pull/10567) (erratic-pattern)
 - Fix compilation of datafusion-cli on 32bit targets 
[#10594](https://github.com/apache/datafusion/pull/10594) (nathaniel-daniel)
-- Add to_date function to scalar functions doc 
[#10601](https://github.com/apache/datafusion/pull/10601) (Omega359)
 - Rename monotonicity as output_ordering in ScalarUDF's 
[#10596](https://github.com/apache/datafusion/pull/10596) (berkaysynnada)
 - Implement Unparser for `UNION ALL` 
[#10603](https://github.com/apache/datafusion/pull/10603) (phillipleblanc)
 - Improve `UserDefinedLogicalNodeCore::from_template` API to return Result 
[#10597](https://github.com/apache/datafusion/pull/10597) (lewiszlw)
 - Minor: Move group accumulator for aggregate function to 
physical-expr-common, and add ahash physical-expr-common 
[#10574](https://github.com/apache/datafusion/pull/10574) (jayzhan211)
 - Minor: Consolidate some integration tests into `core_integration` 
[#10588](https://github.com/apache/datafusion/pull/10588) (alamb)
 - Stop copying LogicalPlan and Exprs in `SingleDistinctToGroupBy` 
[#10527](https://github.com/apache/datafusion/pull/10527) (appletreeisyellow)
-- feat: Add eliminate group by constant optimizer rule 
[#10591](https://github.com/apache/datafusion/pull/10591) (korowa)
-- Docs: Update PR workflow documentation 
[#10532](https://github.com/apache/datafusion/pull/10532) (alamb)
 - [MINOR]: Update get range implementation for lead lag window functions 
[#10614](https://github.com/apache/datafusion/pull/10614) (mustafasrepo)
 - Minor: Improve documentation in sql_to_plan example 
[#10582](https://github.com/apache/datafusion/pull/10582) (alamb)
 - Docs: add examples for `RuntimeEnv::register_object_store`, improve error 
messages [#10617](https://github.com/apache/datafusion/pull/10617) (aditanase)
@@ -184,7 +159,6 @@
 - Add to_unixtime function to scalar functions doc 
[#10620](https://github.com/apache/datafusion/pull/10620) (Omega359)
 - Test for reading read statistics from parquet files without statistics and 
boolean & struct data type 
[#10608](https://github.com/apache/datafusion/pull/10608) (NGA-TRAN)
 - adding benchmark for extracting arrow statistics from parquet 
[#10610](https://github.com/apache/datafusion/pull/10610) (Lordworms)
-- feat: extend `unnest` to support Struct datatype 
[#10429](https://github.com/apache/datafusion/pull/10429) (duongcongtoai)
 - Implement a dialect-specific rule for unparsing an identifier with or 
without quotes [#10573](https://github.com/apache/datafusion/pull/10573) 
(goldmedal)
 - add catalog as part of the table path in plan_to_sql 
[#10612](https://github.com/apache/datafusion/pull/10612) (y-f-u)
 - Refactor parquet row group pruning into a struct (use new statistics API, 
part 1) [#10607](https://github.com/apache/datafusion/pull/10607) (alamb)
@@ -205,19 +179,14 @@
 - Fix typo in Cargo.toml (unused manifest key: dependencies.regex.worksapce) 
[#10662](https://github.com/apache/datafusion/pull/10662) (alamb)
 - Add `FileScanConfig::new()` API 
[#10623](https://github.com/apache/datafusion/pull/10623) (alamb)
 - Minor: Remove `GetFieldAccessSchema` 
[#10665](https://github.com/apache/datafusion/pull/10665) (jayzhan211)
-- Minor: Use slice in `ConcreteTreeNode` 
[#10666](https://github.com/apache/datafusion/pull/10666) (peter-toth)
 - Move Median to `functions-aggregate` and Introduce Numeric signature 
[#10644](https://github.com/apache/datafusion/pull/10644) (jayzhan211)
 - Fix `Coalesce` casting logic to follows what Postgres and DuckDB do. 
Introduce signature that do non-comparison coercion 
[#10268](https://github.com/apache/datafusion/pull/10268) (jayzhan211)
-- fix: pass `quote` parameter to CSV writer 
[#10671](https://github.com/apache/datafusion/pull/10671) (DDtKey)
 - Fix compilation "comparison_binary_numeric_coercion not found" 
[#10677](https://github.com/apache/datafusion/pull/10677) (alamb)
 - refactor: simplify converting List DataTypes to `ScalarValue` 
[#10675](https://github.com/apache/datafusion/pull/10675) (jonahgao)
-- feat: add substrait support for Interval types and literals 
[#10646](https://github.com/apache/datafusion/pull/10646) (waynexia)
 - Minor: Improve ObjectStoreUrl docs + examples 
[#10619](https://github.com/apache/datafusion/pull/10619) (alamb)
-- fix: CI compilation failed on substrait 
[#10683](https://github.com/apache/datafusion/pull/10683) (jonahgao)
 - Add tests for reading numeric limits in parquet statistics 
[#10642](https://github.com/apache/datafusion/pull/10642) (alamb)
 - Update nix requirement from 0.28.0 to 0.29.0 
[#10684](https://github.com/apache/datafusion/pull/10684) (dependabot[bot])
 - refactor: Move SchemaAdapter from parquet module to data source 
[#10680](https://github.com/apache/datafusion/pull/10680) (HawaiianSpork)
-- Add reference visitor `TreeNode` APIs, change `ExecutionPlan::children()` 
and `PhysicalExpr::children()` return references 
[#10543](https://github.com/apache/datafusion/pull/10543) (peter-toth)
 - Convert first, last aggregate function to UDAF 
[#10648](https://github.com/apache/datafusion/pull/10648) (mustafasrepo)
 - Minor: CastExpr Ordering Handle 
[#10650](https://github.com/apache/datafusion/pull/10650) (berkaysynnada)
 - Factor out common datafusion types into another proto file 
[#10649](https://github.com/apache/datafusion/pull/10649) (mustafasrepo)
@@ -238,7 +207,6 @@
 - Fix incorrect statistics read for unsigned integers columns in parquet 
[#10704](https://github.com/apache/datafusion/pull/10704) (xinlifoobar)
 - Separate `Partitioning` protobuf serialization code 
[#10708](https://github.com/apache/datafusion/pull/10708) (lewiszlw)
 - Support consuming Substrait with compound signature function names 
[#10653](https://github.com/apache/datafusion/pull/10653) (Blizzara)
-- Minor: Add examples of using TreeNode with `Expr` 
[#10686](https://github.com/apache/datafusion/pull/10686) (alamb)
 - Minor: Add examples of using TreeNode with `LogicalPlan` 
[#10687](https://github.com/apache/datafusion/pull/10687) (alamb)
 - Add `ParquetExec::builder()`, deprecate `ParquetExec::new` 
[#10636](https://github.com/apache/datafusion/pull/10636) (alamb)
 - feature: Add a WindowUDFImpl::simplify() API 
[#9906](https://github.com/apache/datafusion/pull/9906) (guojidan)
@@ -249,7 +217,6 @@
 - CI: Fix complaints from newer Clippy versions 
[#10725](https://github.com/apache/datafusion/pull/10725) (comphead)
 - Remove Eager Trait for Joins 
[#10721](https://github.com/apache/datafusion/pull/10721) (berkaysynnada)
 - Minor: fix signature `fn octect_length()` 
[#10726](https://github.com/apache/datafusion/pull/10726) (marvinlanhenke)
-- docs: add documents to substrait type variation consts 
[#10719](https://github.com/apache/datafusion/pull/10719) (waynexia)
 - Update rstest requirement from 0.19.0 to 0.20.0 
[#10734](https://github.com/apache/datafusion/pull/10734) (dependabot[bot])
 - Update rstest_reuse requirement from 0.6.0 to 0.7.0 
[#10733](https://github.com/apache/datafusion/pull/10733) (dependabot[bot])
 - Add example for building an external secondary index for parquet files 
[#10549](https://github.com/apache/datafusion/pull/10549) (alamb)
@@ -262,16 +229,12 @@
 - Minor: Split physical_plan/parquet/mod.rs into smaller modules 
[#10727](https://github.com/apache/datafusion/pull/10727) (alamb)
 - minor: consolidate unparser integration tests 
[#10736](https://github.com/apache/datafusion/pull/10736) (devinjdangelo)
 - Minor: Move aggregate variance to slt 
[#10750](https://github.com/apache/datafusion/pull/10750) (marvinlanhenke)
-- fix: fix string repeat for negative numbers 
[#10760](https://github.com/apache/datafusion/pull/10760) (tshauck)
-- Introduce Sum UDAF [#10651](https://github.com/apache/datafusion/pull/10651) 
(jayzhan211)
 - Extract parquet statistics from timestamps with timezones 
[#10766](https://github.com/apache/datafusion/pull/10766) (xinlifoobar)
 - Minor: Add tests for extracting dictionary parquet statistics 
[#10729](https://github.com/apache/datafusion/pull/10729) (alamb)
 - Update rstest requirement from 0.20.0 to 0.21.0 
[#10774](https://github.com/apache/datafusion/pull/10774) (dependabot[bot])
 - Minor: Refactor memory size estimation for HashTable 
[#10748](https://github.com/apache/datafusion/pull/10748) (marvinlanhenke)
 - Reduce code repetition in `datafusion/functions` mod files 
[#10700](https://github.com/apache/datafusion/pull/10700) (MohamedAbdeen21)
-- Minor: (Doc) Enable rt-multi-thread feature for sample code 
[#10770](https://github.com/apache/datafusion/pull/10770) (hsiang-c)
 - Support negatives in split part 
[#10780](https://github.com/apache/datafusion/pull/10780) (tshauck)
-- feat: support unparsing LogicalPlan::Window nodes 
[#10767](https://github.com/apache/datafusion/pull/10767) (devinjdangelo)
 - Extract parquet statistics from `LargeUtf8` columns and Add tests for `UTF8` 
And `LargeUTF8` [#10762](https://github.com/apache/datafusion/pull/10762) 
(Weijun-H)
 - Cleanup GetIndexedField 
[#10769](https://github.com/apache/datafusion/pull/10769) (lewiszlw)
 - Extract parquet statistics from f16 columns, add `ScalarValue::Float16` 
[#10763](https://github.com/apache/datafusion/pull/10763) (Lordworms)
@@ -284,10 +247,8 @@
 - minor: Refactor some unparser methods to improve readability 
[#10788](https://github.com/apache/datafusion/pull/10788) (devinjdangelo)
 - Convert variance sample to udaf 
[#10713](https://github.com/apache/datafusion/pull/10713) (yyin-dev)
 - Improve docs and fix a typo 
[#10798](https://github.com/apache/datafusion/pull/10798) (lewiszlw)
-- fix: `array_slice` and `array_element` panicked on empty args 
[#10804](https://github.com/apache/datafusion/pull/10804) (jonahgao)
 - Avoid the usage of intermediate ScalarValue to improve performance of 
extracting statistics from parquet files 
[#10711](https://github.com/apache/datafusion/pull/10711) (xinlifoobar)
 - SMJ: Add more tests and improve comments 
[#10784](https://github.com/apache/datafusion/pull/10784) (comphead)
-- feat: Update Parquet row filtering to handle type coercion 
[#10716](https://github.com/apache/datafusion/pull/10716) (jeffreyssmith2nd)
 - Handle EmptyRelation during SQL unparsing 
[#10803](https://github.com/apache/datafusion/pull/10803) (goldmedal)
 - Document Committer and PMC process 
[#10778](https://github.com/apache/datafusion/pull/10778) (alamb)
 - Int64 as default type for make_array function empty or null case 
[#10790](https://github.com/apache/datafusion/pull/10790) (jayzhan211)
@@ -301,3 +262,72 @@
 - Refactor and simplify the SQL unparser 
[#10811](https://github.com/apache/datafusion/pull/10811) (goldmedal)
 - Minor: Remove code duplication in `memory_limit` derivation for 
datafusion-cli [#10814](https://github.com/apache/datafusion/pull/10814) 
(comphead)
 - build(deps): update Arrow/Parquet to `52.0`, object-store to `0.10` 
[#10765](https://github.com/apache/datafusion/pull/10765) (waynexia)
+- chore: Prepare 39.0.0-rc1 
[#10828](https://github.com/apache/datafusion/pull/10828) (andygrove)
+
+## Credits
+
+Thank you to everyone who contributed to this release. Here is a breakdown of 
commits (PRs merged) per contributor.
+
+```
+    44 Andrew Lamb
+    18 Jay Zhan
+    14 张林伟
+    11 Andy Grove
+    11 Xin Li
+    10 Jonah Gao
+     8 Jax Liu
+     7 Mustafa Akur
+     7 Oleks V
+     7 dependabot[bot]
+     5 Arttu
+     5 Berkay Şahin
+     5 Marvin Lanhenke
+     4 Lordworms
+     4 Ruihang Xia
+     3 Bruce Ritchie
+     3 Devin D'Angelo
+     3 Duong Cong Toai
+     3 Eduard Karacharov
+     3 Junhao Liu
+     3 Liang-Chi Hsieh
+     3 Mohamed Abdeen
+     3 Nga Tran
+     3 Peter Toth
+     3 Phillip LeBlanc
+     2 Abrar Khan
+     2 Adam Curtis
+     2 Chunchun Ye
+     2 Jeffrey Vo
+     2 Michael Maletich
+     2 QP Hou
+     2 Trent Hauck
+     2 Weijie Guo
+     2 junxiangMu
+     2 yfu
+     1 Adrian Tanase
+     1 Alex Huang
+     1 Andrey Koshchiy
+     1 Artem Medvedev
+     1 ClSlaid
+     1 Dan Harris
+     1 Edmondo Porcu
+     1 Jeffrey Smith II
+     1 Kun Liu
+     1 Leonardo Yvens
+     1 Marko Milenković
+     1 Matthew Turner
+     1 Mehmet Ozan Kabak
+     1 Michael J Ward
+     1 NoeB
+     1 Samuel Colvin
+     1 Scott Anderson
+     1 VimT
+     1 Yue Yin
+     1 baishen
+     1 hsiang-c
+     1 nathaniel-daniel
+     1 shanretoo
+     1 tison
+```
+
+Thank you also to everyone who contributed in other ways such as filing 
issues, reviewing PRs, and providing feedback on this release.
diff --git a/dev/release/README.md b/dev/release/README.md
index 749af8696b..c0ba87ad39 100644
--- a/dev/release/README.md
+++ b/dev/release/README.md
@@ -57,7 +57,7 @@ See instructions at 
https://infra.apache.org/release-signing.html#generate for g
 
 Committers can add signing keys in Subversion client with their ASF account. 
e.g.:
 
-```bash
+```shell
 $ svn co https://dist.apache.org/repos/dist/dev/datafusion
 $ cd datafusion
 $ editor KEYS
@@ -66,7 +66,7 @@ $ svn ci KEYS
 
 Follow the instructions in the header of the KEYS file to append your key. 
Here is an example:
 
-```bash
+```shell
 (gpg --list-sigs "John Doe" && gpg --armor --export "John Doe") >> KEYS
 svn commit KEYS -m "Add key for John Doe"
 ```
@@ -89,35 +89,26 @@ to generate one if you do not already have one.
 
 The changelog is generated using a Python script. There is a dependency on 
`PyGitHub`, which can be installed using pip:
 
-```bash
+```shell
 pip3 install PyGitHub
 ```
 
-Run the following command to generate the changelog content.
+To generate the changelog, set the `GITHUB_TOKEN` environment variable to a 
valid token and then run the script
+providing two commit ids or tags followed by the version number of the release 
being created. The following
+example generates a change log of all changes between the first commit and the 
current HEAD revision.
 
-```bash
-$ GITHUB_TOKEN=<TOKEN> ./dev/release/generate-changelog.py 24.0.0 HEAD > 
dev/changelog/25.0.0.md
+```shell
+export GITHUB_TOKEN=<your-token-here>
+./dev/release/generate-changelog.py 24.0.0 HEAD 25.0.0 > 
dev/changelog/25.0.0.md
 ```
 
 This script creates a changelog from GitHub PRs based on the labels associated 
with them as well as looking for
-titles starting with `feat:`, `fix:`, or `docs:` . The script will produce 
output similar to:
-
-```
-Fetching list of commits between 24.0.0 and HEAD
-Fetching pull requests
-Categorizing pull requests
-Generating changelog content
-```
-
-This process is not fully automated, so there are some additional manual steps:
+titles starting with `feat:`, `fix:`, or `docs:`.
 
-- Add the ASF header to the generated file
-- Add the following content (copy from the previous version's changelog and 
update as appropriate:
-
-```
-## [24.0.0](https://github.com/apache/datafusion/tree/24.0.0) (2023-05-06)
+Once the change log is generated, run `prettier` to format the document:
 
-[Full Changelog](https://github.com/apache/datafusion/compare/23.0.0...24.0.0)
+```shell
+prettier -w dev/changelog/25.0.0md
 ```
 
 ## Prepare release commits and PR
@@ -265,7 +256,7 @@ published in the correct order as shown in this diagram.
 
 _To update this diagram, manually edit the dependencies in 
[crate-deps.dot](crate-deps.dot) and then run:_
 
-```bash
+```shell
 dot -Tsvg dev/release/crate-deps.dot > dev/release/crate-deps.svg
 ```
 
@@ -310,7 +301,7 @@ Please visit https://brew.sh/ to obtain Homebrew. In 
addition to that please che
 Before running the script make sure that you can run the following command in 
your bash to make sure
 that `brew` has been installed and configured properly:
 
-```bash
+```shell
 brew --version
 ```
 
@@ -325,7 +316,7 @@ To create a Github Personal Access Token, please visit 
https://docs.github.com/e
 
 After all of the above is complete execute the following command:
 
-```bash
+```shell
 dev/release/publish_homebrew.sh <version> <github-user> <github-token> 
<homebrew-default-branch-name>
 ```
 
@@ -368,13 +359,13 @@ Release candidates should be deleted once the release is 
published.
 
 Get a list of DataFusion release candidates:
 
-```bash
+```shell
 svn ls https://dist.apache.org/repos/dist/dev/datafusion
 ```
 
 Delete a release candidate:
 
-```bash
+```shell
 svn delete -m "delete old DataFusion RC" 
https://dist.apache.org/repos/dist/dev/datafusion/apache-datafusion-38.0.0-rc1/
 ```
 
@@ -384,13 +375,13 @@ Only the latest release should be available. Delete old 
releases after publishin
 
 Get a list of DataFusion releases:
 
-```bash
+```shell
 svn ls https://dist.apache.org/repos/dist/release/datafusion
 ```
 
 Delete a release:
 
-```bash
+```shell
 svn delete -m "delete old DataFusion release" 
https://dist.apache.org/repos/dist/release/datafusion/datafusion-37.0.0
 ```
 
@@ -401,7 +392,7 @@ with a copy of the previous release announcement.
 
 Run the following commands to get the number of commits and number of unique 
contributors for inclusion in the blog post.
 
-```bash
+```shell
 git log --pretty=oneline 37.0.0..38.0.0 datafusion datafusion-cli 
datafusion-examples | wc -l
 git shortlog -sn 37.0.0..38.0.0 datafusion datafusion-cli datafusion-examples 
| wc -l
 ```
diff --git a/dev/release/generate-changelog.py 
b/dev/release/generate-changelog.py
index 424baece60..23b5942148 100755
--- a/dev/release/generate-changelog.py
+++ b/dev/release/generate-changelog.py
@@ -20,7 +20,7 @@ import sys
 from github import Github
 import os
 import re
-
+import subprocess
 
 def print_pulls(repo_name, title, pulls):
     if len(pulls)  > 0:
@@ -32,7 +32,7 @@ def print_pulls(repo_name, title, pulls):
         print()
 
 
-def generate_changelog(repo, repo_name, tag1, tag2):
+def generate_changelog(repo, repo_name, tag1, tag2, version):
 
     # get a list of commits between two tags
     print(f"Fetching list of commits between {tag1} and {tag2}", 
file=sys.stderr)
@@ -52,12 +52,12 @@ def generate_changelog(repo, repo_name, tag1, tag2):
                 all_pulls.append((pull, commit))
 
     # we split the pulls into categories
-    #TODO: make categories configurable
     breaking = []
     bugs = []
     docs = []
     enhancements = []
     performance = []
+    other = []
 
     # categorize the pull requests based on GitHub labels
     print("Categorizing pull requests", file=sys.stderr)
@@ -75,7 +75,6 @@ def generate_changelog(repo, repo_name, tag1, tag2):
             cc_breaking = parts_tuple[2] == '!'
 
         labels = [label.name for label in pull.labels]
-        #print(pull.number, labels, parts, file=sys.stderr)
         if 'api change' in labels or cc_breaking:
             breaking.append((pull, commit))
         elif 'bug' in labels or cc_type == 'fix':
@@ -84,18 +83,64 @@ def generate_changelog(repo, repo_name, tag1, tag2):
             performance.append((pull, commit))
         elif 'enhancement' in labels or cc_type == 'feat':
             enhancements.append((pull, commit))
-        elif 'documentation' in labels or cc_type == 'docs':
+        elif 'documentation' in labels or cc_type == 'docs' or cc_type == 
'doc':
             docs.append((pull, commit))
+        else:
+            other.append((pull, commit))
 
     # produce the changelog content
     print("Generating changelog content", file=sys.stderr)
+
+    # ASF header
+    print("""<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->\n""")
+
+    print(f"# Apache DataFusion {version} Changelog\n")
+
+    # get the number of commits
+    commit_count = subprocess.check_output(f"git log --pretty=oneline 
{tag1}..{tag2} | wc -l", shell=True, text=True).strip()
+
+    # get number of contributors
+    contributor_count = subprocess.check_output(f"git shortlog -sn 
{tag1}..{tag2} | wc -l", shell=True, text=True).strip()
+
+    print(f"This release consists of {commit_count} commits from 
{contributor_count} contributors. "
+          f"See credits at the end of this changelog for more information.\n")
+
     print_pulls(repo_name, "Breaking changes", breaking)
     print_pulls(repo_name, "Performance related", performance)
     print_pulls(repo_name, "Implemented enhancements", enhancements)
     print_pulls(repo_name, "Fixed bugs", bugs)
     print_pulls(repo_name, "Documentation updates", docs)
-    print_pulls(repo_name, "Merged pull requests", all_pulls)
+    print_pulls(repo_name, "Other", other)
+
+    # show code contributions
+    credits = subprocess.check_output(f"git shortlog -sn {tag1}..{tag2}", 
shell=True, text=True).rstrip()
+
+    print("## Credits\n")
+    print("Thank you to everyone who contributed to this release. Here is a 
breakdown of commits (PRs merged) "
+          "per contributor.\n")
+    print("```")
+    print(credits)
+    print("```\n")
 
+    print("Thank you also to everyone who contributed in other ways such as 
filing issues, reviewing "
+          "PRs, and providing feedback on this release.\n")
 
 def cli(args=None):
     """Process command line arguments."""
@@ -103,8 +148,9 @@ def cli(args=None):
         args = sys.argv[1:]
 
     parser = argparse.ArgumentParser()
-    parser.add_argument("tag1", help="The previous release tag (e.g. 38.0.0)")
-    parser.add_argument("tag2", help="The current release tag (e.g. HEAD)")
+    parser.add_argument("tag1", help="The previous commit or tag (e.g. 0.1.0)")
+    parser.add_argument("tag2", help="The current commit or tag (e.g. HEAD)")
+    parser.add_argument("version", help="The version number to include in the 
changelog")
     args = parser.parse_args()
 
     token = os.getenv("GITHUB_TOKEN")
@@ -112,7 +158,7 @@ def cli(args=None):
 
     g = Github(token)
     repo = g.get_repo(project)
-    generate_changelog(repo, project, args.tag1, args.tag2)
+    generate_changelog(repo, project, args.tag1, args.tag2, args.version)
 
 if __name__ == "__main__":
     cli()
\ No newline at end of file


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


Reply via email to