[GitHub] [arrow-site] alamb commented on a change in pull request #193: ARROW-15683: [Website] [DataFusion] DataFusion 7.0.0 blog post

GitBox Tue, 15 Feb 2022 10:38:18 -0800


alamb commented on a change in pull request #193:
URL: https://github.com/apache/arrow-site/pull/193#discussion_r806271254




##########
File path: _posts/2022-02-14-datafusion-7.0.0.md
##########
@@ -0,0 +1,166 @@
+---
+layout: post
+title: Apache Arrow DataFusion 6.0.0 Release
+date: "2022-02-14 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+# Introduction
+
+[DataFusion](https://arrow.apache.org/datafusion/) is an extensible query 
execution framework, written in Rust, that uses Apache Arrow as its in-memory 
format.
+
+When you want to extend your Rust project with [SQL 
support](https://arrow.apache.org/datafusion/user-guide/sql/sql_status.html), a 
DataFrame API, or the ability to read and process Parquet, JSON, Avro or CSV 
data, DataFusion is definitely worth checking out.
+
+DataFusion supports both a SQL and DataFrame API for building logical query 
plans as well as a sophisticated query optimizer and execution engine capable 
of parallel execution against memory, CSV, Parquet, Avro and JSON.

Review comment:
       ```suggestion
   DataFusion's  SQL, `DataFrame`, and manual `PlanBuilder` API let users 
access to a sophisticated query optimizer and execution engine capable of fast, 
resource efficient parallel execution that takes optimal advantage of todays 
multicore hardware. Being written in Rust means DataFusion an offer *both* the 
safety of dynamic languages as well as the resource efficiency of a compiled 
language. 
   ```

##########
File path: _posts/2022-02-14-datafusion-7.0.0.md
##########
@@ -0,0 +1,166 @@
+---
+layout: post
+title: Apache Arrow DataFusion 6.0.0 Release
+date: "2022-02-14 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+# Introduction
+
+[DataFusion](https://arrow.apache.org/datafusion/) is an extensible query 
execution framework, written in Rust, that uses Apache Arrow as its in-memory 
format.
+
+When you want to extend your Rust project with [SQL 
support](https://arrow.apache.org/datafusion/user-guide/sql/sql_status.html), a 
DataFrame API, or the ability to read and process Parquet, JSON, Avro or CSV 
data, DataFusion is definitely worth checking out.
+
+DataFusion supports both a SQL and DataFrame API for building logical query 
plans as well as a sophisticated query optimizer and execution engine capable 
of parallel execution against memory, CSV, Parquet, Avro and JSON.
+
+The Apache Arrow team is pleased to announce the DataFusion 7.0.0 release. 
This covers 4 months of development work
+and includes 195 commits from the following 37 distinct contributors.
+
+<!--
+git log --pretty=oneline 5.0.0..6.0.0 datafusion datafusion-cli 
datafusion-examples | wc -l
+     134
+
+git shortlog -sn 5.0.0..6.0.0 datafusion datafusion-cli datafusion-examples | 
wc -l
+      29
+
+      Carlos and xudong963 are same individual
+-->
+
+```
+    44  Andrew Lamb
+    24  Kun Liu
+    23  Jiayu Liu
+    12  xudong.w
+    11  Yijie Shen
+     9  Matthew Turner
+     7  Liang-Chi Hsieh
+     5  Lin Ma
+     5  Carlos
+     4  Stephen Carman
+     4  James Katz
+     4  Dmitry Patsura
+     4  QP Hou
+     3  dependabot[bot]
+     3  Remzi Yang
+     3  Yang
+     3  ic4y
+     3  Daniël Heres
+     2  Andy Grove
+     2  Raphael Taylor-Davies
+     2  Jason Tianyi Wang
+     2  Dan Harris
+     2  Sergey Melnychuk
+     1  Nitish Tiwari
+     1  Dom
+     1  Eduard Karacharov
+     1  Javier Goday
+     1  Boaz
+     1  Marko Mikulicic
+     1  Max Burke
+     1  Carol (Nichols || Goulding)
+     1  Phillip Cloud
+     1  Rich
+     1  Toby Hede
+     1  Will Jones
+     1  r.4ntix
+     1  rdettai
+```
+
+The release notes below are not exhaustive and only expose selected highlights 
of the release. Many other bug fixes
+and improvements have been made: we refer you to the complete
+[changelog](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md).
+
+# Summary
+
+There have been significant improvements across the board since the 6.0 
release which are summarized below.
+
+- DataFusion Crate
+  - The DataFusion crate is in the process of being split into multiple crates 
in order to decrease compilation times and improve the development experience. 
To start, datafusion-common (the core DataFusion components) and 
datafusion-expr (DataFusion expressions, functions, and operators) will be 
split out.  There will be additional splits after the 7.0 release.

Review comment:
       ```suggestion
     - The DataFusion crate is being split into multiple crates to decrease 
compilation times and improve the development experience. Initially, 
`datafusion-common` (the core DataFusion components) and `datafusion-expr` 
(DataFusion expressions, functions, and operators) have been split out. There 
will be additional splits after the 7.0 release.
   ```

##########
File path: _posts/2022-02-14-datafusion-7.0.0.md
##########
@@ -0,0 +1,166 @@
+---
+layout: post
+title: Apache Arrow DataFusion 6.0.0 Release
+date: "2022-02-14 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+# Introduction
+
+[DataFusion](https://arrow.apache.org/datafusion/) is an extensible query 
execution framework, written in Rust, that uses Apache Arrow as its in-memory 
format.
+
+When you want to extend your Rust project with [SQL 
support](https://arrow.apache.org/datafusion/user-guide/sql/sql_status.html), a 
DataFrame API, or the ability to read and process Parquet, JSON, Avro or CSV 
data, DataFusion is definitely worth checking out.
+
+DataFusion supports both a SQL and DataFrame API for building logical query 
plans as well as a sophisticated query optimizer and execution engine capable 
of parallel execution against memory, CSV, Parquet, Avro and JSON.
+
+The Apache Arrow team is pleased to announce the DataFusion 7.0.0 release. 
This covers 4 months of development work
+and includes 195 commits from the following 37 distinct contributors.
+
+<!--
+git log --pretty=oneline 5.0.0..6.0.0 datafusion datafusion-cli 
datafusion-examples | wc -l
+     134
+
+git shortlog -sn 5.0.0..6.0.0 datafusion datafusion-cli datafusion-examples | 
wc -l
+      29
+
+      Carlos and xudong963 are same individual
+-->
+
+```
+    44  Andrew Lamb
+    24  Kun Liu
+    23  Jiayu Liu
+    12  xudong.w
+    11  Yijie Shen
+     9  Matthew Turner
+     7  Liang-Chi Hsieh
+     5  Lin Ma
+     5  Carlos
+     4  Stephen Carman
+     4  James Katz
+     4  Dmitry Patsura
+     4  QP Hou
+     3  dependabot[bot]
+     3  Remzi Yang
+     3  Yang
+     3  ic4y
+     3  Daniël Heres
+     2  Andy Grove
+     2  Raphael Taylor-Davies
+     2  Jason Tianyi Wang
+     2  Dan Harris
+     2  Sergey Melnychuk
+     1  Nitish Tiwari
+     1  Dom
+     1  Eduard Karacharov
+     1  Javier Goday
+     1  Boaz
+     1  Marko Mikulicic
+     1  Max Burke
+     1  Carol (Nichols || Goulding)
+     1  Phillip Cloud
+     1  Rich
+     1  Toby Hede
+     1  Will Jones
+     1  r.4ntix
+     1  rdettai
+```
+
+The release notes below are not exhaustive and only expose selected highlights 
of the release. Many other bug fixes
+and improvements have been made: we refer you to the complete
+[changelog](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md).
+
+# Summary
+
+There have been significant improvements across the board since the 6.0 
release which are summarized below.
+
+- DataFusion Crate
+  - The DataFusion crate is in the process of being split into multiple crates 
in order to decrease compilation times and improve the development experience. 
To start, datafusion-common (the core DataFusion components) and 
datafusion-expr (DataFusion expressions, functions, and operators) will be 
split out.  There will be additional splits after the 7.0 release.
+- Performance Improvements and Optimizations
+  - Arrow’s dyn scalar kernels are now used which enable more efficient 
operations on DictionaryArrays 
[#1685](https://github.com/apache/arrow-datafusion/pull/1685)
+  - Switch from std::sync::Mutex to parking_lot::Mutex 
[#1720](https://github.com/apache/arrow-datafusion/pull/1720)
+- New Features
+  - Better support for limiting resource usage
+    - MemoryMananger and DiskManager 
[#1526](https://github.com/apache/arrow-datafusion/pull/1526)
+    - Out of core sort 
[#1526](https://github.com/apache/arrow-datafusion/pull/1526)
+    - New metrics
+      - `Gauge` and `CurrentMemoryUsage` 
[#1682](https://github.com/apache/arrow-datafusion/pull/1682)
+      - `Spill_count` and `spilled_bytes` 
[#1641](https://github.com/apache/arrow-datafusion/pull/1641)
+  - New math functions
+    - `Approx_quantile` 
[#1529](https://github.com/apache/arrow-datafusion/pull/1539)
+    - `stddev` and `variance` (sample and population) 
[#1525](https://github.com/apache/arrow-datafusion/pull/1525)
+    - `corr` [#1561](https://github.com/apache/arrow-datafusion/pull/1561)
+  - Support decimal type 
[#1394](https://github.com/apache/arrow-datafusion/pull/1394)[#1407](https://github.com/apache/arrow-datafusion/pull/1407)[#1408](https://github.com/apache/arrow-datafusion/pull/1408)[#1431](https://github.com/apache/arrow-datafusion/pull/1431)[#1483](https://github.com/apache/arrow-datafusion/pull/1483)[#1554](https://github.com/apache/arrow-datafusion/pull/1554)[#1640](https://github.com/apache/arrow-datafusion/pull/1640)
+  - Support for evolved schemas 
[#1622](https://github.com/apache/arrow-datafusion/pull/1622)[#1709](https://github.com/apache/arrow-datafusion/pull/1709)

Review comment:
       ```suggestion
     - Support for reading parquet files with evolved schemas 
[#1622](https://github.com/apache/arrow-datafusion/pull/1622)[#1709](https://github.com/apache/arrow-datafusion/pull/1709)
   ```

##########
File path: _posts/2022-02-14-datafusion-7.0.0.md
##########
@@ -0,0 +1,166 @@
+---
+layout: post
+title: Apache Arrow DataFusion 6.0.0 Release
+date: "2022-02-14 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+# Introduction
+
+[DataFusion](https://arrow.apache.org/datafusion/) is an extensible query 
execution framework, written in Rust, that uses Apache Arrow as its in-memory 
format.
+
+When you want to extend your Rust project with [SQL 
support](https://arrow.apache.org/datafusion/user-guide/sql/sql_status.html), a 
DataFrame API, or the ability to read and process Parquet, JSON, Avro or CSV 
data, DataFusion is definitely worth checking out.
+
+DataFusion supports both a SQL and DataFrame API for building logical query 
plans as well as a sophisticated query optimizer and execution engine capable 
of parallel execution against memory, CSV, Parquet, Avro and JSON.
+
+The Apache Arrow team is pleased to announce the DataFusion 7.0.0 release. 
This covers 4 months of development work
+and includes 195 commits from the following 37 distinct contributors.
+
+<!--
+git log --pretty=oneline 5.0.0..6.0.0 datafusion datafusion-cli 
datafusion-examples | wc -l
+     134
+
+git shortlog -sn 5.0.0..6.0.0 datafusion datafusion-cli datafusion-examples | 
wc -l
+      29
+
+      Carlos and xudong963 are same individual
+-->
+
+```
+    44  Andrew Lamb
+    24  Kun Liu
+    23  Jiayu Liu
+    12  xudong.w
+    11  Yijie Shen
+     9  Matthew Turner
+     7  Liang-Chi Hsieh
+     5  Lin Ma
+     5  Carlos
+     4  Stephen Carman
+     4  James Katz
+     4  Dmitry Patsura
+     4  QP Hou
+     3  dependabot[bot]
+     3  Remzi Yang
+     3  Yang
+     3  ic4y
+     3  Daniël Heres
+     2  Andy Grove
+     2  Raphael Taylor-Davies
+     2  Jason Tianyi Wang
+     2  Dan Harris
+     2  Sergey Melnychuk
+     1  Nitish Tiwari
+     1  Dom
+     1  Eduard Karacharov
+     1  Javier Goday
+     1  Boaz
+     1  Marko Mikulicic
+     1  Max Burke
+     1  Carol (Nichols || Goulding)
+     1  Phillip Cloud
+     1  Rich
+     1  Toby Hede
+     1  Will Jones
+     1  r.4ntix
+     1  rdettai
+```
+
+The release notes below are not exhaustive and only expose selected highlights 
of the release. Many other bug fixes
+and improvements have been made: we refer you to the complete
+[changelog](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md).
+
+# Summary
+
+There have been significant improvements across the board since the 6.0 
release which are summarized below.
+
+- DataFusion Crate
+  - The DataFusion crate is in the process of being split into multiple crates 
in order to decrease compilation times and improve the development experience. 
To start, datafusion-common (the core DataFusion components) and 
datafusion-expr (DataFusion expressions, functions, and operators) will be 
split out.  There will be additional splits after the 7.0 release.
+- Performance Improvements and Optimizations
+  - Arrow’s dyn scalar kernels are now used which enable more efficient 
operations on DictionaryArrays 
[#1685](https://github.com/apache/arrow-datafusion/pull/1685)
+  - Switch from std::sync::Mutex to parking_lot::Mutex 
[#1720](https://github.com/apache/arrow-datafusion/pull/1720)
+- New Features
+  - Better support for limiting resource usage
+    - MemoryMananger and DiskManager 
[#1526](https://github.com/apache/arrow-datafusion/pull/1526)
+    - Out of core sort 
[#1526](https://github.com/apache/arrow-datafusion/pull/1526)
+    - New metrics
+      - `Gauge` and `CurrentMemoryUsage` 
[#1682](https://github.com/apache/arrow-datafusion/pull/1682)
+      - `Spill_count` and `spilled_bytes` 
[#1641](https://github.com/apache/arrow-datafusion/pull/1641)
+  - New math functions
+    - `Approx_quantile` 
[#1529](https://github.com/apache/arrow-datafusion/pull/1539)
+    - `stddev` and `variance` (sample and population) 
[#1525](https://github.com/apache/arrow-datafusion/pull/1525)
+    - `corr` [#1561](https://github.com/apache/arrow-datafusion/pull/1561)
+  - Support decimal type 
[#1394](https://github.com/apache/arrow-datafusion/pull/1394)[#1407](https://github.com/apache/arrow-datafusion/pull/1407)[#1408](https://github.com/apache/arrow-datafusion/pull/1408)[#1431](https://github.com/apache/arrow-datafusion/pull/1431)[#1483](https://github.com/apache/arrow-datafusion/pull/1483)[#1554](https://github.com/apache/arrow-datafusion/pull/1554)[#1640](https://github.com/apache/arrow-datafusion/pull/1640)
+  - Support for evolved schemas 
[#1622](https://github.com/apache/arrow-datafusion/pull/1622)[#1709](https://github.com/apache/arrow-datafusion/pull/1709)
+  - Support for registering `DataFrame` as table 
[#1699](https://github.com/apache/arrow-datafusion/pull/1699)
+  - Suppot `substring` function 
[#1621](https://github.com/apache/arrow-datafusion/pull/1621)
+  - Support `array_agg(distinct ...)` 
[#1579](https://github.com/apache/arrow-datafusion/pull/1579)
+  - Support `sort` on unprojected columns 
[#1415](https://github.com/apache/arrow-datafusion/pull/1415)
+- Additional Integration Points
+  - A new public Expression simplification API 
[#1717](https://github.com/apache/arrow-datafusion/pull/1717)
+- [DataFusion-Contrib](https://github.com/datafusion-contrib)
+  - A new GitHub organization created as a home for both `DataFusion` 
extensions and as a testing ground for new features.
+    - Extensions
+      - 
[DataFusion-Python](https://github.com/datafusion-contrib/datafusion-python)
+      - 
[DataFusion-Java](https://github.com/datafusion-contrib/datafusion-java)
+      - 
[DataFusion-hdsfs-native](https://github.com/datafusion-contrib/datafusion-hdfs-native)
+      - 
[DataFusion-ObjectStore-s3](https://github.com/datafusion-contrib/datafusion-objectstore-s3)
+    - New Features
+      - 
[DataFusion-Streams](https://github.com/datafusion-contrib/datafusion-streams)
+- [Arrow2](https://github.com/jorgecarleitao/arrow2)
+  - An [Arrow2 Branch](https://github.com/apache/arrow-datafusion/tree/arrow2) 
has been created.  There are ongoing discussions in 
[DataFusion](https://github.com/apache/arrow-datafusion/issues/1532) and 
[arrow-rs](https://github.com/apache/arrow-rs/issues/1176) about migrating 
`DataFusion` to `Arrow2`
+
+For the full list of new features with their relevant PRs, see the
+[enhancements 
section](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md)
+in the changelog.
+
+# Documentation and Roadmap
+
+The project's documentation is being consolidated into the official site.  You 
can find more details there on topics such as the SQL status (TO DO LINK) and a 
user guide.
+
+To provide transparency on DataFusion’s priorities to users and developers a 
three month roadmap will be published at the beginning of each quarter.  This 
can be found here (TO DO LINK once site is updated).  
+
+See full details on DataFusion’s ambitions (TO DO LINK).
+
+# Upcoming Attractions
+
+- Ballista is gaining momentum, and several groups are now evaluating and 
contributing to the project.
+  - Some of the proposed improvements
+    - [Improvements 
Overview](https://github.com/apache/arrow-datafusion/issues/1701)
+    - [Extensibility](https://github.com/apache/arrow-datafusion/issues/1675)
+    - [File system 
access](https://github.com/apache/arrow-datafusion/issues/1702)
+    - [Cluster state](https://github.com/apache/arrow-datafusion/issues/1704)
+- Continued improvements for working with limited resources and large datasets
+  - Memory limited 
joins[#1599](https://github.com/apache/arrow-datafusion/issues/1599)
+  - Sort-merge 
join[#141](https://github.com/apache/arrow-datafusion/issues/141)[#1776](https://github.com/apache/arrow-datafusion/pull/1776)
+  - Introduce row based bytes representation 
[#1708](https://github.com/apache/arrow-datafusion/pull/1708)
+
+# How to Get Involved
+
+If you are interested in contributing to DataFusion, we would love to have 
you! You
+can help by trying out DataFusion on some of your own data and projects and 
filing bug reports and helping to
+improve the documentation, or contribute to the documentation, tests or code. 
A list of open issues suitable for
+beginners is 
[here](https://github.com/apache/arrow-datafusion/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22)

Review comment:
       ```suggestion
   can help by trying out DataFusion on some of your own data and projects and 
let us know how it goes or contribute a PR with documentation, tests or code. A 
list of open issues suitable for
   beginners is 
[here](https://github.com/apache/arrow-datafusion/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22)
   ```

##########
File path: _posts/2022-02-14-datafusion-7.0.0.md
##########
@@ -0,0 +1,166 @@
+---
+layout: post
+title: Apache Arrow DataFusion 6.0.0 Release
+date: "2022-02-14 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+# Introduction
+
+[DataFusion](https://arrow.apache.org/datafusion/) is an extensible query 
execution framework, written in Rust, that uses Apache Arrow as its in-memory 
format.
+
+When you want to extend your Rust project with [SQL 
support](https://arrow.apache.org/datafusion/user-guide/sql/sql_status.html), a 
DataFrame API, or the ability to read and process Parquet, JSON, Avro or CSV 
data, DataFusion is definitely worth checking out.
+
+DataFusion supports both a SQL and DataFrame API for building logical query 
plans as well as a sophisticated query optimizer and execution engine capable 
of parallel execution against memory, CSV, Parquet, Avro and JSON.
+
+The Apache Arrow team is pleased to announce the DataFusion 7.0.0 release. 
This covers 4 months of development work
+and includes 195 commits from the following 37 distinct contributors.
+
+<!--
+git log --pretty=oneline 5.0.0..6.0.0 datafusion datafusion-cli 
datafusion-examples | wc -l
+     134
+
+git shortlog -sn 5.0.0..6.0.0 datafusion datafusion-cli datafusion-examples | 
wc -l
+      29
+
+      Carlos and xudong963 are same individual
+-->
+
+```
+    44  Andrew Lamb
+    24  Kun Liu
+    23  Jiayu Liu
+    12  xudong.w
+    11  Yijie Shen
+     9  Matthew Turner
+     7  Liang-Chi Hsieh
+     5  Lin Ma
+     5  Carlos
+     4  Stephen Carman
+     4  James Katz
+     4  Dmitry Patsura
+     4  QP Hou
+     3  dependabot[bot]
+     3  Remzi Yang
+     3  Yang
+     3  ic4y
+     3  Daniël Heres
+     2  Andy Grove
+     2  Raphael Taylor-Davies
+     2  Jason Tianyi Wang
+     2  Dan Harris
+     2  Sergey Melnychuk
+     1  Nitish Tiwari
+     1  Dom
+     1  Eduard Karacharov
+     1  Javier Goday
+     1  Boaz
+     1  Marko Mikulicic
+     1  Max Burke
+     1  Carol (Nichols || Goulding)
+     1  Phillip Cloud
+     1  Rich
+     1  Toby Hede
+     1  Will Jones
+     1  r.4ntix
+     1  rdettai
+```
+
+The release notes below are not exhaustive and only expose selected highlights 
of the release. Many other bug fixes
+and improvements have been made: we refer you to the complete
+[changelog](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md).
+
+# Summary
+
+There have been significant improvements across the board since the 6.0 
release which are summarized below.
+
+- DataFusion Crate
+  - The DataFusion crate is in the process of being split into multiple crates 
in order to decrease compilation times and improve the development experience. 
To start, datafusion-common (the core DataFusion components) and 
datafusion-expr (DataFusion expressions, functions, and operators) will be 
split out.  There will be additional splits after the 7.0 release.
+- Performance Improvements and Optimizations
+  - Arrow’s dyn scalar kernels are now used which enable more efficient 
operations on DictionaryArrays 
[#1685](https://github.com/apache/arrow-datafusion/pull/1685)
+  - Switch from std::sync::Mutex to parking_lot::Mutex 
[#1720](https://github.com/apache/arrow-datafusion/pull/1720)
+- New Features
+  - Better support for limiting resource usage
+    - MemoryMananger and DiskManager 
[#1526](https://github.com/apache/arrow-datafusion/pull/1526)
+    - Out of core sort 
[#1526](https://github.com/apache/arrow-datafusion/pull/1526)
+    - New metrics
+      - `Gauge` and `CurrentMemoryUsage` 
[#1682](https://github.com/apache/arrow-datafusion/pull/1682)
+      - `Spill_count` and `spilled_bytes` 
[#1641](https://github.com/apache/arrow-datafusion/pull/1641)
+  - New math functions
+    - `Approx_quantile` 
[#1529](https://github.com/apache/arrow-datafusion/pull/1539)
+    - `stddev` and `variance` (sample and population) 
[#1525](https://github.com/apache/arrow-datafusion/pull/1525)
+    - `corr` [#1561](https://github.com/apache/arrow-datafusion/pull/1561)
+  - Support decimal type 
[#1394](https://github.com/apache/arrow-datafusion/pull/1394)[#1407](https://github.com/apache/arrow-datafusion/pull/1407)[#1408](https://github.com/apache/arrow-datafusion/pull/1408)[#1431](https://github.com/apache/arrow-datafusion/pull/1431)[#1483](https://github.com/apache/arrow-datafusion/pull/1483)[#1554](https://github.com/apache/arrow-datafusion/pull/1554)[#1640](https://github.com/apache/arrow-datafusion/pull/1640)
+  - Support for evolved schemas 
[#1622](https://github.com/apache/arrow-datafusion/pull/1622)[#1709](https://github.com/apache/arrow-datafusion/pull/1709)
+  - Support for registering `DataFrame` as table 
[#1699](https://github.com/apache/arrow-datafusion/pull/1699)
+  - Suppot `substring` function 
[#1621](https://github.com/apache/arrow-datafusion/pull/1621)
+  - Support `array_agg(distinct ...)` 
[#1579](https://github.com/apache/arrow-datafusion/pull/1579)
+  - Support `sort` on unprojected columns 
[#1415](https://github.com/apache/arrow-datafusion/pull/1415)
+- Additional Integration Points
+  - A new public Expression simplification API 
[#1717](https://github.com/apache/arrow-datafusion/pull/1717)
+- [DataFusion-Contrib](https://github.com/datafusion-contrib)
+  - A new GitHub organization created as a home for both `DataFusion` 
extensions and as a testing ground for new features.
+    - Extensions
+      - 
[DataFusion-Python](https://github.com/datafusion-contrib/datafusion-python)
+      - 
[DataFusion-Java](https://github.com/datafusion-contrib/datafusion-java)
+      - 
[DataFusion-hdsfs-native](https://github.com/datafusion-contrib/datafusion-hdfs-native)
+      - 
[DataFusion-ObjectStore-s3](https://github.com/datafusion-contrib/datafusion-objectstore-s3)
+    - New Features
+      - 
[DataFusion-Streams](https://github.com/datafusion-contrib/datafusion-streams)
+- [Arrow2](https://github.com/jorgecarleitao/arrow2)
+  - An [Arrow2 Branch](https://github.com/apache/arrow-datafusion/tree/arrow2) 
has been created.  There are ongoing discussions in 
[DataFusion](https://github.com/apache/arrow-datafusion/issues/1532) and 
[arrow-rs](https://github.com/apache/arrow-rs/issues/1176) about migrating 
`DataFusion` to `Arrow2`
+
+For the full list of new features with their relevant PRs, see the
+[enhancements 
section](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md)
+in the changelog.
+
+# Documentation and Roadmap
+
+The project's documentation is being consolidated into the official site.  You 
can find more details there on topics such as the SQL status (TO DO LINK) and a 
user guide.
+
+To provide transparency on DataFusion’s priorities to users and developers a 
three month roadmap will be published at the beginning of each quarter.  This 
can be found here (TO DO LINK once site is updated).  
+
+See full details on DataFusion’s ambitions (TO DO LINK).
+
+# Upcoming Attractions
+
+- Ballista is gaining momentum, and several groups are now evaluating and 
contributing to the project.
+  - Some of the proposed improvements
+    - [Improvements 
Overview](https://github.com/apache/arrow-datafusion/issues/1701)
+    - [Extensibility](https://github.com/apache/arrow-datafusion/issues/1675)
+    - [File system 
access](https://github.com/apache/arrow-datafusion/issues/1702)
+    - [Cluster state](https://github.com/apache/arrow-datafusion/issues/1704)
+- Continued improvements for working with limited resources and large datasets
+  - Memory limited 
joins[#1599](https://github.com/apache/arrow-datafusion/issues/1599)
+  - Sort-merge 
join[#141](https://github.com/apache/arrow-datafusion/issues/141)[#1776](https://github.com/apache/arrow-datafusion/pull/1776)
+  - Introduce row based bytes representation 
[#1708](https://github.com/apache/arrow-datafusion/pull/1708)
+
+# How to Get Involved
+
+If you are interested in contributing to DataFusion, we would love to have 
you! You

Review comment:
       ```suggestion
   If you are interested in contributing to DataFusion, and learning about 
state of 
   the art query processing, we would love to have you join us on the journey! 
You
   ```

##########
File path: _posts/2022-02-14-datafusion-7.0.0.md
##########
@@ -0,0 +1,166 @@
+---
+layout: post
+title: Apache Arrow DataFusion 6.0.0 Release
+date: "2022-02-14 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+# Introduction
+
+[DataFusion](https://arrow.apache.org/datafusion/) is an extensible query 
execution framework, written in Rust, that uses Apache Arrow as its in-memory 
format.
+
+When you want to extend your Rust project with [SQL 
support](https://arrow.apache.org/datafusion/user-guide/sql/sql_status.html), a 
DataFrame API, or the ability to read and process Parquet, JSON, Avro or CSV 
data, DataFusion is definitely worth checking out.
+
+DataFusion supports both a SQL and DataFrame API for building logical query 
plans as well as a sophisticated query optimizer and execution engine capable 
of parallel execution against memory, CSV, Parquet, Avro and JSON.
+
+The Apache Arrow team is pleased to announce the DataFusion 7.0.0 release. 
This covers 4 months of development work
+and includes 195 commits from the following 37 distinct contributors.
+
+<!--
+git log --pretty=oneline 5.0.0..6.0.0 datafusion datafusion-cli 
datafusion-examples | wc -l
+     134
+
+git shortlog -sn 5.0.0..6.0.0 datafusion datafusion-cli datafusion-examples | 
wc -l
+      29
+
+      Carlos and xudong963 are same individual
+-->
+
+```
+    44  Andrew Lamb
+    24  Kun Liu
+    23  Jiayu Liu
+    12  xudong.w
+    11  Yijie Shen
+     9  Matthew Turner
+     7  Liang-Chi Hsieh
+     5  Lin Ma
+     5  Carlos
+     4  Stephen Carman
+     4  James Katz
+     4  Dmitry Patsura
+     4  QP Hou
+     3  dependabot[bot]
+     3  Remzi Yang
+     3  Yang
+     3  ic4y
+     3  Daniël Heres
+     2  Andy Grove
+     2  Raphael Taylor-Davies
+     2  Jason Tianyi Wang
+     2  Dan Harris
+     2  Sergey Melnychuk
+     1  Nitish Tiwari
+     1  Dom
+     1  Eduard Karacharov
+     1  Javier Goday
+     1  Boaz
+     1  Marko Mikulicic
+     1  Max Burke
+     1  Carol (Nichols || Goulding)
+     1  Phillip Cloud
+     1  Rich
+     1  Toby Hede
+     1  Will Jones
+     1  r.4ntix
+     1  rdettai
+```
+
+The release notes below are not exhaustive and only expose selected highlights 
of the release. Many other bug fixes
+and improvements have been made: we refer you to the complete
+[changelog](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md).
+
+# Summary
+
+There have been significant improvements across the board since the 6.0 
release which are summarized below.

Review comment:
       ```suggestion
   ```
   
   I am not sure this sentence adds much as a very similar thought is mentioned 
immediately above it.

##########
File path: _posts/2022-02-14-datafusion-7.0.0.md
##########
@@ -0,0 +1,166 @@
+---
+layout: post
+title: Apache Arrow DataFusion 6.0.0 Release
+date: "2022-02-14 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+# Introduction
+
+[DataFusion](https://arrow.apache.org/datafusion/) is an extensible query 
execution framework, written in Rust, that uses Apache Arrow as its in-memory 
format.
+
+When you want to extend your Rust project with [SQL 
support](https://arrow.apache.org/datafusion/user-guide/sql/sql_status.html), a 
DataFrame API, or the ability to read and process Parquet, JSON, Avro or CSV 
data, DataFusion is definitely worth checking out.
+
+DataFusion supports both a SQL and DataFrame API for building logical query 
plans as well as a sophisticated query optimizer and execution engine capable 
of parallel execution against memory, CSV, Parquet, Avro and JSON.
+
+The Apache Arrow team is pleased to announce the DataFusion 7.0.0 release. 
This covers 4 months of development work
+and includes 195 commits from the following 37 distinct contributors.
+
+<!--
+git log --pretty=oneline 5.0.0..6.0.0 datafusion datafusion-cli 
datafusion-examples | wc -l
+     134
+
+git shortlog -sn 5.0.0..6.0.0 datafusion datafusion-cli datafusion-examples | 
wc -l
+      29
+
+      Carlos and xudong963 are same individual
+-->
+
+```
+    44  Andrew Lamb
+    24  Kun Liu
+    23  Jiayu Liu
+    12  xudong.w
+    11  Yijie Shen
+     9  Matthew Turner
+     7  Liang-Chi Hsieh
+     5  Lin Ma
+     5  Carlos
+     4  Stephen Carman
+     4  James Katz
+     4  Dmitry Patsura
+     4  QP Hou
+     3  dependabot[bot]
+     3  Remzi Yang
+     3  Yang
+     3  ic4y
+     3  Daniël Heres
+     2  Andy Grove
+     2  Raphael Taylor-Davies
+     2  Jason Tianyi Wang
+     2  Dan Harris
+     2  Sergey Melnychuk
+     1  Nitish Tiwari
+     1  Dom
+     1  Eduard Karacharov
+     1  Javier Goday
+     1  Boaz
+     1  Marko Mikulicic
+     1  Max Burke
+     1  Carol (Nichols || Goulding)
+     1  Phillip Cloud
+     1  Rich
+     1  Toby Hede
+     1  Will Jones
+     1  r.4ntix
+     1  rdettai
+```
+
+The release notes below are not exhaustive and only expose selected highlights 
of the release. Many other bug fixes
+and improvements have been made: we refer you to the complete
+[changelog](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md).

Review comment:
       ```suggestion
   The following section highlights some of the improvements in this release. 
Of course, many other bug fixes
   and improvements have also been made and we refer you to the complete
   
[changelog](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md)
 for the full detail.
   ```

##########
File path: _posts/2022-02-14-datafusion-7.0.0.md
##########
@@ -0,0 +1,166 @@
+---
+layout: post
+title: Apache Arrow DataFusion 6.0.0 Release
+date: "2022-02-14 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+# Introduction
+
+[DataFusion](https://arrow.apache.org/datafusion/) is an extensible query 
execution framework, written in Rust, that uses Apache Arrow as its in-memory 
format.
+
+When you want to extend your Rust project with [SQL 
support](https://arrow.apache.org/datafusion/user-guide/sql/sql_status.html), a 
DataFrame API, or the ability to read and process Parquet, JSON, Avro or CSV 
data, DataFusion is definitely worth checking out.
+
+DataFusion supports both a SQL and DataFrame API for building logical query 
plans as well as a sophisticated query optimizer and execution engine capable 
of parallel execution against memory, CSV, Parquet, Avro and JSON.
+
+The Apache Arrow team is pleased to announce the DataFusion 7.0.0 release. 
This covers 4 months of development work
+and includes 195 commits from the following 37 distinct contributors.
+
+<!--
+git log --pretty=oneline 5.0.0..6.0.0 datafusion datafusion-cli 
datafusion-examples | wc -l
+     134
+
+git shortlog -sn 5.0.0..6.0.0 datafusion datafusion-cli datafusion-examples | 
wc -l
+      29
+
+      Carlos and xudong963 are same individual
+-->
+
+```
+    44  Andrew Lamb
+    24  Kun Liu
+    23  Jiayu Liu
+    12  xudong.w
+    11  Yijie Shen
+     9  Matthew Turner
+     7  Liang-Chi Hsieh
+     5  Lin Ma
+     5  Carlos
+     4  Stephen Carman
+     4  James Katz
+     4  Dmitry Patsura
+     4  QP Hou
+     3  dependabot[bot]
+     3  Remzi Yang
+     3  Yang
+     3  ic4y
+     3  Daniël Heres
+     2  Andy Grove
+     2  Raphael Taylor-Davies
+     2  Jason Tianyi Wang
+     2  Dan Harris
+     2  Sergey Melnychuk
+     1  Nitish Tiwari
+     1  Dom
+     1  Eduard Karacharov
+     1  Javier Goday
+     1  Boaz
+     1  Marko Mikulicic
+     1  Max Burke
+     1  Carol (Nichols || Goulding)
+     1  Phillip Cloud
+     1  Rich
+     1  Toby Hede
+     1  Will Jones
+     1  r.4ntix
+     1  rdettai
+```
+
+The release notes below are not exhaustive and only expose selected highlights 
of the release. Many other bug fixes
+and improvements have been made: we refer you to the complete
+[changelog](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md).
+
+# Summary
+
+There have been significant improvements across the board since the 6.0 
release which are summarized below.
+
+- DataFusion Crate
+  - The DataFusion crate is in the process of being split into multiple crates 
in order to decrease compilation times and improve the development experience. 
To start, datafusion-common (the core DataFusion components) and 
datafusion-expr (DataFusion expressions, functions, and operators) will be 
split out.  There will be additional splits after the 7.0 release.
+- Performance Improvements and Optimizations
+  - Arrow’s dyn scalar kernels are now used which enable more efficient 
operations on DictionaryArrays 
[#1685](https://github.com/apache/arrow-datafusion/pull/1685)
+  - Switch from std::sync::Mutex to parking_lot::Mutex 
[#1720](https://github.com/apache/arrow-datafusion/pull/1720)
+- New Features
+  - Better support for limiting resource usage
+    - MemoryMananger and DiskManager 
[#1526](https://github.com/apache/arrow-datafusion/pull/1526)
+    - Out of core sort 
[#1526](https://github.com/apache/arrow-datafusion/pull/1526)
+    - New metrics
+      - `Gauge` and `CurrentMemoryUsage` 
[#1682](https://github.com/apache/arrow-datafusion/pull/1682)
+      - `Spill_count` and `spilled_bytes` 
[#1641](https://github.com/apache/arrow-datafusion/pull/1641)
+  - New math functions
+    - `Approx_quantile` 
[#1529](https://github.com/apache/arrow-datafusion/pull/1539)
+    - `stddev` and `variance` (sample and population) 
[#1525](https://github.com/apache/arrow-datafusion/pull/1525)
+    - `corr` [#1561](https://github.com/apache/arrow-datafusion/pull/1561)
+  - Support decimal type 
[#1394](https://github.com/apache/arrow-datafusion/pull/1394)[#1407](https://github.com/apache/arrow-datafusion/pull/1407)[#1408](https://github.com/apache/arrow-datafusion/pull/1408)[#1431](https://github.com/apache/arrow-datafusion/pull/1431)[#1483](https://github.com/apache/arrow-datafusion/pull/1483)[#1554](https://github.com/apache/arrow-datafusion/pull/1554)[#1640](https://github.com/apache/arrow-datafusion/pull/1640)
+  - Support for evolved schemas 
[#1622](https://github.com/apache/arrow-datafusion/pull/1622)[#1709](https://github.com/apache/arrow-datafusion/pull/1709)
+  - Support for registering `DataFrame` as table 
[#1699](https://github.com/apache/arrow-datafusion/pull/1699)
+  - Suppot `substring` function 
[#1621](https://github.com/apache/arrow-datafusion/pull/1621)
+  - Support `array_agg(distinct ...)` 
[#1579](https://github.com/apache/arrow-datafusion/pull/1579)
+  - Support `sort` on unprojected columns 
[#1415](https://github.com/apache/arrow-datafusion/pull/1415)
+- Additional Integration Points
+  - A new public Expression simplification API 
[#1717](https://github.com/apache/arrow-datafusion/pull/1717)
+- [DataFusion-Contrib](https://github.com/datafusion-contrib)
+  - A new GitHub organization created as a home for both `DataFusion` 
extensions and as a testing ground for new features.
+    - Extensions
+      - 
[DataFusion-Python](https://github.com/datafusion-contrib/datafusion-python)
+      - 
[DataFusion-Java](https://github.com/datafusion-contrib/datafusion-java)
+      - 
[DataFusion-hdsfs-native](https://github.com/datafusion-contrib/datafusion-hdfs-native)
+      - 
[DataFusion-ObjectStore-s3](https://github.com/datafusion-contrib/datafusion-objectstore-s3)
+    - New Features
+      - 
[DataFusion-Streams](https://github.com/datafusion-contrib/datafusion-streams)
+- [Arrow2](https://github.com/jorgecarleitao/arrow2)
+  - An [Arrow2 Branch](https://github.com/apache/arrow-datafusion/tree/arrow2) 
has been created.  There are ongoing discussions in 
[DataFusion](https://github.com/apache/arrow-datafusion/issues/1532) and 
[arrow-rs](https://github.com/apache/arrow-rs/issues/1176) about migrating 
`DataFusion` to `Arrow2`
+
+For the full list of new features with their relevant PRs, see the
+[enhancements 
section](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md)
+in the changelog.
+
+# Documentation and Roadmap
+
+The project's documentation is being consolidated into the official site.  You 
can find more details there on topics such as the SQL status (TO DO LINK) and a 
user guide.
+
+To provide transparency on DataFusion’s priorities to users and developers a 
three month roadmap will be published at the beginning of each quarter.  This 
can be found here (TO DO LINK once site is updated).  

Review comment:
       ```suggestion
   To provide transparency on DataFusion’s priorities to users and developers a 
three month roadmap will be published at the beginning of each quarter.  This 
can be found 
[here[(https://arrow.apache.org/datafusion/specification/roadmap.html).  
   ```

##########
File path: _posts/2022-02-14-datafusion-7.0.0.md
##########
@@ -0,0 +1,166 @@
+---
+layout: post
+title: Apache Arrow DataFusion 6.0.0 Release
+date: "2022-02-14 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+# Introduction
+
+[DataFusion](https://arrow.apache.org/datafusion/) is an extensible query 
execution framework, written in Rust, that uses Apache Arrow as its in-memory 
format.
+
+When you want to extend your Rust project with [SQL 
support](https://arrow.apache.org/datafusion/user-guide/sql/sql_status.html), a 
DataFrame API, or the ability to read and process Parquet, JSON, Avro or CSV 
data, DataFusion is definitely worth checking out.
+
+DataFusion supports both a SQL and DataFrame API for building logical query 
plans as well as a sophisticated query optimizer and execution engine capable 
of parallel execution against memory, CSV, Parquet, Avro and JSON.
+
+The Apache Arrow team is pleased to announce the DataFusion 7.0.0 release. 
This covers 4 months of development work
+and includes 195 commits from the following 37 distinct contributors.
+
+<!--
+git log --pretty=oneline 5.0.0..6.0.0 datafusion datafusion-cli 
datafusion-examples | wc -l
+     134
+
+git shortlog -sn 5.0.0..6.0.0 datafusion datafusion-cli datafusion-examples | 
wc -l
+      29
+
+      Carlos and xudong963 are same individual
+-->
+
+```
+    44  Andrew Lamb
+    24  Kun Liu
+    23  Jiayu Liu
+    12  xudong.w
+    11  Yijie Shen
+     9  Matthew Turner
+     7  Liang-Chi Hsieh
+     5  Lin Ma
+     5  Carlos
+     4  Stephen Carman
+     4  James Katz
+     4  Dmitry Patsura
+     4  QP Hou
+     3  dependabot[bot]
+     3  Remzi Yang
+     3  Yang
+     3  ic4y
+     3  Daniël Heres
+     2  Andy Grove
+     2  Raphael Taylor-Davies
+     2  Jason Tianyi Wang
+     2  Dan Harris
+     2  Sergey Melnychuk
+     1  Nitish Tiwari
+     1  Dom
+     1  Eduard Karacharov
+     1  Javier Goday
+     1  Boaz
+     1  Marko Mikulicic
+     1  Max Burke
+     1  Carol (Nichols || Goulding)
+     1  Phillip Cloud
+     1  Rich
+     1  Toby Hede
+     1  Will Jones
+     1  r.4ntix
+     1  rdettai
+```
+
+The release notes below are not exhaustive and only expose selected highlights 
of the release. Many other bug fixes
+and improvements have been made: we refer you to the complete
+[changelog](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md).
+
+# Summary
+
+There have been significant improvements across the board since the 6.0 
release which are summarized below.
+
+- DataFusion Crate
+  - The DataFusion crate is in the process of being split into multiple crates 
in order to decrease compilation times and improve the development experience. 
To start, datafusion-common (the core DataFusion components) and 
datafusion-expr (DataFusion expressions, functions, and operators) will be 
split out.  There will be additional splits after the 7.0 release.
+- Performance Improvements and Optimizations
+  - Arrow’s dyn scalar kernels are now used which enable more efficient 
operations on DictionaryArrays 
[#1685](https://github.com/apache/arrow-datafusion/pull/1685)

Review comment:
       ```suggestion
     - Arrow’s dyn scalar kernels are now used to enable efficient operations 
on `DictionaryArray`s 
[#1685](https://github.com/apache/arrow-datafusion/pull/1685)
   ```

##########
File path: _posts/2022-02-14-datafusion-7.0.0.md
##########
@@ -0,0 +1,166 @@
+---
+layout: post
+title: Apache Arrow DataFusion 6.0.0 Release
+date: "2022-02-14 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+# Introduction
+
+[DataFusion](https://arrow.apache.org/datafusion/) is an extensible query 
execution framework, written in Rust, that uses Apache Arrow as its in-memory 
format.
+
+When you want to extend your Rust project with [SQL 
support](https://arrow.apache.org/datafusion/user-guide/sql/sql_status.html), a 
DataFrame API, or the ability to read and process Parquet, JSON, Avro or CSV 
data, DataFusion is definitely worth checking out.
+
+DataFusion supports both a SQL and DataFrame API for building logical query 
plans as well as a sophisticated query optimizer and execution engine capable 
of parallel execution against memory, CSV, Parquet, Avro and JSON.
+
+The Apache Arrow team is pleased to announce the DataFusion 7.0.0 release. 
This covers 4 months of development work
+and includes 195 commits from the following 37 distinct contributors.
+
+<!--
+git log --pretty=oneline 5.0.0..6.0.0 datafusion datafusion-cli 
datafusion-examples | wc -l
+     134
+
+git shortlog -sn 5.0.0..6.0.0 datafusion datafusion-cli datafusion-examples | 
wc -l
+      29
+
+      Carlos and xudong963 are same individual
+-->
+
+```
+    44  Andrew Lamb
+    24  Kun Liu
+    23  Jiayu Liu
+    12  xudong.w
+    11  Yijie Shen
+     9  Matthew Turner
+     7  Liang-Chi Hsieh
+     5  Lin Ma
+     5  Carlos
+     4  Stephen Carman
+     4  James Katz
+     4  Dmitry Patsura
+     4  QP Hou
+     3  dependabot[bot]
+     3  Remzi Yang
+     3  Yang
+     3  ic4y
+     3  Daniël Heres
+     2  Andy Grove
+     2  Raphael Taylor-Davies
+     2  Jason Tianyi Wang
+     2  Dan Harris
+     2  Sergey Melnychuk
+     1  Nitish Tiwari
+     1  Dom
+     1  Eduard Karacharov
+     1  Javier Goday
+     1  Boaz
+     1  Marko Mikulicic
+     1  Max Burke
+     1  Carol (Nichols || Goulding)
+     1  Phillip Cloud
+     1  Rich
+     1  Toby Hede
+     1  Will Jones
+     1  r.4ntix
+     1  rdettai
+```
+
+The release notes below are not exhaustive and only expose selected highlights 
of the release. Many other bug fixes
+and improvements have been made: we refer you to the complete
+[changelog](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md).
+
+# Summary
+
+There have been significant improvements across the board since the 6.0 
release which are summarized below.
+
+- DataFusion Crate
+  - The DataFusion crate is in the process of being split into multiple crates 
in order to decrease compilation times and improve the development experience. 
To start, datafusion-common (the core DataFusion components) and 
datafusion-expr (DataFusion expressions, functions, and operators) will be 
split out.  There will be additional splits after the 7.0 release.
+- Performance Improvements and Optimizations
+  - Arrow’s dyn scalar kernels are now used which enable more efficient 
operations on DictionaryArrays 
[#1685](https://github.com/apache/arrow-datafusion/pull/1685)
+  - Switch from std::sync::Mutex to parking_lot::Mutex 
[#1720](https://github.com/apache/arrow-datafusion/pull/1720)
+- New Features
+  - Better support for limiting resource usage
+    - MemoryMananger and DiskManager 
[#1526](https://github.com/apache/arrow-datafusion/pull/1526)
+    - Out of core sort 
[#1526](https://github.com/apache/arrow-datafusion/pull/1526)
+    - New metrics
+      - `Gauge` and `CurrentMemoryUsage` 
[#1682](https://github.com/apache/arrow-datafusion/pull/1682)
+      - `Spill_count` and `spilled_bytes` 
[#1641](https://github.com/apache/arrow-datafusion/pull/1641)
+  - New math functions
+    - `Approx_quantile` 
[#1529](https://github.com/apache/arrow-datafusion/pull/1539)
+    - `stddev` and `variance` (sample and population) 
[#1525](https://github.com/apache/arrow-datafusion/pull/1525)
+    - `corr` [#1561](https://github.com/apache/arrow-datafusion/pull/1561)
+  - Support decimal type 
[#1394](https://github.com/apache/arrow-datafusion/pull/1394)[#1407](https://github.com/apache/arrow-datafusion/pull/1407)[#1408](https://github.com/apache/arrow-datafusion/pull/1408)[#1431](https://github.com/apache/arrow-datafusion/pull/1431)[#1483](https://github.com/apache/arrow-datafusion/pull/1483)[#1554](https://github.com/apache/arrow-datafusion/pull/1554)[#1640](https://github.com/apache/arrow-datafusion/pull/1640)
+  - Support for evolved schemas 
[#1622](https://github.com/apache/arrow-datafusion/pull/1622)[#1709](https://github.com/apache/arrow-datafusion/pull/1709)
+  - Support for registering `DataFrame` as table 
[#1699](https://github.com/apache/arrow-datafusion/pull/1699)
+  - Suppot `substring` function 
[#1621](https://github.com/apache/arrow-datafusion/pull/1621)
+  - Support `array_agg(distinct ...)` 
[#1579](https://github.com/apache/arrow-datafusion/pull/1579)
+  - Support `sort` on unprojected columns 
[#1415](https://github.com/apache/arrow-datafusion/pull/1415)
+- Additional Integration Points
+  - A new public Expression simplification API 
[#1717](https://github.com/apache/arrow-datafusion/pull/1717)
+- [DataFusion-Contrib](https://github.com/datafusion-contrib)
+  - A new GitHub organization created as a home for both `DataFusion` 
extensions and as a testing ground for new features.
+    - Extensions
+      - 
[DataFusion-Python](https://github.com/datafusion-contrib/datafusion-python)
+      - 
[DataFusion-Java](https://github.com/datafusion-contrib/datafusion-java)
+      - 
[DataFusion-hdsfs-native](https://github.com/datafusion-contrib/datafusion-hdfs-native)
+      - 
[DataFusion-ObjectStore-s3](https://github.com/datafusion-contrib/datafusion-objectstore-s3)
+    - New Features
+      - 
[DataFusion-Streams](https://github.com/datafusion-contrib/datafusion-streams)
+- [Arrow2](https://github.com/jorgecarleitao/arrow2)
+  - An [Arrow2 Branch](https://github.com/apache/arrow-datafusion/tree/arrow2) 
has been created.  There are ongoing discussions in 
[DataFusion](https://github.com/apache/arrow-datafusion/issues/1532) and 
[arrow-rs](https://github.com/apache/arrow-rs/issues/1176) about migrating 
`DataFusion` to `Arrow2`
+
+For the full list of new features with their relevant PRs, see the
+[enhancements 
section](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md)
+in the changelog.

Review comment:
       ```suggestion
   ```
   I think this is redundant with the lead above

##########
File path: _posts/2022-02-14-datafusion-7.0.0.md
##########
@@ -0,0 +1,166 @@
+---
+layout: post
+title: Apache Arrow DataFusion 6.0.0 Release
+date: "2022-02-14 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+# Introduction
+
+[DataFusion](https://arrow.apache.org/datafusion/) is an extensible query 
execution framework, written in Rust, that uses Apache Arrow as its in-memory 
format.
+
+When you want to extend your Rust project with [SQL 
support](https://arrow.apache.org/datafusion/user-guide/sql/sql_status.html), a 
DataFrame API, or the ability to read and process Parquet, JSON, Avro or CSV 
data, DataFusion is definitely worth checking out.
+
+DataFusion supports both a SQL and DataFrame API for building logical query 
plans as well as a sophisticated query optimizer and execution engine capable 
of parallel execution against memory, CSV, Parquet, Avro and JSON.
+
+The Apache Arrow team is pleased to announce the DataFusion 7.0.0 release. 
This covers 4 months of development work
+and includes 195 commits from the following 37 distinct contributors.
+
+<!--
+git log --pretty=oneline 5.0.0..6.0.0 datafusion datafusion-cli 
datafusion-examples | wc -l
+     134
+
+git shortlog -sn 5.0.0..6.0.0 datafusion datafusion-cli datafusion-examples | 
wc -l
+      29
+
+      Carlos and xudong963 are same individual
+-->
+
+```
+    44  Andrew Lamb
+    24  Kun Liu
+    23  Jiayu Liu
+    12  xudong.w
+    11  Yijie Shen
+     9  Matthew Turner
+     7  Liang-Chi Hsieh
+     5  Lin Ma
+     5  Carlos
+     4  Stephen Carman
+     4  James Katz
+     4  Dmitry Patsura
+     4  QP Hou
+     3  dependabot[bot]
+     3  Remzi Yang
+     3  Yang
+     3  ic4y
+     3  Daniël Heres
+     2  Andy Grove
+     2  Raphael Taylor-Davies
+     2  Jason Tianyi Wang
+     2  Dan Harris
+     2  Sergey Melnychuk
+     1  Nitish Tiwari
+     1  Dom
+     1  Eduard Karacharov
+     1  Javier Goday
+     1  Boaz
+     1  Marko Mikulicic
+     1  Max Burke
+     1  Carol (Nichols || Goulding)
+     1  Phillip Cloud
+     1  Rich
+     1  Toby Hede
+     1  Will Jones
+     1  r.4ntix
+     1  rdettai
+```
+
+The release notes below are not exhaustive and only expose selected highlights 
of the release. Many other bug fixes
+and improvements have been made: we refer you to the complete
+[changelog](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md).
+
+# Summary
+
+There have been significant improvements across the board since the 6.0 
release which are summarized below.
+
+- DataFusion Crate
+  - The DataFusion crate is in the process of being split into multiple crates 
in order to decrease compilation times and improve the development experience. 
To start, datafusion-common (the core DataFusion components) and 
datafusion-expr (DataFusion expressions, functions, and operators) will be 
split out.  There will be additional splits after the 7.0 release.
+- Performance Improvements and Optimizations
+  - Arrow’s dyn scalar kernels are now used which enable more efficient 
operations on DictionaryArrays 
[#1685](https://github.com/apache/arrow-datafusion/pull/1685)
+  - Switch from std::sync::Mutex to parking_lot::Mutex 
[#1720](https://github.com/apache/arrow-datafusion/pull/1720)
+- New Features
+  - Better support for limiting resource usage
+    - MemoryMananger and DiskManager 
[#1526](https://github.com/apache/arrow-datafusion/pull/1526)
+    - Out of core sort 
[#1526](https://github.com/apache/arrow-datafusion/pull/1526)
+    - New metrics
+      - `Gauge` and `CurrentMemoryUsage` 
[#1682](https://github.com/apache/arrow-datafusion/pull/1682)
+      - `Spill_count` and `spilled_bytes` 
[#1641](https://github.com/apache/arrow-datafusion/pull/1641)
+  - New math functions
+    - `Approx_quantile` 
[#1529](https://github.com/apache/arrow-datafusion/pull/1539)
+    - `stddev` and `variance` (sample and population) 
[#1525](https://github.com/apache/arrow-datafusion/pull/1525)
+    - `corr` [#1561](https://github.com/apache/arrow-datafusion/pull/1561)
+  - Support decimal type 
[#1394](https://github.com/apache/arrow-datafusion/pull/1394)[#1407](https://github.com/apache/arrow-datafusion/pull/1407)[#1408](https://github.com/apache/arrow-datafusion/pull/1408)[#1431](https://github.com/apache/arrow-datafusion/pull/1431)[#1483](https://github.com/apache/arrow-datafusion/pull/1483)[#1554](https://github.com/apache/arrow-datafusion/pull/1554)[#1640](https://github.com/apache/arrow-datafusion/pull/1640)
+  - Support for evolved schemas 
[#1622](https://github.com/apache/arrow-datafusion/pull/1622)[#1709](https://github.com/apache/arrow-datafusion/pull/1709)
+  - Support for registering `DataFrame` as table 
[#1699](https://github.com/apache/arrow-datafusion/pull/1699)
+  - Suppot `substring` function 
[#1621](https://github.com/apache/arrow-datafusion/pull/1621)

Review comment:
       ```suggestion
     - Support for the `substring` function 
[#1621](https://github.com/apache/arrow-datafusion/pull/1621)
   ```

##########
File path: _posts/2022-02-14-datafusion-7.0.0.md
##########
@@ -0,0 +1,166 @@
+---
+layout: post
+title: Apache Arrow DataFusion 6.0.0 Release
+date: "2022-02-14 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+# Introduction
+
+[DataFusion](https://arrow.apache.org/datafusion/) is an extensible query 
execution framework, written in Rust, that uses Apache Arrow as its in-memory 
format.
+
+When you want to extend your Rust project with [SQL 
support](https://arrow.apache.org/datafusion/user-guide/sql/sql_status.html), a 
DataFrame API, or the ability to read and process Parquet, JSON, Avro or CSV 
data, DataFusion is definitely worth checking out.
+
+DataFusion supports both a SQL and DataFrame API for building logical query 
plans as well as a sophisticated query optimizer and execution engine capable 
of parallel execution against memory, CSV, Parquet, Avro and JSON.
+
+The Apache Arrow team is pleased to announce the DataFusion 7.0.0 release. 
This covers 4 months of development work
+and includes 195 commits from the following 37 distinct contributors.
+
+<!--
+git log --pretty=oneline 5.0.0..6.0.0 datafusion datafusion-cli 
datafusion-examples | wc -l
+     134
+
+git shortlog -sn 5.0.0..6.0.0 datafusion datafusion-cli datafusion-examples | 
wc -l
+      29
+
+      Carlos and xudong963 are same individual
+-->
+
+```
+    44  Andrew Lamb
+    24  Kun Liu
+    23  Jiayu Liu
+    12  xudong.w
+    11  Yijie Shen
+     9  Matthew Turner
+     7  Liang-Chi Hsieh
+     5  Lin Ma
+     5  Carlos
+     4  Stephen Carman
+     4  James Katz
+     4  Dmitry Patsura
+     4  QP Hou
+     3  dependabot[bot]
+     3  Remzi Yang
+     3  Yang
+     3  ic4y
+     3  Daniël Heres
+     2  Andy Grove
+     2  Raphael Taylor-Davies
+     2  Jason Tianyi Wang
+     2  Dan Harris
+     2  Sergey Melnychuk
+     1  Nitish Tiwari
+     1  Dom
+     1  Eduard Karacharov
+     1  Javier Goday
+     1  Boaz
+     1  Marko Mikulicic
+     1  Max Burke
+     1  Carol (Nichols || Goulding)
+     1  Phillip Cloud
+     1  Rich
+     1  Toby Hede
+     1  Will Jones
+     1  r.4ntix
+     1  rdettai
+```
+
+The release notes below are not exhaustive and only expose selected highlights 
of the release. Many other bug fixes
+and improvements have been made: we refer you to the complete
+[changelog](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md).
+
+# Summary
+
+There have been significant improvements across the board since the 6.0 
release which are summarized below.
+
+- DataFusion Crate
+  - The DataFusion crate is in the process of being split into multiple crates 
in order to decrease compilation times and improve the development experience. 
To start, datafusion-common (the core DataFusion components) and 
datafusion-expr (DataFusion expressions, functions, and operators) will be 
split out.  There will be additional splits after the 7.0 release.
+- Performance Improvements and Optimizations
+  - Arrow’s dyn scalar kernels are now used which enable more efficient 
operations on DictionaryArrays 
[#1685](https://github.com/apache/arrow-datafusion/pull/1685)
+  - Switch from std::sync::Mutex to parking_lot::Mutex 
[#1720](https://github.com/apache/arrow-datafusion/pull/1720)

Review comment:
       ```suggestion
     - Switch from `std::sync::Mutex` to `parking_lot::Mutex` 
[#1720](https://github.com/apache/arrow-datafusion/pull/1720)
   ```

##########
File path: _posts/2022-02-14-datafusion-7.0.0.md
##########
@@ -0,0 +1,166 @@
+---
+layout: post
+title: Apache Arrow DataFusion 6.0.0 Release
+date: "2022-02-14 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+# Introduction
+
+[DataFusion](https://arrow.apache.org/datafusion/) is an extensible query 
execution framework, written in Rust, that uses Apache Arrow as its in-memory 
format.
+
+When you want to extend your Rust project with [SQL 
support](https://arrow.apache.org/datafusion/user-guide/sql/sql_status.html), a 
DataFrame API, or the ability to read and process Parquet, JSON, Avro or CSV 
data, DataFusion is definitely worth checking out.
+
+DataFusion supports both a SQL and DataFrame API for building logical query 
plans as well as a sophisticated query optimizer and execution engine capable 
of parallel execution against memory, CSV, Parquet, Avro and JSON.
+
+The Apache Arrow team is pleased to announce the DataFusion 7.0.0 release. 
This covers 4 months of development work
+and includes 195 commits from the following 37 distinct contributors.
+
+<!--
+git log --pretty=oneline 5.0.0..6.0.0 datafusion datafusion-cli 
datafusion-examples | wc -l
+     134
+
+git shortlog -sn 5.0.0..6.0.0 datafusion datafusion-cli datafusion-examples | 
wc -l
+      29
+
+      Carlos and xudong963 are same individual
+-->
+
+```
+    44  Andrew Lamb
+    24  Kun Liu
+    23  Jiayu Liu
+    12  xudong.w
+    11  Yijie Shen
+     9  Matthew Turner
+     7  Liang-Chi Hsieh
+     5  Lin Ma
+     5  Carlos
+     4  Stephen Carman
+     4  James Katz
+     4  Dmitry Patsura
+     4  QP Hou
+     3  dependabot[bot]
+     3  Remzi Yang
+     3  Yang
+     3  ic4y
+     3  Daniël Heres
+     2  Andy Grove
+     2  Raphael Taylor-Davies
+     2  Jason Tianyi Wang
+     2  Dan Harris
+     2  Sergey Melnychuk
+     1  Nitish Tiwari
+     1  Dom
+     1  Eduard Karacharov
+     1  Javier Goday
+     1  Boaz
+     1  Marko Mikulicic
+     1  Max Burke
+     1  Carol (Nichols || Goulding)
+     1  Phillip Cloud
+     1  Rich
+     1  Toby Hede
+     1  Will Jones
+     1  r.4ntix
+     1  rdettai
+```
+
+The release notes below are not exhaustive and only expose selected highlights 
of the release. Many other bug fixes
+and improvements have been made: we refer you to the complete
+[changelog](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md).
+
+# Summary
+
+There have been significant improvements across the board since the 6.0 
release which are summarized below.
+
+- DataFusion Crate
+  - The DataFusion crate is in the process of being split into multiple crates 
in order to decrease compilation times and improve the development experience. 
To start, datafusion-common (the core DataFusion components) and 
datafusion-expr (DataFusion expressions, functions, and operators) will be 
split out.  There will be additional splits after the 7.0 release.
+- Performance Improvements and Optimizations
+  - Arrow’s dyn scalar kernels are now used which enable more efficient 
operations on DictionaryArrays 
[#1685](https://github.com/apache/arrow-datafusion/pull/1685)
+  - Switch from std::sync::Mutex to parking_lot::Mutex 
[#1720](https://github.com/apache/arrow-datafusion/pull/1720)
+- New Features
+  - Better support for limiting resource usage
+    - MemoryMananger and DiskManager 
[#1526](https://github.com/apache/arrow-datafusion/pull/1526)
+    - Out of core sort 
[#1526](https://github.com/apache/arrow-datafusion/pull/1526)
+    - New metrics
+      - `Gauge` and `CurrentMemoryUsage` 
[#1682](https://github.com/apache/arrow-datafusion/pull/1682)
+      - `Spill_count` and `spilled_bytes` 
[#1641](https://github.com/apache/arrow-datafusion/pull/1641)
+  - New math functions
+    - `Approx_quantile` 
[#1529](https://github.com/apache/arrow-datafusion/pull/1539)
+    - `stddev` and `variance` (sample and population) 
[#1525](https://github.com/apache/arrow-datafusion/pull/1525)
+    - `corr` [#1561](https://github.com/apache/arrow-datafusion/pull/1561)
+  - Support decimal type 
[#1394](https://github.com/apache/arrow-datafusion/pull/1394)[#1407](https://github.com/apache/arrow-datafusion/pull/1407)[#1408](https://github.com/apache/arrow-datafusion/pull/1408)[#1431](https://github.com/apache/arrow-datafusion/pull/1431)[#1483](https://github.com/apache/arrow-datafusion/pull/1483)[#1554](https://github.com/apache/arrow-datafusion/pull/1554)[#1640](https://github.com/apache/arrow-datafusion/pull/1640)
+  - Support for evolved schemas 
[#1622](https://github.com/apache/arrow-datafusion/pull/1622)[#1709](https://github.com/apache/arrow-datafusion/pull/1709)
+  - Support for registering `DataFrame` as table 
[#1699](https://github.com/apache/arrow-datafusion/pull/1699)
+  - Suppot `substring` function 
[#1621](https://github.com/apache/arrow-datafusion/pull/1621)
+  - Support `array_agg(distinct ...)` 
[#1579](https://github.com/apache/arrow-datafusion/pull/1579)
+  - Support `sort` on unprojected columns 
[#1415](https://github.com/apache/arrow-datafusion/pull/1415)
+- Additional Integration Points
+  - A new public Expression simplification API 
[#1717](https://github.com/apache/arrow-datafusion/pull/1717)
+- [DataFusion-Contrib](https://github.com/datafusion-contrib)
+  - A new GitHub organization created as a home for both `DataFusion` 
extensions and as a testing ground for new features.
+    - Extensions
+      - 
[DataFusion-Python](https://github.com/datafusion-contrib/datafusion-python)
+      - 
[DataFusion-Java](https://github.com/datafusion-contrib/datafusion-java)
+      - 
[DataFusion-hdsfs-native](https://github.com/datafusion-contrib/datafusion-hdfs-native)
+      - 
[DataFusion-ObjectStore-s3](https://github.com/datafusion-contrib/datafusion-objectstore-s3)
+    - New Features
+      - 
[DataFusion-Streams](https://github.com/datafusion-contrib/datafusion-streams)
+- [Arrow2](https://github.com/jorgecarleitao/arrow2)
+  - An [Arrow2 Branch](https://github.com/apache/arrow-datafusion/tree/arrow2) 
has been created.  There are ongoing discussions in 
[DataFusion](https://github.com/apache/arrow-datafusion/issues/1532) and 
[arrow-rs](https://github.com/apache/arrow-rs/issues/1176) about migrating 
`DataFusion` to `Arrow2`
+
+For the full list of new features with their relevant PRs, see the
+[enhancements 
section](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md)
+in the changelog.
+
+# Documentation and Roadmap
+
+The project's documentation is being consolidated into the official site.  You 
can find more details there on topics such as the SQL status (TO DO LINK) and a 
user guide.
+
+To provide transparency on DataFusion’s priorities to users and developers a 
three month roadmap will be published at the beginning of each quarter.  This 
can be found here (TO DO LINK once site is updated).  
+
+See full details on DataFusion’s ambitions (TO DO LINK).
+

Review comment:
       ```suggestion
   ```

##########
File path: _posts/2022-02-14-datafusion-7.0.0.md
##########
@@ -0,0 +1,166 @@
+---
+layout: post
+title: Apache Arrow DataFusion 6.0.0 Release
+date: "2022-02-14 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+# Introduction
+
+[DataFusion](https://arrow.apache.org/datafusion/) is an extensible query 
execution framework, written in Rust, that uses Apache Arrow as its in-memory 
format.
+
+When you want to extend your Rust project with [SQL 
support](https://arrow.apache.org/datafusion/user-guide/sql/sql_status.html), a 
DataFrame API, or the ability to read and process Parquet, JSON, Avro or CSV 
data, DataFusion is definitely worth checking out.
+
+DataFusion supports both a SQL and DataFrame API for building logical query 
plans as well as a sophisticated query optimizer and execution engine capable 
of parallel execution against memory, CSV, Parquet, Avro and JSON.
+
+The Apache Arrow team is pleased to announce the DataFusion 7.0.0 release. 
This covers 4 months of development work
+and includes 195 commits from the following 37 distinct contributors.
+
+<!--
+git log --pretty=oneline 5.0.0..6.0.0 datafusion datafusion-cli 
datafusion-examples | wc -l
+     134
+
+git shortlog -sn 5.0.0..6.0.0 datafusion datafusion-cli datafusion-examples | 
wc -l
+      29
+
+      Carlos and xudong963 are same individual
+-->
+
+```
+    44  Andrew Lamb
+    24  Kun Liu
+    23  Jiayu Liu
+    12  xudong.w
+    11  Yijie Shen
+     9  Matthew Turner
+     7  Liang-Chi Hsieh
+     5  Lin Ma
+     5  Carlos
+     4  Stephen Carman
+     4  James Katz
+     4  Dmitry Patsura
+     4  QP Hou
+     3  dependabot[bot]
+     3  Remzi Yang
+     3  Yang
+     3  ic4y
+     3  Daniël Heres
+     2  Andy Grove
+     2  Raphael Taylor-Davies
+     2  Jason Tianyi Wang
+     2  Dan Harris
+     2  Sergey Melnychuk
+     1  Nitish Tiwari
+     1  Dom
+     1  Eduard Karacharov
+     1  Javier Goday
+     1  Boaz
+     1  Marko Mikulicic
+     1  Max Burke
+     1  Carol (Nichols || Goulding)
+     1  Phillip Cloud
+     1  Rich
+     1  Toby Hede
+     1  Will Jones
+     1  r.4ntix
+     1  rdettai
+```
+
+The release notes below are not exhaustive and only expose selected highlights 
of the release. Many other bug fixes
+and improvements have been made: we refer you to the complete
+[changelog](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md).
+
+# Summary
+
+There have been significant improvements across the board since the 6.0 
release which are summarized below.
+
+- DataFusion Crate
+  - The DataFusion crate is in the process of being split into multiple crates 
in order to decrease compilation times and improve the development experience. 
To start, datafusion-common (the core DataFusion components) and 
datafusion-expr (DataFusion expressions, functions, and operators) will be 
split out.  There will be additional splits after the 7.0 release.
+- Performance Improvements and Optimizations
+  - Arrow’s dyn scalar kernels are now used which enable more efficient 
operations on DictionaryArrays 
[#1685](https://github.com/apache/arrow-datafusion/pull/1685)
+  - Switch from std::sync::Mutex to parking_lot::Mutex 
[#1720](https://github.com/apache/arrow-datafusion/pull/1720)
+- New Features
+  - Better support for limiting resource usage
+    - MemoryMananger and DiskManager 
[#1526](https://github.com/apache/arrow-datafusion/pull/1526)
+    - Out of core sort 
[#1526](https://github.com/apache/arrow-datafusion/pull/1526)
+    - New metrics
+      - `Gauge` and `CurrentMemoryUsage` 
[#1682](https://github.com/apache/arrow-datafusion/pull/1682)
+      - `Spill_count` and `spilled_bytes` 
[#1641](https://github.com/apache/arrow-datafusion/pull/1641)
+  - New math functions
+    - `Approx_quantile` 
[#1529](https://github.com/apache/arrow-datafusion/pull/1539)
+    - `stddev` and `variance` (sample and population) 
[#1525](https://github.com/apache/arrow-datafusion/pull/1525)
+    - `corr` [#1561](https://github.com/apache/arrow-datafusion/pull/1561)
+  - Support decimal type 
[#1394](https://github.com/apache/arrow-datafusion/pull/1394)[#1407](https://github.com/apache/arrow-datafusion/pull/1407)[#1408](https://github.com/apache/arrow-datafusion/pull/1408)[#1431](https://github.com/apache/arrow-datafusion/pull/1431)[#1483](https://github.com/apache/arrow-datafusion/pull/1483)[#1554](https://github.com/apache/arrow-datafusion/pull/1554)[#1640](https://github.com/apache/arrow-datafusion/pull/1640)
+  - Support for evolved schemas 
[#1622](https://github.com/apache/arrow-datafusion/pull/1622)[#1709](https://github.com/apache/arrow-datafusion/pull/1709)
+  - Support for registering `DataFrame` as table 
[#1699](https://github.com/apache/arrow-datafusion/pull/1699)
+  - Suppot `substring` function 
[#1621](https://github.com/apache/arrow-datafusion/pull/1621)
+  - Support `array_agg(distinct ...)` 
[#1579](https://github.com/apache/arrow-datafusion/pull/1579)
+  - Support `sort` on unprojected columns 
[#1415](https://github.com/apache/arrow-datafusion/pull/1415)
+- Additional Integration Points
+  - A new public Expression simplification API 
[#1717](https://github.com/apache/arrow-datafusion/pull/1717)
+- [DataFusion-Contrib](https://github.com/datafusion-contrib)
+  - A new GitHub organization created as a home for both `DataFusion` 
extensions and as a testing ground for new features.
+    - Extensions
+      - 
[DataFusion-Python](https://github.com/datafusion-contrib/datafusion-python)
+      - 
[DataFusion-Java](https://github.com/datafusion-contrib/datafusion-java)
+      - 
[DataFusion-hdsfs-native](https://github.com/datafusion-contrib/datafusion-hdfs-native)
+      - 
[DataFusion-ObjectStore-s3](https://github.com/datafusion-contrib/datafusion-objectstore-s3)
+    - New Features
+      - 
[DataFusion-Streams](https://github.com/datafusion-contrib/datafusion-streams)
+- [Arrow2](https://github.com/jorgecarleitao/arrow2)
+  - An [Arrow2 Branch](https://github.com/apache/arrow-datafusion/tree/arrow2) 
has been created.  There are ongoing discussions in 
[DataFusion](https://github.com/apache/arrow-datafusion/issues/1532) and 
[arrow-rs](https://github.com/apache/arrow-rs/issues/1176) about migrating 
`DataFusion` to `Arrow2`
+
+For the full list of new features with their relevant PRs, see the
+[enhancements 
section](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md)
+in the changelog.
+
+# Documentation and Roadmap
+
+The project's documentation is being consolidated into the official site.  You 
can find more details there on topics such as the SQL status (TO DO LINK) and a 
user guide.

Review comment:
       ```suggestion
   We are working to consolidate the documentation into the [official 
site](https://arrow.apache.org/datafusion).  You can find more details there on 
topics such as the [SQL 
status](https://arrow.apache.org/datafusion/user-guide/sql/index.html)  and a 
[user 
guide](https://arrow.apache.org/datafusion/user-guide/introduction.html#introduction).
 This is also an area we would love to get help from the broader community.
   ```

##########
File path: _posts/2022-02-14-datafusion-7.0.0.md
##########
@@ -0,0 +1,166 @@
+---
+layout: post
+title: Apache Arrow DataFusion 6.0.0 Release
+date: "2022-02-14 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+# Introduction
+
+[DataFusion](https://arrow.apache.org/datafusion/) is an extensible query 
execution framework, written in Rust, that uses Apache Arrow as its in-memory 
format.
+
+When you want to extend your Rust project with [SQL 
support](https://arrow.apache.org/datafusion/user-guide/sql/sql_status.html), a 
DataFrame API, or the ability to read and process Parquet, JSON, Avro or CSV 
data, DataFusion is definitely worth checking out.
+
+DataFusion supports both a SQL and DataFrame API for building logical query 
plans as well as a sophisticated query optimizer and execution engine capable 
of parallel execution against memory, CSV, Parquet, Avro and JSON.
+
+The Apache Arrow team is pleased to announce the DataFusion 7.0.0 release. 
This covers 4 months of development work
+and includes 195 commits from the following 37 distinct contributors.
+
+<!--
+git log --pretty=oneline 5.0.0..6.0.0 datafusion datafusion-cli 
datafusion-examples | wc -l
+     134
+
+git shortlog -sn 5.0.0..6.0.0 datafusion datafusion-cli datafusion-examples | 
wc -l
+      29
+
+      Carlos and xudong963 are same individual
+-->
+
+```
+    44  Andrew Lamb
+    24  Kun Liu
+    23  Jiayu Liu
+    12  xudong.w
+    11  Yijie Shen
+     9  Matthew Turner
+     7  Liang-Chi Hsieh
+     5  Lin Ma
+     5  Carlos
+     4  Stephen Carman
+     4  James Katz
+     4  Dmitry Patsura
+     4  QP Hou
+     3  dependabot[bot]
+     3  Remzi Yang
+     3  Yang
+     3  ic4y
+     3  Daniël Heres
+     2  Andy Grove
+     2  Raphael Taylor-Davies
+     2  Jason Tianyi Wang
+     2  Dan Harris
+     2  Sergey Melnychuk
+     1  Nitish Tiwari
+     1  Dom
+     1  Eduard Karacharov
+     1  Javier Goday
+     1  Boaz
+     1  Marko Mikulicic
+     1  Max Burke
+     1  Carol (Nichols || Goulding)
+     1  Phillip Cloud
+     1  Rich
+     1  Toby Hede
+     1  Will Jones
+     1  r.4ntix
+     1  rdettai
+```
+
+The release notes below are not exhaustive and only expose selected highlights 
of the release. Many other bug fixes
+and improvements have been made: we refer you to the complete
+[changelog](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md).
+
+# Summary
+
+There have been significant improvements across the board since the 6.0 
release which are summarized below.
+
+- DataFusion Crate
+  - The DataFusion crate is in the process of being split into multiple crates 
in order to decrease compilation times and improve the development experience. 
To start, datafusion-common (the core DataFusion components) and 
datafusion-expr (DataFusion expressions, functions, and operators) will be 
split out.  There will be additional splits after the 7.0 release.
+- Performance Improvements and Optimizations
+  - Arrow’s dyn scalar kernels are now used which enable more efficient 
operations on DictionaryArrays 
[#1685](https://github.com/apache/arrow-datafusion/pull/1685)
+  - Switch from std::sync::Mutex to parking_lot::Mutex 
[#1720](https://github.com/apache/arrow-datafusion/pull/1720)
+- New Features
+  - Better support for limiting resource usage
+    - MemoryMananger and DiskManager 
[#1526](https://github.com/apache/arrow-datafusion/pull/1526)
+    - Out of core sort 
[#1526](https://github.com/apache/arrow-datafusion/pull/1526)
+    - New metrics
+      - `Gauge` and `CurrentMemoryUsage` 
[#1682](https://github.com/apache/arrow-datafusion/pull/1682)
+      - `Spill_count` and `spilled_bytes` 
[#1641](https://github.com/apache/arrow-datafusion/pull/1641)
+  - New math functions
+    - `Approx_quantile` 
[#1529](https://github.com/apache/arrow-datafusion/pull/1539)
+    - `stddev` and `variance` (sample and population) 
[#1525](https://github.com/apache/arrow-datafusion/pull/1525)
+    - `corr` [#1561](https://github.com/apache/arrow-datafusion/pull/1561)
+  - Support decimal type 
[#1394](https://github.com/apache/arrow-datafusion/pull/1394)[#1407](https://github.com/apache/arrow-datafusion/pull/1407)[#1408](https://github.com/apache/arrow-datafusion/pull/1408)[#1431](https://github.com/apache/arrow-datafusion/pull/1431)[#1483](https://github.com/apache/arrow-datafusion/pull/1483)[#1554](https://github.com/apache/arrow-datafusion/pull/1554)[#1640](https://github.com/apache/arrow-datafusion/pull/1640)
+  - Support for evolved schemas 
[#1622](https://github.com/apache/arrow-datafusion/pull/1622)[#1709](https://github.com/apache/arrow-datafusion/pull/1709)
+  - Support for registering `DataFrame` as table 
[#1699](https://github.com/apache/arrow-datafusion/pull/1699)
+  - Suppot `substring` function 
[#1621](https://github.com/apache/arrow-datafusion/pull/1621)
+  - Support `array_agg(distinct ...)` 
[#1579](https://github.com/apache/arrow-datafusion/pull/1579)
+  - Support `sort` on unprojected columns 
[#1415](https://github.com/apache/arrow-datafusion/pull/1415)
+- Additional Integration Points
+  - A new public Expression simplification API 
[#1717](https://github.com/apache/arrow-datafusion/pull/1717)
+- [DataFusion-Contrib](https://github.com/datafusion-contrib)
+  - A new GitHub organization created as a home for both `DataFusion` 
extensions and as a testing ground for new features.

Review comment:
       ❤️ 

##########
File path: _posts/2022-02-14-datafusion-7.0.0.md
##########
@@ -0,0 +1,166 @@
+---
+layout: post
+title: Apache Arrow DataFusion 6.0.0 Release
+date: "2022-02-14 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+# Introduction
+
+[DataFusion](https://arrow.apache.org/datafusion/) is an extensible query 
execution framework, written in Rust, that uses Apache Arrow as its in-memory 
format.
+
+When you want to extend your Rust project with [SQL 
support](https://arrow.apache.org/datafusion/user-guide/sql/sql_status.html), a 
DataFrame API, or the ability to read and process Parquet, JSON, Avro or CSV 
data, DataFusion is definitely worth checking out.
+
+DataFusion supports both a SQL and DataFrame API for building logical query 
plans as well as a sophisticated query optimizer and execution engine capable 
of parallel execution against memory, CSV, Parquet, Avro and JSON.

Review comment:
       🤔  I was trying to accentuate the positives here / keep readers 
interested. Maybe I have gone too far 🤔 

##########
File path: _posts/2022-02-14-datafusion-7.0.0.md
##########
@@ -0,0 +1,154 @@
+---
+layout: post
+title: Apache Arrow DataFusion 6.0.0 Release
+date: "2022-02-14 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+# Introduction
+
+[DataFusion](https://arrow.apache.org/datafusion/) is an extensible query 
execution framework, written in Rust, that uses Apache Arrow as its in-memory 
format.
+
+When you want to extend your Rust project with [SQL 
support](https://arrow.apache.org/datafusion/user-guide/sql/sql_status.html), a 
DataFrame API, or the ability to read and process Parquet, JSON, Avro or CSV 
data, DataFusion is definitely worth checking out.
+
+DataFusion's  SQL, `DataFrame`, and manual `PlanBuilder` API let users access 
a sophisticated query optimizer and execution engine capable of fast, resource 
efficient, and parallel execution that takes optimal advantage of todays 
multicore hardware. Being written in Rust means DataFusion can offer *both* the 
safety of dynamic languages as well as the resource efficiency of a compiled 
language.
+
+The Apache Arrow team is pleased to announce the DataFusion 7.0.0 release. 
This covers 4 months of development work
+and includes 195 commits from the following 37 distinct contributors.
+
+<!--
+git log --pretty=oneline 5.0.0..6.0.0 datafusion datafusion-cli 
datafusion-examples | wc -l
+     134
+
+git shortlog -sn 5.0.0..6.0.0 datafusion datafusion-cli datafusion-examples | 
wc -l
+      29
+
+      Carlos and xudong963 are same individual
+-->
+
+```
+    44  Andrew Lamb
+    24  Kun Liu
+    23  Jiayu Liu
+    12  xudong.w
+    11  Yijie Shen
+     9  Matthew Turner
+     7  Liang-Chi Hsieh
+     5  Lin Ma
+     5  Carlos
+     4  Stephen Carman
+     4  James Katz
+     4  Dmitry Patsura
+     4  QP Hou
+     3  dependabot[bot]
+     3  Remzi Yang
+     3  Yang
+     3  ic4y
+     3  Daniël Heres
+     2  Andy Grove
+     2  Raphael Taylor-Davies
+     2  Jason Tianyi Wang
+     2  Dan Harris
+     2  Sergey Melnychuk
+     1  Nitish Tiwari
+     1  Dom
+     1  Eduard Karacharov
+     1  Javier Goday
+     1  Boaz
+     1  Marko Mikulicic
+     1  Max Burke
+     1  Carol (Nichols || Goulding)
+     1  Phillip Cloud
+     1  Rich
+     1  Toby Hede
+     1  Will Jones
+     1  r.4ntix
+     1  rdettai
+```
+
+The following section highlights some of the improvements in this release. Of 
course, many other bug fixes and improvements have also been made and we refer 
you to the complete 
[changelog](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md)
 for the full detail.

Review comment:
       That is correct -- I will create a `7.0.0` tag in the datafusion-repo 
once it has been published
   
   Right now, you can preview the log here:
   
https://github.com/apache/arrow-datafusion/blob/7.0.0-rc2/datafusion/CHANGELOG.md

##########
File path: _posts/2022-02-14-datafusion-7.0.0.md
##########
@@ -0,0 +1,154 @@
+---
+layout: post
+title: Apache Arrow DataFusion 6.0.0 Release

Review comment:
       that is a *great* catch. 🦅  👁️  👍 

##########
File path: _posts/2022-02-14-datafusion-7.0.0.md
##########
@@ -0,0 +1,166 @@
+---
+layout: post
+title: Apache Arrow DataFusion 6.0.0 Release
+date: "2022-02-14 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+# Introduction
+
+[DataFusion](https://arrow.apache.org/datafusion/) is an extensible query 
execution framework, written in Rust, that uses Apache Arrow as its in-memory 
format.
+
+When you want to extend your Rust project with [SQL 
support](https://arrow.apache.org/datafusion/user-guide/sql/sql_status.html), a 
DataFrame API, or the ability to read and process Parquet, JSON, Avro or CSV 
data, DataFusion is definitely worth checking out.
+
+DataFusion supports both a SQL and DataFrame API for building logical query 
plans as well as a sophisticated query optimizer and execution engine capable 
of parallel execution against memory, CSV, Parquet, Avro and JSON.
+
+The Apache Arrow team is pleased to announce the DataFusion 7.0.0 release. 
This covers 4 months of development work
+and includes 195 commits from the following 37 distinct contributors.
+
+<!--
+git log --pretty=oneline 5.0.0..6.0.0 datafusion datafusion-cli 
datafusion-examples | wc -l
+     134
+
+git shortlog -sn 5.0.0..6.0.0 datafusion datafusion-cli datafusion-examples | 
wc -l
+      29
+
+      Carlos and xudong963 are same individual
+-->
+
+```
+    44  Andrew Lamb
+    24  Kun Liu
+    23  Jiayu Liu
+    12  xudong.w
+    11  Yijie Shen
+     9  Matthew Turner
+     7  Liang-Chi Hsieh
+     5  Lin Ma
+     5  Carlos
+     4  Stephen Carman
+     4  James Katz
+     4  Dmitry Patsura
+     4  QP Hou
+     3  dependabot[bot]
+     3  Remzi Yang
+     3  Yang
+     3  ic4y
+     3  Daniël Heres
+     2  Andy Grove
+     2  Raphael Taylor-Davies
+     2  Jason Tianyi Wang
+     2  Dan Harris
+     2  Sergey Melnychuk
+     1  Nitish Tiwari
+     1  Dom
+     1  Eduard Karacharov
+     1  Javier Goday
+     1  Boaz
+     1  Marko Mikulicic
+     1  Max Burke
+     1  Carol (Nichols || Goulding)
+     1  Phillip Cloud
+     1  Rich
+     1  Toby Hede
+     1  Will Jones
+     1  r.4ntix
+     1  rdettai
+```
+
+The release notes below are not exhaustive and only expose selected highlights 
of the release. Many other bug fixes
+and improvements have been made: we refer you to the complete
+[changelog](https://github.com/apache/arrow-datafusion/blob/7.0.0/datafusion/CHANGELOG.md).
+
+# Summary
+
+There have been significant improvements across the board since the 6.0 
release which are summarized below.
+
+- DataFusion Crate
+  - The DataFusion crate is in the process of being split into multiple crates 
in order to decrease compilation times and improve the development experience. 
To start, datafusion-common (the core DataFusion components) and 
datafusion-expr (DataFusion expressions, functions, and operators) will be 
split out.  There will be additional splits after the 7.0 release.
+- Performance Improvements and Optimizations
+  - Arrow’s dyn scalar kernels are now used which enable more efficient 
operations on DictionaryArrays 
[#1685](https://github.com/apache/arrow-datafusion/pull/1685)
+  - Switch from std::sync::Mutex to parking_lot::Mutex 
[#1720](https://github.com/apache/arrow-datafusion/pull/1720)
+- New Features
+  - Better support for limiting resource usage
+    - MemoryMananger and DiskManager 
[#1526](https://github.com/apache/arrow-datafusion/pull/1526)
+    - Out of core sort 
[#1526](https://github.com/apache/arrow-datafusion/pull/1526)
+    - New metrics
+      - `Gauge` and `CurrentMemoryUsage` 
[#1682](https://github.com/apache/arrow-datafusion/pull/1682)
+      - `Spill_count` and `spilled_bytes` 
[#1641](https://github.com/apache/arrow-datafusion/pull/1641)
+  - New math functions
+    - `Approx_quantile` 
[#1529](https://github.com/apache/arrow-datafusion/pull/1539)
+    - `stddev` and `variance` (sample and population) 
[#1525](https://github.com/apache/arrow-datafusion/pull/1525)
+    - `corr` [#1561](https://github.com/apache/arrow-datafusion/pull/1561)
+  - Support decimal type 
[#1394](https://github.com/apache/arrow-datafusion/pull/1394)[#1407](https://github.com/apache/arrow-datafusion/pull/1407)[#1408](https://github.com/apache/arrow-datafusion/pull/1408)[#1431](https://github.com/apache/arrow-datafusion/pull/1431)[#1483](https://github.com/apache/arrow-datafusion/pull/1483)[#1554](https://github.com/apache/arrow-datafusion/pull/1554)[#1640](https://github.com/apache/arrow-datafusion/pull/1640)
+  - Support for evolved schemas 
[#1622](https://github.com/apache/arrow-datafusion/pull/1622)[#1709](https://github.com/apache/arrow-datafusion/pull/1709)

Review comment:
       i think @thinkharderdev concluded the schema merging already worked for 
CSV and Json files, but I may also misunderstand




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-site] alamb commented on a change in pull request #193: ARROW-15683: [Website] [DataFusion] DataFusion 7.0.0 blog post

Reply via email to