[GitHub] [arrow-site] ianmcook commented on a change in pull request #127: Blog post for 5.0.0 release

2021-08-03 Thread GitBox


ianmcook commented on a change in pull request #127:
URL: https://github.com/apache/arrow-site/pull/127#discussion_r682066764



##
File path: _posts/2021-07-20-5.0.0-release.md
##
@@ -0,0 +1,267 @@
+---
+layout: post
+title: "Apache Arrow 5.0.0 Release"
+date: "2020-07-29 00:00:00 -0600"
+author: pmc
+categories: [release]
+---
+
+
+
+The Apache Arrow team is pleased to announce the 5.0.0 release. This covers
+3 months of development work and includes **684 commits** from
+[**99 distinct contributors**][1] in 2 repositories. See the Install Page to
+learn how to get the libraries for your platform.
+
+The release notes below are not exhaustive and only expose selected highlights
+of the release. Many other bugfixes and improvements have been made: we refer
+you to the complete changelogs for the [`apache/arrow`][2] and
+[`apache/arrow-rs`][3] repositories.
+
+## Community
+
+Since the 4.0.0 release, Daniël Heres, Kazuaki Ishizaki, Dominik Moritz, and 
Weston Pace
+have been invited as committers to Arrow,
+and Benjamin Kietzman and David Li have joined the Project Management Committee
+(PMC). Thank you for all of your contributions!
+
+## Columnar Format Notes
+
+Official IANA Media types (MIME types) have been registered for Apache
+Arrow IPC protocol data, both [stream]({{ site.baseurl 
}}/docs/format/Columnar.html#ipc-streaming-format)
+and [file]({{ site.baseurl }}/docs/format/Columnar.html#ipc-file-format) 
variants:
+
+* 
https://www.iana.org/assignments/media-types/application/vnd.apache.arrow.stream
+* 
https://www.iana.org/assignments/media-types/application/vnd.apache.arrow.file
+
+We recommend `.arrow` as the IPC file format file extension and `.arrows` for
+the IPC streaming format file extension.
+
+## Arrow Flight RPC notes
+
+The Go implementation now supports custom metadata and middleware, and has
+been added to integration testing.
+
+In Python, some operations can now be interrupted via Control-C.
+
+## C++ notes
+
+`MakeArrayFromScalar` now works for fixed-size binary types (ARROW-13321).
+
+### Compute layer
+
+The following [compute functions]({{site.baseurl}}/docs/cpp/compute.html)
+were added:
+
+* aggregations: `index`
+
+* scalar arithmetic and math functions: `abs`, `abs_checked`, `acos`,
+  `acos_checked`, `asin`, `asin_checked`, `atan`, `atan2`, `ceil`, `cos`,
+  `cos_checked`, `floor`, `ln`, `ln_checked`, `log10`, `log10_checked`,
+  `log1p`, `log1p_checked`, `log2`, `log2_checked`, `negate`, `negate_checked`,
+  `sign`, `sin`, `sin_checked`, `tan`, `tan_checked`, `trunc`
+
+* scalar bitwise functions: `bit_wise_and`, `bit_wise_not`, `bit_wise_or`,
+  `bit_wise_xor`, `shift_left`, `shift_left_checked`, `shift_right`,
+  `shift_right_checked`
+
+* scalar string functions: `ascii_center`, `ascii_lpad`, `ascii_reverse`,
+  `ascii_rpad`, `binary_join`, `binary_join_element_wise`,
+  `binary_replace_slice`, `count_substring`, `count_substring_regex`,
+  `ends_with`, `find_substring`, `find_substring_regex`, `match_like`,
+  `split_pattern_regex`, `starts_with`, `utf8_center`, `utf8_lpad`,
+  `utf8_replace_slice`, `utf8_rpad`, `utf8_reverse`, `utf8_slice_codepoints`
+
+* scalar temporal functions: `day`, `day_of_week`, `day_of_year`,
+  `iso_calendar`, `iso_week`, `iso_year`, `hour`, `microsecond`, `millisecond`,
+  `minute`, `month`, `nanosecond`, `quarter`, `second`, `subsecond`, `year`
+
+* other scalar functions: `case_when`, `coalesce`, `if_else`, `is_finite`,
+  `is_inf`, `is_nan`, `max_element_wise`, `min_element_wise`, `make_struct`
+
+* vector functions: `replace_with_mask`
+
+Duplicates are now allowed in `SetLookupOptions::value_set` (ARROW-12554).
+
+Decimal types are now supported by some basic arithmetic functions 
(ARROW-12074).
+
+The `take` function now supports dense unions (ARROW-13005).
+
+It is now possible to cast between dictionary types with different index
+types (ARROW-11673).
+
+Sorting is now implemented for boolean input (ARROW-12016).
+
+### CSV
+
+The streaming CSV reader can now take some advantage of multiple threads 
(ARROW-11889).
+
+The CSV reader tries to make its errors more informative by adding the
+row number when it is known, i.e. when parallel reading is disabled 
(ARROW-12675).
+
+A new option `ReaderOptions::skip_rows_after_names` allows skipping a number
+of rows _after_ reading the column names (as opposed to
+`ReaderOptions::skip_rows`).
+
+Quoted strings can now be treated as always non-null (ARROW-10115).
+
+### Dataset layer
+
+The asynchronous scanner introduced in 4.0.0 has been improved with truly 
+asynchronous readers implemented for CSV, Parquet, and IPC file formats and 
+file-level parallelism added.  This mode is controlled by a flag `use_async` 
that
+can be passed into methods which scan a dataset.  Setting this flag to True
+will have significant improvements on filesystems with high latency or parallel
+reads (e.g. S3).
+
+A CountRows method has been added to count rows matching a predicate; where

[GitHub] [arrow-site] ianmcook commented on a change in pull request #127: Blog post for 5.0.0 release

2021-08-03 Thread GitBox


ianmcook commented on a change in pull request #127:
URL: https://github.com/apache/arrow-site/pull/127#discussion_r682064224



##
File path: _posts/2021-07-20-5.0.0-release.md
##
@@ -0,0 +1,267 @@
+---
+layout: post
+title: "Apache Arrow 5.0.0 Release"
+date: "2020-07-29 00:00:00 -0600"
+author: pmc
+categories: [release]
+---
+
+
+
+The Apache Arrow team is pleased to announce the 5.0.0 release. This covers
+3 months of development work and includes **684 commits** from
+[**99 distinct contributors**][1] in 2 repositories. See the Install Page to
+learn how to get the libraries for your platform.
+
+The release notes below are not exhaustive and only expose selected highlights
+of the release. Many other bugfixes and improvements have been made: we refer
+you to the complete changelogs for the [`apache/arrow`][2] and
+[`apache/arrow-rs`][3] repositories.
+
+## Community
+
+Since the 4.0.0 release, Daniël Heres, Kazuaki Ishizaki, Dominik Moritz, and 
Weston Pace
+have been invited as committers to Arrow,
+and Benjamin Kietzman and David Li have joined the Project Management Committee
+(PMC). Thank you for all of your contributions!
+
+## Columnar Format Notes
+
+Official IANA Media types (MIME types) have been registered for Apache
+Arrow IPC protocol data, both [stream]({{ site.baseurl 
}}/docs/format/Columnar.html#ipc-streaming-format)
+and [file]({{ site.baseurl }}/docs/format/Columnar.html#ipc-file-format) 
variants:
+
+* 
https://www.iana.org/assignments/media-types/application/vnd.apache.arrow.stream
+* 
https://www.iana.org/assignments/media-types/application/vnd.apache.arrow.file
+
+We recommend `.arrow` as the IPC file format file extension and `.arrows` for
+the IPC streaming format file extension.
+
+## Arrow Flight RPC notes
+
+The Go implementation now supports custom metadata and middleware, and has
+been added to integration testing.
+
+In Python, some operations can now be interrupted via Control-C.
+
+## C++ notes
+
+`MakeArrayFromScalar` now works for fixed-size binary types (ARROW-13321).
+
+### Compute layer
+
+The following [compute functions]({{site.baseurl}}/docs/cpp/compute.html)
+were added:
+
+* aggregations: `index`
+
+* scalar arithmetic and math functions: `abs`, `abs_checked`, `acos`,
+  `acos_checked`, `asin`, `asin_checked`, `atan`, `atan2`, `ceil`, `cos`,
+  `cos_checked`, `floor`, `ln`, `ln_checked`, `log10`, `log10_checked`,
+  `log1p`, `log1p_checked`, `log2`, `log2_checked`, `negate`, `negate_checked`,
+  `sign`, `sin`, `sin_checked`, `tan`, `tan_checked`, `trunc`
+
+* scalar bitwise functions: `bit_wise_and`, `bit_wise_not`, `bit_wise_or`,
+  `bit_wise_xor`, `shift_left`, `shift_left_checked`, `shift_right`,
+  `shift_right_checked`
+
+* scalar string functions: `ascii_center`, `ascii_lpad`, `ascii_reverse`,
+  `ascii_rpad`, `binary_join`, `binary_join_element_wise`,
+  `binary_replace_slice`, `count_substring`, `count_substring_regex`,
+  `ends_with`, `find_substring`, `find_substring_regex`, `match_like`,
+  `split_pattern_regex`, `starts_with`, `utf8_center`, `utf8_lpad`,
+  `utf8_replace_slice`, `utf8_rpad`, `utf8_reverse`, `utf8_slice_codepoints`
+
+* scalar temporal functions: `day`, `day_of_week`, `day_of_year`,
+  `iso_calendar`, `iso_week`, `iso_year`, `hour`, `microsecond`, `millisecond`,
+  `minute`, `month`, `nanosecond`, `quarter`, `second`, `subsecond`, `year`
+
+* other scalar functions: `case_when`, `coalesce`, `if_else`, `is_finite`,
+  `is_inf`, `is_nan`, `max_element_wise`, `min_element_wise`, `make_struct`
+
+* vector functions: `replace_with_mask`
+
+Duplicates are now allowed in `SetLookupOptions::value_set` (ARROW-12554).
+
+Decimal types are now supported by some basic arithmetic functions 
(ARROW-12074).
+
+The `take` function now supports dense unions (ARROW-13005).
+
+It is now possible to cast between dictionary types with different index
+types (ARROW-11673).
+
+Sorting is now implemented for boolean input (ARROW-12016).
+
+### CSV
+
+The streaming CSV reader can now take some advantage of multiple threads 
(ARROW-11889).
+
+The CSV reader tries to make its errors more informative by adding the
+row number when it is known, i.e. when parallel reading is disabled 
(ARROW-12675).
+
+A new option `ReaderOptions::skip_rows_after_names` allows skipping a number
+of rows _after_ reading the column names (as opposed to
+`ReaderOptions::skip_rows`).
+
+Quoted strings can now be treated as always non-null (ARROW-10115).
+
+### Dataset layer
+
+The asynchronous scanner introduced in 4.0.0 has been improved with truly 
+asynchronous readers implemented for CSV, Parquet, and IPC file formats and 
+file-level parallelism added.  This mode is controlled by a flag `use_async` 
that
+can be passed into methods which scan a dataset.  Setting this flag to True
+will have significant improvements on filesystems with high latency or parallel
+reads (e.g. S3).
+
+A CountRows method has been added to count rows matching a predicate; where

[GitHub] [arrow-site] ianmcook commented on a change in pull request #127: Blog post for 5.0.0 release

2021-08-03 Thread GitBox


ianmcook commented on a change in pull request #127:
URL: https://github.com/apache/arrow-site/pull/127#discussion_r682062655



##
File path: _posts/2021-07-20-5.0.0-release.md
##
@@ -0,0 +1,267 @@
+---
+layout: post
+title: "Apache Arrow 5.0.0 Release"
+date: "2020-07-29 00:00:00 -0600"
+author: pmc
+categories: [release]
+---
+
+
+
+The Apache Arrow team is pleased to announce the 5.0.0 release. This covers
+3 months of development work and includes **684 commits** from
+[**99 distinct contributors**][1] in 2 repositories. See the Install Page to
+learn how to get the libraries for your platform.
+
+The release notes below are not exhaustive and only expose selected highlights
+of the release. Many other bugfixes and improvements have been made: we refer
+you to the complete changelogs for the [`apache/arrow`][2] and
+[`apache/arrow-rs`][3] repositories.
+
+## Community
+
+Since the 4.0.0 release, Daniël Heres, Kazuaki Ishizaki, Dominik Moritz, and 
Weston Pace
+have been invited as committers to Arrow,
+and Benjamin Kietzman and David Li have joined the Project Management Committee
+(PMC). Thank you for all of your contributions!
+
+## Columnar Format Notes
+
+Official IANA Media types (MIME types) have been registered for Apache
+Arrow IPC protocol data, both [stream]({{ site.baseurl 
}}/docs/format/Columnar.html#ipc-streaming-format)
+and [file]({{ site.baseurl }}/docs/format/Columnar.html#ipc-file-format) 
variants:
+
+* 
https://www.iana.org/assignments/media-types/application/vnd.apache.arrow.stream
+* 
https://www.iana.org/assignments/media-types/application/vnd.apache.arrow.file
+
+We recommend `.arrow` as the IPC file format file extension and `.arrows` for
+the IPC streaming format file extension.
+
+## Arrow Flight RPC notes
+
+The Go implementation now supports custom metadata and middleware, and has
+been added to integration testing.
+
+In Python, some operations can now be interrupted via Control-C.
+
+## C++ notes
+
+`MakeArrayFromScalar` now works for fixed-size binary types (ARROW-13321).
+
+### Compute layer
+
+The following [compute functions]({{site.baseurl}}/docs/cpp/compute.html)
+were added:
+
+* aggregations: `index`
+
+* scalar arithmetic and math functions: `abs`, `abs_checked`, `acos`,
+  `acos_checked`, `asin`, `asin_checked`, `atan`, `atan2`, `ceil`, `cos`,
+  `cos_checked`, `floor`, `ln`, `ln_checked`, `log10`, `log10_checked`,
+  `log1p`, `log1p_checked`, `log2`, `log2_checked`, `negate`, `negate_checked`,
+  `sign`, `sin`, `sin_checked`, `tan`, `tan_checked`, `trunc`
+
+* scalar bitwise functions: `bit_wise_and`, `bit_wise_not`, `bit_wise_or`,
+  `bit_wise_xor`, `shift_left`, `shift_left_checked`, `shift_right`,
+  `shift_right_checked`
+
+* scalar string functions: `ascii_center`, `ascii_lpad`, `ascii_reverse`,
+  `ascii_rpad`, `binary_join`, `binary_join_element_wise`,
+  `binary_replace_slice`, `count_substring`, `count_substring_regex`,
+  `ends_with`, `find_substring`, `find_substring_regex`, `match_like`,
+  `split_pattern_regex`, `starts_with`, `utf8_center`, `utf8_lpad`,
+  `utf8_replace_slice`, `utf8_rpad`, `utf8_reverse`, `utf8_slice_codepoints`
+
+* scalar temporal functions: `day`, `day_of_week`, `day_of_year`,
+  `iso_calendar`, `iso_week`, `iso_year`, `hour`, `microsecond`, `millisecond`,
+  `minute`, `month`, `nanosecond`, `quarter`, `second`, `subsecond`, `year`
+
+* other scalar functions: `case_when`, `coalesce`, `if_else`, `is_finite`,
+  `is_inf`, `is_nan`, `max_element_wise`, `min_element_wise`, `make_struct`
+
+* vector functions: `replace_with_mask`
+
+Duplicates are now allowed in `SetLookupOptions::value_set` (ARROW-12554).
+
+Decimal types are now supported by some basic arithmetic functions 
(ARROW-12074).
+
+The `take` function now supports dense unions (ARROW-13005).
+
+It is now possible to cast between dictionary types with different index
+types (ARROW-11673).
+
+Sorting is now implemented for boolean input (ARROW-12016).
+
+### CSV
+
+The streaming CSV reader can now take some advantage of multiple threads 
(ARROW-11889).
+
+The CSV reader tries to make its errors more informative by adding the
+row number when it is known, i.e. when parallel reading is disabled 
(ARROW-12675).
+
+A new option `ReaderOptions::skip_rows_after_names` allows skipping a number
+of rows _after_ reading the column names (as opposed to
+`ReaderOptions::skip_rows`).
+
+Quoted strings can now be treated as always non-null (ARROW-10115).
+
+### Dataset layer
+
+The asynchronous scanner introduced in 4.0.0 has been improved with truly 
+asynchronous readers implemented for CSV, Parquet, and IPC file formats and 
+file-level parallelism added.  This mode is controlled by a flag `use_async` 
that
+can be passed into methods which scan a dataset.  Setting this flag to True
+will have significant improvements on filesystems with high latency or parallel
+reads (e.g. S3).
+
+A CountRows method has been added to count rows matching a predicate; where

[GitHub] [arrow-site] ianmcook commented on a change in pull request #127: Blog post for 5.0.0 release

2021-08-03 Thread GitBox


ianmcook commented on a change in pull request #127:
URL: https://github.com/apache/arrow-site/pull/127#discussion_r681888343



##
File path: _posts/2021-07-20-5.0.0-release.md
##
@@ -0,0 +1,264 @@
+---
+layout: post
+title: "Apache Arrow 5.0.0 Release"
+date: "2020-07-16 00:00:00 -0600"
+author: pmc
+categories: [release]
+---
+
+
+
+The Apache Arrow team is pleased to announce the 5.0.0 release. This covers
+3 months of development work and includes **XX commits** from
+[**XX distinct contributors**][1] in 2 repositories. See the Install Page to

Review comment:
   Done in #136




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-site] ianmcook commented on a change in pull request #127: Blog post for 5.0.0 release

2021-08-03 Thread GitBox


ianmcook commented on a change in pull request #127:
URL: https://github.com/apache/arrow-site/pull/127#discussion_r681879683



##
File path: _posts/2021-07-20-5.0.0-release.md
##
@@ -0,0 +1,267 @@
+---
+layout: post
+title: "Apache Arrow 5.0.0 Release"
+date: "2020-07-29 00:00:00 -0600"
+author: pmc
+categories: [release]
+---
+
+
+
+The Apache Arrow team is pleased to announce the 5.0.0 release. This covers
+3 months of development work and includes **684 commits** from
+[**99 distinct contributors**][1] in 2 repositories. See the Install Page to
+learn how to get the libraries for your platform.
+
+The release notes below are not exhaustive and only expose selected highlights
+of the release. Many other bugfixes and improvements have been made: we refer
+you to the complete changelogs for the [`apache/arrow`][2] and
+[`apache/arrow-rs`][3] repositories.
+
+## Community
+
+Since the 4.0.0 release, Daniël Heres, Kazuaki Ishizaki, Dominik Moritz, and 
Weston Pace
+have been invited as committers to Arrow,
+and Benjamin Kietzman and David Li have joined the Project Management Committee
+(PMC). Thank you for all of your contributions!
+
+## Columnar Format Notes
+
+Official IANA Media types (MIME types) have been registered for Apache
+Arrow IPC protocol data, both [stream]({{ site.baseurl 
}}/docs/format/Columnar.html#ipc-streaming-format)
+and [file]({{ site.baseurl }}/docs/format/Columnar.html#ipc-file-format) 
variants:
+
+* 
https://www.iana.org/assignments/media-types/application/vnd.apache.arrow.stream
+* 
https://www.iana.org/assignments/media-types/application/vnd.apache.arrow.file
+
+We recommend `.arrow` as the IPC file format file extension and `.arrows` for
+the IPC streaming format file extension.
+
+## Arrow Flight RPC notes
+
+The Go implementation now supports custom metadata and middleware, and has
+been added to integration testing.
+
+In Python, some operations can now be interrupted via Control-C.
+
+## C++ notes
+
+`MakeArrayFromScalar` now works for fixed-size binary types (ARROW-13321).
+
+### Compute layer
+
+The following [compute functions]({{site.baseurl}}/docs/cpp/compute.html)
+were added:
+
+* aggregations: `index`
+
+* scalar arithmetic and math functions: `abs`, `abs_checked`, `acos`,
+  `acos_checked`, `asin`, `asin_checked`, `atan`, `atan2`, `ceil`, `cos`,
+  `cos_checked`, `floor`, `ln`, `ln_checked`, `log10`, `log10_checked`,
+  `log1p`, `log1p_checked`, `log2`, `log2_checked`, `negate`, `negate_checked`,
+  `sign`, `sin`, `sin_checked`, `tan`, `tan_checked`, `trunc`
+
+* scalar bitwise functions: `bit_wise_and`, `bit_wise_not`, `bit_wise_or`,
+  `bit_wise_xor`, `shift_left`, `shift_left_checked`, `shift_right`,
+  `shift_right_checked`
+
+* scalar string functions: `ascii_center`, `ascii_lpad`, `ascii_reverse`,
+  `ascii_rpad`, `binary_join`, `binary_join_element_wise`,
+  `binary_replace_slice`, `count_substring`, `count_substring_regex`,
+  `ends_with`, `find_substring`, `find_substring_regex`, `match_like`,
+  `split_pattern_regex`, `starts_with`, `utf8_center`, `utf8_lpad`,
+  `utf8_replace_slice`, `utf8_rpad`, `utf8_reverse`, `utf8_slice_codepoints`
+
+* scalar temporal functions: `day`, `day_of_week`, `day_of_year`,
+  `iso_calendar`, `iso_week`, `iso_year`, `hour`, `microsecond`, `millisecond`,
+  `minute`, `month`, `nanosecond`, `quarter`, `second`, `subsecond`, `year`
+
+* other scalar functions: `case_when`, `coalesce`, `if_else`, `is_finite`,
+  `is_inf`, `is_nan`, `max_element_wise`, `min_element_wise`, `make_struct`
+
+* vector functions: `replace_with_mask`
+
+Duplicates are now allowed in `SetLookupOptions::value_set` (ARROW-12554).
+
+Decimal types are now supported by some basic arithmetic functions 
(ARROW-12074).
+
+The `take` function now supports dense unions (ARROW-13005).
+
+It is now possible to cast between dictionary types with different index
+types (ARROW-11673).
+
+Sorting is now implemented for boolean input (ARROW-12016).
+
+### CSV
+
+The streaming CSV reader can now take some advantage of multiple threads 
(ARROW-11889).
+
+The CSV reader tries to make its errors more informative by adding the
+row number when it is known, i.e. when parallel reading is disabled 
(ARROW-12675).
+
+A new option `ReaderOptions::skip_rows_after_names` allows skipping a number
+of rows _after_ reading the column names (as opposed to
+`ReaderOptions::skip_rows`).
+
+Quoted strings can now be treated as always non-null (ARROW-10115).
+
+### Dataset layer
+
+The asynchronous scanner introduced in 4.0.0 has been improved with truly 
+asynchronous readers implemented for CSV, Parquet, and IPC file formats and 
+file-level parallelism added.  This mode is controlled by a flag `use_async` 
that
+can be passed into methods which scan a dataset.  Setting this flag to True
+will have significant improvements on filesystems with high latency or parallel
+reads (e.g. S3).
+
+A CountRows method has been added to count rows matching a predicate; where

[GitHub] [arrow-site] ianmcook commented on a change in pull request #127: Blog post for 5.0.0 release

2021-07-28 Thread GitBox


ianmcook commented on a change in pull request #127:
URL: https://github.com/apache/arrow-site/pull/127#discussion_r678783309



##
File path: _posts/2021-07-20-5.0.0-release.md
##
@@ -0,0 +1,264 @@
+---
+layout: post
+title: "Apache Arrow 5.0.0 Release"
+date: "2020-07-16 00:00:00 -0600"
+author: pmc
+categories: [release]
+---
+
+
+
+The Apache Arrow team is pleased to announce the 5.0.0 release. This covers
+3 months of development work and includes **XX commits** from
+[**XX distinct contributors**][1] in 2 repositories. See the Install Page to

Review comment:
   ```suggestion
   3 months of development work and includes **684 commits** from
   [**99 distinct contributors**][1] in 2 repositories. See the Install Page to
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-site] ianmcook commented on a change in pull request #127: Blog post for 5.0.0 release

2021-07-28 Thread GitBox


ianmcook commented on a change in pull request #127:
URL: https://github.com/apache/arrow-site/pull/127#discussion_r678592244



##
File path: _posts/2021-07-20-5.0.0-release.md
##
@@ -0,0 +1,259 @@
+---
+layout: post
+title: "Apache Arrow 5.0.0 Release"
+date: "2020-07-16 00:00:00 -0600"
+author: pmc
+categories: [release]
+---
+
+
+
+
+
+The Apache Arrow team is pleased to announce the 5.0.0 release. This covers
+3 months of development work and includes **XX commits** from

Review comment:
   @kszucs let's please get https://github.com/apache/arrow/pull/10774 
reviewed and merged and use that to fill in the XXs




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-site] ianmcook commented on a change in pull request #127: Blog post for 5.0.0 release

2021-07-28 Thread GitBox


ianmcook commented on a change in pull request #127:
URL: https://github.com/apache/arrow-site/pull/127#discussion_r678540966



##
File path: _posts/2021-07-20-5.0.0-release.md
##
@@ -0,0 +1,244 @@
+---
+layout: post
+title: "Apache Arrow 5.0.0 Release"
+date: "2020-07-16 00:00:00 -0600"
+author: pmc
+categories: [release]
+---
+
+
+
+
+
+The Apache Arrow team is pleased to announce the 5.0.0 release. This covers
+3 months of development work and includes **XX commits** from
+[**XX distinct contributors**][1] in 2 repositories. See the Install Page to
+learn how to get the libraries for your platform.
+
+The release notes below are not exhaustive and only expose selected highlights
+of the release. Many other bugfixes and improvements have been made: we refer
+you to the complete changelogs for the [`apache/arrow`][2] and
+[`apache/arrow-rs`][3] repositories.
+
+## Community
+
+Since the 4.0.0 release, Daniël Heres, Kazuaki Ishizaki, Dominik Moritz, and 
Weston Pace
+have been invited as committers to Arrow,
+and Benjamin Kietzman and David Li have joined the Project Management Committee
+(PMC). Thank you for all of your contributions!
+
+## Columnar Format Notes
+
+Official IANA Media types (MIME types) have been registered for Apache
+Arrow IPC protocol data, both [stream]({{ site.baseurl 
}}/docs/format/Columnar.html#ipc-streaming-format)
+and [file]({{ site.baseurl }}/docs/format/Columnar.html#ipc-file-format) 
variants:
+
+* 
https://www.iana.org/assignments/media-types/application/vnd.apache.arrow.stream
+* 
https://www.iana.org/assignments/media-types/application/vnd.apache.arrow.file
+
+We recommend ".arrow" as the IPC file format file extension and ".arrows" for
+the IPC streaming format file extension.
+
+## Arrow Flight RPC notes
+
+The Go implementation now supports custom metadata and middleware, and has
+been added to integration testing.
+
+In Python, some operations can now be interrupted via Control-C.
+
+## C++ notes
+
+`MakeArrayFromScalar` now works for fixed-size binary types (ARROW-13321).
+
+### Compute layer
+
+The following [compute functions]({{site.baseurl}}/docs/cpp/compute.html)
+were added:
+
+* aggregations: "index"
+
+* scalar arithmetic and math functions: "abs", "abs_checked", "acos",
+  "acos_checked", "asin", "asin_checked", "atan", "atan2", "ceil", "cos",
+  "cos_checked", "floor", "ln", "ln_checked", "log10", "log10_checked",
+  "log1p", "log1p_checked", "log2", "log2_checked", "negate", "negate_checked",
+  "sign", "sin", "sin_checked", "tan", "tan_checked", "trunc"
+
+* scalar bitwise functions: "bit_wise_and", "bit_wise_not", "bit_wise_or",
+  "bit_wise_xor", "shift_left", "shift_left_checked", "shift_right",
+  "shift_right_checked"
+
+* scalar string functions: "ascii_center", "ascii_lpad", "ascii_reverse",
+  "ascii_rpad", "binary_join", "binary_join_element_wise",
+  "binary_replace_slice", "count_substring", "count_substring_regex",
+  "ends_with", "find_substring", "find_substring_regex", "match_like",
+  "split_pattern_regex", "starts_with", "utf8_center", "utf8_lpad",
+  "utf8_replace_slice", "utf8_rpad", "utf8_reverse", "utf8_slice_codepoints"
+
+* scalar temporal functions: "day", "day_of_week", "day_of_year",
+  "iso_calendar", "iso_week", "iso_year", "hour", "microsecond", "millisecond",
+  "minute", "month", "nanosecond", "quarter", "second", "subsecond", "year"
+
+* other scalar functions: "case_when", "coalesce", "if_else", "is_finite",
+  "is_inf", "is_nan", "max_element_wise", "min_element_wise", "make_struct"
+
+* vector functions: "replace_with_mask"
+
+Duplicates are now allowed in `SetLookupOptions::value_set` (ARROW-12554).
+
+Decimal types are now supported by some basic arithmetic functions 
(ARROW-12074).
+
+The "take" function now supports dense unions (ARROW-13005).
+
+It is now possible to cast between dictionary types with different index
+types (ARROW-11673).
+
+Sorting is now implemented for boolean input (ARROW-12016).
+
+### CSV
+
+The streaming CSV reader can now read in parallel (ARROW-11889).
+
+The CSV reader tries to make its errors more informative by adding the
+row number when it is known, i.e. when parallel reading is disabled 
(ARROW-12675).
+
+A new option `ReaderOptions::skip_rows_after_names` allows skipping a number
+of rows _after_ reading the column names (as opposed to
+`ReaderOptions::skip_rows`).
+
+Quoted strings can now be treated as always non-null (ARROW-10115).
+
+### Dataset layer
+
+### IO and Filesystem layer
+
+The I/O thread pool size can now be adjuted at runtime (ARROW-12760).
+The default size remains 8 threads.
+
+Streams now can have auxiliary metadata, depending on the backend.  This
+has been implemented for the S3 filesystems, where a couple metadata
+keys are supported such as "Content-Type" and "ACL" (ARROW-11161, ARROW-12719).
+
+The HadoopFileSystem implementation now implements the FileSystem abstraction
+more faithfully (ARROW-12790).
+
+### Parquet
+
+The new LZ4_RAW 

[GitHub] [arrow-site] ianmcook commented on a change in pull request #127: Blog post for 5.0.0 release

2021-07-28 Thread GitBox


ianmcook commented on a change in pull request #127:
URL: https://github.com/apache/arrow-site/pull/127#discussion_r678536922



##
File path: _posts/2021-07-20-5.0.0-release.md
##
@@ -0,0 +1,100 @@
+---
+layout: post
+title: "Apache Arrow 5.0.0 Release"
+date: "2020-07-16 00:00:00 -0600"
+author: pmc
+categories: [release]
+---
+
+
+
+
+
+The Apache Arrow team is pleased to announce the 5.0.0 release. This covers
+over XX months of development work and includes [**XX resolved issues**][1]
+from [**XX distinct contributors**][2]. See the Install Page to learn how to
+get the libraries for your platform.
+
+The release notes below are not exhaustive and only expose selected highlights
+of the release. Many other bugfixes and improvements have been made: we refer
+you to the [complete changelog][3].
+
+## Community
+
+Since the 4.0.0 release, Daniël Heres, Kazuaki Ishizaki, Dominik Moritz, and 
Weston Pace
+have been invited as committers to Arrow,
+and Benjamin Kietzman and David Li have joined the Project Management Committee
+(PMC). Thank you for all of your contributions!
+
+## Columnar Format Notes
+
+Official IANA Media types (MIME types) have been registered for Apache
+Arrow IPC protocol data, both [stream]({{ site.baseurl 
}}/docs/format/Columnar.html#ipc-streaming-format) and [file]({{ site.baseurl 
}}/docs/format/Columnar.html#ipc-file-format) variants:
+
+* 
https://www.iana.org/assignments/media-types/application/vnd.apache.arrow.stream
+* 
https://www.iana.org/assignments/media-types/application/vnd.apache.arrow.file
+
+We recommend ".arrow" as the IPC file format file extension and ".arrows" for 
the IPC streaming format file extension.
+
+## Arrow Flight RPC notes
+
+The Go implementation now supports custom metadata and middleware, and has 
been added to integration testing.
+
+In Python, some operations can now be interrupted via Control-C.
+
+## C++ notes
+
+## C# notes
+
+## Go notes
+
+## Java notes
+
+## JavaScript notes
+
+## Python notes
+
+## R notes
+
+In this release, we've more than doubled the number of functions you can call 
on Arrow Datasets inside `dplyr::filter()` and `mutate()`, including many more 
string, datetime, and math functions. You can also write Datasets to CSV files, 
in addition to Parquet and Feather. We've also deepened support for the Arrow C 
interface, which is used in the Python interface and allows integration with 
other projects, such as DuckDB.
+
+For more on what’s in the 5.0.0 R package, see the [R changelog][4].
+
+## Ruby and C GLib notes
+
+### Ruby
+
+### C GLib
+
+## Rust notes
+

Review comment:
   I don't think there is any way to know for sure what the URL of the Rust 
release blog post will be, and it might be posted after this one, so I'll just 
refer to it by name without a link for now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-site] ianmcook commented on a change in pull request #127: Blog post for 5.0.0 release

2021-07-28 Thread GitBox


ianmcook commented on a change in pull request #127:
URL: https://github.com/apache/arrow-site/pull/127#discussion_r678388328



##
File path: _posts/2021-07-20-5.0.0-release.md
##
@@ -0,0 +1,229 @@
+---
+layout: post
+title: "Apache Arrow 5.0.0 Release"
+date: "2020-07-16 00:00:00 -0600"
+author: pmc
+categories: [release]
+---
+
+
+
+
+
+The Apache Arrow team is pleased to announce the 5.0.0 release. This covers
+over XX months of development work and includes [**XX resolved issues**][1]
+from [**XX distinct contributors**][2]. See the Install Page to learn how to
+get the libraries for your platform.
+
+The release notes below are not exhaustive and only expose selected highlights
+of the release. Many other bugfixes and improvements have been made: we refer
+you to the [complete changelog][3].

Review comment:
   ```suggestion
   The Apache Arrow team is pleased to announce the 5.0.0 release. This covers
   3 months of development work and includes **XX commits** from
   [**XX distinct contributors**][1] in 2 repositories. See the Install Page to
   learn how to get the libraries for your platform.
   
   The release notes below are not exhaustive and only expose selected 
highlights
   of the release. Many other bugfixes and improvements have been made: we refer
   you to the complete changelogs for the [`apache/arrow`][2] and
   [`apache/arrow-rs`][3] repositories.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-site] ianmcook commented on a change in pull request #127: Blog post for 5.0.0 release

2021-07-28 Thread GitBox


ianmcook commented on a change in pull request #127:
URL: https://github.com/apache/arrow-site/pull/127#discussion_r678389316



##
File path: _posts/2021-07-20-5.0.0-release.md
##
@@ -0,0 +1,229 @@
+---
+layout: post
+title: "Apache Arrow 5.0.0 Release"
+date: "2020-07-16 00:00:00 -0600"
+author: pmc
+categories: [release]
+---
+
+
+
+
+
+The Apache Arrow team is pleased to announce the 5.0.0 release. This covers
+over XX months of development work and includes [**XX resolved issues**][1]
+from [**XX distinct contributors**][2]. See the Install Page to learn how to
+get the libraries for your platform.
+
+The release notes below are not exhaustive and only expose selected highlights
+of the release. Many other bugfixes and improvements have been made: we refer
+you to the [complete changelog][3].
+
+## Community
+
+Since the 4.0.0 release, Daniël Heres, Kazuaki Ishizaki, Dominik Moritz, and 
Weston Pace
+have been invited as committers to Arrow,
+and Benjamin Kietzman and David Li have joined the Project Management Committee
+(PMC). Thank you for all of your contributions!
+
+## Columnar Format Notes
+
+Official IANA Media types (MIME types) have been registered for Apache
+Arrow IPC protocol data, both [stream]({{ site.baseurl 
}}/docs/format/Columnar.html#ipc-streaming-format)
+and [file]({{ site.baseurl }}/docs/format/Columnar.html#ipc-file-format) 
variants:
+
+* 
https://www.iana.org/assignments/media-types/application/vnd.apache.arrow.stream
+* 
https://www.iana.org/assignments/media-types/application/vnd.apache.arrow.file
+
+We recommend ".arrow" as the IPC file format file extension and ".arrows" for
+the IPC streaming format file extension.
+
+## Arrow Flight RPC notes
+
+The Go implementation now supports custom metadata and middleware, and has
+been added to integration testing.
+
+In Python, some operations can now be interrupted via Control-C.
+
+## C++ notes
+
+`MakeArrayFromScalar` now works for fixed-size binary types (ARROW-13321).
+
+### Compute layer
+
+The following [compute functions]({{site.baseurl}}/docs/cpp/compute.html)
+were added:
+
+* aggregations: "index"
+
+* scalar arithmetic and math functions: "abs", "abs_checked", "acos",
+  "acos_checked", "asin", "asin_checked", "atan", "atan2", "ceil", "cos",
+  "cos_checked", "floor", "ln", "ln_checked", "log10", "log10_checked",
+  "log1p", "log1p_checked", "log2", "log2_checked", "negate", "negate_checked",
+  "sign", "sin", "sin_checked", "tan", "tan_checked", "trunc"
+
+* scalar bitwise functions: "bit_wise_and", "bit_wise_not", "bit_wise_or",
+  "bit_wise_xor", "shift_left", "shift_left_checked", "shift_right",
+  "shift_right_checked"
+
+* scalar string functions: "ascii_center", "ascii_lpad", "ascii_reverse",
+  "ascii_rpad", "binary_join", "binary_join_element_wise",
+  "binary_replace_slice", "count_substring", "count_substring_regex",
+  "ends_with", "find_substring", "find_substring_regex", "match_like",
+  "split_pattern_regex", "starts_with", "utf8_center", "utf8_lpad",
+  "utf8_replace_slice", "utf8_rpad", "utf8_reverse", "utf8_slice_codepoints"
+
+* scalar temporal functions: "day", "day_of_week", "day_of_year",
+  "iso_calendar", "iso_week", "iso_year", "hour", "microsecond", "millisecond",
+  "minute", "month", "nanosecond", "quarter", "second", "subsecond", "year"
+
+* other scalar functions: "case_when", "coalesce", "if_else", "is_finite",
+  "is_inf", "is_nan", "max_element_wise", "min_element_wise", "make_struct"
+
+* vector functions: "replace_with_mask"
+
+Duplicates are now allowed in `SetLookupOptions::value_set` (ARROW-12554).
+
+Decimal types are now supported by some basic arithmetic functions 
(ARROW-12074).
+
+The "take" function now supports dense unions (ARROW-13005).
+
+It is now possible to cast between dictionary types with different index
+types (ARROW-11673).
+
+Sorting is now implemented for boolean input (ARROW-12016).
+
+### CSV
+
+The streaming CSV reader can now read in parallel (ARROW-11889).
+
+The CSV reader tries to make its errors more informative by adding the
+row number when it is known, i.e. when parallel reading is disabled 
(ARROW-12675).
+
+A new option `ReaderOptions::skip_rows_after_names` allows skipping a number
+of rows _after_ reading the column names (as opposed to
+`ReaderOptions::skip_rows`).
+
+Quoted strings can now be treated as always non-null (ARROW-10115).
+
+### Dataset layer
+
+### IO and Filesystem layer
+
+The I/O thread pool size can now be adjuted at runtime (ARROW-12760).
+The default size remains 8 threads.
+
+Streams now can have auxiliary metadata, depending on the backend.  This
+has been implemented for the S3 filesystems, where a couple metadata
+keys are supported such as "Content-Type" and "ACL" (ARROW-11161, ARROW-12719).
+
+The HadoopFileSystem implementation now implements the FileSystem abstraction
+more faithfully (ARROW-12790).
+
+### Parquet
+
+The new LZ4_RAW compression scheme was implemented (PARQUET-1998).
+Unlike the 

[GitHub] [arrow-site] ianmcook commented on a change in pull request #127: Blog post for 5.0.0 release

2021-07-28 Thread GitBox


ianmcook commented on a change in pull request #127:
URL: https://github.com/apache/arrow-site/pull/127#discussion_r678389316



##
File path: _posts/2021-07-20-5.0.0-release.md
##
@@ -0,0 +1,229 @@
+---
+layout: post
+title: "Apache Arrow 5.0.0 Release"
+date: "2020-07-16 00:00:00 -0600"
+author: pmc
+categories: [release]
+---
+
+
+
+
+
+The Apache Arrow team is pleased to announce the 5.0.0 release. This covers
+over XX months of development work and includes [**XX resolved issues**][1]
+from [**XX distinct contributors**][2]. See the Install Page to learn how to
+get the libraries for your platform.
+
+The release notes below are not exhaustive and only expose selected highlights
+of the release. Many other bugfixes and improvements have been made: we refer
+you to the [complete changelog][3].
+
+## Community
+
+Since the 4.0.0 release, Daniël Heres, Kazuaki Ishizaki, Dominik Moritz, and 
Weston Pace
+have been invited as committers to Arrow,
+and Benjamin Kietzman and David Li have joined the Project Management Committee
+(PMC). Thank you for all of your contributions!
+
+## Columnar Format Notes
+
+Official IANA Media types (MIME types) have been registered for Apache
+Arrow IPC protocol data, both [stream]({{ site.baseurl 
}}/docs/format/Columnar.html#ipc-streaming-format)
+and [file]({{ site.baseurl }}/docs/format/Columnar.html#ipc-file-format) 
variants:
+
+* 
https://www.iana.org/assignments/media-types/application/vnd.apache.arrow.stream
+* 
https://www.iana.org/assignments/media-types/application/vnd.apache.arrow.file
+
+We recommend ".arrow" as the IPC file format file extension and ".arrows" for
+the IPC streaming format file extension.
+
+## Arrow Flight RPC notes
+
+The Go implementation now supports custom metadata and middleware, and has
+been added to integration testing.
+
+In Python, some operations can now be interrupted via Control-C.
+
+## C++ notes
+
+`MakeArrayFromScalar` now works for fixed-size binary types (ARROW-13321).
+
+### Compute layer
+
+The following [compute functions]({{site.baseurl}}/docs/cpp/compute.html)
+were added:
+
+* aggregations: "index"
+
+* scalar arithmetic and math functions: "abs", "abs_checked", "acos",
+  "acos_checked", "asin", "asin_checked", "atan", "atan2", "ceil", "cos",
+  "cos_checked", "floor", "ln", "ln_checked", "log10", "log10_checked",
+  "log1p", "log1p_checked", "log2", "log2_checked", "negate", "negate_checked",
+  "sign", "sin", "sin_checked", "tan", "tan_checked", "trunc"
+
+* scalar bitwise functions: "bit_wise_and", "bit_wise_not", "bit_wise_or",
+  "bit_wise_xor", "shift_left", "shift_left_checked", "shift_right",
+  "shift_right_checked"
+
+* scalar string functions: "ascii_center", "ascii_lpad", "ascii_reverse",
+  "ascii_rpad", "binary_join", "binary_join_element_wise",
+  "binary_replace_slice", "count_substring", "count_substring_regex",
+  "ends_with", "find_substring", "find_substring_regex", "match_like",
+  "split_pattern_regex", "starts_with", "utf8_center", "utf8_lpad",
+  "utf8_replace_slice", "utf8_rpad", "utf8_reverse", "utf8_slice_codepoints"
+
+* scalar temporal functions: "day", "day_of_week", "day_of_year",
+  "iso_calendar", "iso_week", "iso_year", "hour", "microsecond", "millisecond",
+  "minute", "month", "nanosecond", "quarter", "second", "subsecond", "year"
+
+* other scalar functions: "case_when", "coalesce", "if_else", "is_finite",
+  "is_inf", "is_nan", "max_element_wise", "min_element_wise", "make_struct"
+
+* vector functions: "replace_with_mask"
+
+Duplicates are now allowed in `SetLookupOptions::value_set` (ARROW-12554).
+
+Decimal types are now supported by some basic arithmetic functions 
(ARROW-12074).
+
+The "take" function now supports dense unions (ARROW-13005).
+
+It is now possible to cast between dictionary types with different index
+types (ARROW-11673).
+
+Sorting is now implemented for boolean input (ARROW-12016).
+
+### CSV
+
+The streaming CSV reader can now read in parallel (ARROW-11889).
+
+The CSV reader tries to make its errors more informative by adding the
+row number when it is known, i.e. when parallel reading is disabled 
(ARROW-12675).
+
+A new option `ReaderOptions::skip_rows_after_names` allows skipping a number
+of rows _after_ reading the column names (as opposed to
+`ReaderOptions::skip_rows`).
+
+Quoted strings can now be treated as always non-null (ARROW-10115).
+
+### Dataset layer
+
+### IO and Filesystem layer
+
+The I/O thread pool size can now be adjuted at runtime (ARROW-12760).
+The default size remains 8 threads.
+
+Streams now can have auxiliary metadata, depending on the backend.  This
+has been implemented for the S3 filesystems, where a couple metadata
+keys are supported such as "Content-Type" and "ACL" (ARROW-11161, ARROW-12719).
+
+The HadoopFileSystem implementation now implements the FileSystem abstraction
+more faithfully (ARROW-12790).
+
+### Parquet
+
+The new LZ4_RAW compression scheme was implemented (PARQUET-1998).
+Unlike the 

[GitHub] [arrow-site] ianmcook commented on a change in pull request #127: Blog post for 5.0.0 release

2021-07-28 Thread GitBox


ianmcook commented on a change in pull request #127:
URL: https://github.com/apache/arrow-site/pull/127#discussion_r678388328



##
File path: _posts/2021-07-20-5.0.0-release.md
##
@@ -0,0 +1,229 @@
+---
+layout: post
+title: "Apache Arrow 5.0.0 Release"
+date: "2020-07-16 00:00:00 -0600"
+author: pmc
+categories: [release]
+---
+
+
+
+
+
+The Apache Arrow team is pleased to announce the 5.0.0 release. This covers
+over XX months of development work and includes [**XX resolved issues**][1]
+from [**XX distinct contributors**][2]. See the Install Page to learn how to
+get the libraries for your platform.
+
+The release notes below are not exhaustive and only expose selected highlights
+of the release. Many other bugfixes and improvements have been made: we refer
+you to the [complete changelog][3].

Review comment:
   ```suggestion
   The Apache Arrow team is pleased to announce the 5.0.0 release. This covers
   3 months of development work and includes [**XX commits**] from
   [**XX distinct contributors**][1] in 2 repositories. See the Install Page to
   learn how to get the libraries for your platform.
   
   The release notes below are not exhaustive and only expose selected 
highlights
   of the release. Many other bugfixes and improvements have been made: we refer
   you to the complete changelogs for the [`apache/arrow`][2] and
   [`apache/arrow-rs`][3] repositories.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-site] ianmcook commented on a change in pull request #127: Blog post for 5.0.0 release

2021-07-27 Thread GitBox


ianmcook commented on a change in pull request #127:
URL: https://github.com/apache/arrow-site/pull/127#discussion_r677466710



##
File path: _posts/2021-07-20-5.0.0-release.md
##
@@ -0,0 +1,100 @@
+---
+layout: post
+title: "Apache Arrow 5.0.0 Release"
+date: "2020-07-16 00:00:00 -0600"
+author: pmc
+categories: [release]
+---
+
+
+
+
+
+The Apache Arrow team is pleased to announce the 5.0.0 release. This covers
+over XX months of development work and includes [**XX resolved issues**][1]
+from [**XX distinct contributors**][2]. See the Install Page to learn how to
+get the libraries for your platform.
+
+The release notes below are not exhaustive and only expose selected highlights
+of the release. Many other bugfixes and improvements have been made: we refer
+you to the [complete changelog][3].
+
+## Community
+
+Since the 4.0.0 release, Daniël Heres, Kazuaki Ishizaki, Dominik Moritz, and 
Weston Pace
+have been invited as committers to Arrow,
+and Benjamin Kietzman and David Li have joined the Project Management Committee
+(PMC). Thank you for all of your contributions!
+
+## Columnar Format Notes
+
+Official IANA Media types (MIME types) have been registered for Apache
+Arrow IPC protocol data, both [stream]({{ site.baseurl 
}}/docs/format/Columnar.html#ipc-streaming-format) and [file]({{ site.baseurl 
}}/docs/format/Columnar.html#ipc-file-format) variants:
+
+* 
https://www.iana.org/assignments/media-types/application/vnd.apache.arrow.stream
+* 
https://www.iana.org/assignments/media-types/application/vnd.apache.arrow.file
+
+We recommend ".arrow" as the IPC file format file extension and ".arrows" for 
the IPC streaming format file extension.
+
+## Arrow Flight RPC notes
+
+The Go implementation now supports custom metadata and middleware, and has 
been added to integration testing.
+
+In Python, some operations can now be interrupted via Control-C.
+
+## C++ notes
+
+## C# notes
+
+## Go notes
+
+## Java notes
+
+## JavaScript notes
+
+## Python notes
+
+## R notes
+
+In this release, we've more than doubled the number of functions you can call 
on Arrow Datasets inside `dplyr::filter()` and `mutate()`, including many more 
string, datetime, and math functions. You can also write Datasets to CSV files, 
in addition to Parquet and Feather. We've also deepened support for the Arrow C 
interface, which is used in the Python interface and allows integration with 
other projects, such as DuckDB.
+
+For more on what’s in the 5.0.0 R package, see the [R changelog][4].
+
+## Ruby and C GLib notes
+
+### Ruby
+
+### C GLib
+
+## Rust notes
+

Review comment:
   Link here to the Rust release blog post (in the works at 
https://github.com/apache/arrow-site/pull/128) and changelog 
(https://github.com/apache/arrow-rs/blob/5.0.0/CHANGELOG.md)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-site] ianmcook commented on a change in pull request #127: Blog post for 5.0.0 release

2021-07-27 Thread GitBox


ianmcook commented on a change in pull request #127:
URL: https://github.com/apache/arrow-site/pull/127#discussion_r677464606



##
File path: _posts/2021-07-20-5.0.0-release.md
##
@@ -0,0 +1,100 @@
+---
+layout: post
+title: "Apache Arrow 5.0.0 Release"
+date: "2020-07-16 00:00:00 -0600"
+author: pmc
+categories: [release]
+---
+
+
+
+
+
+The Apache Arrow team is pleased to announce the 5.0.0 release. This covers
+over XX months of development work and includes [**XX resolved issues**][1]
+from [**XX distinct contributors**][2]. See the Install Page to learn how to
+get the libraries for your platform.
+
+The release notes below are not exhaustive and only expose selected highlights
+of the release. Many other bugfixes and improvements have been made: we refer
+you to the [complete changelog][3].

Review comment:
   @thisisnic we should tweak this to be consistent with the changes to the 
release notes in https://github.com/apache/arrow/pull/10774.
   
   Currently this counts resolved issues, with a link to Jira. The release 
notes does not count resolved Jiras; it counts commits. Rust and DataFusion use 
GitHub Issues, not Jira.
   
   I think the easiest thing to do would be to simply change `[**XX resolved 
issues**][1]` to `[**XX commits**]` and remove the link to Jira. The number of 
commits can just be copied from the release notes. What do you think?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org