This is an automated email from the ASF dual-hosted git repository.

kou pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/arrow-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 521d5b7458f GH-35106: [R] Docs for 11.0.0 changelog not updated (#344)
521d5b7458f is described below

commit 521d5b7458f0b21db2be88cfff4a0de7bba3841e
Author: Nic Crane <[email protected]>
AuthorDate: Fri Apr 14 09:54:21 2023 +0100

    GH-35106: [R] Docs for 11.0.0 changelog not updated (#344)
    
    The Changelog for the R docs wasn't updated in the last release; I think
    this is due to the timing of updates, and things being merged after the
    release scripts were ran.
    
    Fix apache/arrow#35106
---
 docs/r/news/index.html | 271 +++++++++++++++++++++++++++++++++----------------
 1 file changed, 182 insertions(+), 89 deletions(-)

diff --git a/docs/r/news/index.html b/docs/r/news/index.html
index 6a04ef5ddaa..035164761e3 100644
--- a/docs/r/news/index.html
+++ b/docs/r/news/index.html
@@ -25,7 +25,7 @@
     <a class="navbar-brand me-2" href="../index.html">Arrow R Package</a>
 
     <span class="version">
-      <small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" 
data-bs-placement="bottom" title="">11.0.0.2</small>
+      <small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" 
data-bs-placement="bottom" title="">11.0.0.3</small>
     </span>
 
     
@@ -69,7 +69,13 @@
       </ul><form class="form-inline my-2 my-lg-0" role="search">
         <input type="search" class="form-control me-sm-2" aria-label="Toggle 
navigation" name="search-input" data-search-index="../search.json" 
id="search-input" placeholder="Search for" autocomplete="off"></form>
 
-      <ul class="navbar-nav"></ul></div>
+      <ul class="navbar-nav"><li class="nav-item">
+  <a class="external-link nav-link" href="https://github.com/apache/arrow/"; 
aria-label="github">
+    <span class="fab fa fab fa-github fa-lg"></span>
+     
+  </a>
+</li>
+      </ul></div>
 
     
   </div>
@@ -77,16 +83,90 @@
 <div class="row">
   <main id="main" class="col-md-9"><div class="page-header">
       <img src="" class="logo" alt=""><h1>Changelog</h1>
-      <small>Source: <a 
href="https://github.com/apache/arrow/blob/master/r/NEWS.md"; 
class="external-link"><code>NEWS.md</code></a></small>
+      <small>Source: <a 
href="https://github.com/apache/arrow/blob/main/r/NEWS.md"; 
class="external-link"><code>NEWS.md</code></a></small>
     </div>
 
-    <div class="section level2"><h2 class="pkg-version" data-toc-text="11.0.0" 
id="arrow-1100">arrow 11.0.0<a class="anchor" aria-label="anchor" 
href="#arrow-1100"></a></h2></div>
+    <div class="section level2">
+<h2 class="pkg-version" data-toc-text="11.0.0.3" id="arrow-11003">arrow 
11.0.0.3<a class="anchor" aria-label="anchor" href="#arrow-11003"></a></h2><p 
class="text-muted">CRAN release: 2023-03-08</p>
+<div class="section level3">
+<h3 id="minor-improvements-and-fixes-11-0-0-3">Minor improvements and fixes<a 
class="anchor" aria-label="anchor" 
href="#minor-improvements-and-fixes-11-0-0-3"></a></h3>
+<ul><li>
+<code><a 
href="../reference/open_delim_dataset.html">open_csv_dataset()</a></code> 
allows a schema to be specified. (<a 
href="https://github.com/apache/arrow/issues/34217"; 
class="external-link">#34217</a>)</li>
+<li>To ensure compatibility with an upcoming dplyr release, we no longer call 
<code>dplyr:::check_names()</code> (<a 
href="https://github.com/apache/arrow/issues/34369"; 
class="external-link">#34369</a>)</li>
+</ul></div>
+</div>
+    <div class="section level2">
+<h2 class="pkg-version" data-toc-text="11.0.0.2" id="arrow-11002">arrow 
11.0.0.2<a class="anchor" aria-label="anchor" href="#arrow-11002"></a></h2><p 
class="text-muted">CRAN release: 2023-02-12</p>
+<div class="section level3">
+<h3 id="breaking-changes-11-0-0-2">Breaking changes<a class="anchor" 
aria-label="anchor" href="#breaking-changes-11-0-0-2"></a></h3>
+<ul><li>
+<code><a href="../reference/map_batches.html">map_batches()</a></code> is lazy 
by default; it now returns a <code>RecordBatchReader</code> instead of a list 
of <code>RecordBatch</code> objects unless <code>lazy = FALSE</code>. (<a 
href="https://github.com/apache/arrow/issues/14521"; 
class="external-link">#14521</a>)</li>
+</ul></div>
+<div class="section level3">
+<h3 id="new-features-11-0-0-2">New features<a class="anchor" 
aria-label="anchor" href="#new-features-11-0-0-2"></a></h3>
+<div class="section level4">
+<h4 id="docs-11-0-0-2">Docs<a class="anchor" aria-label="anchor" 
href="#docs-11-0-0-2"></a></h4>
+<ul><li>A substantial reorganisation, rewrite of and addition to, many of the 
vignettes and README. (<a href="https://github.com/djnavarro"; 
class="external-link">@djnavarro</a>, <a 
href="https://github.com/apache/arrow/issues/14514"; 
class="external-link">#14514</a>)</li>
+</ul></div>
+<div class="section level4">
+<h4 id="readingwriting-data-11-0-0-2">Reading/writing data<a class="anchor" 
aria-label="anchor" href="#readingwriting-data-11-0-0-2"></a></h4>
+<ul><li>New functions <code><a 
href="../reference/open_delim_dataset.html">open_csv_dataset()</a></code>, 
<code><a 
href="../reference/open_delim_dataset.html">open_tsv_dataset()</a></code>, and 
<code><a 
href="../reference/open_delim_dataset.html">open_delim_dataset()</a></code> all 
wrap <code><a href="../reference/open_dataset.html">open_dataset()</a></code>- 
they don’t provide new functionality, but allow for readr-style options to be 
supplied, making it simpler to switch between indivi [...]
+<li>User-defined null values can be set when writing CSVs both as datasets and 
as individual files. (<a href="https://github.com/wjones127"; 
class="external-link">@wjones127</a>, <a 
href="https://github.com/apache/arrow/issues/14679"; 
class="external-link">#14679</a>)</li>
+<li>The new <code>col_names</code> parameter allows specification of column 
names when opening a CSV dataset. (<a href="https://github.com/wjones127"; 
class="external-link">@wjones127</a>, <a 
href="https://github.com/apache/arrow/issues/14705"; 
class="external-link">#14705</a>)</li>
+<li>The <code>parse_options</code>, <code>read_options</code>, and 
<code>convert_options</code> parameters for reading individual files 
(<code>read_*_arrow()</code> functions) and datasets (<code><a 
href="../reference/open_dataset.html">open_dataset()</a></code> and the new 
<code>open_*_dataset()</code> functions) can be passed in as lists. (<a 
href="https://github.com/apache/arrow/issues/15270"; 
class="external-link">#15270</a>)</li>
+<li>File paths containing accents can be read by <code><a 
href="../reference/read_delim_arrow.html">read_csv_arrow()</a></code>. (<a 
href="https://github.com/apache/arrow/issues/14930"; 
class="external-link">#14930</a>)</li>
+</ul></div>
+<div class="section level4">
+<h4 id="dplyr-compatibility-11-0-0-2">dplyr compatibility<a class="anchor" 
aria-label="anchor" href="#dplyr-compatibility-11-0-0-2"></a></h4>
+<ul><li>New dplyr (1.1.0) function <code>join_by()</code> has been implemented 
for dplyr joins on Arrow objects (equality conditions only). (<a 
href="https://github.com/apache/arrow/issues/33664"; 
class="external-link">#33664</a>)</li>
+<li>Output is accurate when multiple <code><a 
href="https://dplyr.tidyverse.org/reference/group_by.html"; 
class="external-link">dplyr::group_by()</a></code>/<code><a 
href="https://dplyr.tidyverse.org/reference/summarise.html"; 
class="external-link">dplyr::summarise()</a></code> calls are used. (<a 
href="https://github.com/apache/arrow/issues/14905"; 
class="external-link">#14905</a>)</li>
+<li>
+<code><a href="https://dplyr.tidyverse.org/reference/summarise.html"; 
class="external-link">dplyr::summarize()</a></code> works with division when 
divisor is a variable. (<a href="https://github.com/apache/arrow/issues/14933"; 
class="external-link">#14933</a>)</li>
+<li>
+<code><a href="https://dplyr.tidyverse.org/reference/mutate-joins.html"; 
class="external-link">dplyr::right_join()</a></code> correctly coalesces keys. 
(<a href="https://github.com/apache/arrow/issues/15077"; 
class="external-link">#15077</a>)</li>
+<li>Multiple changes to ensure compatibility with dplyr 1.1.0. (<a 
href="https://github.com/lionel-"; class="external-link">@lionel-</a>, <a 
href="https://github.com/apache/arrow/issues/14948"; 
class="external-link">#14948</a>)</li>
+</ul></div>
+<div class="section level4">
+<h4 id="function-bindings-11-0-0-2">Function bindings<a class="anchor" 
aria-label="anchor" href="#function-bindings-11-0-0-2"></a></h4>
+<ul><li>The following functions can be used in queries on Arrow objects:
+<ul><li>
+<code><a href="https://lubridate.tidyverse.org/reference/with_tz.html"; 
class="external-link">lubridate::with_tz()</a></code> and <code><a 
href="https://lubridate.tidyverse.org/reference/force_tz.html"; 
class="external-link">lubridate::force_tz()</a></code> (<a 
href="https://github.com/eitsupi"; class="external-link">@eitsupi</a>, <a 
href="https://github.com/apache/arrow/issues/14093"; 
class="external-link">#14093</a>)</li>
+<li>
+<code><a href="https://stringr.tidyverse.org/reference/str_remove.html"; 
class="external-link">stringr::str_remove()</a></code> and <code><a 
href="https://stringr.tidyverse.org/reference/str_remove.html"; 
class="external-link">stringr::str_remove_all()</a></code> (<a 
href="https://github.com/apache/arrow/issues/14644"; 
class="external-link">#14644</a>)</li>
+</ul></li>
+</ul></div>
+<div class="section level4">
+<h4 id="arrow-object-creation-11-0-0-2">Arrow object creation<a class="anchor" 
aria-label="anchor" href="#arrow-object-creation-11-0-0-2"></a></h4>
+<ul><li>Arrow Scalars can be created from <code>POSIXlt</code> objects. (<a 
href="https://github.com/apache/arrow/issues/15277"; 
class="external-link">#15277</a>)</li>
+<li>
+<code>Array$create()</code> can create Decimal arrays. (<a 
href="https://github.com/apache/arrow/issues/15211"; 
class="external-link">#15211</a>)</li>
+<li>
+<code>StructArray$create()</code> can be used to create StructArray objects. 
(<a href="https://github.com/apache/arrow/issues/14922"; 
class="external-link">#14922</a>)</li>
+<li>Creating an Array from an object bigger than 2^31 has correct length (<a 
href="https://github.com/apache/arrow/issues/14929"; 
class="external-link">#14929</a>)</li>
+</ul></div>
+<div class="section level4">
+<h4 id="installation-11-0-0-2">Installation<a class="anchor" 
aria-label="anchor" href="#installation-11-0-0-2"></a></h4>
+<ul><li>Improved offline installation using pre-downloaded binaries. (<a 
href="https://github.com/pgramme"; class="external-link">@pgramme</a>, <a 
href="https://github.com/apache/arrow/issues/14086"; 
class="external-link">#14086</a>)</li>
+<li>The package can automatically link to system installations of the AWS SDK 
for C++. (<a href="https://github.com/kou"; class="external-link">@kou</a>, <a 
href="https://github.com/apache/arrow/issues/14235"; 
class="external-link">#14235</a>)</li>
+</ul></div>
+</div>
+<div class="section level3">
+<h3 id="minor-improvements-and-fixes-11-0-0-2">Minor improvements and fixes<a 
class="anchor" aria-label="anchor" 
href="#minor-improvements-and-fixes-11-0-0-2"></a></h3>
+<ul><li>Calling <code><a 
href="https://lubridate.tidyverse.org/reference/as_date.html"; 
class="external-link">lubridate::as_datetime()</a></code> on Arrow objects can 
handle time in sub-seconds. (<a href="https://github.com/eitsupi"; 
class="external-link">@eitsupi</a>, <a 
href="https://github.com/apache/arrow/issues/13890"; 
class="external-link">#13890</a>)</li>
+<li>
+<code><a href="https://rdrr.io/r/utils/head.html"; 
class="external-link">head()</a></code> can be called after <code><a 
href="../reference/as_record_batch_reader.html">as_record_batch_reader()</a></code>.
 (<a href="https://github.com/apache/arrow/issues/14518"; 
class="external-link">#14518</a>)</li>
+<li>
+<code><a href="https://rdrr.io/r/base/as.Date.html"; 
class="external-link">as.Date()</a></code> can go from 
<code>timestamp[us]</code> to <code>timestamp[s]</code>. (<a 
href="https://github.com/apache/arrow/issues/14935"; 
class="external-link">#14935</a>)</li>
+<li>curl timeout policy can be configured for S3. (<a 
href="https://github.com/apache/arrow/issues/15166"; 
class="external-link">#15166</a>)</li>
+<li>rlang dependency must be at least version 1.0.0 because of 
<code>check_dots_empty()</code>. (<a href="https://github.com/daattali"; 
class="external-link">@daattali</a>, <a 
href="https://github.com/apache/arrow/issues/14744"; 
class="external-link">#14744</a>)</li>
+</ul></div>
+</div>
     <div class="section level2">
 <h2 class="pkg-version" data-toc-text="10.0.1" id="arrow-1001">arrow 10.0.1<a 
class="anchor" aria-label="anchor" href="#arrow-1001"></a></h2><p 
class="text-muted">CRAN release: 2022-12-06</p>
 <p>Minor improvements and fixes:</p>
-<ul><li>Fixes for failing test after lubridate 1.9 release (<a 
href="https://issues.apache.org/jira/browse/&lt;a%20href='https://issues.apache.org/jira/browse/ARROW-18285'&gt;ARROW-18285&lt;/a&gt;"
 class="external-link"></a><a 
href="https://issues.apache.org/jira/browse/ARROW-18285"; 
class="external-link">ARROW-18285</a>)</li>
-<li>Update to ensure compatibility with changes in dev purrr (<a 
href="https://issues.apache.org/jira/browse/&lt;a%20href='https://issues.apache.org/jira/browse/ARROW-18305'&gt;ARROW-18305&lt;/a&gt;"
 class="external-link"></a><a 
href="https://issues.apache.org/jira/browse/ARROW-18305"; 
class="external-link">ARROW-18305</a>)</li>
-<li>Fix to correctly handle <code>.data</code> pronoun in <code><a 
href="https://dplyr.tidyverse.org/reference/group_by.html"; 
class="external-link">dplyr::group_by()</a></code> (<a 
href="https://issues.apache.org/jira/browse/&lt;a%20href='https://issues.apache.org/jira/browse/ARROW-18131'&gt;ARROW-18131&lt;/a&gt;"
 class="external-link"></a><a 
href="https://issues.apache.org/jira/browse/ARROW-18131"; 
class="external-link">ARROW-18131</a>)</li>
+<ul><li>Fixes for failing test after lubridate 1.9 release (<a 
href="https://github.com/apache/arrow/issues/14615"; 
class="external-link">#14615</a>)</li>
+<li>Update to ensure compatibility with changes in dev purrr (<a 
href="https://github.com/apache/arrow/issues/14581"; 
class="external-link">#14581</a>)</li>
+<li>Fix to correctly handle <code>.data</code> pronoun in <code><a 
href="https://dplyr.tidyverse.org/reference/group_by.html"; 
class="external-link">dplyr::group_by()</a></code> (<a 
href="https://github.com/apache/arrow/issues/14484"; 
class="external-link">#14484</a>)</li>
 </ul></div>
     <div class="section level2">
 <h2 class="pkg-version" data-toc-text="10.0.0" id="arrow-1000">arrow 10.0.0<a 
class="anchor" aria-label="anchor" href="#arrow-1000"></a></h2><p 
class="text-muted">CRAN release: 2022-10-26</p>
@@ -94,7 +174,7 @@
 <h3 id="arrow-dplyr-queries-10-0-0">Arrow dplyr queries<a class="anchor" 
aria-label="anchor" href="#arrow-dplyr-queries-10-0-0"></a></h3>
 <p>Several new functions can be used in queries:</p>
 <ul><li>
-<code><a href="https://dplyr.tidyverse.org/reference/across.html"; 
class="external-link">dplyr::across()</a></code> can be used to apply the same 
computation across multiple columns, and the <code>where()</code> selection 
helper is supported in <code><a 
href="https://dplyr.tidyverse.org/reference/across.html"; 
class="external-link">across()</a></code>;</li>
+<code><a href="https://dplyr.tidyverse.org/reference/across.html"; 
class="external-link">dplyr::across()</a></code> can be used to apply the same 
computation across multiple columns, and the <code>where()</code> selection 
helper is supported in <code>across()</code>;</li>
 <li>
 <code><a href="../reference/add_filename.html">add_filename()</a></code> can 
be used to get the filename a row came from (only available when querying 
<code><a href="../reference/Dataset.html">?Dataset</a></code>);</li>
 <li>Added five functions in the <code>slice_*</code> family: <code><a 
href="https://dplyr.tidyverse.org/reference/slice.html"; 
class="external-link">dplyr::slice_min()</a></code>, <code><a 
href="https://dplyr.tidyverse.org/reference/slice.html"; 
class="external-link">dplyr::slice_max()</a></code>, <code><a 
href="https://dplyr.tidyverse.org/reference/slice.html"; 
class="external-link">dplyr::slice_head()</a></code>, <code><a 
href="https://dplyr.tidyverse.org/reference/slice.html"; class="exte [...]
@@ -128,63 +208,68 @@
 <h2 class="pkg-version" data-toc-text="9.0.0" id="arrow-900">arrow 9.0.0<a 
class="anchor" aria-label="anchor" href="#arrow-900"></a></h2><p 
class="text-muted">CRAN release: 2022-08-10</p>
 <div class="section level3">
 <h3 id="arrow-dplyr-queries-9-0-0">Arrow dplyr queries<a class="anchor" 
aria-label="anchor" href="#arrow-dplyr-queries-9-0-0"></a></h3>
-<ul><li>New dplyr verbs:<ul><li>
-<code><a href="https://generics.r-lib.org/reference/setops.html"; 
class="external-link">dplyr::union</a></code> and <code><a 
href="https://dplyr.tidyverse.org/reference/setops.html"; 
class="external-link">dplyr::union_all</a></code> (<a 
href="https://issues.apache.org/jira/browse/ARROW-15622"; 
class="external-link">ARROW-15622</a>)</li>
+<ul><li>New dplyr verbs:
+<ul><li>
+<code><a href="https://generics.r-lib.org/reference/setops.html"; 
class="external-link">dplyr::union</a></code> and <code><a 
href="https://dplyr.tidyverse.org/reference/setops.html"; 
class="external-link">dplyr::union_all</a></code> (<a 
href="https://github.com/apache/arrow/issues/13090"; 
class="external-link">#13090</a>)</li>
 <li>
-<code><a href="https://pillar.r-lib.org/reference/glimpse.html"; 
class="external-link">dplyr::glimpse</a></code> (<a 
href="https://issues.apache.org/jira/browse/ARROW-16776"; 
class="external-link">ARROW-16776</a>)</li>
+<code><a href="https://pillar.r-lib.org/reference/glimpse.html"; 
class="external-link">dplyr::glimpse</a></code> (<a 
href="https://github.com/apache/arrow/issues/13563"; 
class="external-link">#13563</a>)</li>
 <li>
-<code><a href="../reference/show_exec_plan.html">show_exec_plan()</a></code> 
can be added to the end of a dplyr pipeline to show the underlying plan, 
similar to <code><a href="https://dplyr.tidyverse.org/reference/explain.html"; 
class="external-link">dplyr::show_query()</a></code>. <code><a 
href="https://dplyr.tidyverse.org/reference/explain.html"; 
class="external-link">dplyr::show_query()</a></code> and <code><a 
href="https://dplyr.tidyverse.org/reference/explain.html"; class="external-lin 
[...]
+<code><a href="../reference/show_exec_plan.html">show_exec_plan()</a></code> 
can be added to the end of a dplyr pipeline to show the underlying plan, 
similar to <code><a href="https://dplyr.tidyverse.org/reference/explain.html"; 
class="external-link">dplyr::show_query()</a></code>. <code><a 
href="https://dplyr.tidyverse.org/reference/explain.html"; 
class="external-link">dplyr::show_query()</a></code> and <code><a 
href="https://dplyr.tidyverse.org/reference/explain.html"; class="external-lin 
[...]
 </ul></li>
-<li>User-defined functions are supported in queries. Use <code><a 
href="../reference/register_scalar_function.html">register_scalar_function()</a></code>
 to create them. (<a href="https://issues.apache.org/jira/browse/ARROW-16444"; 
class="external-link">ARROW-16444</a>)</li>
+<li>User-defined functions are supported in queries. Use <code><a 
href="../reference/register_scalar_function.html">register_scalar_function()</a></code>
 to create them. (<a href="https://github.com/apache/arrow/issues/13397"; 
class="external-link">#13397</a>)</li>
 <li>
-<code><a href="../reference/map_batches.html">map_batches()</a></code> returns 
a <code>RecordBatchReader</code> and requires that the function it maps returns 
something coercible to a <code>RecordBatch</code> through the <code><a 
href="../reference/as_record_batch.html">as_record_batch()</a></code> S3 
function. It can also run in streaming fashion if passed <code>.lazy = 
TRUE</code>. (<a href="https://issues.apache.org/jira/browse/ARROW-15271"; 
class="external-link">ARROW-15271</a>, <a hr [...]
-<li>Functions can be called with package namespace prefixes (e.g. 
<code>stringr::</code>, <code>lubridate::</code>) within queries. For example, 
<code><a href="https://stringr.tidyverse.org/reference/str_length.html"; 
class="external-link">stringr::str_length</a></code> will now dispatch to the 
same kernel as <code>str_length</code>. (<a 
href="https://issues.apache.org/jira/browse/ARROW-14575"; 
class="external-link">ARROW-14575</a>)</li>
-<li>Support for new functions:<ul><li>
-<code><a href="https://lubridate.tidyverse.org/reference/parse_date_time.html"; 
class="external-link">lubridate::parse_date_time()</a></code> datetime parser: 
(<a href="https://issues.apache.org/jira/browse/ARROW-14848"; 
class="external-link">ARROW-14848</a>, <a 
href="https://issues.apache.org/jira/browse/ARROW-16407"; 
class="external-link">ARROW-16407</a>)<ul><li>
+<code><a href="../reference/map_batches.html">map_batches()</a></code> returns 
a <code>RecordBatchReader</code> and requires that the function it maps returns 
something coercible to a <code>RecordBatch</code> through the <code><a 
href="../reference/as_record_batch.html">as_record_batch()</a></code> S3 
function. It can also run in streaming fashion if passed <code>.lazy = 
TRUE</code>. (<a href="https://github.com/apache/arrow/issues/13170"; 
class="external-link">#13170</a>, <a href="https: [...]
+<li>Functions can be called with package namespace prefixes (e.g. 
<code>stringr::</code>, <code>lubridate::</code>) within queries. For example, 
<code><a href="https://stringr.tidyverse.org/reference/str_length.html"; 
class="external-link">stringr::str_length</a></code> will now dispatch to the 
same kernel as <code>str_length</code>. (<a 
href="https://github.com/apache/arrow/issues/13160"; 
class="external-link">#13160</a>)</li>
+<li>Support for new functions:
+<ul><li>
+<code><a href="https://lubridate.tidyverse.org/reference/parse_date_time.html"; 
class="external-link">lubridate::parse_date_time()</a></code> datetime parser: 
(<a href="https://github.com/apache/arrow/issues/12589"; 
class="external-link">#12589</a>, <a 
href="https://github.com/apache/arrow/issues/13196"; 
class="external-link">#13196</a>, <a 
href="https://github.com/apache/arrow/issues/13506"; 
class="external-link">#13506</a>)
+<ul><li>
 <code>orders</code> with year, month, day, hours, minutes, and seconds 
components are supported.</li>
 <li>the <code>orders</code> argument in the Arrow binding works as follows: 
<code>orders</code> are transformed into <code>formats</code> which 
subsequently get applied in turn. There is no <code>select_formats</code> 
parameter and no inference takes place (like is the case in <code><a 
href="https://lubridate.tidyverse.org/reference/parse_date_time.html"; 
class="external-link">lubridate::parse_date_time()</a></code>).</li>
 </ul></li>
 <li>
-<code>lubridate</code> date and datetime parsers such as <code><a 
href="https://lubridate.tidyverse.org/reference/ymd.html"; 
class="external-link">lubridate::ymd()</a></code>, <code><a 
href="https://lubridate.tidyverse.org/reference/ymd.html"; 
class="external-link">lubridate::yq()</a></code>, and <code><a 
href="https://lubridate.tidyverse.org/reference/ymd_hms.html"; 
class="external-link">lubridate::ymd_hms()</a></code> (<a 
href="https://issues.apache.org/jira/browse/ARROW-16394"; class="ext [...]
+<code>lubridate</code> date and datetime parsers such as <code><a 
href="https://lubridate.tidyverse.org/reference/ymd.html"; 
class="external-link">lubridate::ymd()</a></code>, <code><a 
href="https://lubridate.tidyverse.org/reference/ymd.html"; 
class="external-link">lubridate::yq()</a></code>, and <code><a 
href="https://lubridate.tidyverse.org/reference/ymd_hms.html"; 
class="external-link">lubridate::ymd_hms()</a></code> (<a 
href="https://github.com/apache/arrow/issues/13118"; class="external [...]
 <li>
-<code><a href="https://lubridate.tidyverse.org/reference/parse_date_time.html"; 
class="external-link">lubridate::fast_strptime()</a></code> (<a 
href="https://issues.apache.org/jira/browse/ARROW-16439"; 
class="external-link">ARROW-16439</a>)</li>
+<code><a href="https://lubridate.tidyverse.org/reference/parse_date_time.html"; 
class="external-link">lubridate::fast_strptime()</a></code> (<a 
href="https://github.com/apache/arrow/issues/13174"; 
class="external-link">#13174</a>)</li>
 <li>
-<code><a href="https://lubridate.tidyverse.org/reference/round_date.html"; 
class="external-link">lubridate::floor_date()</a></code>, <code><a 
href="https://lubridate.tidyverse.org/reference/round_date.html"; 
class="external-link">lubridate::ceiling_date()</a></code>, and <code><a 
href="https://lubridate.tidyverse.org/reference/round_date.html"; 
class="external-link">lubridate::round_date()</a></code> (<a 
href="https://issues.apache.org/jira/browse/ARROW-14821"; 
class="external-link">ARROW-14 [...]
+<code><a href="https://lubridate.tidyverse.org/reference/round_date.html"; 
class="external-link">lubridate::floor_date()</a></code>, <code><a 
href="https://lubridate.tidyverse.org/reference/round_date.html"; 
class="external-link">lubridate::ceiling_date()</a></code>, and <code><a 
href="https://lubridate.tidyverse.org/reference/round_date.html"; 
class="external-link">lubridate::round_date()</a></code> (<a 
href="https://github.com/apache/arrow/issues/12154"; 
class="external-link">#12154</a>)</li>
 <li>
-<code><a href="https://rdrr.io/r/base/strptime.html"; 
class="external-link">strptime()</a></code> supports the <code>tz</code> 
argument to pass timezones. (<a 
href="https://issues.apache.org/jira/browse/ARROW-16415"; 
class="external-link">ARROW-16415</a>)</li>
+<code><a href="https://rdrr.io/r/base/strptime.html"; 
class="external-link">strptime()</a></code> supports the <code>tz</code> 
argument to pass timezones. (<a 
href="https://github.com/apache/arrow/issues/13190"; 
class="external-link">#13190</a>)</li>
 <li>
 <code><a href="https://lubridate.tidyverse.org/reference/day.html"; 
class="external-link">lubridate::qday()</a></code> (day of quarter)</li>
 <li>
-<code><a href="https://rdrr.io/r/base/Log.html"; 
class="external-link">exp()</a></code> and <code><a 
href="https://rdrr.io/r/base/MathFun.html"; 
class="external-link">sqrt()</a></code>. (<a 
href="https://issues.apache.org/jira/browse/ARROW-16871"; 
class="external-link">ARROW-16871</a>)</li>
+<code><a href="https://rdrr.io/r/base/Log.html"; 
class="external-link">exp()</a></code> and <code><a 
href="https://rdrr.io/r/base/MathFun.html"; 
class="external-link">sqrt()</a></code>. (<a 
href="https://github.com/apache/arrow/issues/13517"; 
class="external-link">#13517</a>)</li>
 </ul></li>
-<li>Bugfixes:<ul><li>Count distinct now gives correct result across multiple 
row groups. (<a href="https://issues.apache.org/jira/browse/ARROW-16807"; 
class="external-link">ARROW-16807</a>)</li>
-<li>Aggregations over partition columns return correct results. (<a 
href="https://issues.apache.org/jira/browse/ARROW-16700"; 
class="external-link">ARROW-16700</a>)</li>
+<li>Bugfixes:
+<ul><li>Count distinct now gives correct result across multiple row groups. 
(<a href="https://github.com/apache/arrow/issues/13583"; 
class="external-link">#13583</a>)</li>
+<li>Aggregations over partition columns return correct results. (<a 
href="https://github.com/apache/arrow/issues/13518"; 
class="external-link">#13518</a>)</li>
 </ul></li>
 </ul></div>
 <div class="section level3">
 <h3 id="reading-and-writing-9-0-0">Reading and writing<a class="anchor" 
aria-label="anchor" href="#reading-and-writing-9-0-0"></a></h3>
 <ul><li>New functions <code><a 
href="../reference/read_feather.html">read_ipc_file()</a></code> and <code><a 
href="../reference/write_feather.html">write_ipc_file()</a></code> are added. 
These functions are almost the same as <code><a 
href="../reference/read_feather.html">read_feather()</a></code> and <code><a 
href="../reference/write_feather.html">write_feather()</a></code>, but differ 
in that they only target IPC files (Feather V2 files), not Feather V1 
files.</li>
 <li>
-<code>read_arrow()</code> and <code>write_arrow()</code>, deprecated since 
1.0.0 (July 2020), have been removed. Instead of these, use the <code><a 
href="../reference/read_feather.html">read_ipc_file()</a></code> and <code><a 
href="../reference/write_feather.html">write_ipc_file()</a></code> for IPC 
files, or, <code><a 
href="../reference/read_ipc_stream.html">read_ipc_stream()</a></code> and 
<code><a 
href="../reference/write_ipc_stream.html">write_ipc_stream()</a></code> for IPC 
streams. [...]
+<code>read_arrow()</code> and <code>write_arrow()</code>, deprecated since 
1.0.0 (July 2020), have been removed. Instead of these, use the <code><a 
href="../reference/read_feather.html">read_ipc_file()</a></code> and <code><a 
href="../reference/write_feather.html">write_ipc_file()</a></code> for IPC 
files, or, <code><a 
href="../reference/read_ipc_stream.html">read_ipc_stream()</a></code> and 
<code><a 
href="../reference/write_ipc_stream.html">write_ipc_stream()</a></code> for IPC 
streams. [...]
 <li>
-<code><a href="../reference/write_parquet.html">write_parquet()</a></code> now 
defaults to writing Parquet format version 2.4 (was 1.0). Previously deprecated 
arguments <code>properties</code> and <code>arrow_properties</code> have been 
removed; if you need to deal with these lower-level properties objects 
directly, use <code>ParquetFileWriter</code>, which <code><a 
href="../reference/write_parquet.html">write_parquet()</a></code> wraps. (<a 
href="https://issues.apache.org/jira/browse/AR [...]
-<li>UnionDatasets can unify schemas of multiple InMemoryDatasets with varying 
schemas. (<a href="https://issues.apache.org/jira/browse/ARROW-16085"; 
class="external-link">ARROW-16085</a>)</li>
+<code><a href="../reference/write_parquet.html">write_parquet()</a></code> now 
defaults to writing Parquet format version 2.4 (was 1.0). Previously deprecated 
arguments <code>properties</code> and <code>arrow_properties</code> have been 
removed; if you need to deal with these lower-level properties objects 
directly, use <code>ParquetFileWriter</code>, which <code><a 
href="../reference/write_parquet.html">write_parquet()</a></code> wraps. (<a 
href="https://github.com/apache/arrow/issues/1 [...]
+<li>UnionDatasets can unify schemas of multiple InMemoryDatasets with varying 
schemas. (<a href="https://github.com/apache/arrow/issues/13088"; 
class="external-link">#13088</a>)</li>
 <li>
-<code><a href="../reference/write_dataset.html">write_dataset()</a></code> 
preserves all schema metadata again. In 8.0.0, it would drop most metadata, 
breaking packages such as sfarrow. (<a 
href="https://issues.apache.org/jira/browse/ARROW-16511"; 
class="external-link">ARROW-16511</a>)</li>
-<li>Reading and writing functions (such as <code><a 
href="../reference/write_csv_arrow.html">write_csv_arrow()</a></code>) will 
automatically (de-)compress data if the file path contains a compression 
extension (e.g. <code>"data.csv.gz"</code>). This works locally as well as on 
remote filesystems like S3 and GCS. (<a 
href="https://issues.apache.org/jira/browse/ARROW-16144"; 
class="external-link">ARROW-16144</a>)</li>
+<code><a href="../reference/write_dataset.html">write_dataset()</a></code> 
preserves all schema metadata again. In 8.0.0, it would drop most metadata, 
breaking packages such as sfarrow. (<a 
href="https://github.com/apache/arrow/issues/13105"; 
class="external-link">#13105</a>)</li>
+<li>Reading and writing functions (such as <code><a 
href="../reference/write_csv_arrow.html">write_csv_arrow()</a></code>) will 
automatically (de-)compress data if the file path contains a compression 
extension (e.g. <code>"data.csv.gz"</code>). This works locally as well as on 
remote filesystems like S3 and GCS. (<a 
href="https://github.com/apache/arrow/issues/13183"; 
class="external-link">#13183</a>)</li>
 <li>
-<code>FileSystemFactoryOptions</code> can be provided to <code><a 
href="../reference/open_dataset.html">open_dataset()</a></code>, allowing you 
to pass options such as which file prefixes to ignore. (<a 
href="https://issues.apache.org/jira/browse/ARROW-15280"; 
class="external-link">ARROW-15280</a>)</li>
-<li>By default, <code>S3FileSystem</code> will not create or delete buckets. 
To enable that, pass the configuration option 
<code>allow_bucket_creation</code> or <code>allow_bucket_deletion</code>. (<a 
href="https://issues.apache.org/jira/browse/ARROW-15906"; 
class="external-link">ARROW-15906</a>)</li>
+<code>FileSystemFactoryOptions</code> can be provided to <code><a 
href="../reference/open_dataset.html">open_dataset()</a></code>, allowing you 
to pass options such as which file prefixes to ignore. (<a 
href="https://github.com/apache/arrow/issues/13171"; 
class="external-link">#13171</a>)</li>
+<li>By default, <code>S3FileSystem</code> will not create or delete buckets. 
To enable that, pass the configuration option 
<code>allow_bucket_creation</code> or <code>allow_bucket_deletion</code>. (<a 
href="https://github.com/apache/arrow/issues/13206"; 
class="external-link">#13206</a>)</li>
 <li>
-<code>GcsFileSystem</code> and <code><a 
href="../reference/gs_bucket.html">gs_bucket()</a></code> allow connecting to 
Google Cloud Storage. (<a 
href="https://issues.apache.org/jira/browse/ARROW-13404"; 
class="external-link">ARROW-13404</a>, <a 
href="https://issues.apache.org/jira/browse/ARROW-16887"; 
class="external-link">ARROW-16887</a>)</li>
+<code>GcsFileSystem</code> and <code><a 
href="../reference/gs_bucket.html">gs_bucket()</a></code> allow connecting to 
Google Cloud Storage. (<a href="https://github.com/apache/arrow/issues/10999"; 
class="external-link">#10999</a>, <a 
href="https://github.com/apache/arrow/issues/13601"; 
class="external-link">#13601</a>)</li>
 </ul></div>
 <div class="section level3">
 <h3 id="arrays-and-tables-9-0-0">Arrays and tables<a class="anchor" 
aria-label="anchor" href="#arrays-and-tables-9-0-0"></a></h3>
-<ul><li>Table and RecordBatch <code>$num_rows()</code> method returns a double 
(previously integer), avoiding integer overflow on larger tables. (<a 
href="https://issues.apache.org/jira/browse/ARROW-14989"; 
class="external-link">ARROW-14989</a>, <a 
href="https://issues.apache.org/jira/browse/ARROW-16977"; 
class="external-link">ARROW-16977</a>)</li></ul></div>
+<ul><li>Table and RecordBatch <code>$num_rows()</code> method returns a double 
(previously integer), avoiding integer overflow on larger tables. (<a 
href="https://github.com/apache/arrow/issues/13482"; 
class="external-link">#13482</a>, <a 
href="https://github.com/apache/arrow/issues/13514"; 
class="external-link">#13514</a>)</li>
+</ul></div>
 <div class="section level3">
 <h3 id="packaging-9-0-0">Packaging<a class="anchor" aria-label="anchor" 
href="#packaging-9-0-0"></a></h3>
 <ul><li>The <code>arrow.dev_repo</code> for nightly builds of the R package 
and prebuilt libarrow binaries is now <a 
href="https://nightlies.apache.org/arrow/r/"; class="external-link 
uri">https://nightlies.apache.org/arrow/r/</a>.</li>
-<li>Brotli and BZ2 are shipped with MacOS binaries. BZ2 is shipped with 
Windows binaries. (<a href="https://issues.apache.org/jira/browse/ARROW-16828"; 
class="external-link">ARROW-16828</a>)</li>
+<li>Brotli and BZ2 are shipped with MacOS binaries. BZ2 is shipped with 
Windows binaries. (<a href="https://github.com/apache/arrow/issues/13484"; 
class="external-link">#13484</a>)</li>
 </ul></div>
 </div>
     <div class="section level2">
@@ -192,10 +277,12 @@
 <div class="section level3">
 <h3 id="enhancements-to-dplyr-and-datasets-8-0-0">Enhancements to dplyr and 
datasets<a class="anchor" aria-label="anchor" 
href="#enhancements-to-dplyr-and-datasets-8-0-0"></a></h3>
 <ul><li>
-<code><a 
href="../reference/open_dataset.html">open_dataset()</a></code>:<ul><li>correctly
 supports the <code>skip</code> argument for skipping header rows in CSV 
datasets.</li>
+<code><a href="../reference/open_dataset.html">open_dataset()</a></code>:
+<ul><li>correctly supports the <code>skip</code> argument for skipping header 
rows in CSV datasets.</li>
 <li>can take a list of datasets with differing schemas and attempt to unify 
the schemas to produce a <code>UnionDataset</code>.</li>
 </ul></li>
-<li>Arrow <a href="https://dplyr.tidyverse.org"; 
class="external-link">dplyr</a> queries:<ul><li>are supported on 
<code>RecordBatchReader</code>. This allows, for example, results from DuckDB 
to be streamed back into Arrow rather than materialized before continuing the 
pipeline.</li>
+<li>Arrow <a href="https://dplyr.tidyverse.org"; 
class="external-link">dplyr</a> queries:
+<ul><li>are supported on <code>RecordBatchReader</code>. This allows, for 
example, results from DuckDB to be streamed back into Arrow rather than 
materialized before continuing the pipeline.</li>
 <li>no longer need to materialize the entire result table before writing to a 
dataset if the query contains aggregations or joins.</li>
 <li>supports <code><a href="https://dplyr.tidyverse.org/reference/rename.html"; 
class="external-link">dplyr::rename_with()</a></code>.</li>
 <li>
@@ -216,7 +303,9 @@
 <h3 id="enhancements-to-date-and-time-support-8-0-0">Enhancements to date and 
time support<a class="anchor" aria-label="anchor" 
href="#enhancements-to-date-and-time-support-8-0-0"></a></h3>
 <ul><li>
 <code><a 
href="../reference/read_delim_arrow.html">read_csv_arrow()</a></code>’s 
readr-style type <code>T</code> is mapped to <code>timestamp(unit = 
"ns")</code> instead of <code>timestamp(unit = "s")</code>.</li>
-<li>For Arrow dplyr queries, added additional <a 
href="https://lubridate.tidyverse.org"; class="external-link">lubridate</a> 
features and fixes:<ul><li>New component extraction functions:<ul><li>
+<li>For Arrow dplyr queries, added additional <a 
href="https://lubridate.tidyverse.org"; class="external-link">lubridate</a> 
features and fixes:
+<ul><li>New component extraction functions:
+<ul><li>
 <code><a href="https://lubridate.tidyverse.org/reference/tz.html"; 
class="external-link">lubridate::tz()</a></code> (timezone),</li>
 <li>
 <code><a href="https://lubridate.tidyverse.org/reference/quarter.html"; 
class="external-link">lubridate::semester()</a></code>,</li>
@@ -243,7 +332,8 @@
 <code><a href="https://lubridate.tidyverse.org/reference/as_date.html"; 
class="external-link">lubridate::as_date()</a></code> and <code><a 
href="https://lubridate.tidyverse.org/reference/as_date.html"; 
class="external-link">lubridate::as_datetime()</a></code>
 </li>
 </ul></li>
-<li>Also for Arrow dplyr queries, added support and fixes for base date and 
time functions:<ul><li>
+<li>Also for Arrow dplyr queries, added support and fixes for base date and 
time functions:
+<ul><li>
 <code><a href="https://rdrr.io/r/base/difftime.html"; 
class="external-link">base::difftime</a></code> and <code><a 
href="https://rdrr.io/r/base/difftime.html"; 
class="external-link">base::as.difftime()</a></code>
 </li>
 <li>
@@ -275,7 +365,8 @@
 <li>Math group generics are implemented for ArrowDatum. This means you can use 
base functions like <code><a href="https://rdrr.io/r/base/MathFun.html"; 
class="external-link">sqrt()</a></code>, <code><a 
href="https://rdrr.io/r/base/Log.html"; class="external-link">log()</a></code>, 
and <code><a href="https://rdrr.io/r/base/Log.html"; 
class="external-link">exp()</a></code> with Arrow arrays and scalars.</li>
 <li>
 <code>read_*</code> and <code>write_*</code> functions support R Connection 
objects for reading and writing files.</li>
-<li>Parquet improvements:<ul><li>Parquet writer supports Duration type 
columns.</li>
+<li>Parquet improvements:
+<ul><li>Parquet writer supports Duration type columns.</li>
 <li>The dataset Parquet reader consumes less memory.</li>
 </ul></li>
 <li>
@@ -298,9 +389,9 @@
 <div class="section level3">
 <h3 id="enhancements-to-dplyr-and-datasets-7-0-0">Enhancements to dplyr and 
datasets<a class="anchor" aria-label="anchor" 
href="#enhancements-to-dplyr-and-datasets-7-0-0"></a></h3>
 <ul><li>Additional <a href="https://lubridate.tidyverse.org"; 
class="external-link">lubridate</a> features: <code>week()</code>, more of the 
<code>is.*()</code> functions, and the label argument to <code>month()</code> 
have been implemented.</li>
-<li>More complex expressions inside <code><a 
href="https://dplyr.tidyverse.org/reference/summarise.html"; 
class="external-link">summarize()</a></code>, such as <code>ifelse(n() &gt; 1, 
mean(y), mean(z))</code>, are supported.</li>
+<li>More complex expressions inside <code>summarize()</code>, such as 
<code>ifelse(n() &gt; 1, mean(y), mean(z))</code>, are supported.</li>
 <li>When adding columns in a dplyr pipeline, one can now use 
<code>tibble</code> and <code>data.frame</code> to create columns of tibbles or 
data.frames respectively (e.g. <code>... %&gt;% mutate(df_col = tibble(a, b)) 
%&gt;% ...</code>).</li>
-<li>Dictionary columns (R <code>factor</code> type) are supported inside of 
<code><a href="https://dplyr.tidyverse.org/reference/coalesce.html"; 
class="external-link">coalesce()</a></code>.</li>
+<li>Dictionary columns (R <code>factor</code> type) are supported inside of 
<code>coalesce()</code>.</li>
 <li>
 <code><a href="../reference/open_dataset.html">open_dataset()</a></code> 
accepts the <code>partitioning</code> argument when reading Hive-style 
partitioned files, even though it is not required.</li>
 <li>The experimental <code><a 
href="../reference/map_batches.html">map_batches()</a></code> function for 
custom operations on dataset has been restored.</li>
@@ -315,7 +406,7 @@
 <code><a href="https://rdrr.io/r/utils/head.html"; 
class="external-link">head()</a></code> no longer hangs on large CSV 
datasets.</li>
 <li>There is an improved error message when there is a conflict between a 
header in the file and schema/column names provided as arguments.</li>
 <li>
-<code><a href="../reference/write_csv_arrow.html">write_csv_arrow()</a></code> 
now follows the signature of <code>readr::write_csv()</code>.</li>
+<code><a href="../reference/write_csv_arrow.html">write_csv_arrow()</a></code> 
now follows the signature of <code><a 
href="https://readr.tidyverse.org/reference/write_delim.html"; 
class="external-link">readr::write_csv()</a></code>.</li>
 </ul></div>
 <div class="section level3">
 <h3 id="other-improvements-and-fixes-7-0-0">Other improvements and fixes<a 
class="anchor" aria-label="anchor" 
href="#other-improvements-and-fixes-7-0-0"></a></h3>
@@ -352,7 +443,7 @@
 <h2 class="pkg-version" data-toc-text="6.0.1" id="arrow-601">arrow 6.0.1<a 
class="anchor" aria-label="anchor" href="#arrow-601"></a></h2><p 
class="text-muted">CRAN release: 2021-11-20</p>
 <ul><li>Joins now support inclusion of dictionary columns, and multiple 
crashes have been fixed</li>
 <li>Grouped aggregation no longer crashes when working on data that has been 
filtered down to 0 rows</li>
-<li>Bindings added for <code><a 
href="https://stringr.tidyverse.org/reference/str_count.html"; 
class="external-link">str_count()</a></code> in dplyr queries</li>
+<li>Bindings added for <code>str_count()</code> in dplyr queries</li>
 <li>Work around a critical bug in the AWS SDK for C++ that could affect S3 
multipart upload</li>
 <li>A UBSAN warning in the round kernel has been resolved</li>
 <li>Fixes for build failures on Solaris and on old versions of macOS</li>
@@ -361,27 +452,27 @@
 <h2 class="pkg-version" data-toc-text="6.0.0" id="arrow-600">arrow 6.0.0<a 
class="anchor" aria-label="anchor" href="#arrow-600"></a></h2>
 <p>There are now two ways to query Arrow data:</p>
 <div class="section level3">
-<h3 id="1-expanded-arrow-native-queries-aggregation-and-joins-6-0-0">1. 
Expanded Arrow-native queries: aggregation and joins<a class="anchor" 
aria-label="anchor" 
href="#1-expanded-arrow-native-queries-aggregation-and-joins-6-0-0"></a></h3>
-<p><code><a href="https://dplyr.tidyverse.org/reference/summarise.html"; 
class="external-link">dplyr::summarize()</a></code>, both grouped and 
ungrouped, is now implemented for Arrow Datasets, Tables, and RecordBatches. 
Because data is scanned in chunks, you can aggregate over larger-than-memory 
datasets backed by many files. Supported aggregation functions include <code><a 
href="https://dplyr.tidyverse.org/reference/context.html"; 
class="external-link">n()</a></code>, <code><a href="https [...]
-<p>Along with <code><a 
href="https://dplyr.tidyverse.org/reference/summarise.html"; 
class="external-link">summarize()</a></code>, you can also call <code><a 
href="https://dplyr.tidyverse.org/reference/count.html"; 
class="external-link">count()</a></code>, <code><a 
href="https://dplyr.tidyverse.org/reference/count.html"; 
class="external-link">tally()</a></code>, and <code><a 
href="https://dplyr.tidyverse.org/reference/distinct.html"; 
class="external-link">distinct()</a></code>, which effectiv [...]
-<p>This enhancement does change the behavior of <code><a 
href="https://dplyr.tidyverse.org/reference/summarise.html"; 
class="external-link">summarize()</a></code> and <code><a 
href="https://dplyr.tidyverse.org/reference/compute.html"; 
class="external-link">collect()</a></code> in some cases: see “Breaking 
changes” below for details.</p>
-<p>In addition to <code><a 
href="https://dplyr.tidyverse.org/reference/summarise.html"; 
class="external-link">summarize()</a></code>, mutating and filtering equality 
joins (<code><a href="https://dplyr.tidyverse.org/reference/mutate-joins.html"; 
class="external-link">inner_join()</a></code>, <code><a 
href="https://dplyr.tidyverse.org/reference/mutate-joins.html"; 
class="external-link">left_join()</a></code>, <code><a 
href="https://dplyr.tidyverse.org/reference/mutate-joins.html"; class="exte [...]
+<h3 id="id_1-expanded-arrow-native-queries-aggregation-and-joins-6-0-0">1. 
Expanded Arrow-native queries: aggregation and joins<a class="anchor" 
aria-label="anchor" 
href="#id_1-expanded-arrow-native-queries-aggregation-and-joins-6-0-0"></a></h3>
+<p><code><a href="https://dplyr.tidyverse.org/reference/summarise.html"; 
class="external-link">dplyr::summarize()</a></code>, both grouped and 
ungrouped, is now implemented for Arrow Datasets, Tables, and RecordBatches. 
Because data is scanned in chunks, you can aggregate over larger-than-memory 
datasets backed by many files. Supported aggregation functions include 
<code>n()</code>, <code>n_distinct()</code>, <code>min(),</code> <code><a 
href="https://rdrr.io/r/base/Extremes.html"; class=" [...]
+<p>Along with <code>summarize()</code>, you can also call 
<code>count()</code>, <code>tally()</code>, and <code>distinct()</code>, which 
effectively wrap <code>summarize()</code>.</p>
+<p>This enhancement does change the behavior of <code>summarize()</code> and 
<code>collect()</code> in some cases: see “Breaking changes” below for 
details.</p>
+<p>In addition to <code>summarize()</code>, mutating and filtering equality 
joins (<code>inner_join()</code>, <code>left_join()</code>, 
<code>right_join()</code>, <code>full_join()</code>, <code>semi_join()</code>, 
and <code>anti_join()</code>) with are also supported natively in Arrow.</p>
 <p>Grouped aggregation and (especially) joins should be considered somewhat 
experimental in this release. We expect them to work, but they may not be well 
optimized for all workloads. To help us focus our efforts on improving them in 
the next release, please let us know if you encounter unexpected behavior or 
poor performance.</p>
-<p>New non-aggregating compute functions include string functions like 
<code><a href="https://stringr.tidyverse.org/reference/case.html"; 
class="external-link">str_to_title()</a></code> and <code><a 
href="https://rdrr.io/r/base/strptime.html"; 
class="external-link">strftime()</a></code> as well as compute functions for 
extracting date parts (e.g. <code>year()</code>, <code>month()</code>) from 
dates. This is not a complete list of additional compute functions; for an 
exhaustive list of ava [...]
+<p>New non-aggregating compute functions include string functions like 
<code>str_to_title()</code> and <code><a 
href="https://rdrr.io/r/base/strptime.html"; 
class="external-link">strftime()</a></code> as well as compute functions for 
extracting date parts (e.g. <code>year()</code>, <code>month()</code>) from 
dates. This is not a complete list of additional compute functions; for an 
exhaustive list of available compute functions see <code><a 
href="../reference/list_compute_functions.html"> [...]
 <p>We’ve also worked to fill in support for all data types, such as 
<code>Decimal</code>, for functions added in previous releases. All type 
limitations mentioned in previous release notes should be no longer valid, and 
if you find a function that is not implemented for a certain data type, please 
<a href="https://issues.apache.org/jira/projects/ARROW/issues"; 
class="external-link">report an issue</a>.</p>
 </div>
 <div class="section level3">
-<h3 id="2-duckdb-integration-6-0-0">2. DuckDB integration<a class="anchor" 
aria-label="anchor" href="#2-duckdb-integration-6-0-0"></a></h3>
+<h3 id="id_2-duckdb-integration-6-0-0">2. DuckDB integration<a class="anchor" 
aria-label="anchor" href="#id_2-duckdb-integration-6-0-0"></a></h3>
 <p>If you have the <a href="https://CRAN.R-project.org/package=duckdb"; 
class="external-link">duckdb package</a> installed, you can hand off an Arrow 
Dataset or query object to <a href="https://duckdb.org/"; 
class="external-link">DuckDB</a> for further querying using the <code><a 
href="../reference/to_duckdb.html">to_duckdb()</a></code> function. This allows 
you to use duckdb’s <code>dbplyr</code> methods, as well as its SQL interface, 
to aggregate data. Filtering and column projection don [...]
 <p>You can also take a duckdb <code>tbl</code> and call <code><a 
href="../reference/to_arrow.html">to_arrow()</a></code> to stream data to 
Arrow’s query engine. This means that in a single dplyr pipeline, you could 
start with an Arrow Dataset, evaluate some steps in DuckDB, then evaluate the 
rest in Arrow.</p>
 </div>
 <div class="section level3">
 <h3 id="breaking-changes-6-0-0">Breaking changes<a class="anchor" 
aria-label="anchor" href="#breaking-changes-6-0-0"></a></h3>
-<ul><li>Row order of data from a Dataset query is no longer deterministic. If 
you need a stable sort order, you should explicitly <code><a 
href="https://dplyr.tidyverse.org/reference/arrange.html"; 
class="external-link">arrange()</a></code> the query result. For calls to 
<code><a href="https://dplyr.tidyverse.org/reference/summarise.html"; 
class="external-link">summarize()</a></code>, you can set 
<code>options(arrow.summarise.sort = TRUE)</code> to match the current 
<code>dplyr</code> beha [...]
+<ul><li>Row order of data from a Dataset query is no longer deterministic. If 
you need a stable sort order, you should explicitly <code>arrange()</code> the 
query result. For calls to <code>summarize()</code>, you can set 
<code>options(arrow.summarise.sort = TRUE)</code> to match the current 
<code>dplyr</code> behavior of sorting on the grouping columns.</li>
 <li>
-<code><a href="https://dplyr.tidyverse.org/reference/summarise.html"; 
class="external-link">dplyr::summarize()</a></code> on an in-memory Arrow Table 
or RecordBatch no longer eagerly evaluates. Call <code><a 
href="https://dplyr.tidyverse.org/reference/compute.html"; 
class="external-link">compute()</a></code> or <code><a 
href="https://dplyr.tidyverse.org/reference/compute.html"; 
class="external-link">collect()</a></code> to evaluate the query.</li>
+<code><a href="https://dplyr.tidyverse.org/reference/summarise.html"; 
class="external-link">dplyr::summarize()</a></code> on an in-memory Arrow Table 
or RecordBatch no longer eagerly evaluates. Call <code>compute()</code> or 
<code>collect()</code> to evaluate the query.</li>
 <li>
-<code><a href="https://rdrr.io/r/utils/head.html"; 
class="external-link">head()</a></code> and <code><a 
href="https://rdrr.io/r/utils/head.html"; 
class="external-link">tail()</a></code> also no longer eagerly evaluate, both 
for in-memory data and for Datasets. Also, because row order is no longer 
deterministic, they will effectively give you a random slice of data from 
somewhere in the dataset unless you <code><a 
href="https://dplyr.tidyverse.org/reference/arrange.html"; class="external-lin 
[...]
+<code><a href="https://rdrr.io/r/utils/head.html"; 
class="external-link">head()</a></code> and <code><a 
href="https://rdrr.io/r/utils/head.html"; 
class="external-link">tail()</a></code> also no longer eagerly evaluate, both 
for in-memory data and for Datasets. Also, because row order is no longer 
deterministic, they will effectively give you a random slice of data from 
somewhere in the dataset unless you <code>arrange()</code> to specify 
sorting.</li>
 <li>Simple Feature (SF) columns no longer save all of their metadata when 
converting to Arrow tables (and thus when saving to Parquet or Feather). This 
also includes any dataframe column that has attributes on each element (in 
other words: row-level metadata). Our previous approach to saving this metadata 
is both (computationally) inefficient and unreliable with Arrow queries + 
datasets. This will most impact saving SF columns. For saving these columns we 
recommend either converting the  [...]
 <li>Datasets are officially no longer supported on 32-bit Windows on R &lt; 
4.0 (Rtools 3.5). 32-bit Windows users should upgrade to a newer version of R 
in order to use datasets.</li>
 </ul></div>
@@ -389,7 +480,7 @@
 <h3 id="installation-on-linux-6-0-0">Installation on Linux<a class="anchor" 
aria-label="anchor" href="#installation-on-linux-6-0-0"></a></h3>
 <ul><li>Package installation now fails if the Arrow C++ library does not 
compile. In previous versions, if the C++ library failed to compile, you would 
get a successful R package installation that wouldn’t do much useful.</li>
 <li>You can disable all optional C++ components when building from source by 
setting the environment variable <code>LIBARROW_MINIMAL=true</code>. This will 
have the core Arrow/Feather components but excludes Parquet, Datasets, 
compression libraries, and other optional features.</li>
-<li>Source packages now bundle the Arrow C++ source code, so it does not have 
to be downloaded in order to build the package. Because the source is included, 
it is now possible to build the package on an offline/airgapped system. By 
default, the offline build will be minimal because it cannot download 
third-party C++ dependencies required to support all features. To allow a fully 
featured offline build, the included <code><a 
href="../reference/create_package_with_all_dependencies.html">c [...]
+<li>Source packages now bundle the Arrow C++ source code, so it does not have 
to be downloaded in order to build the package. Because the source is included, 
it is now possible to build the package on an offline/airgapped system. By 
default, the offline build will be minimal because it cannot download 
third-party C++ dependencies required to support all features. To allow a fully 
featured offline build, the included <code><a 
href="../reference/create_package_with_all_dependencies.html">c [...]
 <li>Source builds can make use of system dependencies (such as 
<code>libz</code>) by setting <code>ARROW_DEPENDENCY_SOURCE=AUTO</code>. This 
is not the default in this release (<code>BUNDLED</code>, i.e. download and 
build all dependencies) but may become the default in the future.</li>
 <li>The JSON library components (<code><a 
href="../reference/read_json_arrow.html">read_json_arrow()</a></code>) are now 
optional and still on by default; set <code>ARROW_JSON=OFF</code> before 
building to disable them.</li>
 </ul></div>
@@ -403,7 +494,7 @@
 <li>
 <code><a href="../reference/write_parquet.html">write_parquet()</a></code> no 
longer errors when used with a grouped data.frame</li>
 <li>
-<code><a href="https://dplyr.tidyverse.org/reference/case_when.html"; 
class="external-link">case_when()</a></code> now errors cleanly if an 
expression is not supported in Arrow</li>
+<code>case_when()</code> now errors cleanly if an expression is not supported 
in Arrow</li>
 <li>
 <code><a href="../reference/open_dataset.html">open_dataset()</a></code> now 
works on CSVs without header rows</li>
 <li>Fixed a minor issue where the short readr-style types <code>T</code> and 
<code>t</code> were reversed in <code><a 
href="../reference/read_delim_arrow.html">read_csv_arrow()</a></code>
@@ -431,19 +522,19 @@
 <div class="section level3">
 <h3 id="more-dplyr-5-0-0">More dplyr<a class="anchor" aria-label="anchor" 
href="#more-dplyr-5-0-0"></a></h3>
 <ul><li>
-<p>There are now more than 250 compute functions available for use in <code><a 
href="https://dplyr.tidyverse.org/reference/filter.html"; 
class="external-link">dplyr::filter()</a></code>, <code><a 
href="https://dplyr.tidyverse.org/reference/mutate.html"; 
class="external-link">mutate()</a></code>, etc. Additions in this release 
include:</p>
-<ul><li>String operations: <code><a 
href="https://rdrr.io/r/base/strsplit.html"; 
class="external-link">strsplit()</a></code> and <code><a 
href="https://stringr.tidyverse.org/reference/str_split.html"; 
class="external-link">str_split()</a></code>; <code><a 
href="https://rdrr.io/r/base/strptime.html"; 
class="external-link">strptime()</a></code>; <code><a 
href="https://rdrr.io/r/base/paste.html"; 
class="external-link">paste()</a></code>, <code><a 
href="https://rdrr.io/r/base/paste.html"; class=" [...]
+<p>There are now more than 250 compute functions available for use in <code><a 
href="https://dplyr.tidyverse.org/reference/filter.html"; 
class="external-link">dplyr::filter()</a></code>, <code>mutate()</code>, etc. 
Additions in this release include:</p>
+<ul><li>String operations: <code><a 
href="https://rdrr.io/r/base/strsplit.html"; 
class="external-link">strsplit()</a></code> and <code>str_split()</code>; 
<code><a href="https://rdrr.io/r/base/strptime.html"; 
class="external-link">strptime()</a></code>; <code><a 
href="https://rdrr.io/r/base/paste.html"; 
class="external-link">paste()</a></code>, <code><a 
href="https://rdrr.io/r/base/paste.html"; 
class="external-link">paste0()</a></code>, and <code>str_c()</code>; <code><a 
href="https://rdrr.i [...]
 </li>
 <li>Date/time operations: <code>lubridate</code> methods such as 
<code>year()</code>, <code>month()</code>, <code>wday()</code>, and so on</li>
 <li>Math: logarithms (<code><a href="https://rdrr.io/r/base/Log.html"; 
class="external-link">log()</a></code> et al.); trigonometry (<code><a 
href="https://rdrr.io/r/base/Trig.html"; class="external-link">sin()</a></code>, 
<code><a href="https://rdrr.io/r/base/Trig.html"; 
class="external-link">cos()</a></code>, et al.); <code><a 
href="https://rdrr.io/r/base/MathFun.html"; 
class="external-link">abs()</a></code>; <code><a 
href="https://rdrr.io/r/base/sign.html"; class="external-link">sign()</a> [...]
 </li>
-<li>Conditional functions, with some limitations on input type in this 
release: <code><a href="https://rdrr.io/r/base/ifelse.html"; 
class="external-link">ifelse()</a></code> and <code><a 
href="https://dplyr.tidyverse.org/reference/if_else.html"; 
class="external-link">if_else()</a></code> for all but <code>Decimal</code> 
types; <code><a href="https://dplyr.tidyverse.org/reference/case_when.html"; 
class="external-link">case_when()</a></code> for logical, numeric, and temporal 
types only; <cod [...]
+<li>Conditional functions, with some limitations on input type in this 
release: <code><a href="https://rdrr.io/r/base/ifelse.html"; 
class="external-link">ifelse()</a></code> and <code>if_else()</code> for all 
but <code>Decimal</code> types; <code>case_when()</code> for logical, numeric, 
and temporal types only; <code>coalesce()</code> for all but lists/structs. 
Note also that in this release, factors/dictionaries are converted to strings 
in these functions.</li>
 <li>
-<code>is.*</code> functions are supported and can be used inside <code><a 
href="https://dplyr.tidyverse.org/reference/relocate.html"; 
class="external-link">relocate()</a></code>
+<code>is.*</code> functions are supported and can be used inside 
<code>relocate()</code>
 </li>
 </ul></li>
-<li>The print method for <code>arrow_dplyr_query</code> now includes the 
expression and the resulting type of columns derived by <code><a 
href="https://dplyr.tidyverse.org/reference/mutate.html"; 
class="external-link">mutate()</a></code>.</li>
-<li><p><code><a href="https://dplyr.tidyverse.org/reference/mutate.html"; 
class="external-link">transmute()</a></code> now errors if passed arguments 
<code>.keep</code>, <code>.before</code>, or <code>.after</code>, for 
consistency with the behavior of <code>dplyr</code> on 
<code>data.frame</code>s.</p></li>
+<li><p>The print method for <code>arrow_dplyr_query</code> now includes the 
expression and the resulting type of columns derived by 
<code>mutate()</code>.</p></li>
+<li><p><code>transmute()</code> now errors if passed arguments 
<code>.keep</code>, <code>.before</code>, or <code>.after</code>, for 
consistency with the behavior of <code>dplyr</code> on 
<code>data.frame</code>s.</p></li>
 </ul></div>
 <div class="section level3">
 <h3 id="csv-writing-5-0-0">CSV writing<a class="anchor" aria-label="anchor" 
href="#csv-writing-5-0-0"></a></h3>
@@ -454,8 +545,8 @@
 </ul></div>
 <div class="section level3">
 <h3 id="c-interface-5-0-0">C interface<a class="anchor" aria-label="anchor" 
href="#c-interface-5-0-0"></a></h3>
-<ul><li>Added bindings for the remainder of C data interface: Type, Field, and 
RecordBatchReader (from the experimental C stream interface). These also have 
<code><a 
href="https://rstudio.github.io/reticulate/reference/r-py-conversion.html"; 
class="external-link">reticulate::py_to_r()</a></code> and <code><a 
href="https://rstudio.github.io/reticulate/reference/r-py-conversion.html"; 
class="external-link">r_to_py()</a></code> methods. Along with the addition of 
the <code>Scanner$ToRecordBat [...]
-<li>C interface methods are exposed on Arrow objects (e.g. 
<code>Array$export_to_c()</code>, <code>RecordBatch$import_from_c()</code>), 
similar to how they are in <code>pyarrow</code>. This facilitates their use in 
other packages. See the <code><a 
href="https://rstudio.github.io/reticulate/reference/r-py-conversion.html"; 
class="external-link">py_to_r()</a></code> and <code><a 
href="https://rstudio.github.io/reticulate/reference/r-py-conversion.html"; 
class="external-link">r_to_py()</a></c [...]
+<ul><li>Added bindings for the remainder of C data interface: Type, Field, and 
RecordBatchReader (from the experimental C stream interface). These also have 
<code><a 
href="https://rstudio.github.io/reticulate/reference/r-py-conversion.html"; 
class="external-link">reticulate::py_to_r()</a></code> and 
<code>r_to_py()</code> methods. Along with the addition of the 
<code>Scanner$ToRecordBatchReader()</code> method, you can now build up a 
Dataset query in R and pass the resulting stream of bat [...]
+<li>C interface methods are exposed on Arrow objects (e.g. 
<code>Array$export_to_c()</code>, <code>RecordBatch$import_from_c()</code>), 
similar to how they are in <code>pyarrow</code>. This facilitates their use in 
other packages. See the <code>py_to_r()</code> and <code>r_to_py()</code> 
methods for usage examples.</li>
 </ul></div>
 <div class="section level3">
 <h3 id="other-enhancements-5-0-0">Other enhancements<a class="anchor" 
aria-label="anchor" href="#other-enhancements-5-0-0"></a></h3>
@@ -478,7 +569,8 @@
 </div>
     <div class="section level2">
 <h2 class="pkg-version" data-toc-text="4.0.1" id="arrow-401">arrow 4.0.1<a 
class="anchor" aria-label="anchor" href="#arrow-401"></a></h2><p 
class="text-muted">CRAN release: 2021-05-28</p>
-<ul><li>Resolved a few bugs in new string compute kernels (<a 
href="https://issues.apache.org/jira/browse/ARROW-12774"; 
class="external-link">ARROW-12774</a>, <a 
href="https://issues.apache.org/jira/browse/ARROW-12670"; 
class="external-link">ARROW-12670</a>)</li></ul></div>
+<ul><li>Resolved a few bugs in new string compute kernels (<a 
href="https://github.com/apache/arrow/issues/10320"; 
class="external-link">#10320</a>, <a 
href="https://github.com/apache/arrow/issues/10287"; 
class="external-link">#10287</a>)</li>
+</ul></div>
     <div class="section level2">
 <h2 class="pkg-version" data-toc-text="4.0.0.1" id="arrow-4001">arrow 
4.0.0.1<a class="anchor" aria-label="anchor" href="#arrow-4001"></a></h2><p 
class="text-muted">CRAN release: 2021-05-10</p>
 <ul><li>The mimalloc memory allocator is the default memory allocator when 
using a static source build of the package on Linux. This is because it has 
better behavior under valgrind than jemalloc does. A full-featured build 
(installed with <code>LIBARROW_MINIMAL=false</code>) includes both jemalloc and 
mimalloc, and it has still has jemalloc as default, though this is configurable 
at runtime with the <code>ARROW_DEFAULT_MEMORY_POOL</code> environment 
variable.</li>
@@ -491,9 +583,9 @@
 <h3 id="dplyr-methods-4-0-0">dplyr methods<a class="anchor" 
aria-label="anchor" href="#dplyr-methods-4-0-0"></a></h3>
 <p>Many more <code>dplyr</code> verbs are supported on Arrow objects:</p>
 <ul><li>
-<code><a href="https://dplyr.tidyverse.org/reference/mutate.html"; 
class="external-link">dplyr::mutate()</a></code> is now supported in Arrow for 
many applications. For queries on <code>Table</code> and 
<code>RecordBatch</code> that are not yet supported in Arrow, the 
implementation falls back to pulling data into an in-memory R 
<code>data.frame</code> first, as in the previous release. For queries on 
<code>Dataset</code> (which can be larger than memory), it raises an error if 
the functi [...]
+<code><a href="https://dplyr.tidyverse.org/reference/mutate.html"; 
class="external-link">dplyr::mutate()</a></code> is now supported in Arrow for 
many applications. For queries on <code>Table</code> and 
<code>RecordBatch</code> that are not yet supported in Arrow, the 
implementation falls back to pulling data into an in-memory R 
<code>data.frame</code> first, as in the previous release. For queries on 
<code>Dataset</code> (which can be larger than memory), it raises an error if 
the functi [...]
 <li>
-<code><a href="https://dplyr.tidyverse.org/reference/mutate.html"; 
class="external-link">dplyr::transmute()</a></code> (which calls <code><a 
href="https://dplyr.tidyverse.org/reference/mutate.html"; 
class="external-link">mutate()</a></code>)</li>
+<code><a href="https://dplyr.tidyverse.org/reference/transmute.html"; 
class="external-link">dplyr::transmute()</a></code> (which calls 
<code>mutate()</code>)</li>
 <li>
 <code><a href="https://dplyr.tidyverse.org/reference/group_by.html"; 
class="external-link">dplyr::group_by()</a></code> now preserves the 
<code>.drop</code> argument and supports on-the-fly definition of columns</li>
 <li>
@@ -503,8 +595,8 @@
 <li>
 <code><a href="https://dplyr.tidyverse.org/reference/compute.html"; 
class="external-link">dplyr::compute()</a></code> to evaluate the lazy 
expressions and return an Arrow Table. This is equivalent to 
<code>dplyr::collect(as_data_frame = FALSE)</code>, which was added in 
2.0.0.</li>
 </ul><p>Over 100 functions can now be called on Arrow objects inside a 
<code>dplyr</code> verb:</p>
-<ul><li>String functions <code><a href="https://rdrr.io/r/base/nchar.html"; 
class="external-link">nchar()</a></code>, <code><a 
href="https://rdrr.io/r/base/chartr.html"; 
class="external-link">tolower()</a></code>, and <code><a 
href="https://rdrr.io/r/base/chartr.html"; 
class="external-link">toupper()</a></code>, along with their 
<code>stringr</code> spellings <code><a 
href="https://stringr.tidyverse.org/reference/str_length.html"; 
class="external-link">str_length()</a></code>, <code><a href= [...]
-<li>Regular expression functions <code><a 
href="https://rdrr.io/r/base/grep.html"; class="external-link">sub()</a></code>, 
<code><a href="https://rdrr.io/r/base/grep.html"; 
class="external-link">gsub()</a></code>, and <code><a 
href="https://rdrr.io/r/base/grep.html"; 
class="external-link">grepl()</a></code>, along with <code><a 
href="https://stringr.tidyverse.org/reference/str_replace.html"; 
class="external-link">str_replace()</a></code>, <code><a 
href="https://stringr.tidyverse.org/referenc [...]
+<ul><li>String functions <code><a href="https://rdrr.io/r/base/nchar.html"; 
class="external-link">nchar()</a></code>, <code><a 
href="https://rdrr.io/r/base/chartr.html"; 
class="external-link">tolower()</a></code>, and <code><a 
href="https://rdrr.io/r/base/chartr.html"; 
class="external-link">toupper()</a></code>, along with their 
<code>stringr</code> spellings <code>str_length()</code>, 
<code>str_to_lower()</code>, and <code>str_to_upper()</code>, are supported in 
Arrow <code>dplyr</code> ca [...]
+<li>Regular expression functions <code><a 
href="https://rdrr.io/r/base/grep.html"; class="external-link">sub()</a></code>, 
<code><a href="https://rdrr.io/r/base/grep.html"; 
class="external-link">gsub()</a></code>, and <code><a 
href="https://rdrr.io/r/base/grep.html"; 
class="external-link">grepl()</a></code>, along with 
<code>str_replace()</code>, <code>str_replace_all()</code>, and 
<code>str_detect()</code>, are supported.</li>
 <li>
 <code>cast(x, type)</code> and <code>dictionary_encode()</code> allow changing 
the type of columns in Arrow objects; <code><a 
href="https://rdrr.io/r/base/numeric.html"; 
class="external-link">as.numeric()</a></code>, <code><a 
href="https://rdrr.io/r/base/character.html"; 
class="external-link">as.character()</a></code>, etc. are exposed as similar 
type-altering conveniences</li>
 <li>
@@ -534,7 +626,7 @@
 </li>
 <li>Similarly, <code>Schema</code> can now be edited by assigning in new 
types. This enables using the CSV reader to detect the schema of a file, modify 
the <code>Schema</code> object for any columns that you want to read in as a 
different type, and then use that <code>Schema</code> to read the data.</li>
 <li>Better validation when creating a <code>Table</code> with a schema, with 
columns of different lengths, and with scalar value recycling</li>
-<li>Reading Parquet files in Japanese or other multi-byte locales on Windows 
no longer hangs (workaround for a <a 
href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98723"; 
class="external-link">bug in libstdc++</a>; thanks @yutannihilation for the 
persistence in discovering this!)</li>
+<li>Reading Parquet files in Japanese or other multi-byte locales on Windows 
no longer hangs (workaround for a <a 
href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98723"; 
class="external-link">bug in libstdc++</a>; thanks <a 
href="https://github.com/yutannihilation"; 
class="external-link">@yutannihilation</a> for the persistence in discovering 
this!)</li>
 <li>If you attempt to read string data that has embedded nul (<code>\0</code>) 
characters, the error message now informs you that you can set 
<code>options(arrow.skip_nul = TRUE)</code> to strip them out. It is not 
recommended to set this option by default since this code path is significantly 
slower, and most string data does not contain nuls.</li>
 <li>
 <code><a href="../reference/read_json_arrow.html">read_json_arrow()</a></code> 
now accepts a schema: <code>read_json_arrow("file.json", schema = schema(col_a 
= float64(), col_b = string()))</code>
@@ -545,7 +637,7 @@
 <ul><li>The R package can now support working with an Arrow C++ library that 
has additional features (such as dataset, parquet, string libraries) disabled, 
and the bundled build script enables setting environment variables to disable 
them. See <code><a href="../articles/install.html">vignette("install", package 
= "arrow")</a></code> for details. This allows a faster, smaller package build 
in cases where that is useful, and it enables a minimal, functioning R package 
build on Solaris.</li>
 <li>On macOS, it is now possible to use the same bundled C++ build that is 
used by default on Linux, along with all of its customization parameters, by 
setting the environment variable <code>FORCE_BUNDLED_BUILD=true</code>.</li>
 <li>
-<code>arrow</code> now uses the <code>mimalloc</code> memory allocator by 
default on macOS, if available (as it is in CRAN binaries), instead of 
<code>jemalloc</code>. There are <a 
href="https://issues.apache.org/jira/browse/&lt;a%20href='https://issues.apache.org/jira/browse/ARROW-6994'&gt;ARROW-6994&lt;/a&gt;"
 class="external-link">configuration issues</a> with <code>jemalloc</code> on 
macOS, and <a href="https://ursalabs.org/blog/2021-r-benchmarks-part-1/"; 
class="external-link">benchm [...]
+<code>arrow</code> now uses the <code>mimalloc</code> memory allocator by 
default on macOS, if available (as it is in CRAN binaries), instead of 
<code>jemalloc</code>. There are <a 
href="https://github.com/apache/arrow/issues/23308"; 
class="external-link">configuration issues</a> with <code>jemalloc</code> on 
macOS, and <a href="https://ursalabs.org/blog/2021-r-benchmarks-part-1/"; 
class="external-link">benchmark analysis</a> shows that this has negative 
effects on performance, especially  [...]
 <li>Setting the <code>ARROW_DEFAULT_MEMORY_POOL</code> environment variable to 
switch memory allocators now works correctly when the Arrow C++ library has 
been statically linked (as is usually the case when installing from CRAN).</li>
 <li>The <code><a href="../reference/arrow_info.html">arrow_info()</a></code> 
function now reports on the additional optional features, as well as the 
detected SIMD level. If key features or compression libraries are not enabled 
in the build, <code><a 
href="../reference/arrow_info.html">arrow_info()</a></code> will refer to the 
installation vignette for guidance on how to install a more complete build, if 
desired.</li>
 <li>If you attempt to read a file that was compressed with a codec that your 
Arrow build does not contain support for, the error message now will tell you 
how to reinstall Arrow with that feature enabled.</li>
@@ -579,7 +671,7 @@
 <li>
 <code><a href="../reference/arrow_info.html">arrow_info()</a></code> for an 
overview of various run-time and build-time Arrow configurations, useful for 
debugging</li>
 <li>Set environment variable <code>ARROW_DEFAULT_MEMORY_POOL</code> before 
loading the Arrow package to change memory allocators. Windows packages are 
built with <code>mimalloc</code>; most others are built with both 
<code>jemalloc</code> (used by default) and <code>mimalloc</code>. These 
alternative memory allocators are generally much faster than the system memory 
allocator, so they are used by default when available, but sometimes it is 
useful to turn them off for debugging purposes.  [...]
-<li>List columns that have attributes on each element are now also included 
with the metadata that is saved when creating Arrow tables. This allows 
<code>sf</code> tibbles to faithfully preserved and roundtripped (<a 
href="https://issues.apache.org/jira/browse/ARROW-10386"; 
class="external-link">ARROW-10386</a>).</li>
+<li>List columns that have attributes on each element are now also included 
with the metadata that is saved when creating Arrow tables. This allows 
<code>sf</code> tibbles to faithfully preserved and roundtripped (<a 
href="https://github.com/apache/arrow/issues/8549"; 
class="external-link">#8549</a>).</li>
 <li>R metadata that exceeds 100Kb is now compressed before being written to a 
table; see <code><a href="../reference/Schema.html">schema()</a></code> for 
more details.</li>
 </ul></div>
 <div class="section level3">
@@ -590,8 +682,8 @@
 <code><a href="../reference/write_parquet.html">write_parquet()</a></code> can 
now write RecordBatches</li>
 <li>Reading a Table from a RecordBatchStreamReader containing 0 batches no 
longer crashes</li>
 <li>
-<code>readr</code>’s <code>problems</code> attribute is removed when 
converting to Arrow RecordBatch and table to prevent large amounts of metadata 
from accumulating inadvertently (<a 
href="https://issues.apache.org/jira/browse/ARROW-10624"; 
class="external-link">ARROW-10624</a>)</li>
-<li>Fixed reading of compressed Feather files written with Arrow 0.17 (<a 
href="https://issues.apache.org/jira/browse/ARROW-10850"; 
class="external-link">ARROW-10850</a>)</li>
+<code>readr</code>’s <code>problems</code> attribute is removed when 
converting to Arrow RecordBatch and table to prevent large amounts of metadata 
from accumulating inadvertently (<a 
href="https://github.com/apache/arrow/issues/9092"; 
class="external-link">#9092</a>)</li>
+<li>Fixed reading of compressed Feather files written with Arrow 0.17 (<a 
href="https://github.com/apache/arrow/issues/9128"; 
class="external-link">#9128</a>)</li>
 <li>
 <code>SubTreeFileSystem</code> gains a useful print method and no longer 
errors when printing</li>
 </ul></div>
@@ -613,7 +705,7 @@
 <code><a href="../reference/write_dataset.html">write_dataset()</a></code> to 
Feather or Parquet files with partitioning. See the end of <code><a 
href="../articles/dataset.html">vignette("dataset", package = 
"arrow")</a></code> for discussion and examples.</li>
 <li>Datasets now have <code><a href="https://rdrr.io/r/utils/head.html"; 
class="external-link">head()</a></code>, <code><a 
href="https://rdrr.io/r/utils/head.html"; 
class="external-link">tail()</a></code>, and take (<code>[</code>) methods. 
<code><a href="https://rdrr.io/r/utils/head.html"; 
class="external-link">head()</a></code> is optimized but the others may not be 
performant.</li>
 <li>
-<code><a href="https://dplyr.tidyverse.org/reference/compute.html"; 
class="external-link">collect()</a></code> gains an <code>as_data_frame</code> 
argument, default <code>TRUE</code> but when <code>FALSE</code> allows you to 
evaluate the accumulated <code>select</code> and <code>filter</code> query but 
keep the result in Arrow, not an R <code>data.frame</code>
+<code>collect()</code> gains an <code>as_data_frame</code> argument, default 
<code>TRUE</code> but when <code>FALSE</code> allows you to evaluate the 
accumulated <code>select</code> and <code>filter</code> query but keep the 
result in Arrow, not an R <code>data.frame</code>
 </li>
 <li>
 <code><a href="../reference/read_delim_arrow.html">read_csv_arrow()</a></code> 
supports specifying column types, both with a <code>Schema</code> and with the 
compact string representation for types used in the <code>readr</code> package. 
It also has gained a <code>timestamp_parsers</code> argument that lets you 
express a set of <code>strptime</code> parse strings that will be tried to 
convert columns designated as <code>Timestamp</code> type.</li>
@@ -680,7 +772,7 @@
 <code>character</code> vectors that exceed 2GB are converted to Arrow 
<code>large_utf8</code> type</li>
 <li>
 <code>POSIXlt</code> objects can now be converted to Arrow 
(<code>struct</code>)</li>
-<li>R <code><a href="https://rdrr.io/r/base/attributes.html"; 
class="external-link">attributes()</a></code> are preserved in Arrow metadata 
when converting to Arrow RecordBatch and table and are restored when converting 
from Arrow. This means that custom subclasses, such as 
<code>haven::labelled</code>, are preserved in round trip through Arrow.</li>
+<li>R <code><a href="https://rdrr.io/r/base/attributes.html"; 
class="external-link">attributes()</a></code> are preserved in Arrow metadata 
when converting to Arrow RecordBatch and table and are restored when converting 
from Arrow. This means that custom subclasses, such as <code><a 
href="https://haven.tidyverse.org/reference/labelled.html"; 
class="external-link">haven::labelled</a></code>, are preserved in round trip 
through Arrow.</li>
 <li>Schema metadata is now exposed as a named list, and it can be modified by 
assignment like <code>batch$metadata$new_key &lt;- "new value"</code>
 </li>
 <li>Arrow types <code>int64</code>, <code>uint32</code>, and 
<code>uint64</code> now are converted to R <code>integer</code> if all values 
fit in bounds</li>
@@ -746,7 +838,7 @@
 <div class="section level3">
 <h3 id="datasets-0-17-0">Datasets<a class="anchor" aria-label="anchor" 
href="#datasets-0-17-0"></a></h3>
 <ul><li>Dataset reading benefits from many speedups and fixes in the C++ 
library</li>
-<li>Datasets have a <code><a href="https://rdrr.io/r/base/dim.html"; 
class="external-link">dim()</a></code> method, which sums rows across all files 
(<a href="https://issues.apache.org/jira/browse/ARROW-8118"; 
class="external-link">ARROW-8118</a>, @boshek)</li>
+<li>Datasets have a <code><a href="https://rdrr.io/r/base/dim.html"; 
class="external-link">dim()</a></code> method, which sums rows across all files 
(<a href="https://github.com/apache/arrow/issues/6635"; 
class="external-link">#6635</a>, <a href="https://github.com/boshek"; 
class="external-link">@boshek</a>)</li>
 <li>Combine multiple datasets into a single queryable 
<code>UnionDataset</code> with the <code><a 
href="https://rdrr.io/r/base/c.html"; class="external-link">c()</a></code> 
method</li>
 <li>Dataset filtering now treats <code>NA</code> as <code>FALSE</code>, 
consistent with <code><a 
href="https://dplyr.tidyverse.org/reference/filter.html"; 
class="external-link">dplyr::filter()</a></code>
 </li>
@@ -777,14 +869,14 @@
 <code><a href="../reference/install_arrow.html">install_arrow()</a></code> now 
installs the latest release of <code>arrow</code>, including Linux 
dependencies, either for CRAN releases or for development builds (if 
<code>nightly = TRUE</code>)</li>
 <li>Package installation on Linux no longer downloads C++ dependencies unless 
the <code>LIBARROW_DOWNLOAD</code> or <code>NOT_CRAN</code> environment 
variable is set</li>
 <li>
-<code><a href="../reference/write_feather.html">write_feather()</a></code>, 
<code>write_arrow()</code> and <code><a 
href="../reference/write_parquet.html">write_parquet()</a></code> now return 
their input, similar to the <code>write_*</code> functions in the 
<code>readr</code> package (<a 
href="https://issues.apache.org/jira/browse/ARROW-7796"; 
class="external-link">ARROW-7796</a>, @boshek)</li>
-<li>Can now infer the type of an R <code>list</code> and create a ListArray 
when all list elements are the same type (<a 
href="https://issues.apache.org/jira/browse/ARROW-7662"; 
class="external-link">ARROW-7662</a>, @michaelchirico)</li>
+<code><a href="../reference/write_feather.html">write_feather()</a></code>, 
<code>write_arrow()</code> and <code><a 
href="../reference/write_parquet.html">write_parquet()</a></code> now return 
their input, similar to the <code>write_*</code> functions in the 
<code>readr</code> package (<a 
href="https://github.com/apache/arrow/issues/6387"; 
class="external-link">#6387</a>, <a href="https://github.com/boshek"; 
class="external-link">@boshek</a>)</li>
+<li>Can now infer the type of an R <code>list</code> and create a ListArray 
when all list elements are the same type (<a 
href="https://github.com/apache/arrow/issues/6275"; 
class="external-link">#6275</a>, <a href="https://github.com/michaelchirico"; 
class="external-link">@michaelchirico</a>)</li>
 </ul></div>
     <div class="section level2">
 <h2 class="pkg-version" data-toc-text="0.16.0" id="arrow-0160">arrow 0.16.0<a 
class="anchor" aria-label="anchor" href="#arrow-0160"></a></h2><p 
class="text-muted">CRAN release: 2020-02-09</p>
 <div class="section level3">
 <h3 id="multi-file-datasets-0-16-0">Multi-file datasets<a class="anchor" 
aria-label="anchor" href="#multi-file-datasets-0-16-0"></a></h3>
-<p>This release includes a <code>dplyr</code> interface to Arrow Datasets, 
which let you work efficiently with large, multi-file datasets as a single 
entity. Explore a directory of data files with <code><a 
href="../reference/open_dataset.html">open_dataset()</a></code> and then use 
<code>dplyr</code> methods to <code><a 
href="https://dplyr.tidyverse.org/reference/select.html"; 
class="external-link">select()</a></code>, <code><a 
href="https://dplyr.tidyverse.org/reference/filter.html"; clas [...]
+<p>This release includes a <code>dplyr</code> interface to Arrow Datasets, 
which let you work efficiently with large, multi-file datasets as a single 
entity. Explore a directory of data files with <code><a 
href="../reference/open_dataset.html">open_dataset()</a></code> and then use 
<code>dplyr</code> methods to <code>select()</code>, <code><a 
href="https://rdrr.io/r/stats/filter.html"; 
class="external-link">filter()</a></code>, etc. Work will be done where 
possible in Arrow memory. When n [...]
 <p>See <code><a href="../articles/dataset.html">vignette("dataset", package = 
"arrow")</a></code> for details.</p>
 </div>
 <div class="section level3">
@@ -805,34 +897,35 @@
 <code><a href="../reference/write_parquet.html">write_parquet()</a></code> now 
supports compression</li>
 <li>
 <code><a 
href="../reference/codec_is_available.html">codec_is_available()</a></code> 
returns <code>TRUE</code> or <code>FALSE</code> whether the Arrow C++ library 
was built with support for a given compression library (e.g. gzip, lz4, 
snappy)</li>
-<li>Windows builds now include support for zstd and lz4 compression (<a 
href="https://issues.apache.org/jira/browse/ARROW-6960"; 
class="external-link">ARROW-6960</a>, @gnguy)</li>
+<li>Windows builds now include support for zstd and lz4 compression (<a 
href="https://github.com/apache/arrow/issues/5814"; 
class="external-link">#5814</a>, <a href="https://github.com/gnguy"; 
class="external-link">@gnguy</a>)</li>
 </ul></div>
 <div class="section level3">
 <h3 id="other-fixes-and-improvements-0-16-0">Other fixes and improvements<a 
class="anchor" aria-label="anchor" 
href="#other-fixes-and-improvements-0-16-0"></a></h3>
 <ul><li>Arrow null type is now supported</li>
-<li>Factor types are now preserved in round trip through Parquet format (<a 
href="https://issues.apache.org/jira/browse/ARROW-7045"; 
class="external-link">ARROW-7045</a>, @yutannihilation)</li>
+<li>Factor types are now preserved in round trip through Parquet format (<a 
href="https://github.com/apache/arrow/issues/6135"; 
class="external-link">#6135</a>, <a href="https://github.com/yutannihilation"; 
class="external-link">@yutannihilation</a>)</li>
 <li>Reading an Arrow dictionary type coerces dictionary values to 
<code>character</code> (as R <code>factor</code> levels are required to be) 
instead of raising an error</li>
-<li>Many improvements to Parquet function documentation (@karldw, 
@khughitt)</li>
+<li>Many improvements to Parquet function documentation (<a 
href="https://github.com/karldw"; class="external-link">@karldw</a>, <a 
href="https://github.com/khughitt"; class="external-link">@khughitt</a>)</li>
 </ul></div>
 </div>
     <div class="section level2">
 <h2 class="pkg-version" data-toc-text="0.15.1" id="arrow-0151">arrow 0.15.1<a 
class="anchor" aria-label="anchor" href="#arrow-0151"></a></h2><p 
class="text-muted">CRAN release: 2019-11-04</p>
-<ul><li>This patch release includes bugfixes in the C++ library around 
dictionary types and Parquet reading.</li></ul></div>
+<ul><li>This patch release includes bugfixes in the C++ library around 
dictionary types and Parquet reading.</li>
+</ul></div>
     <div class="section level2">
 <h2 class="pkg-version" data-toc-text="0.15.0" id="arrow-0150">arrow 0.15.0<a 
class="anchor" aria-label="anchor" href="#arrow-0150"></a></h2><p 
class="text-muted">CRAN release: 2019-10-07</p>
 <div class="section level3">
 <h3 id="breaking-changes-0-15-0">Breaking changes<a class="anchor" 
aria-label="anchor" href="#breaking-changes-0-15-0"></a></h3>
 <ul><li>The R6 classes that wrap the C++ classes are now documented and 
exported and have been renamed to be more R-friendly. Users of the high-level R 
interface in this package are not affected. Those who want to interact with the 
Arrow C++ API more directly should work with these objects and methods. As part 
of this change, many functions that instantiated these R6 objects have been 
removed in favor of <code>Class$create()</code> methods. Notably, <code><a 
href="../reference/array.html [...]
 <li>Due to a subtle change in the Arrow message format, data written by the 
0.15 version libraries may not be readable by older versions. If you need to 
send data to a process that uses an older version of Arrow (for example, an 
Apache Spark server that hasn’t yet updated to Arrow 0.15), you can set the 
environment variable <code>ARROW_PRE_0_15_IPC_FORMAT=1</code>.</li>
-<li>The <code>as_tibble</code> argument in the <code>read_*()</code> functions 
has been renamed to <code>as_data_frame</code> (<a 
href="https://issues.apache.org/jira/browse/ARROW-6337"; 
class="external-link">ARROW-6337</a>, @jameslamb)</li>
+<li>The <code>as_tibble</code> argument in the <code>read_*()</code> functions 
has been renamed to <code>as_data_frame</code> (<a 
href="https://github.com/apache/arrow/issues/5399"; 
class="external-link">#5399</a>, <a href="https://github.com/jameslamb"; 
class="external-link">@jameslamb</a>)</li>
 <li>The <code>arrow::Column</code> class has been removed, as it was removed 
from the C++ library</li>
 </ul></div>
 <div class="section level3">
 <h3 id="new-features-0-15-0">New features<a class="anchor" aria-label="anchor" 
href="#new-features-0-15-0"></a></h3>
 <ul><li>
 <code>Table</code> and <code>RecordBatch</code> objects have S3 methods that 
enable you to work with them more like <code>data.frame</code>s. Extract 
columns, subset, and so on. See <code><a 
href="../reference/Table.html">?Table</a></code> and <code><a 
href="../reference/RecordBatch.html">?RecordBatch</a></code> for examples.</li>
-<li>Initial implementation of bindings for the C++ File System API. (<a 
href="https://issues.apache.org/jira/browse/ARROW-6348"; 
class="external-link">ARROW-6348</a>)</li>
-<li>Compressed streams are now supported on Windows (<a 
href="https://issues.apache.org/jira/browse/ARROW-6360"; 
class="external-link">ARROW-6360</a>), and you can also specify a compression 
level (<a href="https://issues.apache.org/jira/browse/ARROW-6533"; 
class="external-link">ARROW-6533</a>)</li>
+<li>Initial implementation of bindings for the C++ File System API. (<a 
href="https://github.com/apache/arrow/issues/5223"; 
class="external-link">#5223</a>)</li>
+<li>Compressed streams are now supported on Windows (<a 
href="https://github.com/apache/arrow/issues/5329"; 
class="external-link">#5329</a>), and you can also specify a compression level 
(<a href="https://github.com/apache/arrow/issues/5450"; 
class="external-link">#5450</a>)</li>
 </ul></div>
 <div class="section level3">
 <h3 id="other-upgrades-0-15-0">Other upgrades<a class="anchor" 
aria-label="anchor" href="#other-upgrades-0-15-0"></a></h3>
@@ -841,9 +934,9 @@
 <code><a href="../reference/read_delim_arrow.html">read_csv_arrow()</a></code> 
supports more parsing options, including <code>col_names</code>, 
<code>na</code>, <code>quoted_na</code>, and <code>skip</code>
 </li>
 <li>
-<code><a href="../reference/read_parquet.html">read_parquet()</a></code> and 
<code><a href="../reference/read_feather.html">read_feather()</a></code> can 
ingest data from a <code>raw</code> vector (<a 
href="https://issues.apache.org/jira/browse/ARROW-6278"; 
class="external-link">ARROW-6278</a>)</li>
-<li>File readers now properly handle paths that need expanding, such as 
<code>~/file.parquet</code> (<a 
href="https://issues.apache.org/jira/browse/ARROW-6323"; 
class="external-link">ARROW-6323</a>)</li>
-<li>Improved support for creating types in a schema: the types’ printed names 
(e.g. “double”) are guaranteed to be valid to use in instantiating a schema 
(e.g. <code><a href="https://rdrr.io/r/base/double.html"; 
class="external-link">double()</a></code>), and time types can be created with 
human-friendly resolution strings (“ms”, “s”, etc.). (<a 
href="https://issues.apache.org/jira/browse/ARROW-6338"; 
class="external-link">ARROW-6338</a>, <a 
href="https://issues.apache.org/jira/browse/ARRO [...]
+<code><a href="../reference/read_parquet.html">read_parquet()</a></code> and 
<code><a href="../reference/read_feather.html">read_feather()</a></code> can 
ingest data from a <code>raw</code> vector (<a 
href="https://github.com/apache/arrow/issues/5141"; 
class="external-link">#5141</a>)</li>
+<li>File readers now properly handle paths that need expanding, such as 
<code>~/file.parquet</code> (<a 
href="https://github.com/apache/arrow/issues/5169"; 
class="external-link">#5169</a>)</li>
+<li>Improved support for creating types in a schema: the types’ printed names 
(e.g. “double”) are guaranteed to be valid to use in instantiating a schema 
(e.g. <code><a href="https://rdrr.io/r/base/double.html"; 
class="external-link">double()</a></code>), and time types can be created with 
human-friendly resolution strings (“ms”, “s”, etc.). (<a 
href="https://github.com/apache/arrow/issues/5198"; 
class="external-link">#5198</a>, <a 
href="https://github.com/apache/arrow/issues/5201"; class=" [...]
 </ul></div>
 </div>
     <div class="section level2">


Reply via email to