nealrichardson commented on code in PR #33748:
URL: https://github.com/apache/arrow/pull/33748#discussion_r1081651053


##########
r/NEWS.md:
##########
@@ -19,6 +19,94 @@
 
 # arrow 10.0.1.9000
 
+## Breaking changes
+
+* `map_batches()` is lazy by default; it now returns a `RecordBatchReader`
+  instead of a list of `RecordBatch` objects unless `lazy = FALSE`.
+  ([#14521](https://github.com/apache/arrow/issues/14521))
+
+## New features
+
+### Docs
+
+* A substantial reorganisation, rewrite of and addition to, many of the 
+  vignettes and README. (@djnavarro, 
+  [#14514](https://github.com/apache/arrow/issues/14514))  
+
+### Reading/writing data
+
+* New functions `open_csv_dataset()`, `open_tsv_dataset()`, and 
+  `open_delim_dataset()` all wrap `open_dataset()`- they don't provide new 
+  functionality, but allow for readr-style options to be supplied, making it 
+  simpler to switch between individual file-reading and dataset 
+  functionality. ([#33614](https://github.com/apache/arrow/issues/33614))
+* User-defined null values can be set when writing CSVs both as datasets 
+  and as individual files. (@wjones127, 
+  [#14679](https://github.com/apache/arrow/issues/14679))
+* The new `col_names` parameter allows specification of column names when 
+  opening a CSV dataset. (@wjones127, 
+  [#14705](https://github.com/apache/arrow/issues/14705))
+* The `parse_options`, `read_options`, and `convert_options` parameters for 
+  reading individual files (`read_*_arrow()` functions) and datasets 
+  (`open_dataset()` and the new `open_*_dataset()` functions) can be passed 
+  in as lists. ([#15270](https://github.com/apache/arrow/issues/15270))
+* File paths containing accents can be read by `read_csv_arrow()`. 
+  ([#14930](https://github.com/apache/arrow/issues/14930))
+
+### dplyr compatibility
+* New dplyr (1.1.0) function `join_by()` has been implemented for dplyr joins 
+  on Arrow objects (equality conditions only).  
+  ([#33664](https://github.com/apache/arrow/issues/33664))
+
+### Function bindings
+
+* The following functions can be used in queries on Arrow objects:
+  * `lubridate::with_tz()` and `lubridate::force_tz()` (@eitsupi, 
+  [#14093](https://github.com/apache/arrow/issues/14093))
+  * `stringr::str_remove()` and `stringr::str_remove_all()` 
+  ([#14644](https://github.com/apache/arrow/issues/14644))
+
+### Arrow object creation
+
+* Arrow Scalars can be created from `POSIXlt` objects. 
+  ([#15277](https://github.com/apache/arrow/issues/15277))
+* `Array$create()` can create Decimal arrays. 
+  ([#15211](https://github.com/apache/arrow/issues/15211))
+* `StructArray$create()` can be used to create StructArray objects. 
+  ([#14922](https://github.com/apache/arrow/issues/14922))
+
+### Installation
+
+* Improved offline installation using pre-downloaded binaries. 
+  (@pgramme, [#14086](https://github.com/apache/arrow/issues/14086))
+* The package can automatically link to system installations of the AWS SDK
+  for C++. (@kou, [#14235](https://github.com/apache/arrow/issues/14235))
+
+## Minor improvements and fixes
+
+* Calling `lubridate::as_datetime()` on Arrow objects can handle time in 
+  sub-seconds. (@eitsupi, 
+  [#13890](https://github.com/apache/arrow/issues/13890))
+* `head()` can be called after `as_record_batch_read()`. 
+  ([#14518](https://github.com/apache/arrow/issues/14518))
+* `dplyr::right_join()` correctly coalesces keys. 
+  ([#15077](https://github.com/apache/arrow/issues/15077))
+* Output is accurate when multiple `dplyr::group_by()`/`dplyr::summarise()` 
+  calls are used. ([#14905](https://github.com/apache/arrow/issues/14905))
+* `dplyr::summarize()` works with division when divisor is a variable. 

Review Comment:
   It's your call, just reading the draft and reacting to it, take it for what 
it's worth. I tend to think of "minor improvements" as the "other" category for 
things that don't fall under some other initiative. If I were scanning the news 
for "how has arrow's dplyr support improved in this release?", I'd just look to 
the "dplyr compatibility" section, not expecting some division between major 
and minor improvements found under some other heading. (Also, there's currently 
only one thing under the dplyr section, and it feels pretty insignificant to me 
as a user 🤷 )



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to