[GitHub] [arrow] ianmcook commented on a change in pull request #10014: ARROW-11477: [R][Doc] Reorganize and improve README and vignette content

GitBox Wed, 14 Apr 2021 14:23:03 -0700


ianmcook commented on a change in pull request #10014:
URL: https://github.com/apache/arrow/pull/10014#discussion_r613596152




##########
File path: r/README.md
##########
@@ -4,250 +4,283 @@
 
[![CI](https://github.com/apache/arrow/workflows/R/badge.svg?event=push)](https://github.com/apache/arrow/actions?query=workflow%3AR+branch%3Amaster+event%3Apush)
 
[![conda-forge](https://img.shields.io/conda/vn/conda-forge/r-arrow.svg)](https://anaconda.org/conda-forge/r-arrow)
 
-[Apache Arrow](https://arrow.apache.org/) is a cross-language
-development platform for in-memory data. It specifies a standardized
+**[Apache Arrow](https://arrow.apache.org/) is a cross-language
+development platform for in-memory data.** It specifies a standardized
 language-independent columnar memory format for flat and hierarchical
 data, organized for efficient analytic operations on modern hardware. It
 also provides computational libraries and zero-copy streaming messaging
 and interprocess communication.
 
-The `arrow` package exposes an interface to the Arrow C++ library to
-access many of its features in R. This includes support for analyzing
-large, multi-file datasets (`open_dataset()`), working with individual
-Parquet (`read_parquet()`, `write_parquet()`) and Feather
-(`read_feather()`, `write_feather()`) files, as well as lower-level
-access to Arrow memory and messages.
+**The `arrow` package exposes an interface to the Arrow C++ library,
+enabling access to many of its features in R.** It provides low-level
+access to the Arrow C++ library API and higher-level access through a
+`dplyr` backend and familiar R functions.
+
+## What can the `arrow` package do?
+
+-   Read and write **Parquet files** (`read_parquet()`,
+    `write_parquet()`), an efficient and widely used columnar format
+-   Read and write **Feather files** (`read_feather()`,
+    `write_feather()`), a format optimized for speed and
+    interoperability
+-   Open or write **large, multi-file datasets** with a single function
+    call (`open_dataset()`, `write_dataset()`)
+-   Read **large CSV and JSON files** with excellent **speed and

Review comment:
       I don't really think this is the right place for links; this is mean to 
be a fairly breezy list of key features, with elaboration provided below. I'd 
prefer to table this idea for later

##########
File path: r/README.md
##########
@@ -4,250 +4,283 @@
 
[![CI](https://github.com/apache/arrow/workflows/R/badge.svg?event=push)](https://github.com/apache/arrow/actions?query=workflow%3AR+branch%3Amaster+event%3Apush)
 
[![conda-forge](https://img.shields.io/conda/vn/conda-forge/r-arrow.svg)](https://anaconda.org/conda-forge/r-arrow)
 
-[Apache Arrow](https://arrow.apache.org/) is a cross-language
-development platform for in-memory data. It specifies a standardized
+**[Apache Arrow](https://arrow.apache.org/) is a cross-language
+development platform for in-memory data.** It specifies a standardized
 language-independent columnar memory format for flat and hierarchical
 data, organized for efficient analytic operations on modern hardware. It
 also provides computational libraries and zero-copy streaming messaging
 and interprocess communication.
 
-The `arrow` package exposes an interface to the Arrow C++ library to
-access many of its features in R. This includes support for analyzing
-large, multi-file datasets (`open_dataset()`), working with individual
-Parquet (`read_parquet()`, `write_parquet()`) and Feather
-(`read_feather()`, `write_feather()`) files, as well as lower-level
-access to Arrow memory and messages.
+**The `arrow` package exposes an interface to the Arrow C++ library,
+enabling access to many of its features in R.** It provides low-level
+access to the Arrow C++ library API and higher-level access through a
+`dplyr` backend and familiar R functions.
+
+## What can the `arrow` package do?
+
+-   Read and write **Parquet files** (`read_parquet()`,
+    `write_parquet()`), an efficient and widely used columnar format
+-   Read and write **Feather files** (`read_feather()`,
+    `write_feather()`), a format optimized for speed and
+    interoperability
+-   Open or write **large, multi-file datasets** with a single function
+    call (`open_dataset()`, `write_dataset()`)
+-   Read **large CSV and JSON files** with excellent **speed and

Review comment:
       I don't really think this is the right place for links; this is meant to 
be a fairly breezy list of key features, with elaboration provided below. I'd 
prefer to table this idea for later




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [arrow] ianmcook commented on a change in pull request #10014: ARROW-11477: [R][Doc] Reorganize and improve README and vignette content

Reply via email to