nealrichardson commented on a change in pull request #10765: URL: https://github.com/apache/arrow/pull/10765#discussion_r674293424
########## File path: r/vignettes/dataset.Rmd ########## @@ -20,34 +20,36 @@ and what is on the immediate development roadmap. The [New York City taxi trip record data](https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page) is widely used in big data exercises and competitions. For demonstration purposes, we have hosted a Parquet-formatted version -of about 10 years of the trip data in a public Amazon S3 bucket. +of about ten years of the trip data in a public Amazon S3 bucket. The total file size is around 37 gigabytes, even in the efficient Parquet file -format. That's bigger than memory on most people's computers, so we can't just +format. That's bigger than memory on most people's computers, so you can't just read it all in and stack it into a single data frame. -In Windows and macOS binary packages, S3 support is included. -On Linux when installing from source, S3 support is not enabled by default, +In Windows (for R > 3.6) and macOS binary packages, S3 support is included. +On Linux, when installing from source, S3 support is not enabled by default, and it has additional system requirements. See `vignette("install", package = "arrow")` for details. -To see if your `arrow` installation has S3 support, run +To see if your __arrow__ installation has S3 support, run: ```{r} arrow::arrow_with_s3() ``` -Even with S3 support enabled network, speed will be a bottleneck unless your +Even with an S3 support enabled network, speed will be a bottleneck unless your Review comment: ```suggestion Even with S3 support enabled, network speed will be a bottleneck unless your ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
