thisisnic commented on code in PR #14514: URL: https://github.com/apache/arrow/pull/14514#discussion_r1005366066
########## r/vignettes/flight.Rmd: ########## @@ -1,48 +1,38 @@ --- -title: "Connecting to Flight RPC Servers" +title: "Connecting to a flight server" +description: > + Learn how to efficiently stream Apache Arrow data objects across a + network using Arrow Flight output: rmarkdown::html_vignette vignette: > - %\VignetteIndexEntry{Connecting to Flight RPC Servers} + %\VignetteIndexEntry{Connecting to a flight server} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- -[**Flight**](https://arrow.apache.org/blog/2019/10/13/introducing-arrow-flight/) -is a general-purpose client-server framework for high performance -transport of large datasets over network interfaces, built as part of the -[Apache Arrow](https://arrow.apache.org) project. +[Arrow Flight](https://arrow.apache.org/blog/2019/10/13/introducing-arrow-flight/) is a general-purpose client-server framework for high performance transport of large datasets over network interfaces, built as part of the Apache Arrow project. It allows for highly efficient data transfer by several means: -Flight allows for highly efficient data transfer as it: +* Flight removes the need for deserialization during data transfer. +* Flight allows for parallel data streaming. +* Flight employs optimizations designed to take advantage of Arrow's columnar format. -* removes the need for deserialization during data transfer -* allows for parallel data streaming -* is highly optimized to take advantage of Arrow's columnar format. +The `arrow` package provides methods for connecting to Flight servers to send and receive data. Review Comment: In previous iterations of doc refactoring, we decided to refer to packages on the first instance with a link, and on the subsequent instances with a link to that package, instead of backticks as it makes the sentence more skimmable (and tbh were just copying [how the dplyr docs do it](https://dplyr.tidyverse.org/articles/programming.html) ;) ) There's a little bit in here about that: https://github.com/apache/arrow/blob/master/r/STYLE.md. ########## r/vignettes/flight.Rmd: ########## @@ -1,48 +1,38 @@ --- -title: "Connecting to Flight RPC Servers" +title: "Connecting to a flight server" +description: > + Learn how to efficiently stream Apache Arrow data objects across a + network using Arrow Flight output: rmarkdown::html_vignette vignette: > - %\VignetteIndexEntry{Connecting to Flight RPC Servers} + %\VignetteIndexEntry{Connecting to a flight server} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- -[**Flight**](https://arrow.apache.org/blog/2019/10/13/introducing-arrow-flight/) -is a general-purpose client-server framework for high performance -transport of large datasets over network interfaces, built as part of the -[Apache Arrow](https://arrow.apache.org) project. +[Arrow Flight](https://arrow.apache.org/blog/2019/10/13/introducing-arrow-flight/) is a general-purpose client-server framework for high performance transport of large datasets over network interfaces, built as part of the Apache Arrow project. It allows for highly efficient data transfer by several means: -Flight allows for highly efficient data transfer as it: +* Flight removes the need for deserialization during data transfer. +* Flight allows for parallel data streaming. +* Flight employs optimizations designed to take advantage of Arrow's columnar format. -* removes the need for deserialization during data transfer -* allows for parallel data streaming -* is highly optimized to take advantage of Arrow's columnar format. +The `arrow` package provides methods for connecting to Flight servers to send and receive data. -The arrow package provides methods for connecting to Flight RPC servers -to send and receive data. +## Prerequisites -## Getting Started - -The `flight` functions in the package use [reticulate](https://rstudio.github.io/reticulate/) to call methods in the -[pyarrow](https://arrow.apache.org/docs/python/api/flight.html) Python package. - -Before using them for the first time, -you'll need to be sure you have reticulate and pyarrow installed: +At present the `arrow` package in R does not supply an independent implementation of Arrow Flight: it works by calling [Flight methods supplied by PyArrow](https://arrow.apache.org/docs/python/api/flight.html) Python, and requires both the [`reticulate`](https://rstudio.github.io/reticulate/) package and the Python PyArrow library to be installed. If you are using them for the first time you can install them like this: Review Comment: Love this phrasing, this is much clearer ########## r/vignettes/flight.Rmd: ########## @@ -84,6 +73,13 @@ client %>% Because `flight_get()` returns an Arrow data structure, you can directly pipe its result into a [dplyr](https://dplyr.tidyverse.org/) workflow. -See `vignette("dataset", package = "arrow")` for more information on working with Arrow objects via a dplyr interface. +See `vignette("data_wrangling", package = "arrow")` for more information on working with Arrow objects via a `dplyr` interface. + +## Further reading + +- The specification of the [Flight remote procedure call protocol](https://arrow.apache.org/docs/format/Flight.html) is listed on the Arrow project homepage +- The Arrow C++ documentation contains a list of [best practices](https://arrow.apache.org/docs/cpp/flight.html#best-practices) for Arrow Flight. +- A detailed worked example of an Arrow Flight server in Python is provided in the [Apache Arrow Python Cookbook](https://arrow.apache.org/cookbook/py/flight.html). Review Comment: Good call, great addition ########## r/vignettes/flight.Rmd: ########## @@ -1,48 +1,38 @@ --- -title: "Connecting to Flight RPC Servers" +title: "Connecting to a flight server" Review Comment: Should "flight" here be capitalised? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org