thisisnic commented on code in PR #14514:
URL: https://github.com/apache/arrow/pull/14514#discussion_r1012447084
##########
r/vignettes/data_objects.Rmd:
##########
@@ -0,0 +1,206 @@
+---
+title: "Data objects"
+description: >
+ Learn about Scalar, Array, Table, and Dataset objects in `arrow`
+ (among others), how they relate to each other, as well as their
+ relationships to familiar R objects like data frames and vectors
+output: rmarkdown::html_vignette
+---
+
+This article describes the various data object types supplied by `arrow`, and
documents how these objects are structured.
+
+```{r include=FALSE}
+library(arrow, warn.conflicts = FALSE)
+```
+
+The `arrow` package supplies several object classes that are used to represent
data. `RecordBatch`, `Table`, and `Dataset` objects are two-dimensional
rectangular data structures used to store tabular data. For columnar,
one-dimensional data, the `Array` and `ChunkedArray` classes are provided.
Finally, `Scalar` objects represent individual values. The table below
summarizes these objects and shows how you can create new instances using the
[`R6`](https://r6.r-lib.org/) class object, as well as convenience functions
that provide the same functionality in a more traditional R-like fashion:
+
+| Dim | Class | How to create an instance |
Convenience function |
+| --- | -------------- | ----------------------------------------------|
--------------------------------------------- |
+| 0 | `Scalar` | `Scalar$create(value, type)` |
|
+| 1 | `Array` | `Array$create(vector, type)` |
|
+| 1 | `ChunkedArray` | `ChunkedArray$create(..., type)` |
`chunked_array(..., type)` |
+| 2 | `RecordBatch` | `RecordBatch$create(...)` |
`record_batch(...)` |
+| 2 | `Table` | `Table$create(...)` |
`arrow_table(...)` |
+| 2 | `Dataset` | `Dataset$create(sources, schema)` |
`open_dataset(sources, schema)` |
+
+Later in the article we'll look at each of these in more detail.
+
+For now we note that each of these object classes corresponds to a class of
the same name in the underlying Arrow C++ library. It is also worth mentioning
that the `arrow` package also defines classes that do not exist in the C++
library including:
+
+* `ArrowDatum`: inherited by `Scalar`, `Array`, and `ChunkedArray`
+* `ArrowTabular`: inherited by `RecordBatch` and `Table`
+* `ArrowObject`: inherited by all Arrow objects
Review Comment:
Sorry, I was reading it as new content as I'd not looking at the getting
started page in ages and ages! Honestly, I'd just err on the side of your own
judgment in cases like this; I agree this section is for one of the dev
vignettes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]