thisisnic commented on a change in pull request #67:
URL: https://github.com/apache/arrow-cookbook/pull/67#discussion_r703844580



##########
File path: r/content/specify_data_types_and_schemas.Rmd
##########
@@ -0,0 +1,205 @@
+# Defining Data Types
+
+As discussed in previous chapters, Arrow automatically handles the conversion 
of objects from native R data types to Arrow data types.  
+However, you might want to manually define data types, for example, to ensure 
interoperability with databases and data warehouse systems.
+
+## Specify data types when creating an Arrow table from an R object
+
+### Problem
+
+You want to manually specify Arrow data types when converting an object from a 
data frame to an Arrow object.
+
+### Solution
+
+```{r, use_schema}
+# create a data frame 
+share_data <- tibble::tibble(
+  company = c("AMZN", "GOOG", "BKNG", "TSLA"),
+  price = c(3463.12, 2884.38, 2300.46, 732.39),
+  date = rep(as.Date("2021-09-02"), 4)
+)
+
+# define field names and types
+share_schema <- schema(
+  company = utf8(),
+  price = float32(),
+  date = date64()
+)
+
+# create arrow Table containing data and schema
+share_data_arrow <- Table$create(share_data, schema = share_schema)
+
+share_data_arrow
+```
+```{r, test_use_schema, opts.label = "test"}
+test_that("use_schema works as expected", {
+  expect_s3_class(share_data_arrow, "Table")
+  expect_equal(share_data_arrow$schema,
+    schema(company = utf8(),  price = float32(), date = date64())
+  )
+})
+```
+
+## Specify data types when reading in files
+
+### Problem
+
+You want to manually specify Arrow data types when reading in files.
+
+### Solution
+
+```{r, use_schema_dataset}
+# create a data frame 
+share_data <- tibble::tibble(
+  company = c("AMZN", "GOOG", "BKNG", "TSLA"),
+  price = c(3463.12, 2884.38, 2300.46, 732.39),
+  date = rep(as.Date("2021-09-02"), 4)
+)
+
+# write dataset to disk
+write_dataset(share_data, path = "shares")
+
+# define field names and types
+share_schema <- schema(

Review comment:
       Can't object to suggestions that improve clarity; cheers, updated!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to