Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/22839#discussion_r228307138
--- Diff: examples/src/main/r/RSparkSQLExample.R ---
@@ -114,10 +114,14 @@ write.df(namesAndAges, "namesAndAges.parquet",
"parquet")
# $example on:manual_load_options_csv$
-df <- read.df("examples/src/main/resources/people.csv", "csv", sep=";",
inferSchema=T, header=T)
+df <- read.df("examples/src/main/resources/people.csv", "csv", sep = ";",
inferSchema = TRUE, header = TRUE)
namesAndAges <- select(df, "name", "age")
# $example off:manual_load_options_csv$
+# $example on:manual_save_options_orc$
+df <- read.df("examples/src/main/resources/users.orc", "orc")
+write.orc(df, "users_with_options.orc", orc.bloom.filter.columns =
"favorite_color", orc.dictionary.key.threshold = 1.0)
+# $example off:manual_save_options_orc$
--- End diff --
Hi, @felixcheung . This is a backport of #22801 (on master branch).
Could you review this?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]