Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/19551#discussion_r146236358
--- Diff: R/pkg/tests/fulltests/test_sparkSQL.R ---
@@ -499,6 +499,12 @@ test_that("create DataFrame with different data
types", {
expect_equal(collect(df), data.frame(l, stringsAsFactors = FALSE))
})
+test_that("SPARK-17902: collect() with stringsAsFactors enabled", {
--- End diff --
```r
> # Ordered vs unordered
> or <- factor(c("Hi", "Med", "Med", "Hi", "Lo"), levels=c("Lo", "Med",
"Hi"), ordered=TRUE)
> or1 <- factor(c("Hi", "Med", "Med", "Hi", "Lo"), levels=c("Lo", "Med",
"Hi"), ordered=FALSE)
> expect_equal(or, or1)
error: `or` not equal to `or1`.
Attributes: < Component âclassâ: Lengths (2, 1) differ (string compare
on first 1) >
Attributes: < Component âclassâ: 1 string mismatch >
```
```r
> # level order mismatch
> or <- factor(c("Hi", "Med", "Med", "Hi", "Lo"), levels=c("Hi", "Lo",
"Med"))
> or1 <- factor(c("Hi", "Med", "Med", "Hi", "Lo"), levels=c("Lo", "Med",
"Hi"))
> expect_equal(or, or1)
error: `or` not equal to `or1`.
Attributes: < Component âlevelsâ: 3 string mismatches >
```
```r
# Data order mismatch
> or <- factor(c("Lo", "Hi", "Med", "Med", "Hi"), levels=c("Hi", "Lo",
"Med"))
> or1 <- factor(c("Hi", "Med", "Med", "Hi", "Lo"), levels=c("Hi", "Lo",
"Med"))
> expect_equal(or, or1)
error: `or` not equal to `or1`.
4 string mismatches
```
```r
> or <- factor(c("Hi", "Med", "Med", "Hi", "Lo"), levels=c("Hi", "Lo",
"Med"))
> or1 <- factor(c("Hi", "Med", "Med", "Hi", "Lo"), levels=c("Hi", "Lo",
"Med"))
> expect_equal(or, or1)
```
Would this test address your concern?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]