Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/9769#issuecomment-157589422
@shivaram, The R raw type is intended to hold raw bytes. while int vector
is to hold 32-bit integer values. The R raw type maps to Spark SQL binary type,
which is internally represented in Array[Byte].
This PR solves two problems:
1. Inferring of raw type is incorrect.
```
> SparkR:::infer_type(as.raw(c(1, 2 ,3)))
[1] "array<binary>"
```
This is not correct, it should be "binary".
2. Collecting a DataFrame fails if there is any column of binary type. The
bug lies in the logic that determines whether a collected column can be coerced
into a atomic vector or not.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]