GitHub user sun-rui opened a pull request:
https://github.com/apache/spark/pull/8794
[SPARK-10051][SPARKR] Support collecting data of StructType in DataFrame
Two points in this PR:
1. Originally thought was that a named R list is assumed to be a struct
in SerDe. But this is problematic because some R functions will implicitly
generate named lists that are not intended to be a struct when transferred by
SerDe. So SerDe clients have to explicitly mark a names list as struct by
changing its class from "list" to "struct".
2. SerDe is in the Spark Core module, and data of StructType is
represented as GenricRow which is defined in Spark SQL module. SerDe can't
import GenricRow as in maven build Spark SQL module depends on Spark Core
module. So this PR adds a registration hook in SerDe to allow SQLUtils in Spark
SQL module to register its functions for serialization and deserialization of
StructType.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/sun-rui/spark SPARK-10051
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/8794.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #8794
----
commit d2b9f37f0e012c78d6c6567da84a506012324289
Author: Sun Rui <[email protected]>
Date: 2015-09-07T02:27:59Z
Support collecting StructType from DataFrame.
commit e60a1a5415346554679d9a50a58a5029c144bbef
Author: Sun Rui <[email protected]>
Date: 2015-09-17T05:11:16Z
Support struct type in createDataFrame().
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]