GitHub user davies opened a pull request:
https://github.com/apache/spark/pull/2369
[SPARK-3500] [SQL] use JavaSchemaRDD as SchemaRDD._jschema_rdd
Currently, SchemaRDD._jschema_rdd is SchemaRDD, the Scala API (coalesce(),
repartition()) can not been called in Python easily, there is no way to specify
the implicit parameter `ord`. The _jrdd is an JavaRDD, so _jschema_rdd should
also be JavaSchemaRDD.
In this patch, change _schema_rdd to JavaSchemaRDD, also added an assert
for it. If some methods are missing from JavaSchemaRDD, then it's called by
_schema_rdd.baseSchemaRDD().xxx().
BTW, Do we need JavaSQLContext?
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/davies/spark fix_schemardd
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/2369.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2369
----
commit abee1595ff38fc28c9a84aefcd25339a85e48c0d
Author: Davies Liu <[email protected]>
Date: 2014-09-12T06:17:46Z
use JavaSchemaRDD as SchemaRDD._jschema_rdd
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]