Github user zhzhan commented on a diff in the pull request:
https://github.com/apache/spark/pull/2576#discussion_r18871011
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SchemaRDDLike.scala
---
@@ -77,6 +77,18 @@ private[sql] trait SchemaRDDLike {
}
/**
+ * Saves the contents of this `SchemaRDD` as a orc file, preserving the
schema. Files that
--- End diff --
Just for your reference, I did it by following code, and when the user
import this one by "import org.apache.spark.sql.hive.orc._", they get the
support of ORC source. (not sure how to make format better:) )
package object orc {
implicit class OrcContext(sqlContext: HiveContext) {
def orcFile(filePath: String) = new SchemaRDD(sqlContext,
OrcRelation(filePath,
Some(sqlContext.sparkContext.hadoopConfiguration), sqlContext))
def createOrcFile[A <: Product : TypeTag](path: String,
allowExisting: Boolean = true,
conf: Configuration = new
Configuration()): SchemaRDD = {
new SchemaRDD(
sqlContext,
OrcRelation.createEmpty(path, ScalaReflection.attributesFor[A],
allowExisting, conf, sqlContext))
}
}
implicit class OrcSchemaRDD(rdd: SchemaRDD) {
def saveAsOrcFile(path: String): Unit = {
rdd.sqlContext.executePlan(WriteToOrcFile(path,
rdd.logicalPlan)).toRdd
}
}
}
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]