[GitHub] spark pull request: [SPARK-3720][SQL]initial support ORC in spark ...

zhzhan Tue, 14 Oct 2014 18:40:04 -0700

Github user zhzhan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2576#discussion_r18871011
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SchemaRDDLike.scala 
---
    @@ -77,6 +77,18 @@ private[sql] trait SchemaRDDLike {
       }
     
       /**
    +   * Saves the contents of this `SchemaRDD` as a orc file, preserving the 
schema.  Files that
    --- End diff --
    
    Just for your reference, I did it by following code, and when the user 
import this one by "import org.apache.spark.sql.hive.orc._", they get the 
support of ORC source. (not sure how to make format better:) )
    
    
    package object orc {
    
      implicit class OrcContext(sqlContext: HiveContext) {
        
        def orcFile(filePath: String) = new SchemaRDD(sqlContext,
          OrcRelation(filePath,
            Some(sqlContext.sparkContext.hadoopConfiguration), sqlContext))
    
        def createOrcFile[A <: Product : TypeTag](path: String,
                                                  allowExisting: Boolean = true,
                                                  conf: Configuration = new 
Configuration()): SchemaRDD = {
          new SchemaRDD(
            sqlContext,
            OrcRelation.createEmpty(path, ScalaReflection.attributesFor[A], 
allowExisting, conf, sqlContext))
        }
      }
    
      implicit class OrcSchemaRDD(rdd: SchemaRDD) {
        def saveAsOrcFile(path: String): Unit = {
          rdd.sqlContext.executePlan(WriteToOrcFile(path, 
rdd.logicalPlan)).toRdd
        }
      }
    }



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-3720][SQL]initial support ORC in spark ...

Reply via email to