[GitHub] spark pull request: SPARK-4228 SchemaRDD to JSON

yhuai Sat, 15 Nov 2014 18:03:45 -0800

Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/3213#discussion_r20406598
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SchemaRDD.scala ---
    @@ -131,6 +134,69 @@ class SchemaRDD(
        */
       lazy val schema: StructType = queryExecution.analyzed.schema
     
    +  /** Transforms a single Row to JSON using Jackson
    +    *
    +    * @param jsonFactory a JsonFactory object to construct a JsonGenerator
    +    * @param rowSchema the schema object used for conversion
    +    * @param row The row to convert
    +    */
    +  private def rowToJSON(rowSchema: StructType, jsonFactory: 
JsonFactory)(row: Row): String = {
    +    val writer = new StringWriter()
    +    val gen = jsonFactory.createGenerator(writer)
    +
    +    def valWriter: (DataType, Any) => Unit = {
    +      case(_, null)  => //do nothing
    +      case(StringType, v: String) => gen.writeString(v)
    +      case(TimestampType, v: java.sql.Timestamp) => 
gen.writeString(v.toString)
    --- End diff --
    
    If we use string for a timestamp value, the meaning of the time can be 
changed (e.g. the data is generated by a developer in a time zone and then it 
is read by another developer in another time zone). I feel using `getTime` is 
better (it is not very reader friendly though).
    
    @marmbrus What do you think?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: SPARK-4228 SchemaRDD to JSON

Reply via email to