[GitHub] spark pull request #17255: [SPARK-19918][SQL] Use TextFileFormat in implemen...

HyukjinKwon Sun, 12 Mar 2017 23:38:05 -0700

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17255#discussion_r105597144
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JsonDataSource.scala
 ---
    @@ -17,32 +17,30 @@
     
     package org.apache.spark.sql.execution.datasources.json
     
    -import scala.reflect.ClassTag
    -
     import com.fasterxml.jackson.core.{JsonFactory, JsonParser}
     import com.google.common.io.ByteStreams
     import org.apache.hadoop.conf.Configuration
     import org.apache.hadoop.fs.FileStatus
    -import org.apache.hadoop.io.{LongWritable, Text}
    +import org.apache.hadoop.io.Text
     import org.apache.hadoop.mapreduce.Job
    -import org.apache.hadoop.mapreduce.lib.input.{FileInputFormat, 
TextInputFormat}
    +import org.apache.hadoop.mapreduce.lib.input.FileInputFormat
     
     import org.apache.spark.TaskContext
     import org.apache.spark.input.{PortableDataStream, StreamInputFormat}
     import org.apache.spark.rdd.{BinaryFileRDD, RDD}
    -import org.apache.spark.sql.{AnalysisException, SparkSession}
    +import org.apache.spark.sql.{AnalysisException, Dataset, Encoders, 
SparkSession}
     import org.apache.spark.sql.catalyst.InternalRow
     import org.apache.spark.sql.catalyst.json.{CreateJacksonParser, 
JacksonParser, JSONOptions}
    -import org.apache.spark.sql.execution.datasources.{CodecStreams, 
HadoopFileLinesReader, PartitionedFile}
    +import org.apache.spark.sql.execution.datasources.{CodecStreams, 
DataSource, HadoopFileLinesReader, PartitionedFile}
    +import org.apache.spark.sql.execution.datasources.text.TextFileFormat
     import org.apache.spark.sql.types.StructType
     import org.apache.spark.unsafe.types.UTF8String
     import org.apache.spark.util.Utils
     
     /**
      * Common functions for parsing JSON files
    - * @tparam T A datatype containing the unparsed JSON, such as [[Text]] or 
[[String]]
      */
    -abstract class JsonDataSource[T] extends Serializable {
    +abstract class JsonDataSource extends Serializable {
    --- End diff --
    
    The changes in this file basically resembles `CSVDataSource`.  (Note that 
this is almost identical if https://github.com/apache/spark/pull/17256 is 
merged).



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #17255: [SPARK-19918][SQL] Use TextFileFormat in implemen...

Reply via email to