[GitHub] carbondata pull request #1352: [CARBONDATA-1174] Streaming Ingestion - schem...

jackylk Wed, 13 Sep 2017 01:43:09 -0700

Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1352#discussion_r138558167
  
    --- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/CarbonSource.scala ---
    @@ -205,19 +220,188 @@ class CarbonSource extends CreatableRelationProvider 
with RelationProvider
      * by setting the output committer class in the conf of 
spark.sql.sources.outputCommitterClass.
      */
       def prepareWrite(
    -    sparkSession: SparkSession,
    -    job: Job,
    -    options: Map[String, String],
    -    dataSchema: StructType): OutputWriterFactory = new 
CarbonStreamingOutputWriterFactory()
    +      sparkSession: SparkSession,
    +      job: Job,
    +      options: Map[String, String],
    +      dataSchema: StructType): OutputWriterFactory = {
     
    -/**
    - * When possible, this method should return the schema of the given 
`files`.  When the format
    - * does not support inference, or no valid files are given should return 
None.  In these cases
    - * Spark will require that user specify the schema manually.
    - */
    +    // Check if table with given path exists
    +    validateTable(options.get("path").get)
    +
    +    // Check id streaming data schema matches with carbon table schema
    +    // Data from socket source does not have schema attached to it,
    +    // Following check is to ignore schema validation for socket source.
    +    if (!(dataSchema.size.equals(1) &&
    --- End diff --
    
    why not equal to 1?

---

[GitHub] carbondata pull request #1352: [CARBONDATA-1174] Streaming Ingestion - schem...

Reply via email to