[GitHub] spark pull request #20020: [SPARK-22834][SQL] Make insertion commands have r...

viirya Wed, 20 Dec 2017 22:15:08 -0800

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20020#discussion_r158203446
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/DataWritingCommand.scala
 ---
    @@ -20,30 +20,32 @@ package org.apache.spark.sql.execution.command
     import org.apache.hadoop.conf.Configuration
     
     import org.apache.spark.SparkContext
    -import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
    +import org.apache.spark.sql.{Row, SparkSession}
    +import org.apache.spark.sql.catalyst.expressions.Attribute
    +import org.apache.spark.sql.catalyst.plans.logical.{AnalysisBarrier, 
Command, LogicalPlan}
    +import org.apache.spark.sql.execution.SparkPlan
     import org.apache.spark.sql.execution.datasources.BasicWriteJobStatsTracker
    +import org.apache.spark.sql.execution.datasources.FileFormatWriter
     import org.apache.spark.sql.execution.metric.{SQLMetric, SQLMetrics}
     import org.apache.spark.util.SerializableConfiguration
     
    -
     /**
      * A special `RunnableCommand` which writes data out and updates metrics.
      */
    -trait DataWritingCommand extends RunnableCommand {
    -
    +trait DataWritingCommand extends Command {
       /**
        * The input query plan that produces the data to be written.
    +   * IMPORTANT: the input query plan MUST be analyzed, so that we can 
carry its output columns
    +   *            to [[FileFormatWriter]].
        */
       def query: LogicalPlan
     
    -  // We make the input `query` an inner child instead of a child in order 
to hide it from the
    -  // optimizer. This is because optimizer may not preserve the output 
schema names' case, and we
    -  // have to keep the original analyzed plan here so that we can pass the 
corrected schema to the
    -  // writer. The schema of analyzed plan is what user expects(or 
specifies), so we should respect
    -  // it when writing.
    -  override protected def innerChildren: Seq[LogicalPlan] = query :: Nil
    +  override def children: Seq[LogicalPlan] = query :: Nil
    --- End diff --
    
    ah, right. We should add the barrier when passing in the query.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #20020: [SPARK-22834][SQL] Make insertion commands have r...

Reply via email to