[GitHub] spark pull request: [SPARK-2094][SQL] Exactly once command

marmbrus Thu, 12 Jun 2014 17:20:28 -0700

Github user marmbrus commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1071#discussion_r13734140
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/commands.scala ---
    @@ -22,45 +22,69 @@ import org.apache.spark.rdd.RDD
     import org.apache.spark.sql.{SQLContext, Row}
     import org.apache.spark.sql.catalyst.expressions.{GenericRow, Attribute}
     
    +trait PhysicalCommand {
    +  /**
    +   * A concrete command should override this lazy field to wrap up any 
side effects caused by the
    +   * command or any other computation that should be evaluated exactly 
once. The value of this field
    +   * can be used as the contents of the corresponding RDD generated from 
the physical plan of this
    +   * command.
    +   *
    +   * The `execute()` method of all the physical command classes should 
reference `sideEffect` so
    +   * that the command can be executed eagerly right after the command 
query is created.
    +   */
    +  protected[sql] lazy val sideEffectResult: Seq[Any] = Seq.empty[Any]
    --- End diff --
    
    > After some thought I think it's not only about naming, the semantics is 
wrong: en RDD[Row] with an empty row indicates that the schema of the result 
has no fields, while an empty RDD[Row] can be fit schemas with any number of 
fields. And, for SELECT 1, shouldn't it be an RDD[Row] with a single row 
containing a 1?
    
    We use a Project to build the actual result.
    
    However, looks like I already did the separation... so that plan sounds 
good to me :)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2094][SQL] Exactly once command

Reply via email to