[GitHub] spark pull request: SPARK-13827[SQL] Can't add subquery to an oper...

gatorsmile Sat, 12 Mar 2016 16:51:07 -0800

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11658#discussion_r55930362
  
    --- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala ---
    @@ -54,8 +55,26 @@ class SQLBuilder(logicalPlan: LogicalPlan, sqlContext: 
SQLContext) extends Loggi
     
       def toSQL: String = {
         val canonicalizedPlan = Canonicalizer.execute(logicalPlan)
    +    val outputNames = logicalPlan.output.map(_.name)
    +    val qualifiers = logicalPlan.output.flatMap(_.qualifiers).distinct
    +
    +    // Keep the qualifier information by using it as sub-query name, if 
there is only one qualifier
    +    // present.
    +    val finalName = if (qualifiers.isEmpty || qualifiers.length > 1) {
    +      SQLBuilder.newSubqueryName
    +    } else {
    +      qualifiers.head
    +    }
    +
    +    // Canonicalizer will remove all naming information, we should add it 
back by adding an extra
    +    // Project and alias the outputs.
    +    val aliasedOutput = canonicalizedPlan.output.zip(outputNames).map {
    +      case (attr, name) => Alias(attr.withQualifiers(Nil), name)()
    +    }
    +    val finalPlan = Project(aliasedOutput, SubqueryAlias(finalName, 
canonicalizedPlan))
    --- End diff --
    
    Just realized Hive will issue an error when having duplicate names in 
CREATE VIEW. Thus, I guess we do not need to redo it in Spark?
    
    BTW, the top Project could have a duplicate names when creating a view if 
users specify the schema. For example, 
    ```SQL
    CREATE VIEW testView(id3, id4) AS SELECT * FROM jt1 JOIN jt2 ON jt1.id1 == 
jt2.id1
    ```




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: SPARK-13827[SQL] Can't add subquery to an oper...

Reply via email to