Github user marmbrus commented on a diff in the pull request:

    https://github.com/apache/spark/pull/6107#discussion_r30381898
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
    @@ -521,66 +525,89 @@ class Analyzer(
       }
     
       /**
    -   * When a SELECT clause has only a single expression and that expression 
is a
    -   * [[catalyst.expressions.Generator Generator]] we convert the
    -   * [[catalyst.plans.logical.Project Project]] to a 
[[catalyst.plans.logical.Generate Generate]].
    +   * Rewrites table generating expressions that either need one or more of 
the following in order
    +   * to be resolved:
    +   *  - concrete attribute references for their output.
    +   *  - to be relocated from a SELECT clause (i.e. from  a [[Project]]) 
into a [[Generate]]).
    +   *
    +   * Names for the output [[Attributes]] are extracted from [[Alias]] or 
[[MultiAlias]] expressions
    +   * that wrap the [[Generator]]. If more than one [[Generator]] is found 
in a Project, an
    +   * [[AnalysisException]] is throw.
        */
    -  object ImplicitGenerate extends Rule[LogicalPlan] {
    +  object ResolveGenerate extends Rule[LogicalPlan] {
         def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    -      case Project(Seq(Alias(g: Generator, name)), child) =>
    -        Generate(g, join = false, outer = false,
    -          qualifier = None, UnresolvedAttribute(name) :: Nil, child)
    -      case Project(Seq(MultiAlias(g: Generator, names)), child) =>
    -        Generate(g, join = false, outer = false,
    -          qualifier = None, names.map(UnresolvedAttribute(_)), child)
    +      case p: Generate if !p.child.resolved || !p.generator.resolved => p
    +      case g: Generate if g.resolved == false =>
    +          g.copy(
    +            generatorOutput = makeGeneratorOutput(g.generator, 
g.generatorOutput.map(_.name)))
    +
    +      case p @ Project(projectList, child) =>
    +        // Holds the resolved generator, if one exists in the project list.
    +        var resolvedGenerator: Generate = null
    +
    +        val newProjectList = projectList.flatMap {
    +          case AliasedGenerator(generator, names) if 
generator.childrenResolved =>
    +            if (resolvedGenerator != null) {
    +              failAnalysis(
    +                s"Only one generator allowed per select but 
${resolvedGenerator.nodeName} and " +
    +                s"and ${generator.nodeName} found.")
    +            }
    +
    +            resolvedGenerator =
    +              Generate(
    +                generator,
    +                join = projectList.size > 1, // Only join if there are 
other expressions in SELECT.
    --- End diff --
    
    @rxin and I are afraid that allowing more than one in a single select is 
too confusing, so we explicitly disallow that.  This is because you get an 
implicit cartesian product of the two things you are exploding.  If users want 
to do more than one they can just use more than one select, and then the result 
is more obvious we think.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to