Github user marmbrus commented on a diff in the pull request:

    https://github.com/apache/spark/pull/4918#discussion_r25919984
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
    @@ -1049,4 +1049,14 @@ class SQLQuerySuite extends QueryTest with 
BeforeAndAfterAll {
         rdd.toDF().registerTempTable("distinctData")
         checkAnswer(sql("SELECT COUNT(DISTINCT key,value) FROM distinctData"), 
Row(2))
       }
    +
    +  test("SPARK-6145: ORDER BY test for nested fields") {
    +    jsonRDD(sparkContext.makeRDD(
    +      """{"a": {"b": 1, "a": {"a": 1}}, "c": [{"d": 1}]}""" :: 
Nil)).registerTempTable("nestedOrder")
    +    // These should be successfully analyzed
    +    sql("SELECT 1 FROM nestedOrder ORDER BY a.b").queryExecution.analyzed
    +    sql("SELECT a.b FROM nestedOrder ORDER BY a.b").queryExecution.analyzed
    +    sql("SELECT 1 FROM nestedOrder ORDER BY a.a.a").queryExecution.analyzed
    +    sql("SELECT 1 FROM nestedOrder ORDER BY 
c[0].d").queryExecution.analyzed
    --- End diff --
    
    Oh right, `analyzed` is not actually checking analysis.  Ugh...  My mistake.
    
    I think the bug here is that we are partially analyzing nested field 
accesses.  We should not resolve the `a` in `a.a` unless we can also resolve 
the field access too.
    
    The fact that Hive only supports ordering on things from the `SELECT` 
clause sounds like a bug to me.  That is not how the SQL spec works right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to