GitHub user chuxi opened a pull request:

    https://github.com/apache/spark/pull/2082

    SPARK-2096 [SQL]: Correctly parse dot notations for accessing an array of 
structs

    For example, "arrayOfStruct" is an array of structs and every element of 
this array has a field called "field1". "arrayOfStruct[0].field1" means to 
access the value of "field1" for the first element of "arrayOfStruct", but the 
SQL parser (in sql-core) treats "field1" as an alias. Also, 
"arrayOfStruct.field1" means to access all values of "field1" in this array of 
structs and the returns those values as an array. But, the SQL parser cannot 
resolve it.
    
    I have passed the test case in JsonSuite ("Complex field and type inferring 
(Ignored)") which is ignored, by a little modified.
    modified test part :
    checkAnswer(
    sql("select arrayOfStruct.field1, arrayOfStruct.field2 from jsonTable"),
    (Seq(true, false, null), Seq("str1", null, null)) :: Nil
    )
    However, another question is repeated nested structure is a problem, like 
arrayOfStruct.field1.arrayOfStruct.field1 or 
arrayOfStruct[0].field1.arrayOfStruct[0].field1
    I plan to ignore this problem and try to add "select arrayOfStruct.field1, 
arrayOfStruct.field2 from jsonTable where arrayOfStruct.field1==true "
    Besides, my friend anyweil (Wei Li) solved the problem of 
arrayOfStruct.field1 and its Filter part( means where parsing).
    I am fresh here but will continue working on spark :)
    
    I checked the problem " where arrayOfStruct.field1==true "
    this problem will lead to modify every kind of comparisonExpression. And I 
think it makes no sense to add this function. So I discard it.
    Over.
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/chuxi/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2082.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2082
    
----
commit b1cb4fb4e3da7ed54ac875afc20a81f25310fa87
Author: chuxi <chuxik...@163.com>
Date:   2014-08-21T12:47:25Z

    Correctly parse dot notations for accessing an array of structs

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to