GitHub user chuxi opened a pull request:
https://github.com/apache/spark/pull/2082
SPARK-2096 [SQL]: Correctly parse dot notations for accessing an array of
structs
For example, "arrayOfStruct" is an array of structs and every element of
this array has a field called "field1". "arrayOfStruct[0].field1" means to
access the value of "field1" for the first element of "arrayOfStruct", but the
SQL parser (in sql-core) treats "field1" as an alias. Also,
"arrayOfStruct.field1" means to access all values of "field1" in this array of
structs and the returns those values as an array. But, the SQL parser cannot
resolve it.
I have passed the test case in JsonSuite ("Complex field and type inferring
(Ignored)") which is ignored, by a little modified.
modified test part :
checkAnswer(
sql("select arrayOfStruct.field1, arrayOfStruct.field2 from jsonTable"),
(Seq(true, false, null), Seq("str1", null, null)) :: Nil
)
However, another question is repeated nested structure is a problem, like
arrayOfStruct.field1.arrayOfStruct.field1 or
arrayOfStruct[0].field1.arrayOfStruct[0].field1
I plan to ignore this problem and try to add "select arrayOfStruct.field1,
arrayOfStruct.field2 from jsonTable where arrayOfStruct.field1==true "
Besides, my friend anyweil (Wei Li) solved the problem of
arrayOfStruct.field1 and its Filter part( means where parsing).
I am fresh here but will continue working on spark :)
I checked the problem " where arrayOfStruct.field1==true "
this problem will lead to modify every kind of comparisonExpression. And I
think it makes no sense to add this function. So I discard it.
Over.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/chuxi/spark master
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/2082.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2082
----
commit b1cb4fb4e3da7ed54ac875afc20a81f25310fa87
Author: chuxi <[email protected]>
Date: 2014-08-21T12:47:25Z
Correctly parse dot notations for accessing an array of structs
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]