[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-09-15 Thread chuxi
Github user chuxi closed the pull request at:

https://github.com/apache/spark/pull/2082


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-10 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-55077709
  
Jenkins, test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-55082124
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20093/consoleFull)
 for   PR 2230 at commit 
[`e1a8898`](https://github.com/apache/spark/commit/e1a88986ddb3cf8147cdb04c35addc974f2acba2).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-55093270
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20093/consoleFull)
 for   PR 2230 at commit 
[`e1a8898`](https://github.com/apache/spark/commit/e1a88986ddb3cf8147cdb04c35addc974f2acba2).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-10 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-55173407
  
Thanks for working on this!  Merged to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/2230


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-09 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-54931137
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20024/consoleFull)
 for   PR 2230 at commit 
[`ca43e6d`](https://github.com/apache/spark/commit/ca43e6d5fd38f859256edce1d8a8b108490516e7).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-09 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-54937332
  
Actually hive doesn't support using dot notation to access fields of nested 
array, even one level. Anyway, I will put this support in another PR to keep 
this PR simple and clear :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-09 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-54939694
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20024/consoleFull)
 for   PR 2230 at commit 
[`ca43e6d`](https://github.com/apache/spark/commit/ca43e6d5fd38f859256edce1d8a8b108490516e7).
 * This patch **fails** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-09 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-54942044
  
The failed test case seems a regression test for a new fix. I have done 
rebase to include the new fix. Test again please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-09 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-54947239
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20034/consoleFull)
 for   PR 2230 at commit 
[`5adb6bf`](https://github.com/apache/spark/commit/5adb6bf0966141917cd960fc4f93c616df371d06).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-09 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-54955701
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20034/consoleFull)
 for   PR 2230 at commit 
[`5adb6bf`](https://github.com/apache/spark/commit/5adb6bf0966141917cd960fc4f93c616df371d06).
 * This patch **fails** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-09 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-55062642
  
Yeah I think the test failure was unrelated, though unfortunately this is 
out of date again.  Mind updating one more time?  Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-09 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-55068488
  
rebase done, test again please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-08 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-54900671
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-08 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-54900963
  
Hmm, does Hive support using dot notation to access fields that are 
arbitrarily nested in arrays?  If not I think it would be better to just 
support one level.  Also, the code added for that feature uses a lot of mutable 
state and is a little hard to follow (in addition to removing the type check).

Since I'd really like to include your parser fixes and test cleanup, what 
do you think about breaking out the GetField on arrays change into another PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-05 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-54601247
  
@marmbrus Seems hive parser will pass something like a.b.c... to 
`LogicalPlan`, so I have to roll back(and I changed `dotExpressionHeader` to 
`ident . ident {. ident}`). And I have done some work on `GetField` to let 
it support not just StructType, but also array of struct, or array of array of 
struct, or array of array of ... struct. 
The idea is simple. If you want `a.b` to work, then `a` must be some level 
if nested array of struct(level 0 means just a StructType). And the result of 
`a.b` is same level of nested array of b-type. In this way, we can handle 
nested array of strcut and simple struct in same process.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-05 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-54601682
  
I'm not sure how to modify `lazy val resolved` in `GetField` since it 
handles not only StructType now. Currently I just removed the type check. What 
do you think? @marmbrus 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-54694349
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-09-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2082#issuecomment-54694419
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-04 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-54573524
  
Yeah, I'd like to simplify this, but unfortunately I think this version 
introduces a regression for hive queries.  I've made a PR (against your PR) 
that shows this regression. https://github.com/cloud-fan/spark/pull/1  Would be 
great if you could merge that and either roll back or propose an alternative.  
Thanks :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-54268698
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19653/consoleFull)
 for   PR 2230 at commit 
[`5c70874`](https://github.com/apache/spark/commit/5c7087407275354be26b4c06d94a19a25dfec123).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-03 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-54269552
  
@marmbrus Sorry for missing the `distinct`. Since we parse the dot in 
`SqlParser` now, the only possible formats of `name` passed into 
`LogicalPlan.resolve` is ident or ident.ident. So my change to 
`LogicalPlan` is just simplify the logic there. We can still roll back to the 
origin `LogicalPlan.resolve` if you feel it too radical :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-54279166
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19653/consoleFull)
 for   PR 2230 at commit 
[`5c70874`](https://github.com/apache/spark/commit/5c7087407275354be26b4c06d94a19a25dfec123).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-02 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-54118647
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-02 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-54118896
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19584/consoleFull)
 for   PR 2230 at commit 
[`de63082`](https://github.com/apache/spark/commit/de630829028d9d4a7ef55b8a0ff31e09f0b549d9).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-02 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-54118942
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19584/consoleFull)
 for   PR 2230 at commit 
[`de63082`](https://github.com/apache/spark/commit/de630829028d9d4a7ef55b8a0ff31e09f0b549d9).
 * This patch **fails** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-02 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-54122147
  
sorry for the code style, fixed! Test again please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-02 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-54122521
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19585/consoleFull)
 for   PR 2230 at commit 
[`8420c84`](https://github.com/apache/spark/commit/8420c849c5756ac1540da288f789800b1137edef).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-02 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-54131873
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19585/consoleFull)
 for   PR 2230 at commit 
[`8420c84`](https://github.com/apache/spark/commit/8420c849c5756ac1540da288f789800b1137edef).
 * This patch **fails** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-09-02 Thread chuxi
Github user chuxi commented on the pull request:

https://github.com/apache/spark/pull/2082#issuecomment-54152206
  
@marmbrus , do I need to check something else? Or merge the code?

Besides, I see another PR : https://github.com/apache/spark/pull/2230

:) it is my friend, I  suggested him have a look of the nested parquet 
sqlpaser in the sql parquet test suit which parses dot as a delimiter, not 
identChar.

it has a problem that we don't know whether the dot should be in identchar 
or delimiter. It leads to different sql parsing result. If only struct in a 
json or parquet, apparently put the dot. in identchar is better. Does it have 
some other reasons about the dot parsing?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-02 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2230#issuecomment-54252034
  
Thanks for working on this! The changes made to the parser seem reasonable 
to me.  Thanks for the thorough explanation.

Can you explain your changes to LogicalPlan a little more and add some 
inline comments.  Thats a very crucial piece of code and I'm a little nervous 
about changing it.  Also it seems like we might be missing the distinct logic 
based on the failing test case.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-09-02 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2082#issuecomment-54252376
  
Hi @chuxi, it would be great if you could discuss with @cloud-fan and 
perhaps adapt your `GetArrayField` stuff to work with the changes in #2230.  
Also I think there are a few unaddressed comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-09-01 Thread chuxi
Github user chuxi commented on a diff in the pull request:

https://github.com/apache/spark/pull/2082#discussion_r16969910
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypes.scala
 ---
@@ -101,3 +101,41 @@ case class GetField(child: Expression, fieldName: 
String) extends UnaryExpressio
 
   override def toString = s$child.$fieldName
 }
+
+/**
+ * Returns an array containing the value of fieldName
+ * for each element in the input array of type struct
+ */
+case class GetArrayField(child: Expression, fieldName: String) extends 
UnaryExpression {
+  type EvaluatedType = Any
+
+  def dataType = field.dataType
+  override def nullable = child.nullable || field.nullable
+  override def foldable = child.foldable
+
+  protected def arrayType = child.dataType match {
+case ArrayType(s: StructType, _) = s
+case otherType = sys.error(sGetArrayField is not valid on fields of 
type $otherType)
+  }
+
+  lazy val field = if (arrayType.isInstanceOf[StructType]) {
+arrayType.fields
+  .find(_.name == fieldName)
+  .getOrElse(sys.error(sNo such field $fieldName in 
${child.dataType}))
+  } else null
+
+
+  lazy val ordinal = arrayType.fields.indexOf(field)
+
+  override lazy val resolved = childrenResolved  
child.dataType.isInstanceOf[ArrayType]
+
+  override def eval(input: Row): Any = {
+val value : Seq[Row] = child.eval(input).asInstanceOf[Seq[Row]]
+val v = value.map{ t =
+  if (t == null) null else t(ordinal)
+}
+v
--- End diff --

= =


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-28 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2082#issuecomment-53828175
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2082#issuecomment-53828495
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19444/consoleFull)
 for   PR 2082 at commit 
[`e5a3db1`](https://github.com/apache/spark/commit/e5a3db19bf5fc7365e4280f17d9bba27b08d29dd).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2082#issuecomment-53832574
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19444/consoleFull)
 for   PR 2082 at commit 
[`e5a3db1`](https://github.com/apache/spark/commit/e5a3db19bf5fc7365e4280f17d9bba27b08d29dd).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class GetArrayField(child: Expression, fieldName: String) extends 
UnaryExpression `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-28 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/2082#discussion_r16881896
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypes.scala
 ---
@@ -101,3 +101,41 @@ case class GetField(child: Expression, fieldName: 
String) extends UnaryExpressio
 
   override def toString = s$child.$fieldName
 }
+
+/**
+ * Returns an array containing the value of fieldName
+ * for each element in the input array of type struct
+ */
+case class GetArrayField(child: Expression, fieldName: String) extends 
UnaryExpression {
+  type EvaluatedType = Any
+
+  def dataType = field.dataType
+  override def nullable = child.nullable || field.nullable
+  override def foldable = child.foldable
+
+  protected def arrayType = child.dataType match {
+case ArrayType(s: StructType, _) = s
+case otherType = sys.error(sGetArrayField is not valid on fields of 
type $otherType)
+  }
+
+  lazy val field = if (arrayType.isInstanceOf[StructType]) {
+arrayType.fields
+  .find(_.name == fieldName)
+  .getOrElse(sys.error(sNo such field $fieldName in 
${child.dataType}))
+  } else null
+
+
+  lazy val ordinal = arrayType.fields.indexOf(field)
+
+  override lazy val resolved = childrenResolved  
child.dataType.isInstanceOf[ArrayType]
+
+  override def eval(input: Row): Any = {
+val value : Seq[Row] = child.eval(input).asInstanceOf[Seq[Row]]
+val v = value.map{ t =
+  if (t == null) null else t(ordinal)
+}
+v
--- End diff --

you can just use 

value.map{ t =
  if (t == null) null else t(ordinal)
}

as the last line of this `eval` function.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-28 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/2082#discussion_r16881901
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypes.scala
 ---
@@ -101,3 +101,41 @@ case class GetField(child: Expression, fieldName: 
String) extends UnaryExpressio
 
   override def toString = s$child.$fieldName
 }
+
+/**
+ * Returns an array containing the value of fieldName
+ * for each element in the input array of type struct
+ */
+case class GetArrayField(child: Expression, fieldName: String) extends 
UnaryExpression {
+  type EvaluatedType = Any
+
+  def dataType = field.dataType
+  override def nullable = child.nullable || field.nullable
+  override def foldable = child.foldable
+
+  protected def arrayType = child.dataType match {
+case ArrayType(s: StructType, _) = s
+case otherType = sys.error(sGetArrayField is not valid on fields of 
type $otherType)
+  }
+
+  lazy val field = if (arrayType.isInstanceOf[StructType]) {
+arrayType.fields
+  .find(_.name == fieldName)
+  .getOrElse(sys.error(sNo such field $fieldName in 
${child.dataType}))
+  } else null
+
+
+  lazy val ordinal = arrayType.fields.indexOf(field)
+
+  override lazy val resolved = childrenResolved  
child.dataType.isInstanceOf[ArrayType]
+
+  override def eval(input: Row): Any = {
+val value : Seq[Row] = child.eval(input).asInstanceOf[Seq[Row]]
--- End diff --

remove the space after value.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-27 Thread chuxi
Github user chuxi commented on the pull request:

https://github.com/apache/spark/pull/2082#issuecomment-53673090
  
Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-26 Thread chuxi
Github user chuxi commented on the pull request:

https://github.com/apache/spark/pull/2082#issuecomment-53386890
  
Thank you, marmbrus, you are so nice. I am fresh here and never post any PR 
to a open project. I will take your suggestions and modify my code as the scala 
style. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-25 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/2082#discussion_r16691377
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypes.scala
 ---
@@ -101,3 +101,44 @@ case class GetField(child: Expression, fieldName: 
String) extends UnaryExpressio
 
   override def toString = s$child.$fieldName
 }
+
+/**
+ * Returns the value of fields[] in the Struct `child`.
+ * for array of structs
+ */
+
+case class GetArrayField(child: Expression, fieldName: String) extends 
UnaryExpression {
+  type EvaluatedType = Any
+
+  def dataType = field.dataType
+  override def nullable = child.nullable || field.nullable
+  override def foldable = child.foldable
+
+  protected def arrayType = child.dataType match {
+case s: ArrayType = s.elementType match {
+  case t :StructType = t
+  case otherType = sys.error(sGetArrayField is not valid on fields 
of type $otherType)
+}
+case otherType = sys.error(sGetArrayField is not valid on fields of 
type $otherType)
+  }
+
+  lazy val field =
+arrayType.fields
+  .find(_.name == fieldName)
+  .getOrElse(sys.error(sNo such field $fieldName in 
${child.dataType}))
+
+
+  lazy val ordinal = arrayType.fields.indexOf(field)
+
+  override lazy val resolved = childrenResolved  
child.dataType.isInstanceOf[ArrayType]
+
+  override def eval(input: Row): Any = {
+val value : Seq[Row] = child.eval(input).asInstanceOf[Seq[Row]]
+val v = value.map{ t =
+  if (t == null) null else t(ordinal)
+}
+v
+  }
+
+  override def toString = s$child.$fieldName
+}
--- End diff --

End files with a newline.  Run `sbt/sbt scalastyle` to check style locally.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-25 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/2082#discussion_r16691386
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypes.scala
 ---
@@ -101,3 +101,44 @@ case class GetField(child: Expression, fieldName: 
String) extends UnaryExpressio
 
   override def toString = s$child.$fieldName
 }
+
+/**
+ * Returns the value of fields[] in the Struct `child`.
+ * for array of structs
+ */
+
+case class GetArrayField(child: Expression, fieldName: String) extends 
UnaryExpression {
+  type EvaluatedType = Any
+
+  def dataType = field.dataType
+  override def nullable = child.nullable || field.nullable
+  override def foldable = child.foldable
+
+  protected def arrayType = child.dataType match {
+case s: ArrayType = s.elementType match {
+  case t :StructType = t
--- End diff --

Always put the `:` next to the variable, not the type.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-25 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/2082#discussion_r16691419
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypes.scala
 ---
@@ -101,3 +101,44 @@ case class GetField(child: Expression, fieldName: 
String) extends UnaryExpressio
 
   override def toString = s$child.$fieldName
 }
+
+/**
+ * Returns the value of fields[] in the Struct `child`.
+ * for array of structs
+ */
+
+case class GetArrayField(child: Expression, fieldName: String) extends 
UnaryExpression {
+  type EvaluatedType = Any
+
+  def dataType = field.dataType
+  override def nullable = child.nullable || field.nullable
+  override def foldable = child.foldable
+
+  protected def arrayType = child.dataType match {
+case s: ArrayType = s.elementType match {
--- End diff --

this could be written as:

```scala
child.dataType match {
  case ArrayType(s: StructType, _) = s
  case otherType = sys.error(sGetArrayField is not valid on fields of 
type $otherType)
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-25 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/2082#discussion_r16691505
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypes.scala
 ---
@@ -101,3 +101,44 @@ case class GetField(child: Expression, fieldName: 
String) extends UnaryExpressio
 
   override def toString = s$child.$fieldName
 }
+
+/**
+ * Returns the value of fields[] in the Struct `child`.
--- End diff --

Perhaps: Returns an array containing the value of `fieldName` for each 
element in the input array of type struct.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-25 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/2082#discussion_r16691550
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypes.scala
 ---
@@ -101,3 +101,44 @@ case class GetField(child: Expression, fieldName: 
String) extends UnaryExpressio
 
   override def toString = s$child.$fieldName
 }
+
+/**
+ * Returns the value of fields[] in the Struct `child`.
+ * for array of structs
+ */
+
+case class GetArrayField(child: Expression, fieldName: String) extends 
UnaryExpression {
+  type EvaluatedType = Any
+
+  def dataType = field.dataType
+  override def nullable = child.nullable || field.nullable
+  override def foldable = child.foldable
+
+  protected def arrayType = child.dataType match {
+case s: ArrayType = s.elementType match {
+  case t :StructType = t
+  case otherType = sys.error(sGetArrayField is not valid on fields 
of type $otherType)
+}
+case otherType = sys.error(sGetArrayField is not valid on fields of 
type $otherType)
+  }
+
+  lazy val field =
+arrayType.fields
+  .find(_.name == fieldName)
+  .getOrElse(sys.error(sNo such field $fieldName in 
${child.dataType}))
+
+
+  lazy val ordinal = arrayType.fields.indexOf(field)
+
+  override lazy val resolved = childrenResolved  
child.dataType.isInstanceOf[ArrayType]
--- End diff --

This should also check that the element type of the ArrayType is StructType 
and that the requested field name can be found in that struct.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-25 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/2082#discussion_r16691566
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala
 ---
@@ -94,6 +94,7 @@ abstract class LogicalPlan extends QueryPlan[LogicalPlan] 
{
 // matches the name or where the first part matches the scope and the 
second part matches the
 // name.  Return these matches along with any remaining parts, which 
represent dotted access to
 // struct fields.
+
--- End diff --

Remove this newline.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-25 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2082#issuecomment-53363057
  
Thanks for working on this.  I made a few small suggestions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-25 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2082#issuecomment-53363068
  
Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2082#issuecomment-53363532
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19178/consoleFull)
 for   PR 2082 at commit 
[`ebf033b`](https://github.com/apache/spark/commit/ebf033bfb2a658d4ca25cfdbe4d7def105793486).
 * This patch **fails** unit tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class GetArrayField(child: Expression, fieldName: String) extends 
UnaryExpression `
  * `case class ExplainCommand(plan: LogicalPlan, extended: Boolean = 
false) extends Command `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-25 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/2082#discussion_r16692668
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala ---
@@ -373,8 +376,13 @@ class SqlLexical(val keywords: Seq[String]) extends 
StdLexical {
   )
 
   override lazy val token: Parser[Token] = (
-identChar ~ rep( identChar | digit ) ^^
-  { case first ~ rest = processIdent(first :: rest mkString ) }
+   identChar ~ rep( identChar | digit ) ^^
--- End diff --

indent too much


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-21 Thread chuxi
GitHub user chuxi opened a pull request:

https://github.com/apache/spark/pull/2082

SPARK-2096 [SQL]: Correctly parse dot notations for accessing an array of 
structs

For example, arrayOfStruct is an array of structs and every element of 
this array has a field called field1. arrayOfStruct[0].field1 means to 
access the value of field1 for the first element of arrayOfStruct, but the 
SQL parser (in sql-core) treats field1 as an alias. Also, 
arrayOfStruct.field1 means to access all values of field1 in this array of 
structs and the returns those values as an array. But, the SQL parser cannot 
resolve it.

I have passed the test case in JsonSuite (Complex field and type inferring 
(Ignored)) which is ignored, by a little modified.
modified test part :
checkAnswer(
sql(select arrayOfStruct.field1, arrayOfStruct.field2 from jsonTable),
(Seq(true, false, null), Seq(str1, null, null)) :: Nil
)
However, another question is repeated nested structure is a problem, like 
arrayOfStruct.field1.arrayOfStruct.field1 or 
arrayOfStruct[0].field1.arrayOfStruct[0].field1
I plan to ignore this problem and try to add select arrayOfStruct.field1, 
arrayOfStruct.field2 from jsonTable where arrayOfStruct.field1==true 
Besides, my friend anyweil (Wei Li) solved the problem of 
arrayOfStruct.field1 and its Filter part( means where parsing).
I am fresh here but will continue working on spark :)

I checked the problem  where arrayOfStruct.field1==true 
this problem will lead to modify every kind of comparisonExpression. And I 
think it makes no sense to add this function. So I discard it.
Over.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chuxi/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2082.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2082


commit b1cb4fb4e3da7ed54ac875afc20a81f25310fa87
Author: chuxi chuxik...@163.com
Date:   2014-08-21T12:47:25Z

Correctly parse dot notations for accessing an array of structs




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2082#issuecomment-52916269
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-21 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/2082#discussion_r16539291
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/json/JsonSuite.scala 
---
@@ -292,24 +292,29 @@ class JsonSuite extends QueryTest {
   sql(select structWithArrayFields.field1[1], 
structWithArrayFields.field2[3] from jsonTable),
   (5, null) :: Nil
 )
-  }
 
-  ignore(Complex field and type inferring (Ignored)) {
-val jsonSchemaRDD = jsonRDD(complexFieldAndType)
-jsonSchemaRDD.registerTempTable(jsonTable)
+checkAnswer(
+  sql(select arrayOfStruct.field1, arrayOfStruct.field2 from 
jsonTable),
+  (Seq(true, false, null), Seq(str1, null, null)) :: Nil
+)
 
-// Right now, field1 and field2 are treated as aliases. We should 
fix it.
 checkAnswer(
   sql(select arrayOfStruct[0].field1, arrayOfStruct[0].field2 from 
jsonTable),
   (true, str1) :: Nil
 )
 
-// Right now, the analyzer cannot resolve arrayOfStruct.field1 and 
arrayOfStruct.field2.
-// Getting all values of a specific field from an array of structs.
+  }
+
+  ignore(Complex field and type inferring (Ignored)) {
+val jsonSchemaRDD = jsonRDD(complexFieldAndType)
+jsonSchemaRDD.registerTempTable(jsonTable)
+
+// still need add filter??? I am not sure whether this function is 
necessary. quite complex
 checkAnswer(
-  sql(select arrayOfStruct.field1, arrayOfStruct.field2 from 
jsonTable),
-  (Seq(true, false), Seq(str1, null)) :: Nil
+  sql(select arrayOfStruct.field1 from jsonTable where 
arrayOfStruct.field1 = true),
--- End diff --

why are you changing the test case since it still cannot work?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-21 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/2082#discussion_r16539375
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala
 ---
@@ -108,6 +109,8 @@ abstract class LogicalPlan extends 
QueryPlan[LogicalPlan] {
 a.dataType match {
   case StructType(fields) =
 Some(Alias(nestedFields.foldLeft(a: Expression)(GetField), 
nestedFields.last)())
+  case fields :ArrayType =
--- End diff --

Maybe it is better to use `case ArrayType(fields)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-21 Thread chuxi
Github user chuxi commented on a diff in the pull request:

https://github.com/apache/spark/pull/2082#discussion_r16540807
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/json/JsonSuite.scala 
---
@@ -292,24 +292,29 @@ class JsonSuite extends QueryTest {
   sql(select structWithArrayFields.field1[1], 
structWithArrayFields.field2[3] from jsonTable),
   (5, null) :: Nil
 )
-  }
 
-  ignore(Complex field and type inferring (Ignored)) {
-val jsonSchemaRDD = jsonRDD(complexFieldAndType)
-jsonSchemaRDD.registerTempTable(jsonTable)
+checkAnswer(
+  sql(select arrayOfStruct.field1, arrayOfStruct.field2 from 
jsonTable),
+  (Seq(true, false, null), Seq(str1, null, null)) :: Nil
+)
 
-// Right now, field1 and field2 are treated as aliases. We should 
fix it.
 checkAnswer(
   sql(select arrayOfStruct[0].field1, arrayOfStruct[0].field2 from 
jsonTable),
   (true, str1) :: Nil
 )
 
-// Right now, the analyzer cannot resolve arrayOfStruct.field1 and 
arrayOfStruct.field2.
-// Getting all values of a specific field from an array of structs.
+  }
+
+  ignore(Complex field and type inferring (Ignored)) {
+val jsonSchemaRDD = jsonRDD(complexFieldAndType)
+jsonSchemaRDD.registerTempTable(jsonTable)
+
+// still need add filter??? I am not sure whether this function is 
necessary. quite complex
 checkAnswer(
-  sql(select arrayOfStruct.field1, arrayOfStruct.field2 from 
jsonTable),
-  (Seq(true, false), Seq(str1, null)) :: Nil
+  sql(select arrayOfStruct.field1 from jsonTable where 
arrayOfStruct.field1 = true),
--- End diff --

wang pangzi, someone add field3 in testData arrayOfStruct. So it requires 
another null. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-21 Thread chuxi
Github user chuxi commented on a diff in the pull request:

https://github.com/apache/spark/pull/2082#discussion_r16540895
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala
 ---
@@ -108,6 +109,8 @@ abstract class LogicalPlan extends 
QueryPlan[LogicalPlan] {
 a.dataType match {
   case StructType(fields) =
 Some(Alias(nestedFields.foldLeft(a: Expression)(GetField), 
nestedFields.last)())
+  case fields :ArrayType =
--- End diff --

In fact, ArrayType(fields, _) would be better 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-21 Thread chuxi
Github user chuxi commented on a diff in the pull request:

https://github.com/apache/spark/pull/2082#discussion_r16541208
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/json/JsonSuite.scala 
---
@@ -292,24 +292,29 @@ class JsonSuite extends QueryTest {
   sql(select structWithArrayFields.field1[1], 
structWithArrayFields.field2[3] from jsonTable),
   (5, null) :: Nil
 )
-  }
 
-  ignore(Complex field and type inferring (Ignored)) {
-val jsonSchemaRDD = jsonRDD(complexFieldAndType)
-jsonSchemaRDD.registerTempTable(jsonTable)
+checkAnswer(
+  sql(select arrayOfStruct.field1, arrayOfStruct.field2 from 
jsonTable),
+  (Seq(true, false, null), Seq(str1, null, null)) :: Nil
+)
 
-// Right now, field1 and field2 are treated as aliases. We should 
fix it.
 checkAnswer(
   sql(select arrayOfStruct[0].field1, arrayOfStruct[0].field2 from 
jsonTable),
   (true, str1) :: Nil
 )
 
-// Right now, the analyzer cannot resolve arrayOfStruct.field1 and 
arrayOfStruct.field2.
-// Getting all values of a specific field from an array of structs.
+  }
+
+  ignore(Complex field and type inferring (Ignored)) {
+val jsonSchemaRDD = jsonRDD(complexFieldAndType)
+jsonSchemaRDD.registerTempTable(jsonTable)
+
+// still need add filter??? I am not sure whether this function is 
necessary. quite complex
 checkAnswer(
-  sql(select arrayOfStruct.field1, arrayOfStruct.field2 from 
jsonTable),
-  (Seq(true, false), Seq(str1, null)) :: Nil
+  sql(select arrayOfStruct.field1 from jsonTable where 
arrayOfStruct.field1 = true),
--- End diff --

I add sql(select arrayOfStruct.field1 from jsonTable where 
arrayOfStruct.field1 = true) this test case in ignored part. It does not work 
because I came up with it but did not solve it. Or it makes no sense to solve 
it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2096 [SQL]: Correctly parse dot notation...

2014-08-21 Thread chuxi
Github user chuxi commented on a diff in the pull request:

https://github.com/apache/spark/pull/2082#discussion_r16540937
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala
 ---
@@ -108,6 +109,8 @@ abstract class LogicalPlan extends 
QueryPlan[LogicalPlan] {
 a.dataType match {
   case StructType(fields) =
 Some(Alias(nestedFields.foldLeft(a: Expression)(GetField), 
nestedFields.last)())
+  case fields :ArrayType =
--- End diff --

case ArrayType(fields) leads to an compile error


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org