[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-02-04 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/4068#discussion_r24130578
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Column.scala ---
@@ -17,6 +17,8 @@
 
 package org.apache.spark.sql
 
+import org.apache.spark.sql.catalyst.analysis.UnresolvedGetField
--- End diff --

Nit: Import order


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-02-04 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/4068#discussion_r24130589
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvaluationSuite.scala
 ---
@@ -19,6 +19,8 @@ package org.apache.spark.sql.catalyst.expressions
 
 import java.sql.{Date, Timestamp}
 
+import org.apache.spark.sql.catalyst.analysis.UnresolvedGetField
--- End diff --

nit: import order.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-02-04 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-72963557
  
Okay, you have convinced me that this is cleaner.  Two nits about 
formatting and then this LGTM.  Can you also fix the description of the PR, as 
that becomes the commit message.

Would be awesome to include this in 1.2 if you can update soon.

Thanks for working on this!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-02-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-72778669
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26697/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-02-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-72778662
  
  [Test build #26697 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26697/consoleFull)
 for   PR 4068 at commit 
[`340223d`](https://github.com/apache/spark/commit/340223d44fce76096e9952cc8aab9eb46ff9d1f8).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class Dsl(object):`
  * `class ExamplePointUDT(UserDefinedType):`
  * `class SQLTests(ReusedPySparkTestCase):`
  * `case class UnresolvedGetField(child: Expression, fieldName: String) 
extends UnaryExpression `
  * `case class GetField(child: Expression, field: StructField, ordinal: 
Int) extends UnaryExpression `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-02-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-72614165
  
  [Test build #26641 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26641/consoleFull)
 for   PR 4068 at commit 
[`3295858`](https://github.com/apache/spark/commit/3295858b6d7e1738c11837207f564b4f2c4c503c).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-02-03 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-72613023
  
Hi @yhuai , I have updated this PR introducing `UnresolvedGetField` to fix 
this issue. Do you have time to review it? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-02-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-72622614
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26638/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-02-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-72622599
  
  [Test build #26638 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26638/consoleFull)
 for   PR 4068 at commit 
[`1b8f924`](https://github.com/apache/spark/commit/1b8f9242b323d14010b2cfa743b23fab82177bee).
 * This patch **fails Spark unit tests**.
 * This patch **does not merge cleanly**.
 * This patch adds the following public classes _(experimental)_:
  * `case class UnresolvedGetField(child: Expression, fieldName: String) 
extends UnaryExpression `
  * `case class GetField(child: Expression, field: StructField, ordinal: 
Int) extends UnaryExpression `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-02-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-72622954
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26641/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-02-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-72622946
  
  [Test build #26641 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26641/consoleFull)
 for   PR 4068 at commit 
[`3295858`](https://github.com/apache/spark/commit/3295858b6d7e1738c11837207f564b4f2c4c503c).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-02-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-72771666
  
  [Test build #26697 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26697/consoleFull)
 for   PR 4068 at commit 
[`340223d`](https://github.com/apache/spark/commit/340223d44fce76096e9952cc8aab9eb46ff9d1f8).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-02-01 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/4068#discussion_r23905429
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -285,11 +285,22 @@ class Analyzer(catalog: Catalog,
 result
 
   // Resolve field names using the resolver.
-  case f @ GetField(child, fieldName) if !f.resolved  
child.resolved =
+  case f @ GetField(child, fieldName) if child.resolved =
 child.dataType match {
   case StructType(fields) =
-val resolvedFieldName = 
fields.map(_.name).find(resolver(_, fieldName))
-resolvedFieldName.map(n = f.copy(fieldName = 
n)).getOrElse(f)
+val actualField = fields.filter(f = resolver(f.name, 
fieldName))
+if (actualField.length == 0) {
+  sys.error(
+sNo such struct field $fieldName in 
${fields.map(_.name).mkString(, )})
--- End diff --

ping @marmbrus 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-29 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/4068#discussion_r23786418
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -285,11 +285,22 @@ class Analyzer(catalog: Catalog,
 result
 
   // Resolve field names using the resolver.
-  case f @ GetField(child, fieldName) if !f.resolved  
child.resolved =
+  case f @ GetField(child, fieldName) if child.resolved =
 child.dataType match {
   case StructType(fields) =
-val resolvedFieldName = 
fields.map(_.name).find(resolver(_, fieldName))
-resolvedFieldName.map(n = f.copy(fieldName = 
n)).getOrElse(f)
+val actualField = fields.filter(f = resolver(f.name, 
fieldName))
+if (actualField.length == 0) {
+  sys.error(
+sNo such struct field $fieldName in 
${fields.map(_.name).mkString(, )})
--- End diff --

I think we should have a well defined `resolved` field. Actually, I feel 
`UnresolvedGetField` is a better idea. I think it is reasonable to have a 
`UnresolvedXXX` whenever we need to do lookup or name matching to resolve it. 
@marmbrus What do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-71967503
  
  [Test build #26290 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26290/consoleFull)
 for   PR 4068 at commit 
[`d9cf8fd`](https://github.com/apache/spark/commit/d9cf8fd0e6bf680b0c80bc7d50a225ffaae3e72d).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-28 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/4068#discussion_r23746914
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -285,11 +285,22 @@ class Analyzer(catalog: Catalog,
 result
 
   // Resolve field names using the resolver.
-  case f @ GetField(child, fieldName) if !f.resolved  
child.resolved =
+  case f @ GetField(child, fieldName) if child.resolved =
 child.dataType match {
   case StructType(fields) =
-val resolvedFieldName = 
fields.map(_.name).find(resolver(_, fieldName))
-resolvedFieldName.map(n = f.copy(fieldName = 
n)).getOrElse(f)
+val actualField = fields.filter(f = resolver(f.name, 
fieldName))
+if (actualField.length == 0) {
+  sys.error(
+sNo such struct field $fieldName in 
${fields.map(_.name).mkString(, )})
--- End diff --

ok I see. With your change, the `resolved` field will always be true as 
long as its child's `resolved` is true (the existence of the desired field is 
not considered). Actually, we are breaking the semantic of `resolved` at here. 
I do think checking ambiguity is necessary, but I think it is also necessary to 
follow the definition of `resolved`.

I am sorry if I missed it. Can you explain why you prefer not to put this 
check in `GetField`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/4068#discussion_r23747469
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -285,11 +285,22 @@ class Analyzer(catalog: Catalog,
 result
 
   // Resolve field names using the resolver.
-  case f @ GetField(child, fieldName) if !f.resolved  
child.resolved =
+  case f @ GetField(child, fieldName) if child.resolved =
 child.dataType match {
   case StructType(fields) =
-val resolvedFieldName = 
fields.map(_.name).find(resolver(_, fieldName))
-resolvedFieldName.map(n = f.copy(fieldName = 
n)).getOrElse(f)
+val actualField = fields.filter(f = resolver(f.name, 
fieldName))
+if (actualField.length == 0) {
+  sys.error(
+sNo such struct field $fieldName in 
${fields.map(_.name).mkString(, )})
--- End diff --

As I said before, we need `Resolver` to check the existence of desired 
field. However, we can't get it inside `GetField`, that's why we also did this 
check in `LogicalPlan.resolveNesting`.
And we build `GetField` in `SqlParser`, so it's hard to put `Resolver` into 
`GetField` during building.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-71971690
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26290/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-71971685
  
  [Test build #26290 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26290/consoleFull)
 for   PR 4068 at commit 
[`d9cf8fd`](https://github.com/apache/spark/commit/d9cf8fd0e6bf680b0c80bc7d50a225ffaae3e72d).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-27 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/4068#discussion_r23665528
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveResolutionSuite.scala
 ---
@@ -38,6 +38,15 @@ class HiveResolutionSuite extends HiveComparisonTest {
 sql(SELECT a[0].A.A from nested).queryExecution.analyzed
   }
 
+  test(SPARK-5278: check ambiguous reference to fields) {
+jsonRDD(sparkContext.makeRDD(
+  {a: [{b: 1, B: 2}]} :: Nil)).registerTempTable(nested)
+val exception = intercept[RuntimeException] {
+  println(sql(SELECT a[0].b from nested).queryExecution.analyzed)
--- End diff --

Can you add a comment at here to explain what we are expecting? Also, you 
can remove `println`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-27 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/4068#discussion_r23665510
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -285,11 +285,22 @@ class Analyzer(catalog: Catalog,
 result
 
   // Resolve field names using the resolver.
-  case f @ GetField(child, fieldName) if !f.resolved  
child.resolved =
+  case f @ GetField(child, fieldName) if child.resolved =
 child.dataType match {
   case StructType(fields) =
-val resolvedFieldName = 
fields.map(_.name).find(resolver(_, fieldName))
-resolvedFieldName.map(n = f.copy(fieldName = 
n)).getOrElse(f)
+val actualField = fields.filter(f = resolver(f.name, 
fieldName))
+if (actualField.length == 0) {
+  sys.error(
+sNo such struct field $fieldName in 
${fields.map(_.name).mkString(, )})
--- End diff --

I think `CheckResolution` should catch it. If we cannot resolve it, just 
leave it unchanged. Can you see if there is a unit test for this? If not, can 
you add one? Maybe we can also log it like what `LogicalPlan.resolve` does.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/4068#discussion_r23667563
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -285,11 +285,22 @@ class Analyzer(catalog: Catalog,
 result
 
   // Resolve field names using the resolver.
-  case f @ GetField(child, fieldName) if !f.resolved  
child.resolved =
+  case f @ GetField(child, fieldName) if child.resolved =
 child.dataType match {
   case StructType(fields) =
-val resolvedFieldName = 
fields.map(_.name).find(resolver(_, fieldName))
-resolvedFieldName.map(n = f.copy(fieldName = 
n)).getOrElse(f)
+val actualField = fields.filter(f = resolver(f.name, 
fieldName))
+if (actualField.length == 0) {
+  sys.error(
+sNo such struct field $fieldName in 
${fields.map(_.name).mkString(, )})
--- End diff --

If we leave it unchanged, `CheckResolution` can't catch it. The reason is 
that, we need `Resolver` to check if a `GetField` is resolved, but we can't get 
`Resolver` inside `GetField`.
Fortunately, we can catch it at runtime, as `GetField` will report error if 
it can't find the required field.
Which way should we prefer? Leaving it unchanged or reporting error right 
away?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-24 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-71304548
  
@yhuai can you review this one first?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-22 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-71144612
  
ping @marmbrus @liancheng @rxin 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-19 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-70460444
  
I had a new idea: Don't try to figure out whether a `GetField` need analyse 
or not, just analyse it anyway, until we reach the fixed point.
Thus we can do the check in `Analyzer` happily :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-70460441
  
  [Test build #25754 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25754/consoleFull)
 for   PR 4068 at commit 
[`53048b6`](https://github.com/apache/spark/commit/53048b62758de4e1d76f561e4f0aec00aec6c612).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-19 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-70467955
  
Hi @marmbrus , how about this fix? If you feel OK, I'll update 
https://github.com/apache/spark/pull/2405 in the same way.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-70467813
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25754/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-70467806
  
  [Test build #25754 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25754/consoleFull)
 for   PR 4068 at commit 
[`53048b6`](https://github.com/apache/spark/commit/53048b62758de4e1d76f561e4f0aec00aec6c612).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-18 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-70447398
  
The problem is that: Currently the `GetField` class is an operation which 
picks the first field whose name equal to the required `fieldName` with case 
sensitive. As I said before, we will parse `a.b[0].c.d` to 
`GetField(GetField(GetItem(Unresolved(a.b), 0), c), d)`. For the `a.b`, 
we can check anything we want before build `GetField`, but for the 2 outer 
`GetFiled`, we can only do the check in `Analyzer`(or we can expose `resolver` 
to `GetField`, but it's not recommended).

So we need a way to indicate whether a `GetField` need analyse or not.

For SPARK-3698, we can do this by searching required field with case 
sensitive, if success, we are done. if not, we still have chance if the 
resolver is case insensitive, so we can do the check in `Analyzer` as @marmbrus 
did in https://github.com/apache/spark/pull/3724.

For SPARK-5278 here, it's more complicated. it seems to me that the only 
way is adding a flag to `GetField`, or introduce `UnresolvedGetField`.

What do you think? @marmbrus @liancheng


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-70454202
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25747/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-18 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-70454200
  
  [Test build #25747 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25747/consoleFull)
 for   PR 4068 at commit 
[`bfe069b`](https://github.com/apache/spark/commit/bfe069bfb3ac6e80fa82849b4e1dee90a606e731).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-18 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-70455616
  
  [Test build #25750 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25750/consoleFull)
 for   PR 4068 at commit 
[`d8c1dc9`](https://github.com/apache/spark/commit/d8c1dc958148a4b052b387f5573b147cfd9385da).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-18 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-70456466
  
  [Test build #25750 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25750/consoleFull)
 for   PR 4068 at commit 
[`d8c1dc9`](https://github.com/apache/spark/commit/d8c1dc958148a4b052b387f5573b147cfd9385da).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-70456470
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25750/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-18 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-70453627
  
  [Test build #25747 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25747/consoleFull)
 for   PR 4068 at commit 
[`bfe069b`](https://github.com/apache/spark/commit/bfe069bfb3ac6e80fa82849b4e1dee90a606e731).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-70325692
  
  [Test build #25681 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25681/consoleFull)
 for   PR 4068 at commit 
[`f2ab566`](https://github.com/apache/spark/commit/f2ab566642eedb40e8b4ec6258cb47a59c8cd952).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-70332106
  
  [Test build #25681 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25681/consoleFull)
 for   PR 4068 at commit 
[`f2ab566`](https://github.com/apache/spark/commit/f2ab566642eedb40e8b4ec6258cb47a59c8cd952).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-70332109
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25681/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-15 Thread cloud-fan
GitHub user cloud-fan opened a pull request:

https://github.com/apache/spark/pull/4068

[SPARK-5278][SQL] complete the check of ambiguous reference to fields



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cloud-fan/spark simple

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/4068.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4068


commit 855da25e29afa4b63b2b3f036dc473c201d4b90c
Author: Wenchen Fan cloud0...@outlook.com
Date:   2015-01-16T06:49:09Z

fix bug for check ambiguous reference to fields




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-70215432
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-01-15 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-70216307
  
ping @marmbrus  @liancheng


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org