[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread nongli
Github user nongli commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43671708
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -1921,4 +1921,89 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   assert(sampled.count() == sampledOdd.count() + sampledEven.count())
 }
   }
+
+  test("Struct Star Expansion") {
+val structDf = testData2.select("a", "b").as("record")
+
+checkAnswer(
+  structDf.select($"record.a", $"record.b"),
+  Row(1, 1) :: Row(1, 2) :: Row(2, 1) :: Row(2, 2) :: Row(3, 1) :: 
Row(3, 2) :: Nil)
+
+checkAnswer(
+  structDf.select($"record.*"),
+  Row(1, 1) :: Row(1, 2) :: Row(2, 1) :: Row(2, 2) :: Row(3, 1) :: 
Row(3, 2) :: Nil)
+
+checkAnswer(
+  structDf.select($"record.*", $"record.*"),
+  Row(1, 1, 1, 1) :: Row(1, 2, 1, 2) :: Row(2, 1, 2, 1) :: Row(2, 2, 
2, 2) ::
+Row(3, 1, 3, 1) :: Row(3, 2, 3, 2) :: Nil)
+
+checkAnswer(
+  sql("select struct(a, b) as r1, struct(b, a) as r2 from 
testData2").select($"r1.*", $"r2.*"),
+  Row(1, 1, 1, 1) :: Row(1, 2, 2, 1) :: Row(2, 1, 1, 2) :: Row(2, 2, 
2, 2) ::
+Row(3, 1, 1, 3) :: Row(3, 2, 2, 3) :: Nil)
+
+checkAnswer(sql(
+  """
+| SELECT min(struct(record.*)) FROM
+|   (select struct(a,b) as record from testData2) tmp
+  """.stripMargin),
+  Row(Row(1, 1)) :: Nil)
+
+// Try with an alias on the select list
+checkAnswer(sql(
+  """
+| SELECT max(struct(record.*)) as r FROM
+|   (select struct(a,b) as record from testData2) tmp
+  """.stripMargin).select($"r.*"),
+  Row(3, 2) :: Nil)
+
+// With GROUP BY
+checkAnswer(sql(
+  """
+| SELECT min(struct(record.*)) FROM
+|   (select a as a, struct(a,b) as record from testData2) tmp
+| GROUP BY a
+  """.stripMargin),
+  Row(Row(1, 1)) :: Row(Row(2, 1)) :: Row(Row(3, 1)) :: Nil)
+
+// With GROUP BY and alias
+checkAnswer(sql(
+  """
+| SELECT max(struct(record.*)) as r FROM
+|   (select a as a, struct(a,b) as record from testData2) tmp
+| GROUP BY a
+  """.stripMargin).select($"r.*"),
+  Row(1, 2) :: Row(2, 2) :: Row(3, 2) :: Nil)
+
+// With GROUP BY and alias and additional fields in the struct
+checkAnswer(sql(
+  """
+| SELECT max(struct(a, record.*, b)) as r FROM
+|   (select a as a, b as b, struct(a,b) as record from testData2) 
tmp
+| GROUP BY a
+  """.stripMargin).select($"r.*"),
+  Row(1, 1, 2, 2) :: Row(2, 2, 2, 2) :: Row(3, 3, 2, 2) :: Nil)
+
+// Create a data set that contains nested structs.
+val nestedStructData = sql(
+  """
+| SELECT struct(r1, r2) as record FROM
+|   (SELECT struct(a, b) as r1, struct(b, a) as r2 FROM testData2) 
tmp
+  """.stripMargin)
+
+checkAnswer(nestedStructData.select($"record.*"),
+  Row(Row(1, 1), Row(1, 1)) :: Row(Row(1, 2), Row(2, 1)) :: Row(Row(2, 
1), Row(1, 2)) ::
+Row(Row(2, 2), Row(2, 2)) :: Row(Row(3, 1), Row(1, 3)) :: 
Row(Row(3, 2), Row(2, 3)) :: Nil)
+checkAnswer(nestedStructData.select($"record.r1"),
+  Row(Row(1, 1)) :: Row(Row(1, 2)) :: Row(Row(2, 1)) :: Row(Row(2, 2)) 
::
+Row(Row(3, 1)) :: Row(Row(3, 2)) :: Nil)
+checkAnswer(
+  nestedStructData.select($"record.r1.*"),
--- End diff --

Nice catch! That fixes it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread nongli
Github user nongli commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43669656
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
 ---
@@ -166,26 +168,55 @@ abstract class Star extends LeafExpression with 
NamedExpression {
  * Represents all of the input attributes to a given relational operator, 
for example in
  * "SELECT * FROM ...".
  *
- * @param table an optional table that should be the target of the 
expansion.  If omitted all
- *  tables' columns are produced.
+ * This is also used to expand structs. For example:
+ * "SELECT record.* from (SELCCT struct(a,b,c) as record ...)
+ *
+ * @param target an optional name that should be the target of the 
expansion.  If omitted all
+ *  targets' columns are produced. This can either be a table 
name or struct name.
  */
-case class UnresolvedStar(table: Option[String]) extends Star with 
Unevaluable {
+case class UnresolvedStar(target: Option[String]) extends Star with 
Unevaluable {
--- End diff --

I don't think Seq[String] is right. Looking at UnresolvedAttribute, the 
input to this is a string and then it's parsed to deal with splitting the path 
and unescaping.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43716153
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
 ---
@@ -166,26 +166,68 @@ abstract class Star extends LeafExpression with 
NamedExpression {
  * Represents all of the input attributes to a given relational operator, 
for example in
  * "SELECT * FROM ...".
  *
- * @param table an optional table that should be the target of the 
expansion.  If omitted all
- *  tables' columns are produced.
+ * This is also used to expand structs. For example:
+ * "SELECT record.* from (SELECT struct(a,b,c) as record ...)
+ *
+ * @param target an optional name that should be the target of the 
expansion.  If omitted all
+ *  targets' columns are produced. This can either be a table 
name or struct name. This
+ *  is a list of identifiers that is the path of the expansion.
  */
-case class UnresolvedStar(table: Option[String]) extends Star with 
Unevaluable {
+case class UnresolvedStar(target: Option[Seq[String]]) extends Star with 
Unevaluable {
+
+  override def expand(input: LogicalPlan, resolver: Resolver): 
Seq[NamedExpression] = {
 
-  override def expand(input: Seq[Attribute], resolver: Resolver): 
Seq[NamedExpression] = {
-val expandedAttributes: Seq[Attribute] = table match {
+// First try to expand assuming it is table.*.
+val expandedAttributes: Seq[Attribute] = target match {
   // If there is no table specified, use all input attributes.
-  case None => input
+  case None => input.output
   // If there is a table, pick out attributes that are part of this 
table.
-  case Some(t) => input.filter(_.qualifiers.filter(resolver(_, 
t)).nonEmpty)
+  case Some(t) => if (t.size == 1) {
+input.output.filter(_.qualifiers.filter(resolver(_, 
t.head)).nonEmpty)
--- End diff --

We can use `exists` instead of `filter(...).nonEmpty`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43716232
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
 ---
@@ -166,26 +166,68 @@ abstract class Star extends LeafExpression with 
NamedExpression {
  * Represents all of the input attributes to a given relational operator, 
for example in
  * "SELECT * FROM ...".
  *
- * @param table an optional table that should be the target of the 
expansion.  If omitted all
- *  tables' columns are produced.
+ * This is also used to expand structs. For example:
+ * "SELECT record.* from (SELECT struct(a,b,c) as record ...)
+ *
+ * @param target an optional name that should be the target of the 
expansion.  If omitted all
+ *  targets' columns are produced. This can either be a table 
name or struct name. This
+ *  is a list of identifiers that is the path of the expansion.
  */
-case class UnresolvedStar(table: Option[String]) extends Star with 
Unevaluable {
+case class UnresolvedStar(target: Option[Seq[String]]) extends Star with 
Unevaluable {
+
+  override def expand(input: LogicalPlan, resolver: Resolver): 
Seq[NamedExpression] = {
 
-  override def expand(input: Seq[Attribute], resolver: Resolver): 
Seq[NamedExpression] = {
-val expandedAttributes: Seq[Attribute] = table match {
+// First try to expand assuming it is table.*.
+val expandedAttributes: Seq[Attribute] = target match {
   // If there is no table specified, use all input attributes.
-  case None => input
+  case None => input.output
   // If there is a table, pick out attributes that are part of this 
table.
-  case Some(t) => input.filter(_.qualifiers.filter(resolver(_, 
t)).nonEmpty)
+  case Some(t) => if (t.size == 1) {
+input.output.filter(_.qualifiers.filter(resolver(_, 
t.head)).nonEmpty)
+  } else {
+List()
+  }
 }
-expandedAttributes.zip(input).map {
-  case (n: NamedExpression, _) => n
-  case (e, originalAttribute) =>
-Alias(e, originalAttribute.name)(qualifiers = 
originalAttribute.qualifiers)
+if (!expandedAttributes.isEmpty) {
+  if (expandedAttributes.forall(_.isInstanceOf[NamedExpression])) {
+return expandedAttributes
+  } else {
+require(expandedAttributes.size == input.output.size)
+expandedAttributes.zip(input.output).map {
+  case (e, originalAttribute) =>
+Alias(e, originalAttribute.name)(qualifiers = 
originalAttribute.qualifiers)
+}
--- End diff --

`Attribute` is always `NamedExpression`, I don't think we will hit this 
branch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-153173249
  
**[Test build #44840 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44840/consoleFull)**
 for PR 9343 at commit 
[`c55a7d2`](https://github.com/apache/spark/commit/c55a7d2369520bd3f6a8b8eb59dd69aed7e68dea).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-153172555
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread nongli
Github user nongli commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43689290
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
 ---
@@ -166,26 +168,55 @@ abstract class Star extends LeafExpression with 
NamedExpression {
  * Represents all of the input attributes to a given relational operator, 
for example in
  * "SELECT * FROM ...".
  *
- * @param table an optional table that should be the target of the 
expansion.  If omitted all
- *  tables' columns are produced.
+ * This is also used to expand structs. For example:
+ * "SELECT record.* from (SELCCT struct(a,b,c) as record ...)
+ *
+ * @param target an optional name that should be the target of the 
expansion.  If omitted all
+ *  targets' columns are produced. This can either be a table 
name or struct name.
  */
-case class UnresolvedStar(table: Option[String]) extends Star with 
Unevaluable {
+case class UnresolvedStar(target: Option[String]) extends Star with 
Unevaluable {
--- End diff --

I tried the Seq[String] approach and I think it's better.

The hive parser doesn't want to parse db.table in the SELECT list.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-153172474
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-153197858
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44840/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-153197856
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43702827
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
 ---
@@ -166,26 +166,63 @@ abstract class Star extends LeafExpression with 
NamedExpression {
  * Represents all of the input attributes to a given relational operator, 
for example in
  * "SELECT * FROM ...".
  *
- * @param table an optional table that should be the target of the 
expansion.  If omitted all
- *  tables' columns are produced.
+ * This is also used to expand structs. For example:
+ * "SELECT record.* from (SELECT struct(a,b,c) as record ...)
+ *
+ * @param target an optional name that should be the target of the 
expansion.  If omitted all
+ *  targets' columns are produced. This can either be a table 
name or struct name. This
+ *  is a list of identifiers that is the path of the expansion.
  */
-case class UnresolvedStar(table: Option[String]) extends Star with 
Unevaluable {
+case class UnresolvedStar(target: Option[Seq[String]]) extends Star with 
Unevaluable {
+
+  override def expand(input: LogicalPlan, resolver: Resolver): 
Seq[NamedExpression] = {
 
-  override def expand(input: Seq[Attribute], resolver: Resolver): 
Seq[NamedExpression] = {
-val expandedAttributes: Seq[Attribute] = table match {
+// First try to expand assuming it is table.*.
+val expandedAttributes: Seq[Attribute] = target match {
   // If there is no table specified, use all input attributes.
-  case None => input
+  case None => input.output
   // If there is a table, pick out attributes that are part of this 
table.
-  case Some(t) => input.filter(_.qualifiers.filter(resolver(_, 
t)).nonEmpty)
+  case Some(t) => if (t.size == 1) {
+input.output.filter(_.qualifiers.filter(resolver(_, 
t.head)).nonEmpty)
+  } else {
+List()
+  }
 }
-expandedAttributes.zip(input).map {
+expandedAttributes.zip(input.output).map {
   case (n: NamedExpression, _) => n
   case (e, originalAttribute) =>
 Alias(e, originalAttribute.name)(qualifiers = 
originalAttribute.qualifiers)
 }
+
+if (!expandedAttributes.isEmpty) return expandedAttributes
--- End diff --

Seems you want the following?
```
if (!expandedAttributes.isEmpty) {
  expandedAttributes.zip(input.output).map {
case (n: NamedExpression, _) => n
case (e, originalAttribute) =>
  Alias(e, originalAttribute.name)(qualifiers = 
originalAttribute.qualifiers)
  }
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43703147
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
 ---
@@ -166,26 +166,63 @@ abstract class Star extends LeafExpression with 
NamedExpression {
  * Represents all of the input attributes to a given relational operator, 
for example in
  * "SELECT * FROM ...".
  *
- * @param table an optional table that should be the target of the 
expansion.  If omitted all
- *  tables' columns are produced.
+ * This is also used to expand structs. For example:
+ * "SELECT record.* from (SELECT struct(a,b,c) as record ...)
+ *
+ * @param target an optional name that should be the target of the 
expansion.  If omitted all
+ *  targets' columns are produced. This can either be a table 
name or struct name. This
+ *  is a list of identifiers that is the path of the expansion.
  */
-case class UnresolvedStar(table: Option[String]) extends Star with 
Unevaluable {
+case class UnresolvedStar(target: Option[Seq[String]]) extends Star with 
Unevaluable {
+
+  override def expand(input: LogicalPlan, resolver: Resolver): 
Seq[NamedExpression] = {
 
-  override def expand(input: Seq[Attribute], resolver: Resolver): 
Seq[NamedExpression] = {
-val expandedAttributes: Seq[Attribute] = table match {
+// First try to expand assuming it is table.*.
+val expandedAttributes: Seq[Attribute] = target match {
   // If there is no table specified, use all input attributes.
-  case None => input
+  case None => input.output
   // If there is a table, pick out attributes that are part of this 
table.
-  case Some(t) => input.filter(_.qualifiers.filter(resolver(_, 
t)).nonEmpty)
+  case Some(t) => if (t.size == 1) {
+input.output.filter(_.qualifiers.filter(resolver(_, 
t.head)).nonEmpty)
+  } else {
+List()
+  }
 }
-expandedAttributes.zip(input).map {
+expandedAttributes.zip(input.output).map {
   case (n: NamedExpression, _) => n
   case (e, originalAttribute) =>
 Alias(e, originalAttribute.name)(qualifiers = 
originalAttribute.qualifiers)
 }
+
+if (!expandedAttributes.isEmpty) return expandedAttributes
+
+// Try to resolve it as a struct expansion. If there is a conflict and 
both are possible,
+// (i.e. [name].* is both a table and a struct), the struct path can 
always be qualified.
+if (target.isDefined) {
+  val attribute = input.resolve(target.get, resolver)
+  if (attribute.isDefined) {
+// This target resolved to an attribute in child. It must be a 
struct. Expand it.
+attribute.get.dataType match {
+  case s: StructType => {
+s.fields.map( f => {
+  val extract = GetStructField(attribute.get, f, 
s.getFieldIndex(f.name).get)
+  Alias(extract, target.get + "." + f.name)()
+})
+  }
+  case _ => {
+throw new AnalysisException("Can only star expand struct data 
types. Attribute: `" +
+  target.get + "`")
+  }
+}
+  } else {
+List()
+  }
+} else {
+  List()
+}
   }
 
-  override def toString: String = table.map(_ + ".").getOrElse("") + "*"
+  override def toString: String = target.map(_ + ".").getOrElse("") + "*"
--- End diff --

`map(_.mkString("."))` instead of `map(_ + ".")`?

```
scala> Some(Seq("str1", "str2"))
res0: Some[Seq[String]] = Some(List(str1, str2))

scala> res0.map(_ + ".")
res1: Option[String] = Some(List(str1, str2).)

scala> res0.map(_.mkString("."))
res2: Option[String] = Some(str1.str2)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-153197650
  
**[Test build #44840 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44840/consoleFull)**
 for PR 9343 at commit 
[`c55a7d2`](https://github.com/apache/spark/commit/c55a7d2369520bd3f6a8b8eb59dd69aed7e68dea).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:\n  * 
`case class UnresolvedStar(target: Option[Seq[String]]) extends Star with 
Unevaluable `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43703049
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
 ---
@@ -166,26 +166,63 @@ abstract class Star extends LeafExpression with 
NamedExpression {
  * Represents all of the input attributes to a given relational operator, 
for example in
  * "SELECT * FROM ...".
  *
- * @param table an optional table that should be the target of the 
expansion.  If omitted all
- *  tables' columns are produced.
+ * This is also used to expand structs. For example:
+ * "SELECT record.* from (SELECT struct(a,b,c) as record ...)
+ *
+ * @param target an optional name that should be the target of the 
expansion.  If omitted all
+ *  targets' columns are produced. This can either be a table 
name or struct name. This
+ *  is a list of identifiers that is the path of the expansion.
  */
-case class UnresolvedStar(table: Option[String]) extends Star with 
Unevaluable {
+case class UnresolvedStar(target: Option[Seq[String]]) extends Star with 
Unevaluable {
+
+  override def expand(input: LogicalPlan, resolver: Resolver): 
Seq[NamedExpression] = {
 
-  override def expand(input: Seq[Attribute], resolver: Resolver): 
Seq[NamedExpression] = {
-val expandedAttributes: Seq[Attribute] = table match {
+// First try to expand assuming it is table.*.
+val expandedAttributes: Seq[Attribute] = target match {
   // If there is no table specified, use all input attributes.
-  case None => input
+  case None => input.output
   // If there is a table, pick out attributes that are part of this 
table.
-  case Some(t) => input.filter(_.qualifiers.filter(resolver(_, 
t)).nonEmpty)
+  case Some(t) => if (t.size == 1) {
+input.output.filter(_.qualifiers.filter(resolver(_, 
t.head)).nonEmpty)
+  } else {
+List()
+  }
 }
-expandedAttributes.zip(input).map {
+expandedAttributes.zip(input.output).map {
--- End diff --

How about we add an assert to make sure that `expandedAttributes` and 
`input.output` have the same length? Otherwise, `zip` will silently drop extra 
elements.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43702945
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
 ---
@@ -166,26 +166,63 @@ abstract class Star extends LeafExpression with 
NamedExpression {
  * Represents all of the input attributes to a given relational operator, 
for example in
  * "SELECT * FROM ...".
  *
- * @param table an optional table that should be the target of the 
expansion.  If omitted all
- *  tables' columns are produced.
+ * This is also used to expand structs. For example:
+ * "SELECT record.* from (SELECT struct(a,b,c) as record ...)
+ *
+ * @param target an optional name that should be the target of the 
expansion.  If omitted all
+ *  targets' columns are produced. This can either be a table 
name or struct name. This
+ *  is a list of identifiers that is the path of the expansion.
  */
-case class UnresolvedStar(table: Option[String]) extends Star with 
Unevaluable {
+case class UnresolvedStar(target: Option[Seq[String]]) extends Star with 
Unevaluable {
+
+  override def expand(input: LogicalPlan, resolver: Resolver): 
Seq[NamedExpression] = {
 
-  override def expand(input: Seq[Attribute], resolver: Resolver): 
Seq[NamedExpression] = {
-val expandedAttributes: Seq[Attribute] = table match {
+// First try to expand assuming it is table.*.
+val expandedAttributes: Seq[Attribute] = target match {
   // If there is no table specified, use all input attributes.
-  case None => input
+  case None => input.output
   // If there is a table, pick out attributes that are part of this 
table.
-  case Some(t) => input.filter(_.qualifiers.filter(resolver(_, 
t)).nonEmpty)
+  case Some(t) => if (t.size == 1) {
+input.output.filter(_.qualifiers.filter(resolver(_, 
t.head)).nonEmpty)
+  } else {
+List()
--- End diff --

Should we throw an analysis error instead of returning an empty list? Also, 
let's say we have a query that trigger this path, what will happen?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread nongli
Github user nongli commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43705776
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
 ---
@@ -166,26 +166,63 @@ abstract class Star extends LeafExpression with 
NamedExpression {
  * Represents all of the input attributes to a given relational operator, 
for example in
  * "SELECT * FROM ...".
  *
- * @param table an optional table that should be the target of the 
expansion.  If omitted all
- *  tables' columns are produced.
+ * This is also used to expand structs. For example:
+ * "SELECT record.* from (SELECT struct(a,b,c) as record ...)
+ *
+ * @param target an optional name that should be the target of the 
expansion.  If omitted all
+ *  targets' columns are produced. This can either be a table 
name or struct name. This
+ *  is a list of identifiers that is the path of the expansion.
  */
-case class UnresolvedStar(table: Option[String]) extends Star with 
Unevaluable {
+case class UnresolvedStar(target: Option[Seq[String]]) extends Star with 
Unevaluable {
+
+  override def expand(input: LogicalPlan, resolver: Resolver): 
Seq[NamedExpression] = {
 
-  override def expand(input: Seq[Attribute], resolver: Resolver): 
Seq[NamedExpression] = {
-val expandedAttributes: Seq[Attribute] = table match {
+// First try to expand assuming it is table.*.
+val expandedAttributes: Seq[Attribute] = target match {
   // If there is no table specified, use all input attributes.
-  case None => input
+  case None => input.output
   // If there is a table, pick out attributes that are part of this 
table.
-  case Some(t) => input.filter(_.qualifiers.filter(resolver(_, 
t)).nonEmpty)
+  case Some(t) => if (t.size == 1) {
+input.output.filter(_.qualifiers.filter(resolver(_, 
t.head)).nonEmpty)
+  } else {
+List()
+  }
 }
-expandedAttributes.zip(input).map {
+expandedAttributes.zip(input.output).map {
   case (n: NamedExpression, _) => n
   case (e, originalAttribute) =>
 Alias(e, originalAttribute.name)(qualifiers = 
originalAttribute.qualifiers)
 }
+
+if (!expandedAttributes.isEmpty) return expandedAttributes
+
+// Try to resolve it as a struct expansion. If there is a conflict and 
both are possible,
+// (i.e. [name].* is both a table and a struct), the struct path can 
always be qualified.
+if (target.isDefined) {
+  val attribute = input.resolve(target.get, resolver)
+  if (attribute.isDefined) {
+// This target resolved to an attribute in child. It must be a 
struct. Expand it.
+attribute.get.dataType match {
+  case s: StructType => {
+s.fields.map( f => {
+  val extract = GetStructField(attribute.get, f, 
s.getFieldIndex(f.name).get)
+  Alias(extract, target.get + "." + f.name)()
+})
+  }
+  case _ => {
+throw new AnalysisException("Can only star expand struct data 
types. Attribute: `" +
+  target.get + "`")
+  }
+}
+  } else {
+List()
+  }
+} else {
+  List()
+}
   }
 
-  override def toString: String = table.map(_ + ".").getOrElse("") + "*"
+  override def toString: String = target.map(_ + ".").getOrElse("") + "*"
--- End diff --

I think that was intentional here to append the star


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread nongli
Github user nongli commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43705834
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
 ---
@@ -166,26 +166,63 @@ abstract class Star extends LeafExpression with 
NamedExpression {
  * Represents all of the input attributes to a given relational operator, 
for example in
  * "SELECT * FROM ...".
  *
- * @param table an optional table that should be the target of the 
expansion.  If omitted all
- *  tables' columns are produced.
+ * This is also used to expand structs. For example:
+ * "SELECT record.* from (SELECT struct(a,b,c) as record ...)
+ *
+ * @param target an optional name that should be the target of the 
expansion.  If omitted all
+ *  targets' columns are produced. This can either be a table 
name or struct name. This
+ *  is a list of identifiers that is the path of the expansion.
  */
-case class UnresolvedStar(table: Option[String]) extends Star with 
Unevaluable {
+case class UnresolvedStar(target: Option[Seq[String]]) extends Star with 
Unevaluable {
+
+  override def expand(input: LogicalPlan, resolver: Resolver): 
Seq[NamedExpression] = {
 
-  override def expand(input: Seq[Attribute], resolver: Resolver): 
Seq[NamedExpression] = {
-val expandedAttributes: Seq[Attribute] = table match {
+// First try to expand assuming it is table.*.
+val expandedAttributes: Seq[Attribute] = target match {
   // If there is no table specified, use all input attributes.
-  case None => input
+  case None => input.output
   // If there is a table, pick out attributes that are part of this 
table.
-  case Some(t) => input.filter(_.qualifiers.filter(resolver(_, 
t)).nonEmpty)
+  case Some(t) => if (t.size == 1) {
+input.output.filter(_.qualifiers.filter(resolver(_, 
t.head)).nonEmpty)
+  } else {
+List()
--- End diff --

Yea, I didn't change the existing behavior that but's not so good. You 
often would just get an empty projection.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-153212674
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-153212605
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-153213191
  
**[Test build #44862 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44862/consoleFull)**
 for PR 9343 at commit 
[`886acc7`](https://github.com/apache/spark/commit/886acc7ba1dc95348117b5890ce8115efa149b0b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43715293
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -1932,4 +1932,137 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   assert(sampled.count() == sampledOdd.count() + sampledEven.count())
 }
   }
+
+  test("Struct Star Expansion") {
+val structDf = testData2.select("a", "b").as("record")
+
+checkAnswer(
+  structDf.select($"record.a", $"record.b"),
+  Row(1, 1) :: Row(1, 2) :: Row(2, 1) :: Row(2, 2) :: Row(3, 1) :: 
Row(3, 2) :: Nil)
+
+checkAnswer(
+  structDf.select($"record.*"),
+  Row(1, 1) :: Row(1, 2) :: Row(2, 1) :: Row(2, 2) :: Row(3, 1) :: 
Row(3, 2) :: Nil)
+
+checkAnswer(
+  structDf.select($"record.*", $"record.*"),
+  Row(1, 1, 1, 1) :: Row(1, 2, 1, 2) :: Row(2, 1, 2, 1) :: Row(2, 2, 
2, 2) ::
+Row(3, 1, 3, 1) :: Row(3, 2, 3, 2) :: Nil)
+
+checkAnswer(
+  sql("select struct(a, b) as r1, struct(b, a) as r2 from 
testData2").select($"r1.*", $"r2.*"),
+  Row(1, 1, 1, 1) :: Row(1, 2, 2, 1) :: Row(2, 1, 1, 2) :: Row(2, 2, 
2, 2) ::
+Row(3, 1, 1, 3) :: Row(3, 2, 2, 3) :: Nil)
+
+// Try with a registered table.
+sql("select struct(a, b) as record from 
testData2").registerTempTable("structTable")
+checkAnswer(sql("SELECT record.* FROM structTable"),
+  Row(1, 1) :: Row(1, 2) :: Row(2, 1) :: Row(2, 2) :: Row(3, 1) :: 
Row(3, 2) :: Nil)
+
+checkAnswer(sql(
+  """
+| SELECT min(struct(record.*)) FROM
+|   (select struct(a,b) as record from testData2) tmp
+  """.stripMargin),
+  Row(Row(1, 1)) :: Nil)
+
+// Try with an alias on the select list
+checkAnswer(sql(
+  """
+| SELECT max(struct(record.*)) as r FROM
+|   (select struct(a,b) as record from testData2) tmp
+  """.stripMargin).select($"r.*"),
+  Row(3, 2) :: Nil)
+
+// With GROUP BY
+checkAnswer(sql(
+  """
+| SELECT min(struct(record.*)) FROM
+|   (select a as a, struct(a,b) as record from testData2) tmp
+| GROUP BY a
+  """.stripMargin),
+  Row(Row(1, 1)) :: Row(Row(2, 1)) :: Row(Row(3, 1)) :: Nil)
+
+// With GROUP BY and alias
+checkAnswer(sql(
+  """
+| SELECT max(struct(record.*)) as r FROM
+|   (select a as a, struct(a,b) as record from testData2) tmp
+| GROUP BY a
+  """.stripMargin).select($"r.*"),
+  Row(1, 2) :: Row(2, 2) :: Row(3, 2) :: Nil)
+
+// With GROUP BY and alias and additional fields in the struct
+checkAnswer(sql(
+  """
+| SELECT max(struct(a, record.*, b)) as r FROM
+|   (select a as a, b as b, struct(a,b) as record from testData2) 
tmp
+| GROUP BY a
+  """.stripMargin).select($"r.*"),
+  Row(1, 1, 2, 2) :: Row(2, 2, 2, 2) :: Row(3, 3, 2, 2) :: Nil)
+
+// Create a data set that contains nested structs.
+val nestedStructData = sql(
+  """
+| SELECT struct(r1, r2) as record FROM
+|   (SELECT struct(a, b) as r1, struct(b, a) as r2 FROM testData2) 
tmp
+  """.stripMargin)
+
+checkAnswer(nestedStructData.select($"record.*"),
+  Row(Row(1, 1), Row(1, 1)) :: Row(Row(1, 2), Row(2, 1)) :: Row(Row(2, 
1), Row(1, 2)) ::
+Row(Row(2, 2), Row(2, 2)) :: Row(Row(3, 1), Row(1, 3)) :: 
Row(Row(3, 2), Row(2, 3)) :: Nil)
+checkAnswer(nestedStructData.select($"record.r1"),
+  Row(Row(1, 1)) :: Row(Row(1, 2)) :: Row(Row(2, 1)) :: Row(Row(2, 2)) 
::
+Row(Row(3, 1)) :: Row(Row(3, 2)) :: Nil)
+checkAnswer(
+  nestedStructData.select($"record.r1.*"),
+  Row(1, 1) :: Row(1, 2) :: Row(2, 1) :: Row(2, 2) :: Row(3, 1) :: 
Row(3, 2) :: Nil)
+
+// Try with a registered table
+nestedStructData.registerTempTable("nestedStructTable")
+checkAnswer(sql("SELECT record.* FROM nestedStructTable"),
+  nestedStructData.select($"record.*"))
+checkAnswer(sql("SELECT record.r1 FROM nestedStructTable"),
+  nestedStructData.select($"record.r1"))
+checkAnswer(sql("SELECT record.r1.* FROM nestedStructTable"),
+  nestedStructData.select($"record.r1.*"))
--- End diff --

Regarding the format, maybe the following is better?

```
checkAnswer(
  sql("SELECT record.r1.* FROM nestedStructTable"),
  nestedStructData.select($"record.r1.*"))
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have 

[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43715351
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -1932,4 +1932,137 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   assert(sampled.count() == sampledOdd.count() + sampledEven.count())
 }
   }
+
+  test("Struct Star Expansion") {
+val structDf = testData2.select("a", "b").as("record")
+
+checkAnswer(
+  structDf.select($"record.a", $"record.b"),
+  Row(1, 1) :: Row(1, 2) :: Row(2, 1) :: Row(2, 2) :: Row(3, 1) :: 
Row(3, 2) :: Nil)
+
+checkAnswer(
+  structDf.select($"record.*"),
+  Row(1, 1) :: Row(1, 2) :: Row(2, 1) :: Row(2, 2) :: Row(3, 1) :: 
Row(3, 2) :: Nil)
+
+checkAnswer(
+  structDf.select($"record.*", $"record.*"),
+  Row(1, 1, 1, 1) :: Row(1, 2, 1, 2) :: Row(2, 1, 2, 1) :: Row(2, 2, 
2, 2) ::
+Row(3, 1, 3, 1) :: Row(3, 2, 3, 2) :: Nil)
+
+checkAnswer(
+  sql("select struct(a, b) as r1, struct(b, a) as r2 from 
testData2").select($"r1.*", $"r2.*"),
+  Row(1, 1, 1, 1) :: Row(1, 2, 2, 1) :: Row(2, 1, 1, 2) :: Row(2, 2, 
2, 2) ::
+Row(3, 1, 1, 3) :: Row(3, 2, 2, 3) :: Nil)
+
+// Try with a registered table.
+sql("select struct(a, b) as record from 
testData2").registerTempTable("structTable")
+checkAnswer(sql("SELECT record.* FROM structTable"),
+  Row(1, 1) :: Row(1, 2) :: Row(2, 1) :: Row(2, 2) :: Row(3, 1) :: 
Row(3, 2) :: Nil)
+
+checkAnswer(sql(
+  """
+| SELECT min(struct(record.*)) FROM
+|   (select struct(a,b) as record from testData2) tmp
+  """.stripMargin),
+  Row(Row(1, 1)) :: Nil)
+
+// Try with an alias on the select list
+checkAnswer(sql(
+  """
+| SELECT max(struct(record.*)) as r FROM
+|   (select struct(a,b) as record from testData2) tmp
+  """.stripMargin).select($"r.*"),
+  Row(3, 2) :: Nil)
+
+// With GROUP BY
+checkAnswer(sql(
+  """
+| SELECT min(struct(record.*)) FROM
+|   (select a as a, struct(a,b) as record from testData2) tmp
+| GROUP BY a
+  """.stripMargin),
+  Row(Row(1, 1)) :: Row(Row(2, 1)) :: Row(Row(3, 1)) :: Nil)
+
+// With GROUP BY and alias
+checkAnswer(sql(
+  """
+| SELECT max(struct(record.*)) as r FROM
+|   (select a as a, struct(a,b) as record from testData2) tmp
+| GROUP BY a
+  """.stripMargin).select($"r.*"),
+  Row(1, 2) :: Row(2, 2) :: Row(3, 2) :: Nil)
+
+// With GROUP BY and alias and additional fields in the struct
+checkAnswer(sql(
+  """
+| SELECT max(struct(a, record.*, b)) as r FROM
+|   (select a as a, b as b, struct(a,b) as record from testData2) 
tmp
+| GROUP BY a
+  """.stripMargin).select($"r.*"),
+  Row(1, 1, 2, 2) :: Row(2, 2, 2, 2) :: Row(3, 3, 2, 2) :: Nil)
+
+// Create a data set that contains nested structs.
+val nestedStructData = sql(
+  """
+| SELECT struct(r1, r2) as record FROM
+|   (SELECT struct(a, b) as r1, struct(b, a) as r2 FROM testData2) 
tmp
+  """.stripMargin)
+
+checkAnswer(nestedStructData.select($"record.*"),
+  Row(Row(1, 1), Row(1, 1)) :: Row(Row(1, 2), Row(2, 1)) :: Row(Row(2, 
1), Row(1, 2)) ::
+Row(Row(2, 2), Row(2, 2)) :: Row(Row(3, 1), Row(1, 3)) :: 
Row(Row(3, 2), Row(2, 3)) :: Nil)
+checkAnswer(nestedStructData.select($"record.r1"),
+  Row(Row(1, 1)) :: Row(Row(1, 2)) :: Row(Row(2, 1)) :: Row(Row(2, 2)) 
::
+Row(Row(3, 1)) :: Row(Row(3, 2)) :: Nil)
+checkAnswer(
+  nestedStructData.select($"record.r1.*"),
+  Row(1, 1) :: Row(1, 2) :: Row(2, 1) :: Row(2, 2) :: Row(3, 1) :: 
Row(3, 2) :: Nil)
+
+// Try with a registered table
+nestedStructData.registerTempTable("nestedStructTable")
+checkAnswer(sql("SELECT record.* FROM nestedStructTable"),
+  nestedStructData.select($"record.*"))
+checkAnswer(sql("SELECT record.r1 FROM nestedStructTable"),
+  nestedStructData.select($"record.r1"))
+checkAnswer(sql("SELECT record.r1.* FROM nestedStructTable"),
+  nestedStructData.select($"record.r1.*"))
+
+// Create paths with unusual characters.
+val specialCharacterPath = sql(
+  """
+| SELECT struct(`col$.a_`, `a.b.c.`) as `r&` FROM
+|   (SELECT struct(a, b) as `col$.a_`, struct(b, a) as `a.b.c.` 
FROM testData2) tmp
+  """.stripMargin)
+

[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread yhuai
Github user yhuai commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-153236010
  
Thanks @nongli ! Overall LGTM. I am going to merge it to master. Can you 
create a follow-up PR to address my comments? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/9343


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43715869
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala ---
@@ -466,9 +466,9 @@ object SqlParser extends AbstractSparkSQLParser with 
DataTypeParser {
 
   protected lazy val baseExpression: Parser[Expression] =
 ( "*" ^^^ UnresolvedStar(None)
-| ident <~ "." ~ "*" ^^ { case tableName => 
UnresolvedStar(Option(tableName)) }
-| primary
-)
+| (ident <~ "."). + <~ "*" ^^ { case target => { 
UnresolvedStar(Option(target)) }
+} | primary
+   )
--- End diff --

please fix the style here. e.g. `).+` instead of `). +`, remove the extra 
`{ ... }`, follow the original ident


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43715968
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
 ---
@@ -166,26 +166,68 @@ abstract class Star extends LeafExpression with 
NamedExpression {
  * Represents all of the input attributes to a given relational operator, 
for example in
  * "SELECT * FROM ...".
  *
- * @param table an optional table that should be the target of the 
expansion.  If omitted all
- *  tables' columns are produced.
+ * This is also used to expand structs. For example:
+ * "SELECT record.* from (SELECT struct(a,b,c) as record ...)
+ *
+ * @param target an optional name that should be the target of the 
expansion.  If omitted all
+ *  targets' columns are produced. This can either be a table 
name or struct name. This
+ *  is a list of identifiers that is the path of the expansion.
  */
-case class UnresolvedStar(table: Option[String]) extends Star with 
Unevaluable {
+case class UnresolvedStar(target: Option[Seq[String]]) extends Star with 
Unevaluable {
+
+  override def expand(input: LogicalPlan, resolver: Resolver): 
Seq[NamedExpression] = {
 
-  override def expand(input: Seq[Attribute], resolver: Resolver): 
Seq[NamedExpression] = {
-val expandedAttributes: Seq[Attribute] = table match {
+// First try to expand assuming it is table.*.
+val expandedAttributes: Seq[Attribute] = target match {
   // If there is no table specified, use all input attributes.
-  case None => input
+  case None => input.output
   // If there is a table, pick out attributes that are part of this 
table.
-  case Some(t) => input.filter(_.qualifiers.filter(resolver(_, 
t)).nonEmpty)
+  case Some(t) => if (t.size == 1) {
+input.output.filter(_.qualifiers.filter(resolver(_, 
t.head)).nonEmpty)
--- End diff --

we can use `exists` instead of `filter(...).nonEmpty` here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43622494
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -1921,4 +1921,89 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   assert(sampled.count() == sampledOdd.count() + sampledEven.count())
 }
   }
+
+  test("Struct Star Expansion") {
+val structDf = testData2.select("a", "b").as("record")
+
+checkAnswer(
+  structDf.select($"record.a", $"record.b"),
+  Row(1, 1) :: Row(1, 2) :: Row(2, 1) :: Row(2, 2) :: Row(3, 1) :: 
Row(3, 2) :: Nil)
+
+checkAnswer(
+  structDf.select($"record.*"),
+  Row(1, 1) :: Row(1, 2) :: Row(2, 1) :: Row(2, 2) :: Row(3, 1) :: 
Row(3, 2) :: Nil)
+
+checkAnswer(
+  structDf.select($"record.*", $"record.*"),
+  Row(1, 1, 1, 1) :: Row(1, 2, 1, 2) :: Row(2, 1, 2, 1) :: Row(2, 2, 
2, 2) ::
+Row(3, 1, 3, 1) :: Row(3, 2, 3, 2) :: Nil)
+
+checkAnswer(
+  sql("select struct(a, b) as r1, struct(b, a) as r2 from 
testData2").select($"r1.*", $"r2.*"),
+  Row(1, 1, 1, 1) :: Row(1, 2, 2, 1) :: Row(2, 1, 1, 2) :: Row(2, 2, 
2, 2) ::
+Row(3, 1, 1, 3) :: Row(3, 2, 2, 3) :: Nil)
+
+checkAnswer(sql(
+  """
+| SELECT min(struct(record.*)) FROM
+|   (select struct(a,b) as record from testData2) tmp
+  """.stripMargin),
+  Row(Row(1, 1)) :: Nil)
+
+// Try with an alias on the select list
+checkAnswer(sql(
+  """
+| SELECT max(struct(record.*)) as r FROM
+|   (select struct(a,b) as record from testData2) tmp
+  """.stripMargin).select($"r.*"),
+  Row(3, 2) :: Nil)
+
+// With GROUP BY
+checkAnswer(sql(
+  """
+| SELECT min(struct(record.*)) FROM
+|   (select a as a, struct(a,b) as record from testData2) tmp
+| GROUP BY a
+  """.stripMargin),
+  Row(Row(1, 1)) :: Row(Row(2, 1)) :: Row(Row(3, 1)) :: Nil)
+
+// With GROUP BY and alias
+checkAnswer(sql(
+  """
+| SELECT max(struct(record.*)) as r FROM
+|   (select a as a, struct(a,b) as record from testData2) tmp
+| GROUP BY a
+  """.stripMargin).select($"r.*"),
+  Row(1, 2) :: Row(2, 2) :: Row(3, 2) :: Nil)
+
+// With GROUP BY and alias and additional fields in the struct
+checkAnswer(sql(
+  """
+| SELECT max(struct(a, record.*, b)) as r FROM
+|   (select a as a, b as b, struct(a,b) as record from testData2) 
tmp
+| GROUP BY a
+  """.stripMargin).select($"r.*"),
+  Row(1, 1, 2, 2) :: Row(2, 2, 2, 2) :: Row(3, 3, 2, 2) :: Nil)
+
+// Create a data set that contains nested structs.
+val nestedStructData = sql(
+  """
+| SELECT struct(r1, r2) as record FROM
+|   (SELECT struct(a, b) as r1, struct(b, a) as r2 FROM testData2) 
tmp
+  """.stripMargin)
+
+checkAnswer(nestedStructData.select($"record.*"),
+  Row(Row(1, 1), Row(1, 1)) :: Row(Row(1, 2), Row(2, 1)) :: Row(Row(2, 
1), Row(1, 2)) ::
+Row(Row(2, 2), Row(2, 2)) :: Row(Row(3, 1), Row(1, 3)) :: 
Row(Row(3, 2), Row(2, 3)) :: Nil)
+checkAnswer(nestedStructData.select($"record.r1"),
+  Row(Row(1, 1)) :: Row(Row(1, 2)) :: Row(Row(2, 1)) :: Row(Row(2, 2)) 
::
+Row(Row(3, 1)) :: Row(Row(3, 2)) :: Nil)
+checkAnswer(
+  nestedStructData.select($"record.r1.*"),
--- End diff --

can you also try pure SQL test case like `sql("select record.r1.* from 
xxx")`? AFAIK, in dataframe we have 
[rule](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/Column.scala#L62-L63):
```
case "*" => UnresolvedStar(None)
case _ if name.endsWith(".*") => UnresolvedStar(Some(name.substring(0, 
name.length - 2)))
```
So the `UnresolvedStar.target` will be `record.r1` and work well.

However, for pure SQL, we only have 
[rule](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala#L468-L469):
```
( "*" ^^^ UnresolvedStar(None)
| ident <~ "." ~ "*" ^^ { case tableName => 
UnresolvedStar(Option(tableName)) }
```
which can not parse `record.r1.*`.

A possible solution: `(ident <~ ".").+ <~ "*" ^^ { case target => 
UnresolvedStar(Option(target.mkString("."))) }`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at 

[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43622767
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -1921,4 +1921,89 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   assert(sampled.count() == sampledOdd.count() + sampledEven.count())
 }
   }
+
+  test("Struct Star Expansion") {
+val structDf = testData2.select("a", "b").as("record")
+
+checkAnswer(
+  structDf.select($"record.a", $"record.b"),
+  Row(1, 1) :: Row(1, 2) :: Row(2, 1) :: Row(2, 2) :: Row(3, 1) :: 
Row(3, 2) :: Nil)
+
+checkAnswer(
+  structDf.select($"record.*"),
+  Row(1, 1) :: Row(1, 2) :: Row(2, 1) :: Row(2, 2) :: Row(3, 1) :: 
Row(3, 2) :: Nil)
+
+checkAnswer(
+  structDf.select($"record.*", $"record.*"),
+  Row(1, 1, 1, 1) :: Row(1, 2, 1, 2) :: Row(2, 1, 2, 1) :: Row(2, 2, 
2, 2) ::
+Row(3, 1, 3, 1) :: Row(3, 2, 3, 2) :: Nil)
+
+checkAnswer(
+  sql("select struct(a, b) as r1, struct(b, a) as r2 from 
testData2").select($"r1.*", $"r2.*"),
+  Row(1, 1, 1, 1) :: Row(1, 2, 2, 1) :: Row(2, 1, 1, 2) :: Row(2, 2, 
2, 2) ::
+Row(3, 1, 1, 3) :: Row(3, 2, 2, 3) :: Nil)
+
+checkAnswer(sql(
+  """
+| SELECT min(struct(record.*)) FROM
+|   (select struct(a,b) as record from testData2) tmp
+  """.stripMargin),
+  Row(Row(1, 1)) :: Nil)
+
+// Try with an alias on the select list
+checkAnswer(sql(
+  """
+| SELECT max(struct(record.*)) as r FROM
+|   (select struct(a,b) as record from testData2) tmp
+  """.stripMargin).select($"r.*"),
+  Row(3, 2) :: Nil)
+
+// With GROUP BY
+checkAnswer(sql(
+  """
+| SELECT min(struct(record.*)) FROM
+|   (select a as a, struct(a,b) as record from testData2) tmp
+| GROUP BY a
+  """.stripMargin),
+  Row(Row(1, 1)) :: Row(Row(2, 1)) :: Row(Row(3, 1)) :: Nil)
+
+// With GROUP BY and alias
+checkAnswer(sql(
+  """
+| SELECT max(struct(record.*)) as r FROM
+|   (select a as a, struct(a,b) as record from testData2) tmp
+| GROUP BY a
+  """.stripMargin).select($"r.*"),
+  Row(1, 2) :: Row(2, 2) :: Row(3, 2) :: Nil)
+
+// With GROUP BY and alias and additional fields in the struct
+checkAnswer(sql(
+  """
+| SELECT max(struct(a, record.*, b)) as r FROM
+|   (select a as a, b as b, struct(a,b) as record from testData2) 
tmp
+| GROUP BY a
+  """.stripMargin).select($"r.*"),
+  Row(1, 1, 2, 2) :: Row(2, 2, 2, 2) :: Row(3, 3, 2, 2) :: Nil)
+
+// Create a data set that contains nested structs.
+val nestedStructData = sql(
+  """
+| SELECT struct(r1, r2) as record FROM
+|   (SELECT struct(a, b) as r1, struct(b, a) as r2 FROM testData2) 
tmp
+  """.stripMargin)
+
+checkAnswer(nestedStructData.select($"record.*"),
+  Row(Row(1, 1), Row(1, 1)) :: Row(Row(1, 2), Row(2, 1)) :: Row(Row(2, 
1), Row(1, 2)) ::
+Row(Row(2, 2), Row(2, 2)) :: Row(Row(3, 1), Row(1, 3)) :: 
Row(Row(3, 2), Row(2, 3)) :: Nil)
+checkAnswer(nestedStructData.select($"record.r1"),
+  Row(Row(1, 1)) :: Row(Row(1, 2)) :: Row(Row(2, 1)) :: Row(Row(2, 2)) 
::
+Row(Row(3, 1)) :: Row(Row(3, 2)) :: Nil)
+checkAnswer(
+  nestedStructData.select($"record.r1.*"),
+  Row(1, 1) :: Row(1, 2) :: Row(2, 1) :: Row(2, 2) :: Row(3, 1) :: 
Row(3, 2) :: Nil)
+
+// Try star expanding a scalar. This should fail.
+assert(intercept[AnalysisException](sql("select a.* from 
testData2")).getMessage.contains(
+  "Can only star expand struct data types."))
+  }
--- End diff --

and cases with special chars in column names or struct field names like 
a`.`w.e_`.c.*```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-153231716
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-153231717
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44862/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-02 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-153231640
  
**[Test build #44862 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44862/consoleFull)**
 for PR 9343 at commit 
[`886acc7`](https://github.com/apache/spark/commit/886acc7ba1dc95348117b5890ce8115efa149b0b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:\n  * 
`case class UnresolvedStar(target: Option[Seq[String]]) extends Star with 
Unevaluable `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-01 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43586963
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
 ---
@@ -166,26 +166,58 @@ abstract class Star extends LeafExpression with 
NamedExpression {
  * Represents all of the input attributes to a given relational operator, 
for example in
  * "SELECT * FROM ...".
  *
- * @param table an optional table that should be the target of the 
expansion.  If omitted all
- *  tables' columns are produced.
+ * This is also used to expand structs. For example:
+ * "SELECT record.* from (SELCCT struct(a,b,c) as record ...)
+ *
+ * @param target an optional name that should be the target of the 
expansion.  If omitted all
+ *  targets' columns are produced. This can either be a table 
name or struct name.
  */
-case class UnresolvedStar(table: Option[String]) extends Star with 
Unevaluable {
+case class UnresolvedStar(target: Option[String]) extends Star with 
Unevaluable {
+
+  override def expand(input: LogicalPlan, resolver: Resolver): 
Seq[NamedExpression] = {
+// First try to resolve the target as a struct expansion. That is try 
to see if it is
+// .*. If that fails, we'll try as a table expansion.
+// TODO: is this the order we want to resolve this?
--- End diff --

How about we try as a table expansion first since it is the current 
behavior? If there is a struct that has the same name with the table, we can 
use `name.name.*` to expand it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-01 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43586920
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -286,44 +304,42 @@ class Analyzer(
   case p @ Project(projectList, child) if containsStar(projectList) =>
 Project(
   projectList.flatMap {
-case s: Star => s.expand(child.output, resolver)
+case s: Star => s.expand(child, resolver)
 case UnresolvedAlias(f @ UnresolvedFunction(_, args, _)) if 
containsStar(args) =>
-  val expandedArgs = args.flatMap {
-case s: Star => s.expand(child.output, resolver)
-case o => o :: Nil
-  }
-  UnresolvedAlias(child = f.copy(children = expandedArgs)) :: 
Nil
+  val newChildren = expandStarExpressions(args, child)
+  UnresolvedAlias(child = f.copy(children = newChildren)) :: 
Nil
+case Alias(f @ UnresolvedFunction(_, args, _), name) if 
containsStar(args) =>
+  val newChildren = expandStarExpressions(args, child)
--- End diff --

`newChildren` => `newArgs`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-01 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43586919
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -286,44 +304,42 @@ class Analyzer(
   case p @ Project(projectList, child) if containsStar(projectList) =>
 Project(
   projectList.flatMap {
-case s: Star => s.expand(child.output, resolver)
+case s: Star => s.expand(child, resolver)
 case UnresolvedAlias(f @ UnresolvedFunction(_, args, _)) if 
containsStar(args) =>
-  val expandedArgs = args.flatMap {
-case s: Star => s.expand(child.output, resolver)
-case o => o :: Nil
-  }
-  UnresolvedAlias(child = f.copy(children = expandedArgs)) :: 
Nil
+  val newChildren = expandStarExpressions(args, child)
--- End diff --

`newChildren` => `newArgs`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-01 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43586981
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
 ---
@@ -166,26 +168,55 @@ abstract class Star extends LeafExpression with 
NamedExpression {
  * Represents all of the input attributes to a given relational operator, 
for example in
  * "SELECT * FROM ...".
  *
- * @param table an optional table that should be the target of the 
expansion.  If omitted all
- *  tables' columns are produced.
+ * This is also used to expand structs. For example:
+ * "SELECT record.* from (SELCCT struct(a,b,c) as record ...)
+ *
+ * @param target an optional name that should be the target of the 
expansion.  If omitted all
+ *  targets' columns are produced. This can either be a table 
name or struct name.
  */
-case class UnresolvedStar(table: Option[String]) extends Star with 
Unevaluable {
+case class UnresolvedStar(target: Option[String]) extends Star with 
Unevaluable {
--- End diff --

Maybe `Option[Seq[String]]`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-01 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43586989
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
 ---
@@ -166,26 +168,55 @@ abstract class Star extends LeafExpression with 
NamedExpression {
  * Represents all of the input attributes to a given relational operator, 
for example in
  * "SELECT * FROM ...".
  *
- * @param table an optional table that should be the target of the 
expansion.  If omitted all
- *  tables' columns are produced.
+ * This is also used to expand structs. For example:
+ * "SELECT record.* from (SELCCT struct(a,b,c) as record ...)
+ *
+ * @param target an optional name that should be the target of the 
expansion.  If omitted all
+ *  targets' columns are produced. This can either be a table 
name or struct name.
  */
-case class UnresolvedStar(table: Option[String]) extends Star with 
Unevaluable {
+case class UnresolvedStar(target: Option[String]) extends Star with 
Unevaluable {
--- End diff --

btw, do we support `db.table.*` or `db.table.struct.*`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-11-01 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43587089
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -1921,4 +1921,89 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   assert(sampled.count() == sampledOdd.count() + sampledEven.count())
 }
   }
+
+  test("Struct Star Expansion") {
+val structDf = testData2.select("a", "b").as("record")
+
+checkAnswer(
+  structDf.select($"record.a", $"record.b"),
+  Row(1, 1) :: Row(1, 2) :: Row(2, 1) :: Row(2, 2) :: Row(3, 1) :: 
Row(3, 2) :: Nil)
+
+checkAnswer(
+  structDf.select($"record.*"),
+  Row(1, 1) :: Row(1, 2) :: Row(2, 1) :: Row(2, 2) :: Row(3, 1) :: 
Row(3, 2) :: Nil)
+
+checkAnswer(
+  structDf.select($"record.*", $"record.*"),
+  Row(1, 1, 1, 1) :: Row(1, 2, 1, 2) :: Row(2, 1, 2, 1) :: Row(2, 2, 
2, 2) ::
+Row(3, 1, 3, 1) :: Row(3, 2, 3, 2) :: Nil)
+
+checkAnswer(
+  sql("select struct(a, b) as r1, struct(b, a) as r2 from 
testData2").select($"r1.*", $"r2.*"),
+  Row(1, 1, 1, 1) :: Row(1, 2, 2, 1) :: Row(2, 1, 1, 2) :: Row(2, 2, 
2, 2) ::
+Row(3, 1, 1, 3) :: Row(3, 2, 2, 3) :: Nil)
+
+checkAnswer(sql(
+  """
+| SELECT min(struct(record.*)) FROM
+|   (select struct(a,b) as record from testData2) tmp
+  """.stripMargin),
+  Row(Row(1, 1)) :: Nil)
+
+// Try with an alias on the select list
+checkAnswer(sql(
+  """
+| SELECT max(struct(record.*)) as r FROM
+|   (select struct(a,b) as record from testData2) tmp
+  """.stripMargin).select($"r.*"),
+  Row(3, 2) :: Nil)
+
+// With GROUP BY
+checkAnswer(sql(
+  """
+| SELECT min(struct(record.*)) FROM
+|   (select a as a, struct(a,b) as record from testData2) tmp
+| GROUP BY a
+  """.stripMargin),
+  Row(Row(1, 1)) :: Row(Row(2, 1)) :: Row(Row(3, 1)) :: Nil)
+
+// With GROUP BY and alias
+checkAnswer(sql(
+  """
+| SELECT max(struct(record.*)) as r FROM
+|   (select a as a, struct(a,b) as record from testData2) tmp
+| GROUP BY a
+  """.stripMargin).select($"r.*"),
+  Row(1, 2) :: Row(2, 2) :: Row(3, 2) :: Nil)
+
+// With GROUP BY and alias and additional fields in the struct
+checkAnswer(sql(
+  """
+| SELECT max(struct(a, record.*, b)) as r FROM
+|   (select a as a, b as b, struct(a,b) as record from testData2) 
tmp
+| GROUP BY a
+  """.stripMargin).select($"r.*"),
+  Row(1, 1, 2, 2) :: Row(2, 2, 2, 2) :: Row(3, 3, 2, 2) :: Nil)
+
+// Create a data set that contains nested structs.
+val nestedStructData = sql(
+  """
+| SELECT struct(r1, r2) as record FROM
+|   (SELECT struct(a, b) as r1, struct(b, a) as r2 FROM testData2) 
tmp
+  """.stripMargin)
+
+checkAnswer(nestedStructData.select($"record.*"),
+  Row(Row(1, 1), Row(1, 1)) :: Row(Row(1, 2), Row(2, 1)) :: Row(Row(2, 
1), Row(1, 2)) ::
+Row(Row(2, 2), Row(2, 2)) :: Row(Row(3, 1), Row(1, 3)) :: 
Row(Row(3, 2), Row(2, 3)) :: Nil)
+checkAnswer(nestedStructData.select($"record.r1"),
+  Row(Row(1, 1)) :: Row(Row(1, 2)) :: Row(Row(2, 1)) :: Row(Row(2, 2)) 
::
+Row(Row(3, 1)) :: Row(Row(3, 2)) :: Nil)
+checkAnswer(
+  nestedStructData.select($"record.r1.*"),
+  Row(1, 1) :: Row(1, 2) :: Row(2, 1) :: Row(2, 2) :: Row(3, 1) :: 
Row(3, 2) :: Nil)
+
+// Try star expanding a scalar. This should fail.
+assert(intercept[AnalysisException](sql("select a.* from 
testData2")).getMessage.contains(
+  "Can only star expand struct data types."))
+  }
--- End diff --

This suite is super nice! Can we add a case that a table and its struct 
column have the same name?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-30 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43477553
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
 ---
@@ -166,26 +168,55 @@ abstract class Star extends LeafExpression with 
NamedExpression {
  * Represents all of the input attributes to a given relational operator, 
for example in
  * "SELECT * FROM ...".
  *
- * @param table an optional table that should be the target of the 
expansion.  If omitted all
- *  tables' columns are produced.
+ * This is also used to expand structs. For example:
+ * "SELECT record.* from (SELCCT struct(a,b,c) as record ...)
+ *
+ * @param target an optional name that should be the target of the 
expansion.  If omitted all
+ *  targets' columns are produced. This can either be a table 
name or struct name.
  */
-case class UnresolvedStar(table: Option[String]) extends Star with 
Unevaluable {
+case class UnresolvedStar(target: Option[String]) extends Star with 
Unevaluable {
--- End diff --

How about `table.structColumn.*`? or `complexColumn.structField.*`? I think 
a `Option[String]` is not enough to express this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-152650397
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-30 Thread nongli
Github user nongli commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43549569
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
 ---
@@ -166,26 +168,55 @@ abstract class Star extends LeafExpression with 
NamedExpression {
  * Represents all of the input attributes to a given relational operator, 
for example in
  * "SELECT * FROM ...".
  *
- * @param table an optional table that should be the target of the 
expansion.  If omitted all
- *  tables' columns are produced.
+ * This is also used to expand structs. For example:
+ * "SELECT record.* from (SELCCT struct(a,b,c) as record ...)
+ *
+ * @param target an optional name that should be the target of the 
expansion.  If omitted all
+ *  targets' columns are produced. This can either be a table 
name or struct name.
  */
-case class UnresolvedStar(table: Option[String]) extends Star with 
Unevaluable {
+case class UnresolvedStar(target: Option[String]) extends Star with 
Unevaluable {
--- End diff --

You're right. That doesn't work. I think there is a much better way to do 
this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-152671830
  
**[Test build #44704 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44704/consoleFull)**
 for PR 9343 at commit 
[`be20b33`](https://github.com/apache/spark/commit/be20b33b83b131a90fca7ded014edc564349eae2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:\n  * 
`case class UnresolvedStar(target: Option[String]) extends Star with 
Unevaluable `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-152650374
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-152652356
  
**[Test build #44704 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44704/consoleFull)**
 for PR 9343 at commit 
[`be20b33`](https://github.com/apache/spark/commit/be20b33b83b131a90fca7ded014edc564349eae2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-152671936
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-152671938
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44704/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43470847
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -1870,4 +1870,66 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   assert(sampled.count() == sampledOdd.count() + sampledEven.count())
 }
   }
+
+  test("Struct Star Expansion") {
+checkAnswer(
+  sql("select struct(a, b) as record from 
testData2").select($"record.a", $"record.b"),
--- End diff --

you can make it more dataframe style like `testData2.select(struct("a", 
"b").as("record"))`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-152053435
  
**[Test build #44555 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44555/consoleFull)**
 for PR 9343 at commit 
[`df65944`](https://github.com/apache/spark/commit/df6594406f84bf25139fb13d70937a16ab391057).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:\n  * 
`class HasSolver(Params):`\n  * `case class UnresolvedStar(target: 
Option[String]) extends Star with Unevaluable `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-152053526
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-152053527
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44555/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-28 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/9343#discussion_r43342070
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala
 ---
@@ -146,7 +146,11 @@ case class Alias(child: Expression, name: String)(
 
   override def toAttribute: Attribute = {
 if (resolved) {
-  AttributeReference(name, child.dataType, child.nullable, 
metadata)(exprId, qualifiers)
+  // Append the name of the alias as a qualifier. This lets us resolve 
things like:
+  // (SELECT struct(a,b) AS x FROM ...).SELECT x.*
+  // TODO: is this the best way to do this? Should Alias just have 
nameas the qualifier?
--- End diff --

'nameas' -> 'name as'?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-152030521
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-152031892
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-152034501
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44549/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-152034495
  
**[Test build #44549 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44549/consoleFull)**
 for PR 9343 at commit 
[`54e155e`](https://github.com/apache/spark/commit/54e155e537ef7ab7d03d8c2a7877b91839089458).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:\n  * 
`case class UnresolvedStar(target: Option[String]) extends Star with 
Unevaluable `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-152034499
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-152038598
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-152038614
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-28 Thread nongli
GitHub user nongli opened a pull request:

https://github.com/apache/spark/pull/9343

[SPARK-11329] [SQL] Support star expansion for structs.

1. Supporting expanding structs in Projections. i.e.
  "SELECT s.*" where s is a struct type.
  This is fixed by allowing the expand function to handle structs in 
addition to tables.

2. Supporting expanding * inside aggregate functions of structs.
   "SELECT max(struct(col1, structCol.*))"
   This requires recursively expanding the expressions. In this case, it it 
the aggregate
   expression "max(...)" and we need to recursively expand its children 
inputs.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nongli/spark spark-11329

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9343.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9343


commit 54e155e537ef7ab7d03d8c2a7877b91839089458
Author: Nong Li 
Date:   2015-10-28T19:19:57Z

[SPARK-11329] [SQL] Support star expansion for structs.

1. Supporting expanding structs in Projections. i.e.
  "SELECT s.*" where s is a struct type.
  This is fixed by allowing the expand function to handle structs in 
addition to tables.

2. Supporting expanding * inside aggregate functions of structs.
   "SELECT max(struct(col1, structCol.*))"
   This requires recursively expanding the expressions. In this case, it it 
the aggregate
   expression "max(...)" and we need to recursively expand its children 
inputs.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-28 Thread yhuai
Github user yhuai commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-152031447
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-152031915
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-152034170
  
**[Test build #44549 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44549/consoleFull)**
 for PR 9343 at commit 
[`54e155e`](https://github.com/apache/spark/commit/54e155e537ef7ab7d03d8c2a7877b91839089458).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9343#issuecomment-152038777
  
**[Test build #44555 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44555/consoleFull)**
 for PR 9343 at commit 
[`df65944`](https://github.com/apache/spark/commit/df6594406f84bf25139fb13d70937a16ab391057).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org