[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...

2015-02-03 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/4348#issuecomment-72794234
  
This select() and filter() in Python do not support yet


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...

2015-02-03 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/4348#discussion_r24063817
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameImpl.scala 
---
@@ -179,10 +179,20 @@ private[sql] class DataFrameImpl protected[sql](
 select((col +: cols).map(Column(_)) :_*)
   }
 
+  override def selectExpr(exprs: String*): DataFrame = {
--- End diff --

I think this one could be merged into select(), column is also a valid 
expression


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...

2015-02-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4348#issuecomment-72788843
  
  [Test build #26723 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26723/consoleFull)
 for   PR 4348 at commit 
[`2baeef2`](https://github.com/apache/spark/commit/2baeef2f4035bad7aa829cf52fc338245f52fafd).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...

2015-02-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4348#issuecomment-72796410
  
  [Test build #26723 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26723/consoleFull)
 for   PR 4348 at commit 
[`2baeef2`](https://github.com/apache/spark/commit/2baeef2f4035bad7aa829cf52fc338245f52fafd).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...

2015-02-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4348#issuecomment-72796417
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26723/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...

2015-02-03 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/4348#discussion_r24063992
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameImpl.scala 
---
@@ -179,10 +179,20 @@ private[sql] class DataFrameImpl protected[sql](
 select((col +: cols).map(Column(_)) :_*)
   }
 
+  override def selectExpr(exprs: String*): DataFrame = {
--- End diff --

It should work in these cases with this implementation.
```
select('a', '`the name`', 'a + 1', 'min(b) * 3')
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...

2015-02-03 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/4348#discussion_r24063848
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameImpl.scala 
---
@@ -179,10 +179,20 @@ private[sql] class DataFrameImpl protected[sql](
 select((col +: cols).map(Column(_)) :_*)
   }
 
+  override def selectExpr(exprs: String*): DataFrame = {
--- End diff --

not if it has space ... it will just fail


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...

2015-02-03 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/4348#discussion_r24064580
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameImpl.scala 
---
@@ -179,10 +179,20 @@ private[sql] class DataFrameImpl protected[sql](
 select((col +: cols).map(Column(_)) :_*)
   }
 
+  override def selectExpr(exprs: String*): DataFrame = {
--- End diff --

yea - but asking users to wrap a column name in backticks in strings is 
fairly annoying.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...

2015-02-03 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/4348#issuecomment-72796555
  
We can discuss more offline. For now let's keep this separate, otherwise it 
can be fairly annoying to use column names that contain space or column names 
that contain any SQL keywords.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...

2015-02-03 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/4348#discussion_r24062195
  
--- Diff: python/pyspark/sql.py ---
@@ -2126,10 +2126,9 @@ def sort(self, *cols):
 
 if not cols:
 raise ValueError(should sort by at least one column)
-jcols = ListConverter().convert([_to_java_column(c) for c in 
cols[1:]],
+jcols = ListConverter().convert([_to_java_column(c) for c in cols],
--- End diff --

@davies take a look at the Python changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...

2015-02-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/4348


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...

2015-02-03 Thread rxin
GitHub user rxin opened a pull request:

https://github.com/apache/spark/pull/4348

[SPARK-5579][SQL][DataFrame] Support for project/filter using SQL 
expressions

```scala
df.selectExpr(abs(colA), colB)
df.filter(age  21)
```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rxin/spark SPARK-5579

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/4348.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4348


commit ac65f4b86bfce0ba2e170c31f6a50c58255f960e
Author: Reynold Xin r...@databricks.com
Date:   2015-02-04T00:25:50Z

[SPARK-5579][SQL][DataFrame] Support for project/filter using SQL 
expressions.

e.g.

df.selectExpr(abs(colA), colB)

df.filter(age  21)




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...

2015-02-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4348#issuecomment-7272
  
  [Test build #26693 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26693/consoleFull)
 for   PR 4348 at commit 
[`ac65f4b`](https://github.com/apache/spark/commit/ac65f4b86bfce0ba2e170c31f6a50c58255f960e).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...

2015-02-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4348#issuecomment-72772602
  
  [Test build #26693 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26693/consoleFull)
 for   PR 4348 at commit 
[`ac65f4b`](https://github.com/apache/spark/commit/ac65f4b86bfce0ba2e170c31f6a50c58255f960e).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class Dsl(object):`
  * `class ExamplePointUDT(UserDefinedType):`
  * `class SQLTests(ReusedPySparkTestCase):`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...

2015-02-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4348#issuecomment-72772607
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26693/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org