[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/4348#issuecomment-72794234 This select() and filter() in Python do not support yet --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/4348#discussion_r24063817 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameImpl.scala --- @@ -179,10 +179,20 @@ private[sql] class DataFrameImpl protected[sql]( select((col +: cols).map(Column(_)) :_*) } + override def selectExpr(exprs: String*): DataFrame = { --- End diff -- I think this one could be merged into select(), column is also a valid expression --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4348#issuecomment-72788843 [Test build #26723 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26723/consoleFull) for PR 4348 at commit [`2baeef2`](https://github.com/apache/spark/commit/2baeef2f4035bad7aa829cf52fc338245f52fafd). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4348#issuecomment-72796410 [Test build #26723 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26723/consoleFull) for PR 4348 at commit [`2baeef2`](https://github.com/apache/spark/commit/2baeef2f4035bad7aa829cf52fc338245f52fafd). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4348#issuecomment-72796417 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26723/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/4348#discussion_r24063992 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameImpl.scala --- @@ -179,10 +179,20 @@ private[sql] class DataFrameImpl protected[sql]( select((col +: cols).map(Column(_)) :_*) } + override def selectExpr(exprs: String*): DataFrame = { --- End diff -- It should work in these cases with this implementation. ``` select('a', '`the name`', 'a + 1', 'min(b) * 3') ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/4348#discussion_r24063848 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameImpl.scala --- @@ -179,10 +179,20 @@ private[sql] class DataFrameImpl protected[sql]( select((col +: cols).map(Column(_)) :_*) } + override def selectExpr(exprs: String*): DataFrame = { --- End diff -- not if it has space ... it will just fail --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/4348#discussion_r24064580 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameImpl.scala --- @@ -179,10 +179,20 @@ private[sql] class DataFrameImpl protected[sql]( select((col +: cols).map(Column(_)) :_*) } + override def selectExpr(exprs: String*): DataFrame = { --- End diff -- yea - but asking users to wrap a column name in backticks in strings is fairly annoying. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/4348#issuecomment-72796555 We can discuss more offline. For now let's keep this separate, otherwise it can be fairly annoying to use column names that contain space or column names that contain any SQL keywords. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/4348#discussion_r24062195 --- Diff: python/pyspark/sql.py --- @@ -2126,10 +2126,9 @@ def sort(self, *cols): if not cols: raise ValueError(should sort by at least one column) -jcols = ListConverter().convert([_to_java_column(c) for c in cols[1:]], +jcols = ListConverter().convert([_to_java_column(c) for c in cols], --- End diff -- @davies take a look at the Python changes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/4348 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/4348 [SPARK-5579][SQL][DataFrame] Support for project/filter using SQL expressions ```scala df.selectExpr(abs(colA), colB) df.filter(age 21) ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark SPARK-5579 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4348.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4348 commit ac65f4b86bfce0ba2e170c31f6a50c58255f960e Author: Reynold Xin r...@databricks.com Date: 2015-02-04T00:25:50Z [SPARK-5579][SQL][DataFrame] Support for project/filter using SQL expressions. e.g. df.selectExpr(abs(colA), colB) df.filter(age 21) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4348#issuecomment-7272 [Test build #26693 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26693/consoleFull) for PR 4348 at commit [`ac65f4b`](https://github.com/apache/spark/commit/ac65f4b86bfce0ba2e170c31f6a50c58255f960e). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4348#issuecomment-72772602 [Test build #26693 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26693/consoleFull) for PR 4348 at commit [`ac65f4b`](https://github.com/apache/spark/commit/ac65f4b86bfce0ba2e170c31f6a50c58255f960e). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class Dsl(object):` * `class ExamplePointUDT(UserDefinedType):` * `class SQLTests(ReusedPySparkTestCase):` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4348#issuecomment-72772607 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26693/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org