[GitHub] spark pull request #14309: [SPARK-11977][SQL] Support accessing a column con...
Github user rerngvit closed the pull request at: https://github.com/apache/spark/pull/14309 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14309: [SPARK-11977][SQL] Support accessing a column con...
Github user rerngvit commented on a diff in the pull request: https://github.com/apache/spark/pull/14309#discussion_r73325142 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -896,6 +896,19 @@ class DataFrameSuite extends QueryTest with SharedSQLContext { checkError(df("`a.b`.c.`d")) } + test("SPARK-11977: Support accessing a column contains '.' without backticks") { +val df = sparkContext.parallelize( + (1, 2) :: (3, 4) :: Nil).toDF("test.column1", "test.column.2") +checkAnswer(df.select("test.column1"), Seq(Row(1), Row(3))) --- End diff -- Any follow up on this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14309: [SPARK-11977][SQL] Support accessing a column con...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14309#discussion_r72180296 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -896,6 +896,19 @@ class DataFrameSuite extends QueryTest with SharedSQLContext { checkError(df("`a.b`.c.`d")) } + test("SPARK-11977: Support accessing a column contains '.' without backticks") { +val df = sparkContext.parallelize( + (1, 2) :: (3, 4) :: Nil).toDF("test.column1", "test.column.2") +checkAnswer(df.select("test.column1"), Seq(Row(1), Row(3))) --- End diff -- This is a behaviour change, users may expect to see an analyse failure here. cc @rxin Do we want this feature? Personally I'd like to have a new function(similar to `col`, but resolves the column by the given name directly) instead of changing the semantic of `select` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14309: [SPARK-11977][SQL] Support accessing a column con...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14309#discussion_r72127680 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -896,6 +896,19 @@ class DataFrameSuite extends QueryTest with SharedSQLContext { checkError(df("`a.b`.c.`d")) } + test("SPARK-11977: Support accessing a column contains '.' without backticks") { +val df = sparkContext.parallelize( --- End diff -- could you add some test cases with a DataFrame created from `iris`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14309: [SPARK-11977][SQL] Support accessing a column con...
Github user rerngvit commented on a diff in the pull request: https://github.com/apache/spark/pull/14309#discussion_r71986608 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -641,6 +641,10 @@ class DataFrameSuite extends QueryTest with SharedSQLContext { Row(key, value, key + 1) }.toSeq) assert(df.schema.map(_.name) === Seq("key", "valueRenamed", "newCol")) --- End diff -- @jaceklaskowski: I agree with you that the code can be made shorter and more elegant in the ways that you pointed out. Nonetheless, the improvements: (1) removing () in toDF() and (2) replacing the schema map with fieldNames, are actually applicable to other places in the file "DataFrameSuite.scala" as well. In other words, it is not directly related to this PR. I think the proper way to handle this is rather in another JIRA issue. Feel free to fire it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14309: [SPARK-11977][SQL] Support accessing a column con...
Github user sun-rui commented on a diff in the pull request: https://github.com/apache/spark/pull/14309#discussion_r71982269 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -641,6 +641,10 @@ class DataFrameSuite extends QueryTest with SharedSQLContext { Row(key, value, key + 1) }.toSeq) assert(df.schema.map(_.name) === Seq("key", "valueRenamed", "newCol")) + +// Renaming to a column that contains "." character +val df2 = testData.toDF().withColumnRenamed("value", "value.Renamed") +assert(df2.schema.map(_.name) === Seq("key", "value.Renamed")) --- End diff -- Please add more test cases that columns whose name has '.' can be accessed without backticks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14309: [SPARK-11977][SQL] Support accessing a column con...
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/14309#discussion_r7196 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -641,6 +641,10 @@ class DataFrameSuite extends QueryTest with SharedSQLContext { Row(key, value, key + 1) }.toSeq) assert(df.schema.map(_.name) === Seq("key", "valueRenamed", "newCol")) + +// Renaming to a column that contains "." character +val df2 = testData.toDF().withColumnRenamed("value", "value.Renamed") --- End diff -- No need for `()` in `toDF()`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14309: [SPARK-11977][SQL] Support accessing a column con...
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/14309#discussion_r7137 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -641,6 +641,10 @@ class DataFrameSuite extends QueryTest with SharedSQLContext { Row(key, value, key + 1) }.toSeq) assert(df.schema.map(_.name) === Seq("key", "valueRenamed", "newCol")) + +// Renaming to a column that contains "." character +val df2 = testData.toDF().withColumnRenamed("value", "value.Renamed") +assert(df2.schema.map(_.name) === Seq("key", "value.Renamed")) --- End diff -- `df2.schema.fieldNames`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14309: [SPARK-11977][SQL] Support accessing a column con...
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/14309#discussion_r71888979 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -641,6 +641,10 @@ class DataFrameSuite extends QueryTest with SharedSQLContext { Row(key, value, key + 1) }.toSeq) assert(df.schema.map(_.name) === Seq("key", "valueRenamed", "newCol")) --- End diff -- I'd fix that line, too with `fieldNames` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14309: [SPARK-11977][SQL] Support accessing a column con...
GitHub user rerngvit opened a pull request: https://github.com/apache/spark/pull/14309 [SPARK-11977][SQL] Support accessing a column contains "." without backticks ## What changes were proposed in this pull request? - Add support for accessing a dataframe column that contains "." in its name without backticks - Add a testcase for this in DataFrameSuite ## How was this patch tested? Spark SQL unit test and manual testing You can merge this pull request into a Git repository by running: $ git pull https://github.com/rerngvit/spark SPARK-11977 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14309.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14309 commit b5046851e46ceccf745677d2610e3c57f3e60f16 Author: Rerngvit Yanggratoke Date: 2016-07-21T22:26:28Z [SPARK-11977][SQL] Support accessing a DataFrame column using its name without backticks if the name contains '.' ## What changes were proposed in this pull request? - Add support for accessing dataframe column that contains "." in its name without backticks - Add a testcase for this in DataFrameSuite ## How was this patch tested? Spark unit tests and manual testing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org