[jira] [Comment Edited] (SPARK-12988) Can't drop columns that contain dots
[ https://issues.apache.org/jira/browse/SPARK-12988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128902#comment-15128902 ] Dilip Biswal edited comment on SPARK-12988 at 2/2/16 7:56 PM: -- The subtle difference between column path and column name may not be very obvious to a common user of this API. val df = Seq((1, 1)).toDF("a_b", "a.b") df.select("`a.b`") df.drop("`a.b`") => the fact that one can not use back tick here , would it be that obvious to the user ? I believe that was the motivation to allow it but then i am not sure of its implications. was (Author: dkbiswal): The shuttle difference between column path and column name may not be very obvious to a common user of this API. val df = Seq((1, 1)).toDF("a_b", "a.b") df.select("`a.b`") df.drop("`a.b`") => the fact that one can not use back tick here , would it be that obvious to the user ? I believe that was the motivation to allow it but then i am not sure of its implications. > Can't drop columns that contain dots > > > Key: SPARK-12988 > URL: https://issues.apache.org/jira/browse/SPARK-12988 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.0 >Reporter: Michael Armbrust > > Neither of theses works: > {code} > val df = Seq((1, 1)).toDF("a_b", "a.c") > df.drop("a.c").collect() > df: org.apache.spark.sql.DataFrame = [a_b: int, a.c: int] > {code} > {code} > val df = Seq((1, 1)).toDF("a_b", "a.c") > df.drop("`a.c`").collect() > df: org.apache.spark.sql.DataFrame = [a_b: int, a.c: int] > {code} > Given that you can't use drop to drop subfields, it seems to me that we > should treat the column name literally (i.e. as though it is wrapped in back > ticks). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-12988) Can't drop columns that contain dots
[ https://issues.apache.org/jira/browse/SPARK-12988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118319#comment-15118319 ] Dilip Biswal edited comment on SPARK-12988 at 1/27/16 12:02 AM: [~marmbrus] Hi Michael, need your input on the semantics. Say we have a dataframe defined like following : val df = Seq((1, 1,1)).toDF("a_b", "a.c", "`a.c`") df.drop("a.c") => Should we remove the 2nd column here ? df.drop("`a.c`") => Should we remove the 3rd column here ? Regards, -- Dilip was (Author: dkbiswal): [~marmbrus] Hi Michael, need your input on the semantics. Say we have a dataframe defined like following : val df = Seq((1, 1,1,1,1,1)).toDF("a_b", "a.c", "`a.c`") df.drop("a.c") => Should we remove the 2nd column here ? df.drop("`a.c`") => Should we remove the 3rd column here ? Regards, -- Dilip > Can't drop columns that contain dots > > > Key: SPARK-12988 > URL: https://issues.apache.org/jira/browse/SPARK-12988 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.0 >Reporter: Michael Armbrust > > Neither of theses works: > {code} > val df = Seq((1, 1)).toDF("a_b", "a.c") > df.drop("a.c").collect() > df: org.apache.spark.sql.DataFrame = [a_b: int, a.c: int] > {code} > {code} > val df = Seq((1, 1)).toDF("a_b", "a.c") > df.drop("`a.c`").collect() > df: org.apache.spark.sql.DataFrame = [a_b: int, a.c: int] > {code} > Given that you can't use drop to drop subfields, it seems to me that we > should treat the column name literally (i.e. as though it is wrapped in back > ticks). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org