[
https://issues.apache.org/jira/browse/SPARK-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15442870#comment-15442870
]
Felix Cheung edited comment on SPARK-17214 at 8/28/16 6:15 AM:
---------------------------------------------------------------
I think the underlining issue is that we should either handle column names with
`.` correctly (preferred) or translate them uniformly as in other cases (eg.
`as.DataFrame`)
As of now a DataFrame from csv source can have `.` in column names and it is
unoperable until renamed (which is a known issue):
{code}
> iris_sdf<-read.df("iris.csv","csv",header="true",inferSchema="true")
> iris_sdf
SparkDataFrame[Sepal.Length:double, Sepal.Width:double, Petal.Length:double,
Petal.Width:double, Species:string]
> head(select(iris_sdf,iris_sdf$Sepal.Length))
16/08/28 06:11:16 ERROR RBackendHandler: col on 46 failed
Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) :
org.apache.spark.sql.AnalysisException: Cannot resolve column name
"Sepal.Length" among (Sepal.Length, Sepal.Width, Petal.Length, Petal.Width,
Species);
{code}
was (Author: felixcheung):
I think the underlining issue is that we should either handle column names with
`.` correctly (preferred) or translate them uniformly as in other cases (eg.
`as.DataFrame`)
As of now a DataFrame from csv source can have `.` in column names and it is
unoperable until renamed:
{code}
> iris_sdf<-read.df("iris.csv","csv",header="true",inferSchema="true")
> iris_sdf
SparkDataFrame[Sepal.Length:double, Sepal.Width:double, Petal.Length:double,
Petal.Width:double, Species:string]
> head(select(iris_sdf,iris_sdf$Sepal.Length))
16/08/28 06:11:16 ERROR RBackendHandler: col on 46 failed
Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) :
org.apache.spark.sql.AnalysisException: Cannot resolve column name
"Sepal.Length" among (Sepal.Length, Sepal.Width, Petal.Length, Petal.Width,
Species);
{code}
> How to deal with dots (.) present in column names in SparkR
> -----------------------------------------------------------
>
> Key: SPARK-17214
> URL: https://issues.apache.org/jira/browse/SPARK-17214
> Project: Spark
> Issue Type: Bug
> Reporter: Mohit Bansal
>
> I am trying to load a local csv file into SparkR, which contains dots in
> column names. After reading the file I tried to change the names and replaced
> "." with "_". Still I am not able to do any operation on the created SDF.
> Here is the reproducible code:
> -------------------------------------------------------------------------------
> #writing iris dataset to local
> write.csv(iris,"iris.csv",row.names=F)
> #reading it back using read.df
> iris_sdf<-read.df("iris.csv","csv",header="true",inferSchema="true")
> #changing column names
> names(iris_sdf)<-c("Sepal_Length","Sepal_Width","Petal_Length","Petal_Width","Species")
> #selecting required columna
> head(select(iris_sdf,iris_sdf$Sepal_Length,iris_sdf$Sepal_Width))
> ---------------------------------------------------------------------------------
> 16/08/24 13:51:24 ERROR RBackendHandler: dfToCols on
> org.apache.spark.sql.api.r.SQLUtils failed
> Error in invokeJava(isStatic = TRUE, className, methodName, ...) :
> org.apache.spark.sql.AnalysisException: Unable to resolve Sepal.Length
> given [Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species];
> at
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at scala.Option.getOrElse(Option.scala:121)
> at
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:133)
> at
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:129)
> at
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> at scala.collection.IterableLike$cl
> What should I do to get it work?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]