Maher Hattabi created SPARK-20987: ------------------------------------- Summary: columns with name having dots caused issues with VectorAssemblor Key: SPARK-20987 URL: https://issues.apache.org/jira/browse/SPARK-20987 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.0.2 Reporter: Maher Hattabi
Hello i used this code knowing that that the data contains actually dots here is the dataset. "col0.1","col1.2","col2.3","col3.4" 1,2,3,4 10,12,15,3 1,12,10,5 Here is the code i used val spark = SparkSession.builder.master("local").appName("my-spark-app").getOrCreate() val df = spark.read.format("csv").options(Map("header" -> "true", "inferSchema" -> "true")).load("C:/Users/mhattabi/Desktop/donnee/test.txt") val rows = new VectorAssembler().setInputCols(df.columns).setOutputCol("vs").transform(df).select("vs").rdd val data =rows .map(_.getAs[org.apache.spark.ml.linalg.Vector](0)) .map(org.apache.spark.mllib.linalg.Vectors.fromML) val mat: RowMatrix = new RowMatrix(data) //// Compute the top 5 singular values and corresponding singular vectors. val svd: SingularValueDecomposition[RowMatrix, Matrix] = mat.computeSVD(mat.numCols().toInt, computeU = true) val U: RowMatrix = svd.U // The U factor is a RowMatrix. val s: Vector = svd.s // The singular values are stored in a local dense vector. val V: Matrix = svd.V // The V factor is a local dense matrix. Here is the issue org.apache.spark.sql.AnalysisException: Cannot resolve column name "col0.1" among (col0.1, col1.2, col2.3, col3.4); -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org