Maher Hattabi created SPARK-20987:
-------------------------------------

             Summary: columns with name having dots caused issues with 
VectorAssemblor
                 Key: SPARK-20987
                 URL: https://issues.apache.org/jira/browse/SPARK-20987
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.0.2
            Reporter: Maher Hattabi


Hello 
 i used this code knowing that that the data contains actually dots here is the 
dataset.
"col0.1","col1.2","col2.3","col3.4"
1,2,3,4
10,12,15,3
1,12,10,5
Here is the code i used  
val spark = 
SparkSession.builder.master("local").appName("my-spark-app").getOrCreate()
val df = spark.read.format("csv").options(Map("header" -> "true", "inferSchema" 
-> "true")).load("C:/Users/mhattabi/Desktop/donnee/test.txt")
val rows = new 
VectorAssembler().setInputCols(df.columns).setOutputCol("vs").transform(df).select("vs").rdd

val data =rows  .map(_.getAs[org.apache.spark.ml.linalg.Vector](0))
  .map(org.apache.spark.mllib.linalg.Vectors.fromML)

val mat: RowMatrix = new RowMatrix(data)
//// Compute the top 5 singular values and corresponding singular vectors.
val svd: SingularValueDecomposition[RowMatrix, Matrix] = 
mat.computeSVD(mat.numCols().toInt, computeU = true)
val U: RowMatrix = svd.U  // The U factor is a RowMatrix.
val s: Vector = svd.s  // The singular values are stored in a local dense 
vector.
val V: Matrix = svd.V  // The V factor is a local dense matrix.
Here is the issue 

org.apache.spark.sql.AnalysisException: Cannot resolve column name "col0.1" 
among (col0.1, col1.2, col2.3, col3.4);



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to