As you've pointed out, Rating requires user and item ids in Int form. So you will need to map String user ids to integers.
See this thread for example: https://mail-archives.apache.org/mod_mbox/spark-user/201501.mbox/%3CCAJgQjQ9GhGqpg1=hvxpfrs+59elfj9f7knhp8nyqnh1ut_6...@mail.gmail.com%3E . There is a DeveloperApi method in org.apache.spark.ml.recommendation.ALS that takes Rating with generic type (can be String) for user id and item id. However that is a little more involved, and for larger scale data will be a lot less efficient. Something like this for example: import org.apache.spark.ml.recommendation.ALS import org.apache.spark.ml.recommendation.ALS.Rating val conf = new SparkConf().setAppName("ALSWithStringID").setMaster("local[4]") val sc = new SparkContext(conf) // Name,Value1,Value2. val rdd = sc.parallelize(Seq( Rating[String]("foo", "1", 4.0f), Rating[String]("foo", "2", 2.0f), Rating[String]("bar", "1", 5.0f), Rating[String]("bar", "3", 1.0f) )) val (userFactors, itemFactors) = ALS.train(rdd) As you can see, you just get the factor RDDs back, and if you want an ALSModel you will have to construct it yourself. On Sun, 6 Mar 2016 at 18:23 Shishir Anshuman <shishiranshu...@gmail.com> wrote: > I am new to apache Spark, and I want to implement the Alternating Least > Squares algorithm. The data set is stored in a csv file in the format: > *Name,Value1,Value2*. > > When I read the csv file, I get > *java.lang.NumberFormatException.forInputString* error because the Rating > class needs the parameters in the format: *(user: Int, product: Int, > rating: Double)* and the first column of my file contains *Name*. > > Please suggest me a way to overcome this issue. >