Re: [ML] Converting ml.DenseVector to mllib.Vector
This may also help: http://spark.apache.org/docs/latest/ml-migration-guides.html On Sat, Dec 31, 2016 at 6:51 AM, Marco Mistroniwrote: > Hi. > you have a DataFrame.. there should be either a way to > - convert a DF to a Vector without doing a cast > - use a ML library which relies to DataFrames only > > I can see that your code is still importing libraries from two different > 'machine learning ' packages > > import org.apache.spark.ml.feature.{MinMaxScaler, Normalizer, > StandardScaler, VectorAssembler} > import org.apache.spark.mllib.linalg.{DenseVector, Vector, Vectors} > > You should be able to find exactly same data structures that you had in > mllib under the ml package.i'd advise to stick to ml libaries only, > that will avoid confusion > > i concur with you, this line looks dodgy to me > > val rddVec = dfScaled > .select("scaled_features") > .rdd > .map(_(0) > .asInstanceOf[org.apache.spark.mllib.linalg.Vector]) > > converting a DF to a Vector is not as simple as doing a cast (like you > would do in Java) > > I did a random search and found this, mayb it'll help > > https://community.hortonworks.com/questions/33375/how-to- > convert-a-dataframe-to-a-vectordense-in-sca.html > > > > > hth > marco > > > > On Sat, Dec 31, 2016 at 4:24 AM, Jason Wolosonovich > wrote: > >> Hello All, >> >> I'm working through the Data Science with Scala course on Big Data >> University and it is not updated to work with Spark 2.0, so I'm adapting >> the code as I work through it, however I've finally run into something that >> is over my head. I'm new to Scala as well. >> >> When I run this code (https://gist.github.com/jmwol >> oso/a715cc4d7f1e7cc7951fab4edf6218b1) I get the following error: >> >> `java.lang.ClassCastException: org.apache.spark.ml.linalg.DenseVector >> cannot be cast to org.apache.spark.mllib.linalg.Vector` >> >> I believe this is occurring at line 107 of the gist above. The code >> starting at this line (and continuing to the end of the gist) is the >> current code in the course. >> >> If I try to map to any other class type, then I have problems with the >> `Statistics.corr(rddVec)`. >> >> How can I convert `rddVec` from an `ml.linalg.DenseVector` into an >> `mllib.linalg.Vector` for use with `Statistics`? >> >> Thanks! >> >> -Jason >> >> - >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> >> >
Re: [ML] Converting ml.DenseVector to mllib.Vector
Hi. you have a DataFrame.. there should be either a way to - convert a DF to a Vector without doing a cast - use a ML library which relies to DataFrames only I can see that your code is still importing libraries from two different 'machine learning ' packages import org.apache.spark.ml.feature.{MinMaxScaler, Normalizer, StandardScaler, VectorAssembler} import org.apache.spark.mllib.linalg.{DenseVector, Vector, Vectors} You should be able to find exactly same data structures that you had in mllib under the ml package.i'd advise to stick to ml libaries only, that will avoid confusion i concur with you, this line looks dodgy to me val rddVec = dfScaled .select("scaled_features") .rdd .map(_(0) .asInstanceOf[org.apache.spark.mllib.linalg.Vector]) converting a DF to a Vector is not as simple as doing a cast (like you would do in Java) I did a random search and found this, mayb it'll help https://community.hortonworks.com/questions/33375/how-to-convert-a-dataframe-to-a-vectordense-in-sca.html hth marco On Sat, Dec 31, 2016 at 4:24 AM, Jason Wolosonovichwrote: > Hello All, > > I'm working through the Data Science with Scala course on Big Data > University and it is not updated to work with Spark 2.0, so I'm adapting > the code as I work through it, however I've finally run into something that > is over my head. I'm new to Scala as well. > > When I run this code (https://gist.github.com/jmwol > oso/a715cc4d7f1e7cc7951fab4edf6218b1) I get the following error: > > `java.lang.ClassCastException: org.apache.spark.ml.linalg.DenseVector > cannot be cast to org.apache.spark.mllib.linalg.Vector` > > I believe this is occurring at line 107 of the gist above. The code > starting at this line (and continuing to the end of the gist) is the > current code in the course. > > If I try to map to any other class type, then I have problems with the > `Statistics.corr(rddVec)`. > > How can I convert `rddVec` from an `ml.linalg.DenseVector` into an > `mllib.linalg.Vector` for use with `Statistics`? > > Thanks! > > -Jason > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >
[ML] Converting ml.DenseVector to mllib.Vector
Hello All, I'm working through the Data Science with Scala course on Big Data University and it is not updated to work with Spark 2.0, so I'm adapting the code as I work through it, however I've finally run into something that is over my head. I'm new to Scala as well. When I run this code (https://gist.github.com/jmwoloso/a715cc4d7f1e7cc7951fab4edf6218b1) I get the following error: `java.lang.ClassCastException: org.apache.spark.ml.linalg.DenseVector cannot be cast to org.apache.spark.mllib.linalg.Vector` I believe this is occurring at line 107 of the gist above. The code starting at this line (and continuing to the end of the gist) is the current code in the course. If I try to map to any other class type, then I have problems with the `Statistics.corr(rddVec)`. How can I convert `rddVec` from an `ml.linalg.DenseVector` into an `mllib.linalg.Vector` for use with `Statistics`? Thanks! -Jason - To unsubscribe e-mail: user-unsubscr...@spark.apache.org