Re: [ML] Converting ml.DenseVector to mllib.Vector

2016-12-31 Thread Peyman Mohajerian
This may also help:
http://spark.apache.org/docs/latest/ml-migration-guides.html

On Sat, Dec 31, 2016 at 6:51 AM, Marco Mistroni  wrote:

> Hi.
> you have a DataFrame.. there should be either a way to
> - convert a DF to a Vector without doing a cast
> - use a ML library which relies to DataFrames only
>
> I can see that your code is still importing libraries from two different
> 'machine learning ' packages
>
> import org.apache.spark.ml.feature.{MinMaxScaler, Normalizer,
> StandardScaler, VectorAssembler}
> import org.apache.spark.mllib.linalg.{DenseVector, Vector, Vectors}
>
> You should be able to find exactly same data  structures that you had in
> mllib  under the ml package.i'd advise to stick to ml libaries only,
> that will avoid confusion
>
> i concur with you, this line looks dodgy to me
>
> val rddVec = dfScaled
> .select("scaled_features")
> .rdd
> .map(_(0)
> .asInstanceOf[org.apache.spark.mllib.linalg.Vector])
>
> converting a DF to a Vector is not as simple as doing a cast (like you
> would do in Java)
>
> I did a random search and found this, mayb it'll help
>
> https://community.hortonworks.com/questions/33375/how-to-
> convert-a-dataframe-to-a-vectordense-in-sca.html
>
>
>
>
> hth
>  marco
>
>
>
> On Sat, Dec 31, 2016 at 4:24 AM, Jason Wolosonovich 
> wrote:
>
>> Hello All,
>>
>> I'm working through the Data Science with Scala course on Big Data
>> University and it is not updated to work with Spark 2.0, so I'm adapting
>> the code as I work through it, however I've finally run into something that
>> is over my head. I'm new to Scala as well.
>>
>> When I run this code (https://gist.github.com/jmwol
>> oso/a715cc4d7f1e7cc7951fab4edf6218b1) I get the following error:
>>
>> `java.lang.ClassCastException: org.apache.spark.ml.linalg.DenseVector
>> cannot be cast to org.apache.spark.mllib.linalg.Vector`
>>
>> I believe this is occurring at line 107 of the gist above. The code
>> starting at this line (and continuing to the end of the gist) is the
>> current code in the course.
>>
>> If I try to map to any other class type, then I have problems with the
>> `Statistics.corr(rddVec)`.
>>
>> How can I convert `rddVec` from an `ml.linalg.DenseVector` into an
>> `mllib.linalg.Vector` for use with `Statistics`?
>>
>> Thanks!
>>
>> -Jason
>>
>> -
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>
>


Re: [ML] Converting ml.DenseVector to mllib.Vector

2016-12-31 Thread Marco Mistroni
Hi.
you have a DataFrame.. there should be either a way to
- convert a DF to a Vector without doing a cast
- use a ML library which relies to DataFrames only

I can see that your code is still importing libraries from two different
'machine learning ' packages

import org.apache.spark.ml.feature.{MinMaxScaler, Normalizer,
StandardScaler, VectorAssembler}
import org.apache.spark.mllib.linalg.{DenseVector, Vector, Vectors}

You should be able to find exactly same data  structures that you had in
mllib  under the ml package.i'd advise to stick to ml libaries only,
that will avoid confusion

i concur with you, this line looks dodgy to me

val rddVec = dfScaled
.select("scaled_features")
.rdd
.map(_(0)
.asInstanceOf[org.apache.spark.mllib.linalg.Vector])

converting a DF to a Vector is not as simple as doing a cast (like you
would do in Java)

I did a random search and found this, mayb it'll help

https://community.hortonworks.com/questions/33375/how-to-convert-a-dataframe-to-a-vectordense-in-sca.html




hth
 marco



On Sat, Dec 31, 2016 at 4:24 AM, Jason Wolosonovich 
wrote:

> Hello All,
>
> I'm working through the Data Science with Scala course on Big Data
> University and it is not updated to work with Spark 2.0, so I'm adapting
> the code as I work through it, however I've finally run into something that
> is over my head. I'm new to Scala as well.
>
> When I run this code (https://gist.github.com/jmwol
> oso/a715cc4d7f1e7cc7951fab4edf6218b1) I get the following error:
>
> `java.lang.ClassCastException: org.apache.spark.ml.linalg.DenseVector
> cannot be cast to org.apache.spark.mllib.linalg.Vector`
>
> I believe this is occurring at line 107 of the gist above. The code
> starting at this line (and continuing to the end of the gist) is the
> current code in the course.
>
> If I try to map to any other class type, then I have problems with the
> `Statistics.corr(rddVec)`.
>
> How can I convert `rddVec` from an `ml.linalg.DenseVector` into an
> `mllib.linalg.Vector` for use with `Statistics`?
>
> Thanks!
>
> -Jason
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


[ML] Converting ml.DenseVector to mllib.Vector

2016-12-30 Thread Jason Wolosonovich

Hello All,

I'm working through the Data Science with Scala course on Big Data 
University and it is not updated to work with Spark 2.0, so I'm adapting 
the code as I work through it, however I've finally run into something 
that is over my head. I'm new to Scala as well.


When I run this code 
(https://gist.github.com/jmwoloso/a715cc4d7f1e7cc7951fab4edf6218b1) I 
get the following error:


`java.lang.ClassCastException: org.apache.spark.ml.linalg.DenseVector 
cannot be cast to org.apache.spark.mllib.linalg.Vector`


I believe this is occurring at line 107 of the gist above. The code 
starting at this line (and continuing to the end of the gist) is the 
current code in the course.


If I try to map to any other class type, then I have problems with the 
`Statistics.corr(rddVec)`.


How can I convert `rddVec` from an `ml.linalg.DenseVector` into an 
`mllib.linalg.Vector` for use with `Statistics`?


Thanks!

-Jason

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org