Re: Does Spark.ml LogisticRegression assumes only Double valued features?

sethah Wed, 09 Sep 2015 16:00:06 -0700

When you pass a data frame into the train method of LogisticRegression and
other ML learning algorithms, the data is extracted by using parameters
`labelCol` and `featuresCol` which should have been set before calling the
train method (they default to "label" and "features", respectively).
`featuresCol` should be a Vector type consisting of Doubles. When the train
method is called, it tries to verify that the data type of `featuresCol` is
type Vector and that the data type of `labelCol` is of type Double. It will
throw an exception if other data types are found.


Spark ML has special ways of handling features that are not inherently
continuous or numerical. I urge you to review this question on StackOverflow
which covers it quite well:

http://stackoverflow.com/questions/32277576/spark-ml-categorical-features



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Does-Spark-ml-LogisticRegression-assumes-only-Double-valued-features-tp24575p24630.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Does Spark.ml LogisticRegression assumes only Double valued features?

Reply via email to