Hi, I see ALS is still using Array[Int] but for other mllib algorithm we moved to Vector[Double] so that it can support either dense and sparse formats...
I know ALS can stay in Array[Int] due to the Netflix format for input datasets which is well defined but it helps if we move ALS to Vector[Double] as well...that way all algorithms will be consistent... Does it make sense ? Thanks. Deb On Mon, May 5, 2014 at 4:05 PM, David Hall <d...@cs.berkeley.edu> wrote: > On Mon, May 5, 2014 at 3:40 PM, DB Tsai <dbt...@stanford.edu> wrote: > > > David, > > > > Could we use Int, Long, Float as the data feature spaces, and Double for > > optimizer? > > > > Yes. Breeze doesn't allow operations on mixed types, so you'd need to > convert the double vectors to Floats if you wanted, e.g. dot product with > the weights vector. > > You might also be interested in FeatureVector, which is just a wrapper > around Array[Int] that emulates an indicator vector. It supports dot > products, axpy, etc. > > -- David > > > > > > > > Sincerely, > > > > DB Tsai > > ------------------------------------------------------- > > My Blog: https://www.dbtsai.com > > LinkedIn: https://www.linkedin.com/in/dbtsai > > > > > > On Mon, May 5, 2014 at 3:06 PM, David Hall <d...@cs.berkeley.edu> wrote: > > > > > Lbfgs and other optimizers would not work immediately, as they require > > > vector spaces over double. Otherwise it should work. > > > On May 5, 2014 3:03 PM, "DB Tsai" <dbt...@stanford.edu> wrote: > > > > > > > Breeze could take any type (Int, Long, Double, and Float) in the > matrix > > > > template. > > > > > > > > > > > > Sincerely, > > > > > > > > DB Tsai > > > > ------------------------------------------------------- > > > > My Blog: https://www.dbtsai.com > > > > LinkedIn: https://www.linkedin.com/in/dbtsai > > > > > > > > > > > > On Mon, May 5, 2014 at 2:56 PM, Debasish Das < > debasish.da...@gmail.com > > > > >wrote: > > > > > > > > > Is this a breeze issue or breeze can take templates on float / > > double ? > > > > > > > > > > If breeze can take templates then it is a minor fix for > Vectors.scala > > > > right > > > > > ? > > > > > > > > > > Thanks. > > > > > Deb > > > > > > > > > > > > > > > On Mon, May 5, 2014 at 2:45 PM, DB Tsai <dbt...@stanford.edu> > wrote: > > > > > > > > > > > +1 Would be nice that we can use different type in Vector. > > > > > > > > > > > > > > > > > > Sincerely, > > > > > > > > > > > > DB Tsai > > > > > > ------------------------------------------------------- > > > > > > My Blog: https://www.dbtsai.com > > > > > > LinkedIn: https://www.linkedin.com/in/dbtsai > > > > > > > > > > > > > > > > > > On Mon, May 5, 2014 at 2:41 PM, Debasish Das < > > > debasish.da...@gmail.com > > > > > > >wrote: > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > Why mllib vector is using double as default ? > > > > > > > > > > > > > > /** > > > > > > > > > > > > > > * Represents a numeric vector, whose index type is Int and > value > > > > type > > > > > is > > > > > > > Double. > > > > > > > > > > > > > > */ > > > > > > > > > > > > > > trait Vector extends Serializable { > > > > > > > > > > > > > > > > > > > > > /** > > > > > > > > > > > > > > * Size of the vector. > > > > > > > > > > > > > > */ > > > > > > > > > > > > > > def size: Int > > > > > > > > > > > > > > > > > > > > > /** > > > > > > > > > > > > > > * Converts the instance to a double array. > > > > > > > > > > > > > > */ > > > > > > > > > > > > > > def toArray: Array[Double] > > > > > > > > > > > > > > Don't we need a template on float/double ? This will give us > > memory > > > > > > > savings... > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > Deb > > > > > > > > > > > > > > > > > > > > > > > > > > > >