Hi,

I see ALS is still using Array[Int] but for other mllib algorithm we moved
to Vector[Double] so that it can support either dense and sparse formats...

I know ALS can stay in Array[Int] due to the Netflix format for input
datasets which is well defined but it helps if we move ALS to
Vector[Double] as well...that way all algorithms will be consistent...

Does it make sense ?

Thanks.
Deb



On Mon, May 5, 2014 at 4:05 PM, David Hall <d...@cs.berkeley.edu> wrote:

> On Mon, May 5, 2014 at 3:40 PM, DB Tsai <dbt...@stanford.edu> wrote:
>
> > David,
> >
> > Could we use Int, Long, Float as the data feature spaces, and Double for
> > optimizer?
> >
>
> Yes. Breeze doesn't allow operations on mixed types, so you'd need to
> convert the double vectors to Floats if you wanted, e.g. dot product with
> the weights vector.
>
> You might also be interested in FeatureVector, which is just a wrapper
> around Array[Int] that emulates an indicator vector. It supports dot
> products, axpy, etc.
>
> -- David
>
>
> >
> >
> > Sincerely,
> >
> > DB Tsai
> > -------------------------------------------------------
> > My Blog: https://www.dbtsai.com
> > LinkedIn: https://www.linkedin.com/in/dbtsai
> >
> >
> > On Mon, May 5, 2014 at 3:06 PM, David Hall <d...@cs.berkeley.edu> wrote:
> >
> > > Lbfgs and other optimizers would not work immediately, as they require
> > > vector spaces over double. Otherwise it should work.
> > > On May 5, 2014 3:03 PM, "DB Tsai" <dbt...@stanford.edu> wrote:
> > >
> > > > Breeze could take any type (Int, Long, Double, and Float) in the
> matrix
> > > > template.
> > > >
> > > >
> > > > Sincerely,
> > > >
> > > > DB Tsai
> > > > -------------------------------------------------------
> > > > My Blog: https://www.dbtsai.com
> > > > LinkedIn: https://www.linkedin.com/in/dbtsai
> > > >
> > > >
> > > > On Mon, May 5, 2014 at 2:56 PM, Debasish Das <
> debasish.da...@gmail.com
> > > > >wrote:
> > > >
> > > > > Is this a breeze issue or breeze can take templates on float /
> > double ?
> > > > >
> > > > > If breeze can take templates then it is a minor fix for
> Vectors.scala
> > > > right
> > > > > ?
> > > > >
> > > > > Thanks.
> > > > > Deb
> > > > >
> > > > >
> > > > > On Mon, May 5, 2014 at 2:45 PM, DB Tsai <dbt...@stanford.edu>
> wrote:
> > > > >
> > > > > > +1  Would be nice that we can use different type in Vector.
> > > > > >
> > > > > >
> > > > > > Sincerely,
> > > > > >
> > > > > > DB Tsai
> > > > > > -------------------------------------------------------
> > > > > > My Blog: https://www.dbtsai.com
> > > > > > LinkedIn: https://www.linkedin.com/in/dbtsai
> > > > > >
> > > > > >
> > > > > > On Mon, May 5, 2014 at 2:41 PM, Debasish Das <
> > > debasish.da...@gmail.com
> > > > > > >wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > Why mllib vector is using double as default ?
> > > > > > >
> > > > > > > /**
> > > > > > >
> > > > > > >  * Represents a numeric vector, whose index type is Int and
> value
> > > > type
> > > > > is
> > > > > > > Double.
> > > > > > >
> > > > > > >  */
> > > > > > >
> > > > > > > trait Vector extends Serializable {
> > > > > > >
> > > > > > >
> > > > > > >   /**
> > > > > > >
> > > > > > >    * Size of the vector.
> > > > > > >
> > > > > > >    */
> > > > > > >
> > > > > > >   def size: Int
> > > > > > >
> > > > > > >
> > > > > > >   /**
> > > > > > >
> > > > > > >    * Converts the instance to a double array.
> > > > > > >
> > > > > > >    */
> > > > > > >
> > > > > > >   def toArray: Array[Double]
> > > > > > >
> > > > > > > Don't we need a template on float/double ? This will give us
> > memory
> > > > > > > savings...
> > > > > > >
> > > > > > > Thanks.
> > > > > > >
> > > > > > > Deb
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to