Re: Eigenvalue solver

2016-01-12 Thread David Hall
(I don't know anything spark specific, so I'm going to treat it like a
Breeze question...)

As I understand it, Spark uses ARPACK via Breeze for SVD, and presumably
the same approach can be used for EVD. Basically, you make a function that
multiplies your "matrix" (which might be represented
implicitly/distributed, whatever) by a breeze.linalg.DenseVector.

This is the Breeze implementation for sparse SVD (which is fully generic
and might be hard to follow if you're not used to Breeze/typeclass-heavy
Scala...)

https://github.com/dlwh/breeze/blob/aa958688c428db581d853fd92eb35e82f80d8b5c/math/src/main/scala/breeze/linalg/functions/svd.scala#L205-L205

The difference between SVD and EVD in arpack (to a first approximation) is
that you need to multiple by A.t * A * x for SVD, and just A * x for EVD.

The basic idea is to implement a Breeze UFunc eig.Impl2 implicit following
the svd code (or you could just copy out the body of the function and
specialize it.) The signature you're looking to implement is:

implicit def Eig_Sparse_Impl[Mat](implicit mul: OpMulMatrix.Impl2[Mat,
DenseVector[Double], DenseVector[Double]],
  dimImpl: dim.Impl[Mat, (Int, Int)])
  : eig.Impl3[Mat, Int, Double, EigenvalueResult] = {

The type parameters of Impl3 are: the matrix type, the number of
eigenvalues you want, and a tolerance, and a result type. If you implement
this signature, then you can call eig on anything that can be multiplied by
a dense vector and that implements dim (to get the number of outputs).

(You'll need to define the class eigenvalue result to be what you want. I
don't immediately know how to unpack ARPACK's answers, but you might look
at this scipy thing:
https://github.com/thomasnat1/cdcNewsRanker/blob/71b0ff3989d5191dc6a78c40c4a7a9967cbb0e49/venv/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/arpack/arpack.py#L1049
)

I'm happy to help more if you decide to go this route, here, or on the
scala-breeze google group, or on github.

-- David


On Tue, Jan 12, 2016 at 10:28 AM, Lydia Ickler 
wrote:

> Hi,
>
> I wanted to know if there are any implementations yet within the Machine
> Learning Library or generally that can efficiently solve eigenvalue
> problems?
> Or if not do you have suggestions on how to approach a parallel execution
> maybe with BLAS or Breeze?
>
> Thanks in advance!
> Lydia
>
>
> Von meinem iPhone gesendet
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


Re: [mllib] Is there any bugs to divide a Breeze sparse vectors at Spark v1.3.0-rc3?

2015-03-18 Thread David Hall
sure.

On Wed, Mar 18, 2015 at 12:19 AM, Debasish Das 
wrote:

> Hi David,
>
> We are stress testing breeze.optimize.proximal and nnls...if you are
> cutting a release now, we will need another release soon once we get the
> runtime optimizations in place and merged to breeze.
>
> Thanks.
> Deb
>  On Mar 15, 2015 9:39 PM, "David Hall"  wrote:
>
>> snapshot is pushed. If you verify I'll publish the new artifacts.
>>
>> On Sun, Mar 15, 2015 at 1:14 AM, Yu Ishikawa <
>> yuu.ishikawa+sp...@gmail.com>
>> wrote:
>>
>> > David Hall who is a breeze creator told me that it's a bug. So, I made a
>> > jira
>> > ticket about this issue. We need to upgrade breeze from 0.11.1 to
>> 0.11.2 or
>> > later in order to fix the bug, when the new version of breeze will be
>> > released.
>> >
>> > [SPARK-6341] Upgrade breeze from 0.11.1 to 0.11.2 or later - ASF JIRA
>> > https://issues.apache.org/jira/browse/SPARK-6341
>> >
>> > Thanks,
>> > Yu Ishikawa
>> >
>> >
>> >
>> > -
>> > -- Yu Ishikawa
>> > --
>> > View this message in context:
>> >
>> http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Is-there-any-bugs-to-divide-a-Breeze-sparse-vectors-at-Spark-v1-3-0-rc3-tp11056p11058.html
>> > Sent from the Apache Spark Developers List mailing list archive at
>> > Nabble.com.
>> >
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> > For additional commands, e-mail: dev-h...@spark.apache.org
>> >
>> >
>>
>


Re: [mllib] Is there any bugs to divide a Breeze sparse vectors at Spark v1.3.0-rc3?

2015-03-17 Thread David Hall
ping?

On Sun, Mar 15, 2015 at 9:38 PM, David Hall  wrote:

> snapshot is pushed. If you verify I'll publish the new artifacts.
>
> On Sun, Mar 15, 2015 at 1:14 AM, Yu Ishikawa  > wrote:
>
>> David Hall who is a breeze creator told me that it's a bug. So, I made a
>> jira
>> ticket about this issue. We need to upgrade breeze from 0.11.1 to 0.11.2
>> or
>> later in order to fix the bug, when the new version of breeze will be
>> released.
>>
>> [SPARK-6341] Upgrade breeze from 0.11.1 to 0.11.2 or later - ASF JIRA
>> https://issues.apache.org/jira/browse/SPARK-6341
>>
>> Thanks,
>> Yu Ishikawa
>>
>>
>>
>> -
>> -- Yu Ishikawa
>> --
>> View this message in context:
>> http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Is-there-any-bugs-to-divide-a-Breeze-sparse-vectors-at-Spark-v1-3-0-rc3-tp11056p11058.html
>> Sent from the Apache Spark Developers List mailing list archive at
>> Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> For additional commands, e-mail: dev-h...@spark.apache.org
>>
>>
>


Re: [mllib] Is there any bugs to divide a Breeze sparse vectors at Spark v1.3.0-rc3?

2015-03-15 Thread David Hall
snapshot is pushed. If you verify I'll publish the new artifacts.

On Sun, Mar 15, 2015 at 1:14 AM, Yu Ishikawa 
wrote:

> David Hall who is a breeze creator told me that it's a bug. So, I made a
> jira
> ticket about this issue. We need to upgrade breeze from 0.11.1 to 0.11.2 or
> later in order to fix the bug, when the new version of breeze will be
> released.
>
> [SPARK-6341] Upgrade breeze from 0.11.1 to 0.11.2 or later - ASF JIRA
> https://issues.apache.org/jira/browse/SPARK-6341
>
> Thanks,
> Yu Ishikawa
>
>
>
> -
> -- Yu Ishikawa
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Is-there-any-bugs-to-divide-a-Breeze-sparse-vectors-at-Spark-v1-3-0-rc3-tp11056p11058.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


Re: Breeze Library usage in Spark

2014-10-03 Thread David Hall
yeah, breeze.storage.Zero was introduced in either 0.8 or 0.9.

On Fri, Oct 3, 2014 at 9:45 AM, Xiangrui Meng  wrote:

> Did you add a different version of breeze to the classpath? In Spark
> 1.0, we use breeze 0.7, and in Spark 1.1 we use 0.9. If the breeze
> version you used is different from the one comes with Spark, you might
> see class not found. -Xiangrui
>
> On Fri, Oct 3, 2014 at 4:22 AM, Priya Ch 
> wrote:
> > Hi Team,
> >
> > When I am trying to use DenseMatrix of breeze library in spark, its
> throwing
> > me the following error:
> >
> > java.lang.noclassdeffounderror: breeze/storage/Zero
> >
> >
> > Can someone help me on this ?
> >
> > Thanks,
> > Padma Ch
> >
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


Re: Is breeze thread safe in Spark?

2014-09-03 Thread David Hall
mutating operations are not thread safe. Operations that don't mutate
should be thread safe. I can't speak to what Evan said, but I would guess
that the way they're using += should be safe.


On Wed, Sep 3, 2014 at 11:58 AM, RJ Nowling  wrote:

> David,
>
> Can you confirm that += is not thread safe but + is?  I'm assuming +
> allocates a new object for the write, while += doesn't.
>
> Thanks!
> RJ
>
>
> On Wed, Sep 3, 2014 at 2:50 PM, David Hall  wrote:
>
>> In general, in Breeze we allocate separate work arrays for each call to
>> lapack, so it should be fine. In general concurrent modification isn't
>> thread safe of course, but things that "ought" to be thread safe really
>> should be.
>>
>>
>> On Wed, Sep 3, 2014 at 10:41 AM, RJ Nowling  wrote:
>>
>>> No, it's not in all cases.   Since Breeze uses lapack under the hood,
>>> changes to memory between different threads is bad.
>>>
>>> There's actually a potential bug in the KMeans code where it uses +=
>>> instead of +.
>>>
>>>
>>> On Wed, Sep 3, 2014 at 1:26 PM, Ulanov, Alexander <
>>> alexander.ula...@hp.com>
>>> wrote:
>>>
>>> > Hi,
>>> >
>>> > Is breeze library called thread safe from Spark mllib code in case when
>>> > native libs for blas and lapack are used? Might it be an issue when
>>> running
>>> > Spark locally?
>>> >
>>> > Best regards, Alexander
>>> > -
>>> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>>> > For additional commands, e-mail: dev-h...@spark.apache.org
>>> >
>>> >
>>>
>>>
>>> --
>>> em rnowl...@gmail.com
>>> c 954.496.2314
>>>
>>
>>
>
>
> --
> em rnowl...@gmail.com
> c 954.496.2314
>


Re: Is breeze thread safe in Spark?

2014-09-03 Thread David Hall
In general, in Breeze we allocate separate work arrays for each call to
lapack, so it should be fine. In general concurrent modification isn't
thread safe of course, but things that "ought" to be thread safe really
should be.


On Wed, Sep 3, 2014 at 10:41 AM, RJ Nowling  wrote:

> No, it's not in all cases.   Since Breeze uses lapack under the hood,
> changes to memory between different threads is bad.
>
> There's actually a potential bug in the KMeans code where it uses +=
> instead of +.
>
>
> On Wed, Sep 3, 2014 at 1:26 PM, Ulanov, Alexander  >
> wrote:
>
> > Hi,
> >
> > Is breeze library called thread safe from Spark mllib code in case when
> > native libs for blas and lapack are used? Might it be an issue when
> running
> > Spark locally?
> >
> > Best regards, Alexander
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> > For additional commands, e-mail: dev-h...@spark.apache.org
> >
> >
>
>
> --
> em rnowl...@gmail.com
> c 954.496.2314
>


Re: Linear CG solver

2014-06-27 Thread David Hall
I have no ideas on benchmarks, but breeze has a CG solver:
https://github.com/scalanlp/breeze/tree/master/math/src/main/scala/breeze/optimize/linear/ConjugateGradient.scala

https://github.com/scalanlp/breeze/blob/e2adad3b885736baf890b306806a56abc77a3ed3/math/src/test/scala/breeze/optimize/linear/ConjugateGradientTest.scala

It's based on the code from TRON, and so I think it's more targeted for
norm-constrained solutions of the CG problem.








On Fri, Jun 27, 2014 at 5:54 PM, Debasish Das 
wrote:

> Hi,
>
> I am looking for an efficient linear CG to be put inside the Quadratic
> Minimization algorithms we added for Spark mllib.
>
> With a good linear CG, we should be able to solve kernel SVMs with this
> solver in mllib...
>
> I use direct solves right now using cholesky decomposition which has higher
> complexity as matrix sizes become large...
>
> I found out some jblas example code:
>
> https://github.com/mikiobraun/jblas-examples/blob/master/src/CG.java
>
> I was wondering if mllib developers have any experience using this solver
> and if this is better than apache commons linear CG ?
>
> Thanks.
> Deb
>


Re: mllib vector templates

2014-05-05 Thread David Hall
On Mon, May 5, 2014 at 3:40 PM, DB Tsai  wrote:

> David,
>
> Could we use Int, Long, Float as the data feature spaces, and Double for
> optimizer?
>

Yes. Breeze doesn't allow operations on mixed types, so you'd need to
convert the double vectors to Floats if you wanted, e.g. dot product with
the weights vector.

You might also be interested in FeatureVector, which is just a wrapper
around Array[Int] that emulates an indicator vector. It supports dot
products, axpy, etc.

-- David


>
>
> Sincerely,
>
> DB Tsai
> ---
> My Blog: https://www.dbtsai.com
> LinkedIn: https://www.linkedin.com/in/dbtsai
>
>
> On Mon, May 5, 2014 at 3:06 PM, David Hall  wrote:
>
> > Lbfgs and other optimizers would not work immediately, as they require
> > vector spaces over double. Otherwise it should work.
> > On May 5, 2014 3:03 PM, "DB Tsai"  wrote:
> >
> > > Breeze could take any type (Int, Long, Double, and Float) in the matrix
> > > template.
> > >
> > >
> > > Sincerely,
> > >
> > > DB Tsai
> > > ---
> > > My Blog: https://www.dbtsai.com
> > > LinkedIn: https://www.linkedin.com/in/dbtsai
> > >
> > >
> > > On Mon, May 5, 2014 at 2:56 PM, Debasish Das  > > >wrote:
> > >
> > > > Is this a breeze issue or breeze can take templates on float /
> double ?
> > > >
> > > > If breeze can take templates then it is a minor fix for Vectors.scala
> > > right
> > > > ?
> > > >
> > > > Thanks.
> > > > Deb
> > > >
> > > >
> > > > On Mon, May 5, 2014 at 2:45 PM, DB Tsai  wrote:
> > > >
> > > > > +1  Would be nice that we can use different type in Vector.
> > > > >
> > > > >
> > > > > Sincerely,
> > > > >
> > > > > DB Tsai
> > > > > ---
> > > > > My Blog: https://www.dbtsai.com
> > > > > LinkedIn: https://www.linkedin.com/in/dbtsai
> > > > >
> > > > >
> > > > > On Mon, May 5, 2014 at 2:41 PM, Debasish Das <
> > debasish.da...@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Why mllib vector is using double as default ?
> > > > > >
> > > > > > /**
> > > > > >
> > > > > >  * Represents a numeric vector, whose index type is Int and value
> > > type
> > > > is
> > > > > > Double.
> > > > > >
> > > > > >  */
> > > > > >
> > > > > > trait Vector extends Serializable {
> > > > > >
> > > > > >
> > > > > >   /**
> > > > > >
> > > > > >* Size of the vector.
> > > > > >
> > > > > >*/
> > > > > >
> > > > > >   def size: Int
> > > > > >
> > > > > >
> > > > > >   /**
> > > > > >
> > > > > >* Converts the instance to a double array.
> > > > > >
> > > > > >*/
> > > > > >
> > > > > >   def toArray: Array[Double]
> > > > > >
> > > > > > Don't we need a template on float/double ? This will give us
> memory
> > > > > > savings...
> > > > > >
> > > > > > Thanks.
> > > > > >
> > > > > > Deb
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: mllib vector templates

2014-05-05 Thread David Hall
I should mention it shouldn't be too hard to change, but it is a current
limitation.
On May 5, 2014 3:12 PM, "Debasish Das"  wrote:

> Is any one facing issues due to this ? If not then I guess doubles are
> fine...
>
> For me it's not a big deal as there is enough memory available...
>
>
> On Mon, May 5, 2014 at 3:06 PM, David Hall  wrote:
>
> > Lbfgs and other optimizers would not work immediately, as they require
> > vector spaces over double. Otherwise it should work.
> > On May 5, 2014 3:03 PM, "DB Tsai"  wrote:
> >
> > > Breeze could take any type (Int, Long, Double, and Float) in the matrix
> > > template.
> > >
> > >
> > > Sincerely,
> > >
> > > DB Tsai
> > > ---
> > > My Blog: https://www.dbtsai.com
> > > LinkedIn: https://www.linkedin.com/in/dbtsai
> > >
> > >
> > > On Mon, May 5, 2014 at 2:56 PM, Debasish Das  > > >wrote:
> > >
> > > > Is this a breeze issue or breeze can take templates on float /
> double ?
> > > >
> > > > If breeze can take templates then it is a minor fix for Vectors.scala
> > > right
> > > > ?
> > > >
> > > > Thanks.
> > > > Deb
> > > >
> > > >
> > > > On Mon, May 5, 2014 at 2:45 PM, DB Tsai  wrote:
> > > >
> > > > > +1  Would be nice that we can use different type in Vector.
> > > > >
> > > > >
> > > > > Sincerely,
> > > > >
> > > > > DB Tsai
> > > > > ---
> > > > > My Blog: https://www.dbtsai.com
> > > > > LinkedIn: https://www.linkedin.com/in/dbtsai
> > > > >
> > > > >
> > > > > On Mon, May 5, 2014 at 2:41 PM, Debasish Das <
> > debasish.da...@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Why mllib vector is using double as default ?
> > > > > >
> > > > > > /**
> > > > > >
> > > > > >  * Represents a numeric vector, whose index type is Int and value
> > > type
> > > > is
> > > > > > Double.
> > > > > >
> > > > > >  */
> > > > > >
> > > > > > trait Vector extends Serializable {
> > > > > >
> > > > > >
> > > > > >   /**
> > > > > >
> > > > > >* Size of the vector.
> > > > > >
> > > > > >*/
> > > > > >
> > > > > >   def size: Int
> > > > > >
> > > > > >
> > > > > >   /**
> > > > > >
> > > > > >* Converts the instance to a double array.
> > > > > >
> > > > > >*/
> > > > > >
> > > > > >   def toArray: Array[Double]
> > > > > >
> > > > > > Don't we need a template on float/double ? This will give us
> memory
> > > > > > savings...
> > > > > >
> > > > > > Thanks.
> > > > > >
> > > > > > Deb
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: mllib vector templates

2014-05-05 Thread David Hall
Lbfgs and other optimizers would not work immediately, as they require
vector spaces over double. Otherwise it should work.
On May 5, 2014 3:03 PM, "DB Tsai"  wrote:

> Breeze could take any type (Int, Long, Double, and Float) in the matrix
> template.
>
>
> Sincerely,
>
> DB Tsai
> ---
> My Blog: https://www.dbtsai.com
> LinkedIn: https://www.linkedin.com/in/dbtsai
>
>
> On Mon, May 5, 2014 at 2:56 PM, Debasish Das  >wrote:
>
> > Is this a breeze issue or breeze can take templates on float / double ?
> >
> > If breeze can take templates then it is a minor fix for Vectors.scala
> right
> > ?
> >
> > Thanks.
> > Deb
> >
> >
> > On Mon, May 5, 2014 at 2:45 PM, DB Tsai  wrote:
> >
> > > +1  Would be nice that we can use different type in Vector.
> > >
> > >
> > > Sincerely,
> > >
> > > DB Tsai
> > > ---
> > > My Blog: https://www.dbtsai.com
> > > LinkedIn: https://www.linkedin.com/in/dbtsai
> > >
> > >
> > > On Mon, May 5, 2014 at 2:41 PM, Debasish Das  > > >wrote:
> > >
> > > > Hi,
> > > >
> > > > Why mllib vector is using double as default ?
> > > >
> > > > /**
> > > >
> > > >  * Represents a numeric vector, whose index type is Int and value
> type
> > is
> > > > Double.
> > > >
> > > >  */
> > > >
> > > > trait Vector extends Serializable {
> > > >
> > > >
> > > >   /**
> > > >
> > > >* Size of the vector.
> > > >
> > > >*/
> > > >
> > > >   def size: Int
> > > >
> > > >
> > > >   /**
> > > >
> > > >* Converts the instance to a double array.
> > > >
> > > >*/
> > > >
> > > >   def toArray: Array[Double]
> > > >
> > > > Don't we need a template on float/double ? This will give us memory
> > > > savings...
> > > >
> > > > Thanks.
> > > >
> > > > Deb
> > > >
> > >
> >
>


Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-29 Thread David Hall
Yeah, that's probably the easiest though obviously pretty hacky.

I'm surprised that the hessian approximation isn't worse than it is. (As
in, I'd expect error messages.) It's obviously line searching much more, so
the approximation must be worse. You might be interested in this online
bfgs:
http://jmlr.org/proceedings/papers/v2/schraudolph07a/schraudolph07a.pdf

-- David


On Tue, Apr 29, 2014 at 3:30 PM, DB Tsai  wrote:

> Have a quick hack to understand the behavior of SLBFGS
> (Stochastic-LBFGS) by overwriting the breeze iterations method to get the
> current LBFGS step to ensure that the objective function is the same during
> the line search step. David, the following is my code, have a better way to
> inject into it?
>
> https://github.com/dbtsai/spark/tree/dbtsai-lbfgshack
>
> Couple findings,
>
> 1) miniBatch (using rdd sample api) for each iteration is slower than full
> data training when the full data is cached. Probably because sample is not
> efficiency in Spark.
>
> 2) Since in the line search steps, we use the same sample of data (the
> same objective function), the SLBFGS actually converges well.
>
> 3) For news20 dataset, with 0.05 miniBatch size, it takes 14 SLBFGS steps
> (29 data iterations, 74.5seconds) to converge to loss < 0.10. For LBFGS
> with full data training, it takes 9 LBFGS steps (12 data iterations, 37.6
> seconds) to converge to loss < 0.10.
>
> It seems that as long as the noisy gradient happens in different SLBFGS
> steps, it still works.
>
> (ps, I also tried in line search step, I use different sample of data, and
> it just doesn't work as we expect.)
>
>
>
> Sincerely,
>
> DB Tsai
> -------
> My Blog: https://www.dbtsai.com
> LinkedIn: https://www.linkedin.com/in/dbtsai
>
>
> On Mon, Apr 28, 2014 at 8:55 AM, David Hall  wrote:
>
>> That's right.
>>
>> FWIW, caching should be automatic now, but it might be the version of
>> Breeze you're using doesn't do that yet.
>>
>> Also, In breeze.util._ there's an implicit that adds a tee method to
>> iterator, and also a last method. Both are useful for things like this.
>>
>> -- David
>>
>>
>> On Sun, Apr 27, 2014 at 11:53 PM, DB Tsai  wrote:
>>
>>> I think I figure it out. Instead of calling minimize, and record the
>>> loss in the DiffFunction, I should do the following.
>>>
>>> val states = lbfgs.iterations(new CachedDiffFunction(costFun),
>>> initialWeights.toBreeze.toDenseVector)
>>> states.foreach(state => lossHistory.append(state.value))
>>>
>>> All the losses in states should be decreasing now. Am I right?
>>>
>>>
>>>
>>> Sincerely,
>>>
>>> DB Tsai
>>> ---
>>> My Blog: https://www.dbtsai.com
>>> LinkedIn: https://www.linkedin.com/in/dbtsai
>>>
>>>
>>> On Sun, Apr 27, 2014 at 11:31 PM, DB Tsai  wrote:
>>>
>>>> Also, how many failure of rejection will terminate the optimization
>>>> process? How is it related to "numberOfImprovementFailures"?
>>>>
>>>> Thanks.
>>>>
>>>>
>>>> Sincerely,
>>>>
>>>> DB Tsai
>>>> ---
>>>> My Blog: https://www.dbtsai.com
>>>> LinkedIn: https://www.linkedin.com/in/dbtsai
>>>>
>>>>
>>>> On Sun, Apr 27, 2014 at 11:28 PM, DB Tsai  wrote:
>>>>
>>>>> Hi David,
>>>>>
>>>>> I'm recording the loss history in the DiffFunction implementation, and
>>>>> that's why the rejected step is also recorded in my loss history.
>>>>>
>>>>> Is there any api in Breeze LBFGS to get the history which already
>>>>> excludes the reject step? Or should I just call "iterations" method and
>>>>> check "iteratingShouldStop" instead?
>>>>>
>>>>> Thanks.
>>>>>
>>>>>
>>>>> Sincerely,
>>>>>
>>>>> DB Tsai
>>>>> ---
>>>>> My Blog: https://www.dbtsai.com
>>>>> LinkedIn: https://www.linkedin.com/in/dbtsai
>>>>>
>>>>>
>>>>> On Fri, Apr 25, 2014 at 3:10 PM, David Hall wrote:
>>>>>
>>>>>> LBFGS will not take a step that sends th

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-28 Thread David Hall
That's right.

FWIW, caching should be automatic now, but it might be the version of
Breeze you're using doesn't do that yet.

Also, In breeze.util._ there's an implicit that adds a tee method to
iterator, and also a last method. Both are useful for things like this.

-- David

On Sun, Apr 27, 2014 at 11:53 PM, DB Tsai  wrote:

> I think I figure it out. Instead of calling minimize, and record the loss
> in the DiffFunction, I should do the following.
>
> val states = lbfgs.iterations(new CachedDiffFunction(costFun),
> initialWeights.toBreeze.toDenseVector)
> states.foreach(state => lossHistory.append(state.value))
>
> All the losses in states should be decreasing now. Am I right?
>
>
>
> Sincerely,
>
> DB Tsai
> ---
> My Blog: https://www.dbtsai.com
> LinkedIn: https://www.linkedin.com/in/dbtsai
>
>
> On Sun, Apr 27, 2014 at 11:31 PM, DB Tsai  wrote:
>
>> Also, how many failure of rejection will terminate the optimization
>> process? How is it related to "numberOfImprovementFailures"?
>>
>> Thanks.
>>
>>
>> Sincerely,
>>
>> DB Tsai
>> ---
>> My Blog: https://www.dbtsai.com
>> LinkedIn: https://www.linkedin.com/in/dbtsai
>>
>>
>> On Sun, Apr 27, 2014 at 11:28 PM, DB Tsai  wrote:
>>
>>> Hi David,
>>>
>>> I'm recording the loss history in the DiffFunction implementation, and
>>> that's why the rejected step is also recorded in my loss history.
>>>
>>> Is there any api in Breeze LBFGS to get the history which already
>>> excludes the reject step? Or should I just call "iterations" method and
>>> check "iteratingShouldStop" instead?
>>>
>>> Thanks.
>>>
>>>
>>> Sincerely,
>>>
>>> DB Tsai
>>> ---
>>> My Blog: https://www.dbtsai.com
>>> LinkedIn: https://www.linkedin.com/in/dbtsai
>>>
>>>
>>> On Fri, Apr 25, 2014 at 3:10 PM, David Hall wrote:
>>>
>>>> LBFGS will not take a step that sends the objective value up. It might
>>>> try a step that is "too big" and reject it, so if you're just logging
>>>> everything that gets tried by LBFGS, you could see that. The "iterations"
>>>> method of the minimizer should never return an increasing objective value.
>>>> If you're regularizing, are you including the regularizer in the objective
>>>> value computation?
>>>>
>>>> GD is almost never worth your time.
>>>>
>>>> -- David
>>>>
>>>> On Fri, Apr 25, 2014 at 2:57 PM, DB Tsai  wrote:
>>>>
>>>>> Another interesting benchmark.
>>>>>
>>>>> *News20 dataset - 0.14M row, 1,355,191 features, 0.034% non-zero
>>>>> elements.*
>>>>>
>>>>> LBFGS converges in 70 seconds, while GD seems to be not progressing.
>>>>>
>>>>> Dense feature vector will be too big to fit in the memory, so only
>>>>> conduct the sparse benchmark.
>>>>>
>>>>> I saw the sometimes the loss bumps up, and it's weird for me. Since
>>>>> the cost function of logistic regression is convex, it should be
>>>>> monotonically decreasing.  David, any suggestion?
>>>>>
>>>>> The detail figure:
>>>>>
>>>>> https://github.com/dbtsai/spark-lbfgs-benchmark/raw/0b774682e398b4f7e0ce01a69c44000eb0e73454/result/news20.pdf
>>>>>
>>>>>
>>>>> *Rcv1 dataset - 6.8M row, 677,399 features, 0.15% non-zero elements.*
>>>>>
>>>>> LBFGS converges in 25 seconds, while GD also seems to be not
>>>>> progressing.
>>>>>
>>>>> Only conduct sparse benchmark for the same reason. I also saw the loss
>>>>> bumps up for unknown reason.
>>>>>
>>>>> The detail figure:
>>>>>
>>>>> https://github.com/dbtsai/spark-lbfgs-benchmark/raw/0b774682e398b4f7e0ce01a69c44000eb0e73454/result/rcv1.pdf
>>>>>
>>>>>
>>>>> Sincerely,
>>>>>
>>>>> DB Tsai
>>>>> ---
>>>>> My Blog: https://www.dbtsai.com
>>>>> LinkedIn: https://www.linkedin.com/in/dbtsai
>>

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-25 Thread David Hall
LBFGS will not take a step that sends the objective value up. It might try
a step that is "too big" and reject it, so if you're just logging
everything that gets tried by LBFGS, you could see that. The "iterations"
method of the minimizer should never return an increasing objective value.
If you're regularizing, are you including the regularizer in the objective
value computation?

GD is almost never worth your time.

-- David

On Fri, Apr 25, 2014 at 2:57 PM, DB Tsai  wrote:

> Another interesting benchmark.
>
> *News20 dataset - 0.14M row, 1,355,191 features, 0.034% non-zero elements.*
>
> LBFGS converges in 70 seconds, while GD seems to be not progressing.
>
> Dense feature vector will be too big to fit in the memory, so only conduct
> the sparse benchmark.
>
> I saw the sometimes the loss bumps up, and it's weird for me. Since the
> cost function of logistic regression is convex, it should be monotonically
> decreasing.  David, any suggestion?
>
> The detail figure:
>
> https://github.com/dbtsai/spark-lbfgs-benchmark/raw/0b774682e398b4f7e0ce01a69c44000eb0e73454/result/news20.pdf
>
>
> *Rcv1 dataset - 6.8M row, 677,399 features, 0.15% non-zero elements.*
>
> LBFGS converges in 25 seconds, while GD also seems to be not progressing.
>
> Only conduct sparse benchmark for the same reason. I also saw the loss
> bumps up for unknown reason.
>
> The detail figure:
>
> https://github.com/dbtsai/spark-lbfgs-benchmark/raw/0b774682e398b4f7e0ce01a69c44000eb0e73454/result/rcv1.pdf
>
>
> Sincerely,
>
> DB Tsai
> ---
> My Blog: https://www.dbtsai.com
> LinkedIn: https://www.linkedin.com/in/dbtsai
>
>
> On Thu, Apr 24, 2014 at 2:36 PM, DB Tsai  wrote:
>
>> rcv1.binary is too sparse (0.15% non-zero elements), so dense format
>> will not run due to out of memory. But sparse format runs really well.
>>
>>
>> Sincerely,
>>
>> DB Tsai
>> ---
>> My Blog: https://www.dbtsai.com
>> LinkedIn: https://www.linkedin.com/in/dbtsai
>>
>>
>> On Thu, Apr 24, 2014 at 1:54 PM, DB Tsai  wrote:
>>
>>> I'm doing the timer in runMiniBatchSGD after  val numExamples =
>>> data.count()
>>>
>>> See the following. Running rcv1 dataset now, and will update soon.
>>>
>>> val startTime = System.nanoTime()
>>> for (i <- 1 to numIterations) {
>>>   // Sample a subset (fraction miniBatchFraction) of the total data
>>>   // compute and sum up the subgradients on this subset (this is one
>>> map-reduce)
>>>   val (gradientSum, lossSum) = data.sample(false, miniBatchFraction,
>>> 42 + i)
>>> .aggregate((BDV.zeros[Double](weights.size), 0.0))(
>>>   seqOp = (c, v) => (c, v) match { case ((grad, loss), (label,
>>> features)) =>
>>> val l = gradient.compute(features, label, weights,
>>> Vectors.fromBreeze(grad))
>>> (grad, loss + l)
>>>   },
>>>   combOp = (c1, c2) => (c1, c2) match { case ((grad1, loss1),
>>> (grad2, loss2)) =>
>>> (grad1 += grad2, loss1 + loss2)
>>>   })
>>>
>>>   /**
>>>* NOTE(Xinghao): lossSum is computed using the weights from the
>>> previous iteration
>>>* and regVal is the regularization value computed in the previous
>>> iteration as well.
>>>*/
>>>   stochasticLossHistory.append(lossSum / miniBatchSize + regVal)
>>>   val update = updater.compute(
>>> weights, Vectors.fromBreeze(gradientSum / miniBatchSize),
>>> stepSize, i, regParam)
>>>   weights = update._1
>>>   regVal = update._2
>>>   timeStamp.append(System.nanoTime() - startTime)
>>> }
>>>
>>>
>>>
>>>
>>>
>>>
>>> Sincerely,
>>>
>>> DB Tsai
>>> ---
>>> My Blog: https://www.dbtsai.com
>>> LinkedIn: https://www.linkedin.com/in/dbtsai
>>>
>>>
>>> On Thu, Apr 24, 2014 at 1:44 PM, Xiangrui Meng  wrote:
>>>
 I don't understand why sparse falls behind dense so much at the very
 first iteration. I didn't see count() is called in

 https://github.com/dbtsai/spark-lbfgs-benchmark/blob/master/src/main/scala/org/apache/spark/mllib/benchmark/BinaryLogisticRegression.scala
 . Maybe you have local uncommitted changes.

 Best,
 Xiangrui

 On Thu, Apr 24, 2014 at 11:26 AM, DB Tsai  wrote:
 > Hi Xiangrui,
 >
 > Yes, I'm using yarn-cluster mode, and I did check # of executors I
 specified
 > are the same as the actual running executors.
 >
 > For caching and materialization, I've the timer in optimizer after
 calling
 > count(); as a result, the time for materialization in cache isn't in
 the
 > benchmark.
 >
 > The difference you saw is actually from dense feature or sparse
 feature
 > vector. For LBFGS and GD dense feature, you can see the first
 iteration
 > takes the same time. It's true for GD.
 >
 > I'm going to run rcv1.binary which only has 0.15% non-zero elements 

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-23 Thread David Hall
On Wed, Apr 23, 2014 at 10:18 PM, DB Tsai  wrote:

> ps, it doesn't make sense to have weight and gradient sparse unless
> with strong L1 penalty.
>

Sure, I was just checking the obvious things. Have you run it through it a
profiler to see where the problem is?



>
> Sincerely,
>
> DB Tsai
> ---
> My Blog: https://www.dbtsai.com
> LinkedIn: https://www.linkedin.com/in/dbtsai
>
>
> On Wed, Apr 23, 2014 at 10:17 PM, DB Tsai  wrote:
> > In mllib, the weight, and gradient are dense. Only feature is sparse.
> >
> > Sincerely,
> >
> > DB Tsai
> > ---
> > My Blog: https://www.dbtsai.com
> > LinkedIn: https://www.linkedin.com/in/dbtsai
> >
> >
> > On Wed, Apr 23, 2014 at 10:16 PM, David Hall 
> wrote:
> >> Was the weight vector sparse? The gradients? Or just the feature
> vectors?
> >>
> >>
> >> On Wed, Apr 23, 2014 at 10:08 PM, DB Tsai  wrote:
> >>>
> >>> The figure showing the Log-Likelihood vs Time can be found here.
> >>>
> >>>
> >>>
> https://github.com/dbtsai/spark-lbfgs-benchmark/raw/fd703303fb1c16ef5714901739154728550becf4/result/a9a11M.pdf
> >>>
> >>> Let me know if you can not open it.
> >>>
> >>> Sincerely,
> >>>
> >>> DB Tsai
> >>> ---
> >>> My Blog: https://www.dbtsai.com
> >>> LinkedIn: https://www.linkedin.com/in/dbtsai
> >>>
> >>>
> >>> On Wed, Apr 23, 2014 at 9:34 PM, Shivaram Venkataraman <
> >>> shiva...@eecs.berkeley.edu> wrote:
> >>>
> >>> > I don't think the attachment came through in the list. Could you
> upload
> >>> > the results somewhere and link to them ?
> >>> >
> >>> >
> >>> > On Wed, Apr 23, 2014 at 9:32 PM, DB Tsai  wrote:
> >>> >
> >>> >> 123 features per rows, and in average, 89% are zeros.
> >>> >> On Apr 23, 2014 9:31 PM, "Evan Sparks" 
> wrote:
> >>> >>
> >>> >> > What is the number of non zeroes per row (and number of features)
> in
> >>> >> > the
> >>> >> > sparse case? We've hit some issues with breeze sparse support in
> the
> >>> >> past
> >>> >> > but for sufficiently sparse data it's still pretty good.
> >>> >> >
> >>> >> > > On Apr 23, 2014, at 9:21 PM, DB Tsai 
> wrote:
> >>> >> > >
> >>> >> > > Hi all,
> >>> >> > >
> >>> >> > > I'm benchmarking Logistic Regression in MLlib using the newly
> added
> >>> >> > optimizer LBFGS and GD. I'm using the same dataset and the same
> >>> >> methodology
> >>> >> > in this paper, http://www.csie.ntu.edu.tw/~cjlin/papers/l1.pdf
> >>> >> > >
> >>> >> > > I want to know how Spark scale while adding workers, and how
> >>> >> optimizers
> >>> >> > and input format (sparse or dense) impact performance.
> >>> >> > >
> >>> >> > > The benchmark code can be found here,
> >>> >> > https://github.com/dbtsai/spark-lbfgs-benchmark
> >>> >> > >
> >>> >> > > The first dataset I benchmarked is a9a which only has 2.2MB. I
> >>> >> > duplicated the dataset, and made it 762MB to have 11M rows. This
> >>> >> > dataset
> >>> >> > has 123 features and 11% of the data are non-zero elements.
> >>> >> > >
> >>> >> > > In this benchmark, all the dataset is cached in memory.
> >>> >> > >
> >>> >> > > As we expect, LBFGS converges faster than GD, and at some
> point, no
> >>> >> > matter how we push GD, it will converge slower and slower.
> >>> >> > >
> >>> >> > > However, it's surprising that sparse format runs slower than
> dense
> >>> >> > format. I did see that sparse format takes significantly smaller
> >>> >> > amount
> >>> >> of
> >>> >> > memory in caching RDD, but sparse is 40% slower than dense. I
> think
> >>> >> sparse
> >>> >> > should be fast since when we compute x wT, since x is sparse, we
> can
> >>> >> > do
> >>> >> it
> >>> >> > faster. I wonder if there is anything I'm doing wrong.
> >>> >> > >
> >>> >> > > The attachment is the benchmark result.
> >>> >> > >
> >>> >> > > Thanks.
> >>> >> > >
> >>> >> > > Sincerely,
> >>> >> > >
> >>> >> > > DB Tsai
> >>> >> > > ---
> >>> >> > > My Blog: https://www.dbtsai.com
> >>> >> > > LinkedIn: https://www.linkedin.com/in/dbtsai
> >>> >> >
> >>> >>
> >>> >
> >>> >
> >>
> >>
>


Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-23 Thread David Hall
Was the weight vector sparse? The gradients? Or just the feature vectors?


On Wed, Apr 23, 2014 at 10:08 PM, DB Tsai  wrote:

> The figure showing the Log-Likelihood vs Time can be found here.
>
>
> https://github.com/dbtsai/spark-lbfgs-benchmark/raw/fd703303fb1c16ef5714901739154728550becf4/result/a9a11M.pdf
>
> Let me know if you can not open it.
>
> Sincerely,
>
> DB Tsai
> ---
> My Blog: https://www.dbtsai.com
> LinkedIn: https://www.linkedin.com/in/dbtsai
>
>
> On Wed, Apr 23, 2014 at 9:34 PM, Shivaram Venkataraman <
> shiva...@eecs.berkeley.edu> wrote:
>
> > I don't think the attachment came through in the list. Could you upload
> > the results somewhere and link to them ?
> >
> >
> > On Wed, Apr 23, 2014 at 9:32 PM, DB Tsai  wrote:
> >
> >> 123 features per rows, and in average, 89% are zeros.
> >> On Apr 23, 2014 9:31 PM, "Evan Sparks"  wrote:
> >>
> >> > What is the number of non zeroes per row (and number of features) in
> the
> >> > sparse case? We've hit some issues with breeze sparse support in the
> >> past
> >> > but for sufficiently sparse data it's still pretty good.
> >> >
> >> > > On Apr 23, 2014, at 9:21 PM, DB Tsai  wrote:
> >> > >
> >> > > Hi all,
> >> > >
> >> > > I'm benchmarking Logistic Regression in MLlib using the newly added
> >> > optimizer LBFGS and GD. I'm using the same dataset and the same
> >> methodology
> >> > in this paper, http://www.csie.ntu.edu.tw/~cjlin/papers/l1.pdf
> >> > >
> >> > > I want to know how Spark scale while adding workers, and how
> >> optimizers
> >> > and input format (sparse or dense) impact performance.
> >> > >
> >> > > The benchmark code can be found here,
> >> > https://github.com/dbtsai/spark-lbfgs-benchmark
> >> > >
> >> > > The first dataset I benchmarked is a9a which only has 2.2MB. I
> >> > duplicated the dataset, and made it 762MB to have 11M rows. This
> dataset
> >> > has 123 features and 11% of the data are non-zero elements.
> >> > >
> >> > > In this benchmark, all the dataset is cached in memory.
> >> > >
> >> > > As we expect, LBFGS converges faster than GD, and at some point, no
> >> > matter how we push GD, it will converge slower and slower.
> >> > >
> >> > > However, it's surprising that sparse format runs slower than dense
> >> > format. I did see that sparse format takes significantly smaller
> amount
> >> of
> >> > memory in caching RDD, but sparse is 40% slower than dense. I think
> >> sparse
> >> > should be fast since when we compute x wT, since x is sparse, we can
> do
> >> it
> >> > faster. I wonder if there is anything I'm doing wrong.
> >> > >
> >> > > The attachment is the benchmark result.
> >> > >
> >> > > Thanks.
> >> > >
> >> > > Sincerely,
> >> > >
> >> > > DB Tsai
> >> > > ---
> >> > > My Blog: https://www.dbtsai.com
> >> > > LinkedIn: https://www.linkedin.com/in/dbtsai
> >> >
> >>
> >
> >
>


Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-23 Thread David Hall
On Wed, Apr 23, 2014 at 9:30 PM, Evan Sparks  wrote:

> What is the number of non zeroes per row (and number of features) in the
> sparse case? We've hit some issues with breeze sparse support in the past
> but for sufficiently sparse data it's still pretty good.
>

Any chance you remember what the problems were? I'm sure it could be
better, but it's good to know where improvements need to happen.

-- David


>
> > On Apr 23, 2014, at 9:21 PM, DB Tsai  wrote:
> >
> > Hi all,
> >
> > I'm benchmarking Logistic Regression in MLlib using the newly added
> optimizer LBFGS and GD. I'm using the same dataset and the same methodology
> in this paper, http://www.csie.ntu.edu.tw/~cjlin/papers/l1.pdf
> >
> > I want to know how Spark scale while adding workers, and how optimizers
> and input format (sparse or dense) impact performance.
> >
> > The benchmark code can be found here,
> https://github.com/dbtsai/spark-lbfgs-benchmark
> >
> > The first dataset I benchmarked is a9a which only has 2.2MB. I
> duplicated the dataset, and made it 762MB to have 11M rows. This dataset
> has 123 features and 11% of the data are non-zero elements.
> >
> > In this benchmark, all the dataset is cached in memory.
> >
> > As we expect, LBFGS converges faster than GD, and at some point, no
> matter how we push GD, it will converge slower and slower.
> >
> > However, it's surprising that sparse format runs slower than dense
> format. I did see that sparse format takes significantly smaller amount of
> memory in caching RDD, but sparse is 40% slower than dense. I think sparse
> should be fast since when we compute x wT, since x is sparse, we can do it
> faster. I wonder if there is anything I'm doing wrong.
> >
> > The attachment is the benchmark result.
> >
> > Thanks.
> >
> > Sincerely,
> >
> > DB Tsai
> > ---
> > My Blog: https://www.dbtsai.com
> > LinkedIn: https://www.linkedin.com/in/dbtsai
>


Re: RFC: varargs in Logging.scala?

2014-04-11 Thread David Hall
Another usage that's nice is:

logDebug {
   val timeS = timeMillis/1000.0
   s"Time: $timeS"
}

which can be useful for more complicated expressions.


On Thu, Apr 10, 2014 at 5:55 PM, Michael Armbrust wrote:

> BTW...
>
> You can do calculations in string interpolation:
> s"Time: ${timeMillis / 1000}s"
>
> Or use format strings.
> f"Float with two decimal places: $floatValue%.2f"
>
> More info:
> http://docs.scala-lang.org/overviews/core/string-interpolation.html
>
>
> On Thu, Apr 10, 2014 at 5:46 PM, Michael Armbrust  >wrote:
>
> > Hi Marcelo,
> >
> > Thanks for bringing this up here, as this has been a topic of debate
> > recently.  Some thoughts below.
> >
> > ... all of the suffer from the fact that the log message needs to be
> built
> >> even
> >>
> >> though it might not be used.
> >>
> >
> > This is not true of the current implementation (and this is actually why
> > Spark has a logging trait instead of just using a logger directly.)
> >
> > If you look at the original function signatures:
> >
> > protected def logDebug(msg: => String) ...
> >
> >
> > The => implies that we are passing the msg by name instead of by value.
> > Under the covers, scala is creating a closure that can be used to
> calculate
> > the log message, only if its actually required.  This does result is a
> > significant performance improvement, but still requires allocating an
> > object for the closure.  The bytecode is really something like this:
> >
> > val logMessage = new Function0() { def call() =  "Log message" +
> someExpensiveComputation() }
> > log.debug(logMessage)
> >
> >
> > In Catalyst and Spark SQL we are using the scala-logging package, which
> > uses macros to automatically rewrite all of your log statements.
> >
> > You write: logger.debug(s"Log message $someExpensiveComputation")
> >
> > You get:
> >
> > if(logger.debugEnabled) {
> >   val logMsg = "Log message" + someExpensiveComputation()
> >   logger.debug(logMsg)
> > }
> >
> > IMHO, this is the cleanest option (and is supported by Typesafe).  Based
> > on a micro-benchmark, it is also the fastest:
> >
> > std logging: 19885.48ms
> > spark logging 914.408ms
> > scala logging 729.779ms
> >
> > Once the dust settles from the 1.0 release, I'd be in favor of
> > standardizing on scala-logging.
> >
> > Michael
> >
>


Re: MLLib - Thoughts about refactoring Updater for LBFGS?

2014-03-31 Thread David Hall
On Mon, Mar 31, 2014 at 10:15 AM, Debasish Das wrote:

> I added eclipse support in my qp branch:
>
> https://github.com/debasish83/breeze/tree/qp


Ok, great. Totally fine for this to come in with the QP.


>
>
> For the QP solver I will look into this solver http://www.joptimizer.com/
>
> Right now my plan is to use Professor Boyd's ECOS solver which is also
> designed in the very similar lines but has been tested to solve even cone
> programs...
>
> https://github.com/ifa-ethz/ecos
>
> Any idea whether I should add C native code using jniloader as the first
> version or rewrite using breeze.optimize style and call netlib-java calls
> for native support (ldl, cholesky etc)...
>

I personally don't care much. A year ago, I would have much preferred pure
JVM, and a few months ago I would have just said get it working with
natives, since presumably that's easier. *However*, ECOS is GPL, and
jniloader is LGPL. So, you can't require these dependencies in an Apache
product.

IANAL, but if you want this in Spark, I would suggest you stop looking at
ECOS immediately, at least until you have something that isn't encumbered
that works well enough.



>
> I still have to think how much cone support we will need...In ALS for
> example X^TX = I and Y^Y=I are interesting constraints for
> orthogonality...and they are quadratic constraints...With BFGS and CG, it
> is difficult to handle quadratic constraints...
>


I wonder how our general projected gradient solver would do? Clearly having
dedicated QP support is better, but in terms of just getting it working, it
might be enough.

-- David


>
>
>
> On Sun, Mar 30, 2014 at 4:40 PM, David Hall  wrote:
>
> > On Sun, Mar 30, 2014 at 2:01 PM, Debasish Das  > >wrote:
> >
> > > Hi David,
> > >
> > > I have started to experiment with BFGS solvers for Spark GLM over large
> > > scale data...
> > >
> > > I am also looking to add a good QP solver in breeze that can be used in
> > > Spark ALS for constraint solves...More details on that soon...
> > >
> > > I could not load up breeze 0.7 code onto eclipse...There is a folder
> > called
> > > natives in the master but there is no code in thatall the code is
> in
> > > src/main/scala...
> > >
> > > I added the eclipse plugin:
> > >
> > > addSbtPlugin("com.github.mpeltonen" % "sbt-idea" % "1.6.0")
> > >
> > > addSbtPlugin("com.typesafe.sbteclipse" % "sbteclipse-plugin" % "2.2.0")
> > >
> > > But it seems the project is set to use idea...
> > >
> > > Could you please explain the dev methodology for breeze ? My idea is to
> > do
> > > solver work in breeze as that's the right place and get it into Spark
> > > through Xiangrui's WIP on Sparse data and breeze support...
> > >
> >
> > It would be great to have a QP Solver: I don't know if you know about
> this
> > library: http://www.joptimizer.com/
> >
> > I'm not quite sure what you mean by dev methodology. If you just mean how
> > to get code into Breeze, just send a PR to scalanlp/breeze. Unit tests
> are
> > good for something nontrivial like this. Maybe some basic documentation.
> >
> >
> > >
> > > Thanks.
> > > Deb
> > >
> > >
> > >
> > > On Fri, Mar 7, 2014 at 12:46 AM, DB Tsai  wrote:
> > >
> > > > Hi Xiangrui,
> > > >
> > > > I think it doesn't matter whether we use Fortran/Breeze/RISO for
> > > > optimizers since optimization only takes << 1% of time. Most of the
> > > > time is in gradientSum and lossSum parallel computation.
> > > >
> > > > Sincerely,
> > > >
> > > > DB Tsai
> > > > Machine Learning Engineer
> > > > Alpine Data Labs
> > > > --
> > > > Web: http://alpinenow.com/
> > > >
> > > >
> > > > On Thu, Mar 6, 2014 at 7:10 PM, Xiangrui Meng 
> > wrote:
> > > > > Hi DB,
> > > > >
> > > > > Thanks for doing the comparison! What were the running times for
> > > > > fortran/breeze/riso?
> > > > >
> > > > > Best,
> > > > > Xiangrui
> > > > >
> > > > > On Thu, Mar 6, 2014 at 4:21 PM, DB Tsai 
> > wrote:
> > > > >> Hi David,
> > > > >>
> > > > >> I can converge to the

Re: MLLib - Thoughts about refactoring Updater for LBFGS?

2014-03-30 Thread David Hall
 
> > >> WARNING: Failed to load implementation from:
> > >> com.github.fommil.netlib.NativeSystemBLAS
> > >> Mar 6, 2014 3:59:11 PM com.github.fommil.netlib.BLAS 
> > >> WARNING: Failed to load implementation from:
> > >> com.github.fommil.netlib.NativeRefBLAS
> > >> Iteration 0: loss 1.3862943611198926, diff 0.0
> > >> Iteration 1: loss 1.5846343143210866, diff 0.14307193024217352
> > >> Iteration 2: loss 1.1242501524477688, diff 0.29053004039012126
> > >> Iteration 3: loss 1.1242501524477688, diff 0.0
> > >> Iteration 4: loss 1.1242501524477688, diff 0.0
> > >> Iteration 5: loss 1.0930151243303563, diff 0.027782962952189336
> > >> Iteration 6: loss 1.0930151243303563, diff 0.0
> > >> Iteration 7: loss 1.0930151243303563, diff 0.0
> > >> Iteration 8: loss 1.054036932835569, diff 0.03566113127440601
> > >> Iteration 9: loss 1.054036932835569, diff 0.0
> > >> Iteration 10: loss 1.054036932835569, diff 0.0
> > >> Iteration 11: loss 0.9907956302751622, diff 0.0507649459571
> > >> Iteration 12: loss 0.9907956302751622, diff 0.0
> > >> Iteration 13: loss 0.9907956302751622, diff 0.0
> > >> Iteration 14: loss 0.9184205380342829, diff 0.07304737423337761
> > >> Iteration 15: loss 0.9184205380342829, diff 0.0
> > >> Iteration 16: loss 0.9184205380342829, diff 0.0
> > >> Iteration 17: loss 0.8259870936519939, diff 0.1006438117513297
> > >> Iteration 18: loss 0.8259870936519939, diff 0.0
> > >> Iteration 19: loss 0.8259870936519939, diff 0.0
> > >> Iteration 20: loss 0.6327447552109576, diff 0.233952934583647
> > >> Iteration 21: loss 0.6327447552109576, diff 0.0
> > >> Iteration 22: loss 0.6327447552109576, diff 0.0
> > >> Iteration 23: loss 0.5534101162436362, diff 0.12538154276652747
> > >> Iteration 24: loss 0.5534101162436362, diff 0.0
> > >> Iteration 25: loss 0.5534101162436362, diff 0.0
> > >> Iteration 26: loss 0.40450200866125635, diff 0.2690732137675816
> > >> Iteration 27: loss 0.40450200866125635, diff 0.0
> > >> Iteration 28: loss 0.40450200866125635, diff 0.0
> > >> Iteration 29: loss 0.30788249908237314, diff 0.23885980452569502
> > >>
> > >> Sincerely,
> > >>
> > >> DB Tsai
> > >> Machine Learning Engineer
> > >> Alpine Data Labs
> > >> --
> > >> Web: http://alpinenow.com/
> > >>
> > >>
> > >> On Wed, Mar 5, 2014 at 2:00 PM, David Hall 
> > wrote:
> > >>> On Wed, Mar 5, 2014 at 1:57 PM, DB Tsai 
> wrote:
> > >>>
> > >>>> Hi David,
> > >>>>
> > >>>> On Tue, Mar 4, 2014 at 8:13 PM, dlwh 
> wrote:
> > >>>> > I'm happy to help fix any problems. I've verified at points that
> the
> > >>>> > implementation gives the exact same sequence of iterates for a few
> > >>>> different
> > >>>> > functions (with a particular line search) as the c port of lbfgs.
> > So I'm
> > >>>> a
> > >>>> > little surprised it fails where Fortran succeeds... but only a
> > little.
> > >>>> This
> > >>>> > was fixed late last year.
> > >>>> I'm working on a reproducible test case using breeze vs fortran
> > >>>> implementation to show the problem I've run into. The test will be
> in
> > >>>> one of the test cases in my Spark fork, is it okay for you to
> > >>>> investigate the issue? Or do I need to make it as a standalone test?
> > >>>>
> > >>>
> > >>>
> > >>> Um, as long as it wouldn't be too hard to pull out.
> > >>>
> > >>>
> > >>>>
> > >>>> Will send you the test later today.
> > >>>>
> > >>>> Thanks.
> > >>>>
> > >>>> Sincerely,
> > >>>>
> > >>>> DB Tsai
> > >>>> Machine Learning Engineer
> > >>>> Alpine Data Labs
> > >>>> --
> > >>>> Web: http://alpinenow.com/
> > >>>>
> >
>


Re: Making RDDs Covariant

2014-03-22 Thread David Hall
On Sat, Mar 22, 2014 at 8:59 AM, Pascal Voitot Dev <
pascal.voitot@gmail.com> wrote:

> The problem I was talking about is when you try to use typeclass converters
> and make them contravariant/covariant for input/output. Something like:
>
> Reader[-I, +O] { def read(i:I): O }
>
> Doing this, you soon have implicit collisions and philosophical concerns
> about what it means to serialize/deserialize a Parent class and a Child
> class...
>


You should (almost) never make a typeclass param contravariant. It's almost
certainly not what you want:

https://issues.scala-lang.org/browse/SI-2509

-- David


Re: MLLib - Thoughts about refactoring Updater for LBFGS?

2014-03-06 Thread David Hall
On Thu, Mar 6, 2014 at 4:21 PM, DB Tsai  wrote:

> Hi David,
>
> I can converge to the same result with your breeze LBFGS and Fortran
> implementations now. Probably, I made some mistakes when I tried
> breeze before. I apologize that I claimed it's not stable.
>
> See the test case in BreezeLBFGSSuite.scala
> https://github.com/AlpineNow/spark/tree/dbtsai-breezeLBFGS
>
> This is training multinomial logistic regression against iris dataset,
> and both optimizers can train the models with 98% training accuracy.
>

great to hear! There were some bugs in LBFGS about 6 months ago, so
depending on the last time you tried it, it might indeed have been bugged.


>
> There are two issues to use Breeze in Spark,
>
> 1) When the gradientSum and lossSum are computed distributively in
> custom defined DiffFunction which will be passed into your optimizer,
> Spark will complain LBFGS class is not serializable. In
> BreezeLBFGS.scala, I've to convert RDD to array to make it work
> locally. It should be easy to fix by just having LBFGS to implement
> Serializable.
>

I'm not sure why Spark should be serializing LBFGS? Shouldn't it live on
the controller node? Or is this a per-node thing?

But no problem to make it serializable.


>
> 2) Breeze computes redundant gradient and loss. See the following log
> from both Fortran and Breeze implementations.
>

Err, yeah. I should probably have LBFGS do this automatically, but there's
a CachedDiffFunction that gets rid of the redundant calculations.

-- David


>
> Thanks.
>
> Fortran:
> Iteration -1: loss 1.3862943611198926, diff 1.0
> Iteration 0: loss 1.5846343143210866, diff 0.14307193024217352
> Iteration 1: loss 1.1242501524477688, diff 0.29053004039012126
> Iteration 2: loss 1.0930151243303563, diff 0.027782962952189336
> Iteration 3: loss 1.054036932835569, diff 0.03566113127440601
> Iteration 4: loss 0.9907956302751622, diff 0.0507649459571
> Iteration 5: loss 0.9184205380342829, diff 0.07304737423337761
> Iteration 6: loss 0.8259870936519937, diff 0.10064381175132982
> Iteration 7: loss 0.6327447552109574, diff 0.23395293458364716
> Iteration 8: loss 0.5534101162436359, diff 0.1253815427665277
> Iteration 9: loss 0.4045020086612566, diff 0.26907321376758075
> Iteration 10: loss 0.3078824990823728, diff 0.23885980452569627
>
> Breeze:
> Iteration -1: loss 1.3862943611198926, diff 1.0
> Mar 6, 2014 3:59:11 PM com.github.fommil.netlib.BLAS 
> WARNING: Failed to load implementation from:
> com.github.fommil.netlib.NativeSystemBLAS
> Mar 6, 2014 3:59:11 PM com.github.fommil.netlib.BLAS 
> WARNING: Failed to load implementation from:
> com.github.fommil.netlib.NativeRefBLAS
> Iteration 0: loss 1.3862943611198926, diff 0.0
> Iteration 1: loss 1.5846343143210866, diff 0.14307193024217352
> Iteration 2: loss 1.1242501524477688, diff 0.29053004039012126
> Iteration 3: loss 1.1242501524477688, diff 0.0
> Iteration 4: loss 1.1242501524477688, diff 0.0
> Iteration 5: loss 1.0930151243303563, diff 0.027782962952189336
> Iteration 6: loss 1.0930151243303563, diff 0.0
> Iteration 7: loss 1.0930151243303563, diff 0.0
> Iteration 8: loss 1.054036932835569, diff 0.03566113127440601
> Iteration 9: loss 1.054036932835569, diff 0.0
> Iteration 10: loss 1.054036932835569, diff 0.0
> Iteration 11: loss 0.9907956302751622, diff 0.0507649459571
> Iteration 12: loss 0.9907956302751622, diff 0.0
> Iteration 13: loss 0.9907956302751622, diff 0.0
> Iteration 14: loss 0.9184205380342829, diff 0.07304737423337761
> Iteration 15: loss 0.9184205380342829, diff 0.0
> Iteration 16: loss 0.9184205380342829, diff 0.0
> Iteration 17: loss 0.8259870936519939, diff 0.1006438117513297
> Iteration 18: loss 0.8259870936519939, diff 0.0
> Iteration 19: loss 0.8259870936519939, diff 0.0
> Iteration 20: loss 0.6327447552109576, diff 0.233952934583647
> Iteration 21: loss 0.6327447552109576, diff 0.0
> Iteration 22: loss 0.6327447552109576, diff 0.0
> Iteration 23: loss 0.5534101162436362, diff 0.12538154276652747
> Iteration 24: loss 0.5534101162436362, diff 0.0
> Iteration 25: loss 0.5534101162436362, diff 0.0
> Iteration 26: loss 0.40450200866125635, diff 0.2690732137675816
> Iteration 27: loss 0.40450200866125635, diff 0.0
> Iteration 28: loss 0.40450200866125635, diff 0.0
> Iteration 29: loss 0.30788249908237314, diff 0.23885980452569502
>
> Sincerely,
>
> DB Tsai
> Machine Learning Engineer
> Alpine Data Labs
> --
> Web: http://alpinenow.com/
>
>
> On Wed, Mar 5, 2014 at 2:00 PM, David Hall  wrote:
> > On Wed, Mar 5, 2014 at 1:57 PM, DB Tsai  wrote:
> >
> >> Hi David,
> >>
> >> On Tue, Mar 4, 2014 at 8:13 PM, dlwh  wrote:
&

Re: MLLib - Thoughts about refactoring Updater for LBFGS?

2014-03-05 Thread David Hall
I did not. They would be nice to have.


On Wed, Mar 5, 2014 at 5:21 PM, Debasish Das wrote:

> David,
>
> There used to be standard BFGS testcases in Professor Nocedal's
> package...did you stress test the solver with them ?
>
> If not I will shoot him an email for them.
>
> Thanks.
> Deb
>
>
>
> On Wed, Mar 5, 2014 at 2:00 PM, David Hall  wrote:
>
> > On Wed, Mar 5, 2014 at 1:57 PM, DB Tsai  wrote:
> >
> > > Hi David,
> > >
> > > On Tue, Mar 4, 2014 at 8:13 PM, dlwh  wrote:
> > > > I'm happy to help fix any problems. I've verified at points that the
> > > > implementation gives the exact same sequence of iterates for a few
> > > different
> > > > functions (with a particular line search) as the c port of lbfgs. So
> > I'm
> > > a
> > > > little surprised it fails where Fortran succeeds... but only a
> little.
> > > This
> > > > was fixed late last year.
> > > I'm working on a reproducible test case using breeze vs fortran
> > > implementation to show the problem I've run into. The test will be in
> > > one of the test cases in my Spark fork, is it okay for you to
> > > investigate the issue? Or do I need to make it as a standalone test?
> > >
> >
> >
> > Um, as long as it wouldn't be too hard to pull out.
> >
> >
> > >
> > > Will send you the test later today.
> > >
> > > Thanks.
> > >
> > > Sincerely,
> > >
> > > DB Tsai
> > > Machine Learning Engineer
> > > Alpine Data Labs
> > > --
> > > Web: http://alpinenow.com/
> > >
> >
>


Re: MLLib - Thoughts about refactoring Updater for LBFGS?

2014-03-05 Thread David Hall
On Wed, Mar 5, 2014 at 1:57 PM, DB Tsai  wrote:

> Hi David,
>
> On Tue, Mar 4, 2014 at 8:13 PM, dlwh  wrote:
> > I'm happy to help fix any problems. I've verified at points that the
> > implementation gives the exact same sequence of iterates for a few
> different
> > functions (with a particular line search) as the c port of lbfgs. So I'm
> a
> > little surprised it fails where Fortran succeeds... but only a little.
> This
> > was fixed late last year.
> I'm working on a reproducible test case using breeze vs fortran
> implementation to show the problem I've run into. The test will be in
> one of the test cases in my Spark fork, is it okay for you to
> investigate the issue? Or do I need to make it as a standalone test?
>


Um, as long as it wouldn't be too hard to pull out.


>
> Will send you the test later today.
>
> Thanks.
>
> Sincerely,
>
> DB Tsai
> Machine Learning Engineer
> Alpine Data Labs
> --
> Web: http://alpinenow.com/
>


Re: MLLib - Thoughts about refactoring Updater for LBFGS?

2014-03-05 Thread David Hall
On Wed, Mar 5, 2014 at 8:50 AM, Debasish Das wrote:

> Hi David,
>
> Few questions on breeze solvers:
>
> 1. I feel the right place of adding useful things from RISO LBFGS (based on
> Professor Nocedal's fortran code) will be breeze. It will involve stress
> testing breeze LBFGS on large sparse datasets and contributing fixes to
> existing breeze LBFGS with the learning from RISO LBFGS.
>
> You agree on that right ?
>

That would be great.

>
> 2. Normally for doing experiments like 1, I fix the line search. What's
> your preferred line search in breeze BFGS ? I will also use that.
>
> More Thuente and Strong wolfe with backtracking helped me in the past.
>

The default is a Strong Wolfe search:
https://github.com/scalanlp/breeze/blob/master/src/main/scala/breeze/optimize/StrongWolfe.scala


>
> 3. The paper that you sent also says on L-BFGS-B is better on box
> constraints. I feel It's worthwhile to have both the solvers because many
> practical problems need box constraints or complex constraints can be
> reformulated in the realm of unconstrained and box constraints.


> Example use-cases for me are automatic feature extraction from photo/video
> frames using factorization.
>
> I will compare L-BFGS-B vs constrained QN method that you have in Breeze
> within an analysis similar to 1.
>

Great. Yeah I fully expect l-bfgs-b to be better on box constraints. It
turns out that this other method looked easier to implement and gave
reasonably good results even with box constraints.


>
> 4. Do you have a road-map on adding CG solvers in breeze ? Linear CG solver
> to compare with BLAS posv seems like a good usecase for me in mllib ALS.
> DB sent a paper by Professor Ng which shows the effectiveness of CG and
> BFGS over SGD in the email chain.
>

We have a linear CG solver (
https://github.com/scalanlp/breeze/blob/master/src/main/scala/breeze/optimize/linear/ConjugateGradient.scala)
which is used in the Truncated Newton solver. (
https://github.com/scalanlp/breeze/blob/master/src/main/scala/breeze/optimize/TruncatedNewtonMinimizer.scala)
But I've not implemented nonlinear CG, and honestly hadn't planned on it,
per below.


>
> I believe on non-convex problems like Matrix Factorization, CG family might
> converge to a better solution than BFGS. No way to know till we experiment
> on the datasets :-)
>

I've never had good experience with CG, and a vision colleague of mine
found that LBFGS was better on his (non-convex) stuff, but I could be
easily persuaded.

Thanks!

-- David


>
> Thanks.
> Deb
>
>
>
> On Tue, Mar 4, 2014 at 8:13 PM, dlwh  wrote:
>
> > Just subscribing to this list, so apologies for quoting weirdly and any
> > other
> > etiquette offenses.
> >
> >
> > DB Tsai wrote
> > > Hi Deb,
> > >
> > > I had tried breeze L-BFGS algorithm, and when I tried it couple weeks
> > > ago, it's not as stable as the fortran implementation. I guessed the
> > > problem is in the line search related thing. Since we may bring breeze
> > > dependency for the sparse format support as you pointed out, we can
> > > just try to fix the L-BFGS in breeze, and we can get OWL-QN and
> > > L-BFGS-B.
> > >
> > > What do you think?
> >
> > I'm happy to help fix any problems. I've verified at points that the
> > implementation gives the exact same sequence of iterates for a few
> > different
> > functions (with a particular line search) as the c port of lbfgs. So I'm
> a
> > little surprised it fails where Fortran succeeds... but only a little.
> This
> > was fixed late last year.
> >
> > OWL-QN seems to mostly be stable, but probably deserves more testing.
> > Presumably it has whatever defects my LBFGS does. (It's really pretty
> > straightforward to implement given an L-BFGS)
> >
> > We don't provide an L-BFGS-B implementation. We do have a more general
> > constrained qn method based on
> > http://jmlr.org/proceedings/papers/v5/schmidt09a/schmidt09a.pdf (which
> > uses
> > a L-BFGS type update as part of the algorithm). From the experiments in
> > their paper, it's likely to not work as well for bound constraints, but
> can
> > do things that lbfgsb can't.
> >
> > Again, let me know what I can help with.
> >
> > -- David Hall
> >
> >
> > On Mon, Mar 3, 2014 at 3:52 PM, DB Tsai <dbtsai@> wrote:
> > > Hi Deb,
> > >
> > >> a.  OWL-QN for solving L1 natively in BFGS
> > > Based on what I saw from
> > >
> >
> https://github.com/tjhunter/scalanlp-core/blob/master/learn/s