Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-29 Thread David Hall
Yeah, that's probably the easiest though obviously pretty hacky. I'm surprised that the hessian approximation isn't worse than it is. (As in, I'd expect error messages.) It's obviously line searching much more, so the approximation must be worse. You might be interested in this online bfgs:

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-29 Thread DB Tsai
Yeah, the approximation of hssian in LBFGS isn't stateless, and it does depend on previous LBFGS step as Xiangrui also pointed out. It's surprising that it works without error message. I also saw the loss is fluctuating like SGD during the training. We will remove the miniBatch mode in LBFGS in

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-28 Thread DB Tsai
Also, how many failure of rejection will terminate the optimization process? How is it related to numberOfImprovementFailures? Thanks. Sincerely, DB Tsai --- My Blog: https://www.dbtsai.com LinkedIn: https://www.linkedin.com/in/dbtsai On

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-28 Thread David Hall
That's right. FWIW, caching should be automatic now, but it might be the version of Breeze you're using doesn't do that yet. Also, In breeze.util._ there's an implicit that adds a tee method to iterator, and also a last method. Both are useful for things like this. -- David On Sun, Apr 27,

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-28 Thread DB Tsai
Hi David, I got most of the stuff working, and the loss is monotonically decreasing by getting the history from iterator of state. However, in the costFun, I need to know what current iteration is it for miniBatch, which means for one iteration, if optimizer calls costFun several times for line

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-25 Thread David Hall
LBFGS will not take a step that sends the objective value up. It might try a step that is too big and reject it, so if you're just logging everything that gets tried by LBFGS, you could see that. The iterations method of the minimizer should never return an increasing objective value. If you're

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-25 Thread Tom Vacek
I don't know about Spark's implementation, but with LBFGS, there is a line search step. Since computing the line search takes roughly the same work as one iteration, an efficient implementation will take a full step and simultaneously compute the gradient for the next step and check if the update

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-24 Thread Xiangrui Meng
I don't think it is easy to make sparse faster than dense with this sparsity and feature dimension. You can try rcv1.binary, which should show the difference easily. David, the breeze operators used here are 1. DenseVector dot SparseVector 2. axpy DenseVector SparseVector However, the

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-24 Thread Xiangrui Meng
Hi DB, I saw you are using yarn-cluster mode for the benchmark. I tested the yarn-cluster mode and found that YARN does not always give you the exact number of executors requested. Just want to confirm that you've checked the number of executors. The second thing to check is that in the

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-24 Thread DB Tsai
Hi Xiangrui, Yes, I'm using yarn-cluster mode, and I did check # of executors I specified are the same as the actual running executors. For caching and materialization, I've the timer in optimizer after calling count(); as a result, the time for materialization in cache isn't in the benchmark.

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-24 Thread Xiangrui Meng
I don't understand why sparse falls behind dense so much at the very first iteration. I didn't see count() is called in https://github.com/dbtsai/spark-lbfgs-benchmark/blob/master/src/main/scala/org/apache/spark/mllib/benchmark/BinaryLogisticRegression.scala . Maybe you have local uncommitted

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-24 Thread DB Tsai
I'm doing the timer in runMiniBatchSGD after val numExamples = data.count() See the following. Running rcv1 dataset now, and will update soon. val startTime = System.nanoTime() for (i - 1 to numIterations) { // Sample a subset (fraction miniBatchFraction) of the total data

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-24 Thread DB Tsai
rcv1.binary is too sparse (0.15% non-zero elements), so dense format will not run due to out of memory. But sparse format runs really well. Sincerely, DB Tsai --- My Blog: https://www.dbtsai.com LinkedIn: https://www.linkedin.com/in/dbtsai

MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-23 Thread DB Tsai
Hi all, I'm benchmarking Logistic Regression in MLlib using the newly added optimizer LBFGS and GD. I'm using the same dataset and the same methodology in this paper, http://www.csie.ntu.edu.tw/~cjlin/papers/l1.pdf I want to know how Spark scale while adding workers, and how optimizers and input

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-23 Thread Evan Sparks
What is the number of non zeroes per row (and number of features) in the sparse case? We've hit some issues with breeze sparse support in the past but for sufficiently sparse data it's still pretty good. On Apr 23, 2014, at 9:21 PM, DB Tsai dbt...@stanford.edu wrote: Hi all, I'm

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-23 Thread Evan Sparks
Sorry - just saw the 11% number. That is around the spot where dense data is usually faster (blocking, cache coherence, etc) is there any chance you have a 1% (or so) sparse dataset to experiment with? On Apr 23, 2014, at 9:21 PM, DB Tsai dbt...@stanford.edu wrote: Hi all, I'm

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-23 Thread Shivaram Venkataraman
I don't think the attachment came through in the list. Could you upload the results somewhere and link to them ? On Wed, Apr 23, 2014 at 9:32 PM, DB Tsai dbt...@dbtsai.com wrote: 123 features per rows, and in average, 89% are zeros. On Apr 23, 2014 9:31 PM, Evan Sparks evan.spa...@gmail.com

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-23 Thread DB Tsai
Any suggestion for sparser dataset? Will test more tomorrow in the office. On Apr 23, 2014 9:33 PM, Evan Sparks evan.spa...@gmail.com wrote: Sorry - just saw the 11% number. That is around the spot where dense data is usually faster (blocking, cache coherence, etc) is there any chance you have

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-23 Thread David Hall
On Wed, Apr 23, 2014 at 9:30 PM, Evan Sparks evan.spa...@gmail.com wrote: What is the number of non zeroes per row (and number of features) in the sparse case? We've hit some issues with breeze sparse support in the past but for sufficiently sparse data it's still pretty good. Any chance you

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-23 Thread DB Tsai
The figure showing the Log-Likelihood vs Time can be found here. https://github.com/dbtsai/spark-lbfgs-benchmark/raw/fd703303fb1c16ef5714901739154728550becf4/result/a9a11M.pdf Let me know if you can not open it. Sincerely, DB Tsai --- My

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-23 Thread David Hall
Was the weight vector sparse? The gradients? Or just the feature vectors? On Wed, Apr 23, 2014 at 10:08 PM, DB Tsai dbt...@dbtsai.com wrote: The figure showing the Log-Likelihood vs Time can be found here.

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-23 Thread DB Tsai
ps, it doesn't make sense to have weight and gradient sparse unless with strong L1 penalty. Sincerely, DB Tsai --- My Blog: https://www.dbtsai.com LinkedIn: https://www.linkedin.com/in/dbtsai On Wed, Apr 23, 2014 at 10:17 PM, DB Tsai

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-23 Thread David Hall
On Wed, Apr 23, 2014 at 10:18 PM, DB Tsai dbt...@dbtsai.com wrote: ps, it doesn't make sense to have weight and gradient sparse unless with strong L1 penalty. Sure, I was just checking the obvious things. Have you run it through it a profiler to see where the problem is? Sincerely, DB

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-23 Thread DB Tsai
Not yet since it's running in the cluster. Will run locally with profiler. Thanks for help. Sincerely, DB Tsai --- My Blog: https://www.dbtsai.com LinkedIn: https://www.linkedin.com/in/dbtsai On Wed, Apr 23, 2014 at 10:22 PM, David Hall

Re: MLlib - logistic regression with GD vs LBFGS, sparse vs dense benchmark result

2014-04-23 Thread DB Tsai
The figure showing the Log-Likelihood vs Time can be found here. https://github.com/dbtsai/spark-lbfgs-benchmark/raw/fd703303fb1c16ef5714901739154728550becf4/result/a9a11M.pdf Let me know if you can not open it. Thanks. Sincerely, DB Tsai ---