[mllib] Share the simple benchmark result about the cast cost from Spark vector to Breeze vector

2014-10-15 Thread Yu Ishikawa
Hi all, I wondered the cast cost from Spark Vectors to Breeze vector is high or low. So I benchmarked the simple operation about addition, multiplication and division of RDD[Vector] or RDD[BV[Double]]. I share the simple benchmark result with you. In conclusion, the cast cost was lower than I

Re: Unit testing Master-Worker Message Passing

2014-10-15 Thread Matthew Cheah
What's happening when I do this is that the Worker tries to get the Master actor by calling context.actorSelection(), and the RegisterWorker message gets sent to the dead letters mailbox instead of being picked up by expectMsg. I'm new to Akka and I've tried various ways to registering a mock

short jenkins downtime -- trying to get to the bottom of the git fetch timeouts

2014-10-15 Thread shane knapp
i'm going to be downgrading our git plugin (from 2.2.7 to 2.2.2) to see if that helps w/the git fetch timeouts. this will require a short downtime (~20 mins for builds to finish, ~20 mins to downgrade), and will hopefully give us some insight in to wtf is going on. thanks for your patience...

Re: short jenkins downtime -- trying to get to the bottom of the git fetch timeouts

2014-10-15 Thread Nicholas Chammas
I support this effort. :thumbsup: On Wed, Oct 15, 2014 at 4:52 PM, shane knapp skn...@berkeley.edu wrote: i'm going to be downgrading our git plugin (from 2.2.7 to 2.2.2) to see if that helps w/the git fetch timeouts. this will require a short downtime (~20 mins for builds to finish, ~20

Re: Unit testing Master-Worker Message Passing

2014-10-15 Thread Chester Chen
You can call resolve method on ActorSelection.resolveOne() to see if the actor is still there or the path is correct. The method returns a future and you can wait for it with timeout. This way, you know the actor is live or already dead or incorrect. Another way, is to send Identify method to

Re: short jenkins downtime -- trying to get to the bottom of the git fetch timeouts

2014-10-15 Thread shane knapp
ok, we're up and building... :crossesfingersfortheumpteenthtime: On Wed, Oct 15, 2014 at 1:59 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: I support this effort. :thumbsup: On Wed, Oct 15, 2014 at 4:52 PM, shane knapp skn...@berkeley.edu wrote: i'm going to be downgrading our

Re: short jenkins downtime -- trying to get to the bottom of the git fetch timeouts

2014-10-15 Thread shane knapp
four builds triggered and no timeouts. :crossestoes: :) On Wed, Oct 15, 2014 at 2:19 PM, shane knapp skn...@berkeley.edu wrote: ok, we're up and building... :crossesfingersfortheumpteenthtime: On Wed, Oct 15, 2014 at 1:59 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: I

Re: Unit testing Master-Worker Message Passing

2014-10-15 Thread Matthew Cheah
I think on a higher level I also want to ask why such unit testing has not actually been done in this codebase. If it's not a common practice to test message passing then I'm fine with leaving out the unit test, however I'm more curious as to why such testing was not done before. On Wed, Oct 15,

Re: Unit testing Master-Worker Message Passing

2014-10-15 Thread Josh Rosen
There are some end-to-end integration tests of Master - Worker fault-tolerance in  https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala I’ve actually been working to develop a more generalized Docker-based integration-testing framework

Re: Unit testing Master-Worker Message Passing

2014-10-15 Thread Matthew Cheah
Thanks Josh! These tests seem to cover the cases I'm looking for already =). What's interesting though is that we still ran into SPARK-3736 despite such integration tests being in place to catch it - specifically, the case when the master disconnects and reconnects, the workers should reconnect

Re: short jenkins downtime -- trying to get to the bottom of the git fetch timeouts

2014-10-15 Thread shane knapp
ok, we've had about 10 spark pull request builds go through w/o any git timeouts. it seems that the git timeout issue might be licked. i will be definitely be keeping an eye on this for the next few days. thanks for being patient! shane On Wed, Oct 15, 2014 at 2:27 PM, shane knapp

Issues with ALS positive definite

2014-10-15 Thread Debasish Das
Hi, If I take the Movielens data and run the default ALS with regularization as 0.0, I am hitting exception from LAPACK that the gram matrix is not positive definite. This is on the master branch. This is how I run it : ./bin/spark-submit --total-executor-cores 1 --master spark://

Re: Issues with ALS positive definite

2014-10-15 Thread Liquan Pei
Hi Debaish, I think ||r - wi'hj||^{2} is semi-positive definite. Thanks, Liquan On Wed, Oct 15, 2014 at 4:57 PM, Debasish Das debasish.da...@gmail.com wrote: Hi, If I take the Movielens data and run the default ALS with regularization as 0.0, I am hitting exception from LAPACK that the

Re: Issues with ALS positive definite

2014-10-15 Thread Debasish Das
But do you expect the mllib code to fail if I run with 0.0 regularization ? I think ||r - wi'hj||^{2} is positive definite...It can become positive semi definite only if there are dependent rows in the matrix... @sean is that right ? We had this discussion before as well... On Wed, Oct 15,

Re: short jenkins downtime -- trying to get to the bottom of the git fetch timeouts

2014-10-15 Thread Nicholas Chammas
A quick scan through the Spark PR board https://spark-prs.appspot.com/ shows no recent failures related to this git checkout problem. Looks promising! Nick On Wed, Oct 15, 2014 at 6:10 PM, shane knapp skn...@berkeley.edu wrote: ok, we've had about 10 spark pull request builds go through w/o