Re: Eliminate copy while sending data : any Akka experts here ?

2014-07-04 Thread Mridul Muralidharan
In our clusters, number of containers we can get is high but memory per container is low : which is why avg_nodes_not_hosting data is rarely zero for ML tasks :-) To update - to unblock our current implementation efforts, we went with broadcast - since it is intutively easier and minimal change;

Re: PLSA

2014-07-04 Thread Denis Turdakov
Hi, Deb. I don't quite understand the question. PLSA is an instance of matrix factorization problem. If you are asking about inference algorithm, we use EM-algorithm. Description of this approach is, for example, here: http://www.machinelearning.ru/wiki/images/1/1f/Voron14aist.pdf Best,

Re: PLSA

2014-07-04 Thread Debasish Das
Thanks for the pointer... Looks like you are using EM algorithm for factorization which looks similar to multiplicative update rules Do you think using mllib ALS implicit feedback, you can scale the problem further ? We can handle L1, L2, equality and positivity constraints in ALS now...As long

Re: Constraint Solver for Spark

2014-07-04 Thread Debasish Das
I looked further and realized that ECOS used a mex file while PDCO is using pure Matlab code. So the out-of-box runtime comparison is not fair. I am trying to generate PDCO C port. Like ECOS, PDCO also makes use of sparse support from Tim Davis. Thanks. Deb

Invalid link for Spark 1.0.0 in Official Web Site

2014-07-04 Thread Kousuke Saruta
Hi, I found there is a invalid link in http://spark.apache.org/downloads.html . The link for release note of Spark 1.0.0 indicates http://spark.apache.org/releases/spark-release-1.0.0.html but this link is invalid. I think that is mistake for

[RESULT] [VOTE] Release Apache Spark 1.0.1 (RC1)

2014-07-04 Thread Patrick Wendell
This vote is cancelled in favor of RC2. Thanks to everyone who voted. On Sun, Jun 29, 2014 at 11:23 PM, Andrew Ash and...@andrewash.com wrote: Ok that's reasonable -- it's certainly more of an enhancement than a critical bug-fix. I would like to get this in for 1.1.0 though, so let's talk

[VOTE] Release Apache Spark 1.0.1 (RC2)

2014-07-04 Thread Patrick Wendell
Please vote on releasing the following candidate as Apache Spark version 1.0.1! The tag to be voted on is v1.0.1-rc1 (commit 7d1043c): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=7d1043c99303b87aef8ee19873629c2bfba4cc78 The release files, including signatures, digests, etc.

2nd Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE2)

2014-07-04 Thread Mattmann, Chris A (3980)
(apologies for Cross Posting) 2nd Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE2) http://wssspe.researchcomputing.org.uk/wssspe2/ (to be held in conjunction with SC14, Sunday, 16 November 2014, New Orleans, LA, USA) Progress in scientific research is dependent on