Re: [VOTE] Release Apache Spark 1.0.1 (RC1)

2014-06-30 Thread Andrew Ash
Ok that's reasonable -- it's certainly more of an enhancement than a critical bug-fix. I would like to get this in for 1.1.0 though, so let's talk through the right way to do that on the PR. In the meantime the best alternative is running with lax firewall settings, which can be somewhat

Contributing to MLlib

2014-06-30 Thread salexln
Hi guys, I'm new to Spark MLlib and this may be a dumb question, but still As part of my M.Sc project, i'm working on implementation of Fuzzy C-means (FCM) algorithm in MLlib. FCM has many things in common with K - Means algorithm, which is already implemented, and I wanted to know whether

Re: Application level progress monitoring and communication

2014-06-30 Thread Chester Chen
Reynold thanks for the reply. It's true, this is more to Yarn communication than Spark. But this is a general enough problem for all the YARN_CLUSTER mode application. I thought just to reach out to the community. If we choose to using Akka solution, then this is related to Spark, as there

Re: Eliminate copy while sending data : any Akka experts here ?

2014-06-30 Thread Aaron Davidson
I don't know of any way to avoid Akka doing a copy, but I would like to mention that it's on the priority list to piggy-back only the map statuses relevant to a particular map task on the task itself, thus reducing the total amount of data sent over the wire by a factor of N for N physical

RE: Artificial Neural Network in Spark?

2014-06-30 Thread Bert Greevenbosch
Hi Debasish, Alexander, all, Indeed I found the OpenDL project through the Powered by Spark page. I'll need some time to look into the code, but on the first sight it looks quite well-developed. I'll contact the author about this too. My own implementation (in Scala) works for multiple inputs

Re: Contributing to MLlib on GLM

2014-06-30 Thread Gang Bai
Thanks Xiaokai, I’ve created a pull request to merge features in my PR to your repo. Please take a review here https://github.com/xwei-datageek/spark/pull/2 . As for GLMs, here at Sina, we are solving the problem of predicting the num of visitors who read a particular news article or watch an

Re: Artificial Neural Network in Spark?

2014-06-30 Thread Debasish Das
I will let Xiangrui to comment on the PR process to add the code in mllib but I would love to look into your initial version if you push it to github... As far as I remember Quoc got his best ANN results using back-propagation algorithm and solved using CG...do you have those features or you are