Ok that's reasonable -- it's certainly more of an enhancement than a
critical bug-fix. I would like to get this in for 1.1.0 though, so let's
talk through the right way to do that on the PR.
In the meantime the best alternative is running with lax firewall settings,
which can be somewhat
Hi guys,
I'm new to Spark MLlib and this may be a dumb question, but still
As part of my M.Sc project, i'm working on implementation of Fuzzy C-means
(FCM) algorithm in MLlib.
FCM has many things in common with K - Means algorithm, which is already
implemented, and I wanted to know whether
Reynold
thanks for the reply. It's true, this is more to Yarn communication
than Spark.
But this is a general enough problem for all the YARN_CLUSTER mode
application. I thought
just to reach out to the community.
If we choose to using Akka solution, then this is related to Spark, as
there
I don't know of any way to avoid Akka doing a copy, but I would like to
mention that it's on the priority list to piggy-back only the map statuses
relevant to a particular map task on the task itself, thus reducing the
total amount of data sent over the wire by a factor of N for N physical
Hi Debasish, Alexander, all,
Indeed I found the OpenDL project through the Powered by Spark page. I'll need
some time to look into the code, but on the first sight it looks quite
well-developed. I'll contact the author about this too.
My own implementation (in Scala) works for multiple inputs
Thanks Xiaokai,
I’ve created a pull request to merge features in my PR to your repo. Please
take a review here https://github.com/xwei-datageek/spark/pull/2 .
As for GLMs, here at Sina, we are solving the problem of predicting the num of
visitors who read a particular news article or watch an
I will let Xiangrui to comment on the PR process to add the code in mllib
but I would love to look into your initial version if you push it to
github...
As far as I remember Quoc got his best ANN results using back-propagation
algorithm and solved using CG...do you have those features or you are