Hi Peng,

Short answer: Yes.  It has been run on billions of rows and tens of
millions of columns.

Long answer: There are many ways to implement LR in a distributed fashion,
and their dependence on the dataset dimensions and compute cluster size
varies.

The implementation distributes the gradient computation (which is
instance-parallel).  You can find more info here:
http://spark.apache.org/docs/latest/mllib-linear-methods.html

Joseph

On Tue, Feb 3, 2015 at 7:21 AM, Peng Zhang <pzhang.x...@icloud.com> wrote:

> Hi Everyone,
>
> Is LogisticRegressionWithSGD in MLlib scalable?
>
> If so, what is the idea behind the scalable implementation?
>
> Thanks in advance,
>
> Peng
>
>
>
>
>
> -----
> Peng Zhang
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Is-LogisticRegressionWithSGD-in-MLlib-scalable-tp21482.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to