Re: RDDs caching in typical machine learning use cases

Eugene Morozov Mon, 04 Apr 2016 13:36:21 -0700

Hi,

Yes, I believe people do that. I also believe that SparkML is able to
figure out when to cache some internal RDD also. That's definitely true for
random forest algo. It doesn't harm to cache the same RDD twice, too.

But it's not clear what'd you want to know...

--
Be well!
Jean Morozov

On Sun, Apr 3, 2016 at 11:34 AM, Sergey <ser...@gmail.com> wrote:

> Hi Spark ML experts!
>
> Do you use RDDs caching somewhere together with ML lib to speed up
> calculation?
> I mean typical machine learning use cases.
> Train-test split, train, evaluate, apply model.
>
> Sergey.
>

Re: RDDs caching in typical machine learning use cases

Reply via email to