--
Masood Krohy, Ph.D.
Data Scientist, Intact Lab-R
Intact Financial Corporation
http://ca.linkedin.com/in/masoodkh
De :Xiaomeng Wan <shawn...@gmail.com>
A : User <user@spark.apache.org>
Date : 2016-11-29 11:54
Objet : build models in parallel
I want to div
They https://www.youtube.com/watch?v=R-6nAwLyWCI use such functionality via
pyspark.
Xiaomeng Wan schrieb am Di., 29. Nov. 2016 um
17:54 Uhr:
> I want to divide big data into groups (eg groupby some id), and build one
> model for each group. I am wondering whether I can
I want to divide big data into groups (eg groupby some id), and build one
model for each group. I am wondering whether I can parallelize the model
building process by implementing a UDAF (eg running linearregression in its
evaluate mothod). is it good practice? anybody has experience? Thanks!