RE: build models in parallel

2016-12-01 Thread Masood Krohy
-- Masood Krohy, Ph.D. Data Scientist, Intact Lab-R Intact Financial Corporation http://ca.linkedin.com/in/masoodkh De :Xiaomeng Wan <shawn...@gmail.com> A : User <user@spark.apache.org> Date : 2016-11-29 11:54 Objet : build models in parallel I want to div

Re: build models in parallel

2016-11-29 Thread Georg Heiler
They https://www.youtube.com/watch?v=R-6nAwLyWCI use such functionality via pyspark. Xiaomeng Wan schrieb am Di., 29. Nov. 2016 um 17:54 Uhr: > I want to divide big data into groups (eg groupby some id), and build one > model for each group. I am wondering whether I can

build models in parallel

2016-11-29 Thread Xiaomeng Wan
I want to divide big data into groups (eg groupby some id), and build one model for each group. I am wondering whether I can parallelize the model building process by implementing a UDAF (eg running linearregression in its evaluate mothod). is it good practice? anybody has experience? Thanks!