there are levels would in some cases separate subgroups of variables
which are not "that different". Successive binary splits could
potentially provide you with the required "homogeneous subsets".
Best,
Carlos J. Gil Bellosta
http://www.datanalytics.com
2014-11-06 10:46 GMT
parallelizing and the
cluster ran without memory issues.
Best,
Carlos J. Gil Bellosta
http://www.datanalytics.com
2014-08-15 3:53 GMT+02:00 Shivaram Venkataraman :
> Could you try increasing the number of slices with the large data set ?
> SparkR assumes that each slice (or partition in
nd smaller batches to my cluster? Is there any recommended
general approach to these kind of split-apply-combine problems?
Best,
Carlos J. Gil Bellosta
http://www.datanalytics.com
-
To unsubscribe, e-mail: user-unsubscr...@spark.