How to load dataset in apache spark? Can I know sources of massive datasets?
On Wed, Jul 22, 2015 at 4:50 AM, Ron Gonzalez <zlgonza...@yahoo.com.invalid> wrote: > I'd use Random Forest. It will give you better generalizability. There > are also a number of things you can do with RF that allows to train on > samples of the massive data set and then just average over the resulting > models... > > Thanks, > Ron > > > On 07/21/2015 02:17 PM, Olivier Girardot wrote: > > depends on your data and I guess the time/performance goals you have for > both training/prediction, but for a quick answer : yes :) > > 2015-07-21 11:22 GMT+02:00 Chintan Bhatt <chintanbhatt...@charusat.ac.in>: > >> Which classifier can be useful for mining massive datasets in spark? >> Decision Tree can be good choice as per scalability? >> >> -- >> CHINTAN BHATT <http://in.linkedin.com/pub/chintan-bhatt/22/b31/336/> >> Assistant Professor, >> U & P U Patel Department of Computer Engineering, >> Chandubhai S. Patel Institute of Technology, >> Charotar University of Science And Technology (CHARUSAT), >> Changa-388421, Gujarat, INDIA. >> http://www.charusat.ac.in >> *Personal Website*: https://sites.google.com/a/ecchanga.ac.in/chintan/ >> > > > -- CHINTAN BHATT <http://in.linkedin.com/pub/chintan-bhatt/22/b31/336/> Assistant Professor, U & P U Patel Department of Computer Engineering, Chandubhai S. Patel Institute of Technology, Charotar University of Science And Technology (CHARUSAT), Changa-388421, Gujarat, INDIA. http://www.charusat.ac.in *Personal Website*: https://sites.google.com/a/ecchanga.ac.in/chintan/