Hi CD, Can we take a random sample from the test data and use that for this process?
--Srianth On Fri, May 22, 2015 at 12:00 PM, CD Athuraliya <[email protected]> wrote: > Hi all, > > To implement $subject in ML we need all feature values of the dataset > against predicted and actual values for test data. But Spark only returns > predicted and actual values as test results. Right now we use random 10,000 > data rows for other visualizations and we cannot use same data for this > visualization since that random 10,000 data does not correspond to test > data (test data is a subtracted from dataset according to the train data > fraction at model building stage). > > One option is to persist test data at testing stage, but it can be too > large for some datasets according to train data fraction. Appreciate if you > can give your comments on this. > > Thanks, > CD > > -- > *CD Athuraliya* > Software Engineer > WSO2, Inc. > lean . enterprise . middleware > Mobile: +94 716288847 <94716288847> > LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter > <https://twitter.com/cdathuraliya> | Blog > <http://cdathuraliya.tumblr.com/> > -- ============================ Blog: http://srinathsview.blogspot.com twitter:@srinath_perera Site: http://people.apache.org/~hemapani/ Photos: http://www.flickr.com/photos/hemapani/ Phone: 0772360902
_______________________________________________ Dev mailing list [email protected] http://wso2.org/cgi-bin/mailman/listinfo/dev
