yes, rerun using a random sample from test data is OK. --Srinath
On Fri, May 22, 2015 at 2:28 PM, CD Athuraliya <[email protected]> wrote: > Hi Srinath, > > Still that random sample will not correspond to predicted vs. actual > values in test results. Given that there is no mapping between random > sample data points and test result points. One thing we can do is running > test separately (using the same model) for sampled data for the sole > purpose of visualization. Any other options? > > On Fri, May 22, 2015 at 2:06 PM, Srinath Perera <[email protected]> wrote: > >> Hi CD, >> >> Can we take a random sample from the test data and use that for this >> process? >> >> --Srianth >> >> On Fri, May 22, 2015 at 12:00 PM, CD Athuraliya <[email protected]> >> wrote: >> >>> Hi all, >>> >>> To implement $subject in ML we need all feature values of the dataset >>> against predicted and actual values for test data. But Spark only returns >>> predicted and actual values as test results. Right now we use random 10,000 >>> data rows for other visualizations and we cannot use same data for this >>> visualization since that random 10,000 data does not correspond to test >>> data (test data is a subtracted from dataset according to the train data >>> fraction at model building stage). >>> >>> One option is to persist test data at testing stage, but it can be too >>> large for some datasets according to train data fraction. Appreciate if you >>> can give your comments on this. >>> >>> Thanks, >>> CD >>> >>> -- >>> *CD Athuraliya* >>> Software Engineer >>> WSO2, Inc. >>> lean . enterprise . middleware >>> Mobile: +94 716288847 <94716288847> >>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter >>> <https://twitter.com/cdathuraliya> | Blog >>> <http://cdathuraliya.tumblr.com/> >>> >> >> >> >> -- >> ============================ >> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera >> Site: http://people.apache.org/~hemapani/ >> Photos: http://www.flickr.com/photos/hemapani/ >> Phone: 0772360902 >> > > > > -- > *CD Athuraliya* > Software Engineer > WSO2, Inc. > lean . enterprise . middleware > Mobile: +94 716288847 <94716288847> > LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter > <https://twitter.com/cdathuraliya> | Blog > <http://cdathuraliya.tumblr.com/> > -- ============================ Blog: http://srinathsview.blogspot.com twitter:@srinath_perera Site: http://people.apache.org/~hemapani/ Photos: http://www.flickr.com/photos/hemapani/ Phone: 0772360902
_______________________________________________ Dev mailing list [email protected] http://wso2.org/cgi-bin/mailman/listinfo/dev
