Hi all, To implement $subject in ML we need all feature values of the dataset against predicted and actual values for test data. But Spark only returns predicted and actual values as test results. Right now we use random 10,000 data rows for other visualizations and we cannot use same data for this visualization since that random 10,000 data does not correspond to test data (test data is a subtracted from dataset according to the train data fraction at model building stage).
One option is to persist test data at testing stage, but it can be too large for some datasets according to train data fraction. Appreciate if you can give your comments on this. Thanks, CD -- *CD Athuraliya* Software Engineer WSO2, Inc. lean . enterprise . middleware Mobile: +94 716288847 <94716288847> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter <https://twitter.com/cdathuraliya> | Blog <http://cdathuraliya.tumblr.com/>
_______________________________________________ Dev mailing list [email protected] http://wso2.org/cgi-bin/mailman/listinfo/dev
