What about running several tests on small data , can't that give an indicator of how big data will perform ? Thanks
On Mon, Jan 28, 2013 at 11:19 AM, Sean Owen <[email protected]> wrote: > Impossible to say. More data means a more reliable estimate all else equal. > That's about it. > On Jan 28, 2013 5:17 PM, "Zia mel" <[email protected]> wrote: > >> Any thoughts of this ? >> >> On Sat, Jan 26, 2013 at 10:55 AM, Zia mel <[email protected]> wrote: >> > OK , in the precison when we reduce the size of sample to .1 or 0.05 , >> > would the results be related when we check with all the data ? For >> > example, if we have data1 and data2 and test them using 0.1 and found >> > that data 1 is producing better results , would the same thing stand >> > when we check with all data? >> > >> > IRStatistics stats = evaluator.evaluate(recommenderBuilder, >> > null, model, null, 10, >> > >> > GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, >> > 0.05); >> > >> > Many thanks >> > >> > On Fri, Jan 25, 2013 at 12:26 PM, Sean Owen <[email protected]> wrote: >> >> No, it takes a fixed "at" value. You can modify it to do whatever you >> want. >> >> You will see it doesn't bother with users with little data, like < >> >> 2*at data points. >> >> >> >> On Fri, Jan 25, 2013 at 6:23 PM, Zia mel <[email protected]> >> wrote: >> >>> Interesting. Using >> >>> IRStatistics stats = evaluator.evaluate(recommenderBuilder, >> >>> null, model, null, 5, >> >>> >> >>> GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, >> >>> 1.0); >> >>> >> >>> Can it be adjusted to each user ? In other words, is there a way to >> >>> select a threshold instead of using 5 ? mm Something like selecting y >> >>> set , each set have a min of z user ? >> >>> >> >>> >> >>> >> >>> On Fri, Jan 25, 2013 at 12:09 PM, Sean Owen <[email protected]> wrote: >> >>>> The way I do it is to set x different for each user, to the number of >> >>>> items in the user's test set -- you ask for x recommendations. >> >>>> This makes precision == recall, note. It dodges this problem though. >> >>>> >> >>>> Otherwise, if you fix x, the condition you need is stronger, really: >> >>>> each user needs >= x *test set* items in addition to training set >> >>>> items to make this test fair. >> >>>> >> >>>> >> >>>> On Fri, Jan 25, 2013 at 4:10 PM, Zia mel <[email protected]> >> wrote: >> >>>>> When selecting precision at x let's say 5 , should I check that all >> >>>>> users have 5 items or more? For example, if a user have 3 items and >> >>>>> they were removed as top items, then how can the recommender suggest >> >>>>> items since there are no items to learn from? >> >>>>> Thanks ! >>
