What about running several tests on small data , can't that give an
indicator of how big data will perform ?
Thanks

On Mon, Jan 28, 2013 at 11:19 AM, Sean Owen <[email protected]> wrote:
> Impossible to say. More data means a more reliable estimate all else equal.
> That's about it.
> On Jan 28, 2013 5:17 PM, "Zia mel" <[email protected]> wrote:
>
>> Any thoughts of this ?
>>
>> On Sat, Jan 26, 2013 at 10:55 AM, Zia mel <[email protected]> wrote:
>> > OK , in the precison when we reduce the size of sample to .1 or 0.05 ,
>> > would the results be related when we check with all the data ? For
>> > example, if we have data1 and data2 and test them using 0.1 and found
>> > that data 1 is producing better results , would the same thing stand
>> > when we check with all data?
>> >
>> >  IRStatistics stats = evaluator.evaluate(recommenderBuilder,
>> >                                             null, model, null, 10,
>> >
>> > GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD,
>> >                                             0.05);
>> >
>> > Many thanks
>> >
>> > On Fri, Jan 25, 2013 at 12:26 PM, Sean Owen <[email protected]> wrote:
>> >> No, it takes a fixed "at" value. You can modify it to do whatever you
>> want.
>> >> You will see it doesn't bother with users with little data, like <
>> >> 2*at data points.
>> >>
>> >> On Fri, Jan 25, 2013 at 6:23 PM, Zia mel <[email protected]>
>> wrote:
>> >>> Interesting. Using
>> >>>  IRStatistics stats = evaluator.evaluate(recommenderBuilder,
>> >>>                                             null, model, null, 5,
>> >>>
>> >>> GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD,
>> >>>                                             1.0);
>> >>>
>> >>> Can it be adjusted to each user ? In other words, is there a way to
>> >>> select a threshold instead of using 5 ?  mm Something like selecting y
>> >>> set , each set have a min of z user ?
>> >>>
>> >>>
>> >>>
>> >>> On Fri, Jan 25, 2013 at 12:09 PM, Sean Owen <[email protected]> wrote:
>> >>>> The way I do it is to set x different for each user, to the number of
>> >>>> items in the user's test set -- you ask for x recommendations.
>> >>>> This makes precision == recall, note. It dodges this problem though.
>> >>>>
>> >>>> Otherwise, if you fix x, the condition you need is stronger, really:
>> >>>> each user needs >= x *test set* items in addition to training set
>> >>>> items to make this test fair.
>> >>>>
>> >>>>
>> >>>> On Fri, Jan 25, 2013 at 4:10 PM, Zia mel <[email protected]>
>> wrote:
>> >>>>> When selecting precision at x let's say 5 , should I check that all
>> >>>>> users have 5 items or more? For example, if a user have 3 items and
>> >>>>> they were removed as top items,  then how can the recommender suggest
>> >>>>> items since there are no items to learn from?
>> >>>>> Thanks !
>>

Reply via email to