Re: Trending and prediction Analysis

Ted Dunning Thu, 24 Sep 2009 11:36:44 -0700

Pallavi,

On Wed, Sep 23, 2009 at 11:29 PM, Palleti, Pallavi <
[email protected]> wrote:

> ...
> Regarding deviation, Yes. Like, we can compute the deviation by computing
> the difference of new score with past score (which is mean of past few days
> of data).
>
> Regarding variations on week boundaries having less impact, I am sorry. I
> couldn't get how it will be less impact?. I am assuming that week boundaries
> means the mean of past 7 days of data?
>
>
I still am not quite clear about what kind of score you are referring to.  I
was referring to simple popularity counts, such as how many people watched a
video.  That doesn't seem to match what you have.

I also assumed that you have a large number of things with scores.  This
does match what you have.

In my case, the popularity of items (as measured by the rate at which people
interact with them) varies strongly during the day and also changes from
weekday to weekend.  This is common for almost any traffic based measure
that you can think of.   If you compute aggregate scores over a full week,
then the mix of night/day, and weekday/weekend is constant so you have more
constant scores.

Regardless of this variation, ranking objects by some criterion (score for
you, number of interactions or views for me) gets rid of much of this
variation without the need for long term averages.  Averaging can still be
good to cut down on random fluctuations.

> In order to make sure that I understood correctly, I am reiterating what
> you said. So, we compute the occurrences for all items and then rank top 10k
> or 100k (Some N items). And, the rest would be given rank as N+1. Now, we
> compute the score for each item as log (r_new/r_old) (or is it
> log(r_old/r_new) as you specified below?). r_old being the rank for previous
> time frame (week, day or hour etc). Now, we sort based on scores and we can
> get the items which are varying heavily. Kindly correct me if I am wrong.
>

You are correct.

I put r_old on top so that items that improve have positive trend scores and
things that have decreasing score (increasing rank) get negative trend
scores.

> Also, when you mean k, is it number of occurrences?
>

See above.  I was talking in terms of popularity.  It seems you have
something else in mind.  The ranking trick sweeps that under the carpet,
regardless.

Re: Trending and prediction Analysis

Reply via email to