Thanks Ted, Yes for the time problem. We tend to use aggregations of session data. So instead of asking for user recommendations we do things like user+sessions recommendations.
Of course, deciding when sessions start and stop isn't trivial. I ideally what I would want to is time-weight views using a kernel or convolution. That's a bit heavy so we typically have a global model, that is is basically all preferences over times. Then these user+session type models. We can then combine these at another level to give recommendations based on what you like throughout time versus what you have been doing recently. -b On Thu, Mar 27, 2014 at 1:59 PM, Ted Dunning <[email protected]> wrote: > For the poly-syllable challenged, > > hetereoscedasticity - degree of variation changes. This is common with > counts because you expect the standard deviation of count data to be > proportional to sqrt(n). > > time imhogeneity - changes in behavior over time. One way to handle this > (roughly) is to first remove variation in personal and item means over time > (if using ratings) and then to segment user histories into episodes. By > including both short and long episodes you get some repair for changes in > personal preference. A great example of how this works/breaks is Christmas > music. On December 26th, you want to *stop* recommending this music so it > really pays to limit histories at this point. By having an episodic user > session that starts around November and runs to Christmas, you can get good > recommendations for seasonal songs and not pollute the rest of the > universe. > > > > On Thu, Mar 27, 2014 at 8:30 AM, j.barrett Strausser < > [email protected]> wrote: > > > For my team it has usually been hetereoscedasticity and time > inhomogeneity. > > > > > > > > > > On Thu, Mar 27, 2014 at 10:18 AM, Tevfik Aytekin > > <[email protected]>wrote: > > > > > Interesting topic, > > > Ted, can you give examples of those mathematical assumptions > > > under-pinning ALS which are violated by the real world? > > > > > > On Thu, Mar 27, 2014 at 3:43 PM, Ted Dunning <[email protected]> > > > wrote: > > > > How can there be any other practical method? Essentially all of the > > > > mathematical assumptions under-pinning ALS are violated by the real > > > world. > > > > Why would any mathematical consideration of the number of features > be > > > much > > > > more than heuristic? > > > > > > > > That said, you can make an information content argument. You can > also > > > make > > > > the argument that if you take too many features, it doesn't much hurt > > so > > > > you should always take as many as you can compute. > > > > > > > > > > > > > > > > On Thu, Mar 27, 2014 at 6:33 AM, Sebastian Schelter <[email protected]> > > > wrote: > > > > > > > >> Hi, > > > >> > > > >> does anyone know of a principled approach of choosing the number of > > > >> features for ALS (other than cross-validation?) > > > >> > > > >> --sebastian > > > >> > > > > > > > > > > > -- > > > > > > https://github.com/bearrito > > @deepbearrito > > > -- https://github.com/bearrito @deepbearrito
