On Jan 3, 2011, at 12:02 PM, Ted Dunning wrote: > I think that you have identified an interesting cross-cutting category. > > PageRank, HITS and the related algorithms tend to be classified as "link > analysis" > Priority inboxes tend to be classified as "classifiers" > Click-through predictors are often term "recommendation" > > In all these cases, the nomenclature is all about the implementation > approach, not about the goal. > > Importance modeling as you describe it is all about the goal, not about the > method.
Yes! This is what I'm after! More background on the theory of importance as well as the implementation side of it. I figured there has to be some academic work on it, but the terms are pretty ambiguous, so it's hard to get good results... > > On Mon, Jan 3, 2011 at 8:54 AM, Grant Ingersoll <[email protected]> wrote: > >> Hi, >> >> I wanted to pick people's brains a little bit on the subject of determining >> importance. This isn't necessarily Mahout related, although I think we have >> some tools that help in the area. >> >> One of the emerging trends it seems these days with all our connectivity >> and content is a notion of importance/priority. Some examples: >> 1. Google now has "Priority Inbox" for instance and I think most would >> agree that for things like Twitter and Facebook it would be really nice if >> you could separate out the Important updates/people from the less important. >> 2. Identifying important phrases, etc. in text across a corpus. >> 3. One of the things I think most researchers do when exploring a new topic >> is to identify the one or two seminal papers in the field, read them, and >> then read the ones that cite those papers and so on. >> 4. Take in all the day's news and figure out what the key articles are to >> read (in some sense it's picking the most representative document in a >> cluster) or that the article talking about raising Federal income taxes is >> likely more important >> than the one talking about raising local sales tax (or vice versa!) >> 5. PageRank, TextRank, etc. and other approaches to calculating authority >> >> What I'm looking for is help in researching this area. Is there a name for >> this (sub-)field (importance theory? prioritization theory?), particularly >> in mach. learning and NLP that is geared towards this? I realize some >> (most) of these problems can be solved with classifiers amongst other things >> like graph algorithms (particularly ones that use the social graph), but it >> also seems like the area is bigger than a particular implementation, so I >> wanted to hear what others thought. How would you go about solving these >> problems? Do you have any pointers to useful references on the subject >> (theoretical or practical)? What other examples have you run up against? >> >> Thanks, >> Grant -------------------------- Grant Ingersoll http://www.lucidimagination.com
