On Tue, Nov 8, 2011 at 1:07 PM, Jake Mannix <[email protected]> wrote:
> On Tue, Nov 8, 2011 at 4:35 AM, Ted Dunning <[email protected]> wrote: > > > The practical techniques for such problems are pretty diverse. > > > > One method is to simply define multiple binary classifiers. > > > Only? You mean the only method we have currently implemented, right? > Well... I know some guys who have built a very cool classifier using Mahout. They categorize web-pages into >50,000 categories that have intricate logical structure but which allow multiple categories. Accuracy reportedly exceeds human judge repeatability. Since they built that beast out of Mahout parts, I think that counts. Labeled LDA< > http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.155.3678&rep=rep1&type=pdf > > > (PDF) > is a good example of a way to do this, and I will > most likely be adding this to our bag of tricks once I get my > new-and-improved > LDA into the codebase. > OK. And Bob's your uncle.
