I've been very busy, so haven't been able to respond to this in detail yet. But, briefly, based on a quick read, what you describe here shouldn't work at all. You could train different models and combine them as an ensemble (majority vote, average, product). You'll need to make sure that the label vectors are comparable for each model as they will vary from dataset to dataset with so many labels.
I'd also recommend trying out a simple naive bayes classifier here, at least as a first pass. On Wed, Apr 20, 2011 at 7:35 AM, Rao, Vaijanath <[email protected]>wrote: > Hi All, > > I am trying to use maxent for the Large scale hierarchical challenge ( > http://lshtc.iit.demokritos.gr:10000/ ) contest. > > However, I could not get maxent to work on large number of > classes/categories ( dmoz test data has something like 28K classes and 580K+ > features ). So decided to split the training and merging the models after > every few iterations. The split is decided by the category/classes so that > all the instance belonging to one class resides in one split. > > At every few iteration the model generated by each of these splits is > merged ( I merge out all of the model Data structures ) and average out the > parameters estimated. > > But even after something like 1000 iterations I don't see accuracy going > beyond 70%. As after every merge there is dip in overall accuracy. So I was > wondering if there is a better way to merge. > > Can someone guide me in getting the split / incremental training or should > I try the perceptron model . > > --Thanks and Regards > Vaijanath N. Rao > > -- Jason Baldridge Assistant Professor, Department of Linguistics The University of Texas at Austin http://www.jasonbaldridge.com http://twitter.com/jasonbaldridge
