Which classifier? How are you running it?
Can you publish your data? On Wed, Nov 16, 2011 at 4:51 AM, Lyall Morrison <[email protected]>wrote: > Hi everyone, > > I'm trying to classify some unsorted text files into different categories > using a Bayesian classifier, and it's going well until I try to run a > classifier with more than about 30 categories in it (the limit is between > 27 and 32, I haven't nailed it down yet). > > The training process claims to work fine up to the ~150 categories I have > identified, but actually running the classifier with a model with too many > categories in it causes it to hang without reporting any errors. > > Can anyone tell me if there is a known limit here or suggest an easy way to > diagnose this? My next resort is source diving, which I would prefer to > avoid if I can. > > If I'm reading it correctly, the version I'm using is Mahout 0.5-SNAPSHOT > which I haven't been keeping up to date as I feel better using a static > codebase while I'm mucking around - at least that way if something stops > working I know it's my fault ;) > > Thanks for your time, > > Lyall Morrison >
