Hi everyone,

I'm trying to classify some unsorted text files into different categories
using a Bayesian classifier, and it's going well until I try to run a
classifier with more than about 30 categories in it (the limit is between
27 and 32, I haven't nailed it down yet).

The training process claims to work fine up to the ~150 categories I have
identified, but actually running the classifier with a model with too many
categories in it causes it to hang without reporting any errors.

Can anyone tell me if there is a known limit here or suggest an easy way to
diagnose this? My next resort is source diving, which I would prefer to
avoid if I can.

If I'm reading it correctly, the version I'm using is Mahout 0.5-SNAPSHOT
which I haven't been keeping up to date as I feel better using a static
codebase while I'm mucking around - at least that way if something stops
working I know it's my fault ;)

Thanks for your time,

Lyall Morrison

Reply via email to