Hi everyone, I am experimenting with cvb algorithm and I have a few questions....
a) Is there any updated documentation? I have been collecting info from mail lists, blogs, etc. I have been writing a small beginers tutorial, if you like I'll send it. b) Should I remove "stop-words" before building the feature vectors ? I am having some trouble "reading" the results.... c) Vectordump is not sorting well...is this a reported bug ? ( I am building mahout from trunk now ) d) Any considerations on performance? It took 10 hours on a 5 node cluster and I've set 20 iterations on less than 10.000 docs and it took Thanks! Charly
