Scott,
Based on the dictionary output, it looks like the processing of generating
vector from your tokenized text is not working properly. The only term
that's making it into your dictionary is 'java' - everything else is being
filtered out. Furthermore, your tf vectors have a single dimension
Drew,
I'm sorry - I'm derelict (as opposed to dirichlet) in responding that I
got passed my problem.
It was the min freq that was killing me. Forgot about that parameter.
Thank you for your assist.
Hope to be able to return the favor.
Am on the hook to update documentation for Mahout already
On Sun, Jan 26, 2014 at 9:36 AM, Pat Ferrel p...@occamsmachete.com wrote:
I think I’ll leave dithering out until it goes live because it would seem
to make the eyeball test easier. I doubt all these experiments will survive.
With anti-flood if you turn the epsilon parameter to 1 (makes
Scott,
FYI... 0.9 Release is not official yet. The project trunk's still at
0.9-SNAPSHOT.
Please feel free to update the documentation.
On Sunday, January 26, 2014 1:34 PM, Scott C. Cote scottcc...@gmail.com wrote:
Drew,
I'm sorry - I'm derelict (as opposed to dirichlet) in responding
I understand that it is not official.
Am just trying to provide another test opportunity for the .9 release.
SCott
On 1/26/14 1:05 PM, Suneel Marthi suneel_mar...@yahoo.com wrote:
Scott,
FYI... 0.9 Release is not official yet. The project trunk's still at
0.9-SNAPSHOT.
Please feel free to
Thanks for the answers, actually I worked on a similar issue,
increasing the diversity of top-N lists
(http://link.springer.com/article/10.1007%2Fs10844-013-0252-9).
Clustering-based approaches produce good results and they are very
fast compared to some optimization based techniques. Also it