date:20140126

Re: Problem converting tokenized documents into TFIDF vectors

2014-01-26 Thread Drew Farris

Scott, Based on the dictionary output, it looks like the processing of generating vector from your tokenized text is not working properly. The only term that's making it into your dictionary is 'java' - everything else is being filtered out. Furthermore, your tf vectors have a single dimension

Re: Problem converting tokenized documents into TFIDF vectors

2014-01-26 Thread Scott C. Cote

Drew, I'm sorry - I'm derelict (as opposed to dirichlet) in responding that I got passed my problem. It was the min freq that was killing me. Forgot about that parameter. Thank you for your assist. Hope to be able to return the favor. Am on the hook to update documentation for Mahout already

Re: generic latent variable recommender question

2014-01-26 Thread Ted Dunning

On Sun, Jan 26, 2014 at 9:36 AM, Pat Ferrel p...@occamsmachete.com wrote: I think I’ll leave dithering out until it goes live because it would seem to make the eyeball test easier. I doubt all these experiments will survive. With anti-flood if you turn the epsilon parameter to 1 (makes

Re: Problem converting tokenized documents into TFIDF vectors

2014-01-26 Thread Suneel Marthi

Scott, FYI... 0.9 Release is not official yet. The project trunk's still at 0.9-SNAPSHOT. Please feel free to update the documentation. On Sunday, January 26, 2014 1:34 PM, Scott C. Cote scottcc...@gmail.com wrote: Drew, I'm sorry - I'm derelict (as opposed to dirichlet) in responding

Re: Problem converting tokenized documents into TFIDF vectors

2014-01-26 Thread Scott C. Cote

I understand that it is not official. Am just trying to provide another test opportunity for the .9 release. SCott On 1/26/14 1:05 PM, Suneel Marthi suneel_mar...@yahoo.com wrote: Scott, FYI... 0.9 Release is not official yet. The project trunk's still at 0.9-SNAPSHOT. Please feel free to

Re: generic latent variable recommender question

2014-01-26 Thread Tevfik Aytekin

Thanks for the answers, actually I worked on a similar issue, increasing the diversity of top-N lists (http://link.springer.com/article/10.1007%2Fs10844-013-0252-9). Clustering-based approaches produce good results and they are very fast compared to some optimization based techniques. Also it

Re: Problem converting tokenized documents into TFIDF vectors

Re: Problem converting tokenized documents into TFIDF vectors

Re: generic latent variable recommender question

Re: Problem converting tokenized documents into TFIDF vectors

Re: Problem converting tokenized documents into TFIDF vectors

Re: generic latent variable recommender question

6 matches

Site Navigation

Mail list logo

Footer information