Re: TFIDFConverter generates empty tfidf-vectors

2013-09-08 Thread Gokhan Capan
Taner, It seems to have tf-idf vectors later, you need to create tf vectors (DictionaryVectorizer.createTermFrequencyVectors) with logNormalize option set to false, and normPower option set to -1.0f. This applies to HighDFWordsPruner.pruneVectors, too. I believe that solves your problem. Best

Re: Solr recommender

2013-09-08 Thread Pat Ferrel
I've been looking at examples of recommenders with an eye to reverse engineering what's good and bad. Hard to say with any certainty, of course. Netflix: has a bunch of different recommendation lists, some personalized, some based on different forms of popularity or item similarity. The one

Re: CDBw Usage

2013-09-08 Thread Jeff Eastman
Hi Pablo, Look in the CDBw unit tests for examples of invoking it from Java code. Jeff On Sep 6, 2013, at 5:56 PM, Pablo Andretta Jaskowiak pajaskow...@gmail.com wrote: Hello, I'm trying to use the CDBw implementation from Mahout. Given that I have a dataset in CSV format and a

Re: Hadoop implementation of ParallelSGDFactorizer

2013-09-08 Thread Tevfik Aytekin
Thanks Sebastian. On Sat, Sep 7, 2013 at 8:24 PM, Sebastian Schelter ssc.o...@googlemail.com wrote: IIRC the algorithm behind ParallelSGDFactorizer needs shared memory, which is not given in a shared-nothing environment. On 07.09.2013 19:08, Tevfik Aytekin wrote: Hi, There seems to be no