date:20140527

Indicator Matrix and Mahout + Solr recommender

2014-05-27 Thread Pat Ferrel

I was talking with Ken Krugler off list about the Mahout + Solr recommender and he had an interesting request. When calculating the indicator/item similarity matrix using ItemSimilarityJob there is a --threshold option. Wouldn’t it be better to have an option that specified the fraction of

Re: Indicator Matrix and Mahout + Solr recommender

2014-05-27 Thread Ted Dunning

The threshold should not normally be used in the Mahout+Solr deployment style. This need is better supported by specifying the maximum number of indicators. This is mathematically equivalent to specifying a fraction of values, but is more meaningful to users since good values for this number are

Re: Indicator Matrix and Mahout + Solr recommender

2014-05-27 Thread Pat Ferrel

On May 27, 2014, at 8:15 AM, Ted Dunning ted.dunn...@gmail.com wrote: The threshold should not normally be used in the Mahout+Solr deployment style. Understood and that’s why an alternative way of specifying a cutoff may be a good idea. This need is better supported by specifying the

Re: Indicator Matrix and Mahout + Solr recommender

2014-05-27 Thread Sebastian Schelter

I have added the threshold merely as a way to increase the performance of RowSimilarityJob. If a threshold is given, some item pairs don't need to be looked at. A simple example is if you use cooccurrence count as similarity measure, and set a threshold of n cooccurrences, than any pair

Re: Confusion on runtime of mahout.

2014-05-27 Thread Jay Vyas

have you verified that all the slaves are running tasks? sometimes only a few slaves on a cluster willl pick up a task because of other limitations. Also some algorithms in mahout arent distribnuted. also obviously you will want to make sure that your running the distributed implementations of

Re: Confusion on runtime of mahout.

2014-05-27 Thread dongdan39

Yes, those nodes are running tasks. For Logistic Regression, it's reasonable as this algorithm is only sequential implementation. But for Naive Bayes and Random Forest, it's hard to understand. By the way, how do I know/check if I am running the distributed implementation of these algorithms?

Changing output columns of logistic regression

2014-05-27 Thread Chhaya Vishwakarma

Hi, Logistic regression gives output which has three columns Target Model-output likelihood Is it possible to add more columns to output? I would like to add an ID column so that i can join logistic result with input data. Regards, Chhaya Vishwakarma The

Indicator Matrix and Mahout + Solr recommender

Re: Indicator Matrix and Mahout + Solr recommender

Re: Indicator Matrix and Mahout + Solr recommender

Re: Indicator Matrix and Mahout + Solr recommender

Re: Confusion on runtime of mahout.

Re: Confusion on runtime of mahout.

Changing output columns of logistic regression

7 matches

Site Navigation

Mail list logo

Footer information