Re: Own recommender

2015-01-15 Thread Ted Dunning
The old Taste code is not the state of the art. User-based recommenders built on that will be slow. On Thu, Jan 15, 2015 at 7:10 AM, Juanjo Ramos jjar...@gmail.com wrote: Hi David, You implement your custom algorithm and create your own class that implements the UserSimilarity interface.

mahout 1.0 on EMR with spark

2015-01-15 Thread Pasmanik, Paul
Has anyone tried running mahout 1.0 on EMR with Spark? I've used instructions at https://github.com/awslabs/emr-bootstrap-actions/tree/master/spark to get EMR cluster running spark. I am now able to deploy EMR cluster with Spark using AWS JAVA APIs. EMR allows running a custom script as

Re: Own recommender

2015-01-15 Thread Juanjo Ramos
Hi David, You implement your custom algorithm and create your own class that implements the UserSimilarity interface. When you then instantiate your User-Based recommender, just pass your custom class for the UserSimilarity parameter. Best. On Thu, Jan 15, 2015 at 1:11 PM, ARROYO MANCEBO David

Re: boost selected dimensions in kmeans clustering

2015-01-15 Thread Ted Dunning
On Thu, Jan 15, 2015 at 5:23 AM, Miguel Angel Martin junquera mianmarjun.mailingl...@gmail.com wrote: My question is:.. Is it better to scale up these dimensions directly in the tf-idf sequence final mix file using this correction factors OR first do scale up in each tf-vectors and

Re: DTW distance measure and K-medioids, Hierarchical clustering

2015-01-15 Thread marko . dinic
Dear Ted, Thank you very much for your answer. It is very inspiring for a beginner like me to see the effort that you put to answer to questions like mine, I'm sure that they look trivial. And the whole community involved is great. So, to summarize, my idea of K-medoids with DTW as a

Re: DTW distance measure and K-medioids, Hierarchical clustering

2015-01-15 Thread Ted Dunning
Anand, That is a fine idea. It is called a medoid instead of a mean ( https://en.wikipedia.org/wiki/Medoid ) The basic idea is that for any metric m, you can define the medoid as the element from the set that minimizes the sum of the distances to the other elements for that metric. In

Re: How to partition a file to smaller size for performing KNN in hadoop mapreduce

2015-01-15 Thread unmesha sreeveni
Is there any way.. Waiting for a reply.I have posted the question every where..but none is responding back. I feel like this is the right place to ask doubts. As some of u may came across the same issue and get stuck. On Thu, Jan 15, 2015 at 12:34 PM, unmesha sreeveni unmeshab...@gmail.com wrote:

Re: DTW distance measure and K-medioids, Hierarchical clustering

2015-01-15 Thread Marko Dinic
Ted, Thank you for your answer. Maybe I made a wrong picture about my data when giving sinusoid as an example, my time series are not periodic. Let's say that I have a signal that represents value of power when some device is turned on. That power signal depends of the time when person turns

Own recommender

2015-01-15 Thread ARROYO MANCEBO David
Hi folks, How I can start to build my own recommender system in apache mahout with my personal algorithm? I need a custom UserSimilarity. Maybe a subclass from UserSimilarity like PearsonCorrelationSimilarity? Thanks Regards :)

Re: boost selected dimensions in kmeans clustering

2015-01-15 Thread Miguel Angel Martin junquera
hi Ted, Yes. I was considering various possibilities. one of them was this. ( scale up these dimensions, for example,multiplying by a configurable factor correction.) I really want to mix two different vectors from the same documents with different lengths and dictionaries , (perhaps some