Re: User based recommender

2014-12-04 Thread Ted Dunning
On Wed, Dec 3, 2014 at 6:22 AM, Yash Patel yashpatel1...@gmail.com wrote: I have multiple different columns such as category,shipping location,item price,online user, etc. How can i use all these different columns and improve recommendation quality(ie.calculate more precise similarity

Re: User based recommender

2014-12-04 Thread Yash Patel
Calculating similarity using multiple column values is what i thought,I looked throught the example but there is just some mention of use of content filtering implemented but not explicitly. Can you guide me to a working example or do i need to use algorithms for classifiers or clustering? Also

Few Questions related Mahout used for Text Clustering

2014-12-04 Thread Viral Parikh
Hi Mahout Users! Firstly, this community is great and appreciate all the Q A back and forth! I am currently working on Text Clustering and I am using Mahout and Clustering algorithms (kmeans, krunner, canopy etc) for that. If anyone has worked on a similar project please let me know. I

Process UnStructured Data in Mahout for Clustering

2014-12-04 Thread Shahid Shaikh
Hi All, I have been trying mahout clustering on unstructured data i.e human written data . I have tried mahout clustering algorithms like Kmeans,Canopy+Kmeans and LDA but the results produced are not help full . i see the problem is with the way data is written , Can some one please provide

Re: Process UnStructured Data in Mahout for Clustering

2014-12-04 Thread Donni Khan
Hi it depends on the nature of data you are clustering. If you have knowledge about your data, you can figure out the results and you can also set the correct parameters to the clustering algorithm like number of topics or number of clusters. Cheers, Donni On Thu, Dec 4, 2014 at 2:38 PM, Shahid

Re: User based recommender

2014-12-04 Thread Yash Patel
Cross Recommendors dont seem applicable because this dataset doesn't represent different actions by a user,it just contains transaction history.(ie.customer id,item id,shipping location,sales amount of that item,item category etc) Maybe location,sales per item(similarity might lead to knowledge

Re: Process UnStructured Data in Mahout for Clustering

2014-12-04 Thread Shahid Shaikh
Hey Donni thanks but I have used the configurations and obtained the clusters .the results are not promising enough . I was looking if there are any known technics I can follow specifically while generating vectors . Thanks On Thursday, December 4, 2014, Donni Khan prince.don...@googlemail.com

Re: Process UnStructured Data in Mahout for Clustering

2014-12-04 Thread Brian Dolan
My experience has been that it's best to leave the data processing for Python. I strongly suggest you re-write your ETL and let Mahout only do the clustering. The built-in vectorization routines are fairly primitive. Then I would wash the features, basically set up your own list of stop words

Re: User based recommender

2014-12-04 Thread Pat Ferrel
User1 purchases = infant car seat, infant stroller User2 purchases = infant car seat, infant stroller, infant crib mobile The obvious recommendation for User1 is an infant crib mobile. From the purchase history the users look similar. Here similarity is in “taste”. User or item information that

Mahout used for Text Clustering

2014-12-04 Thread Viral Parikh
Hi Mahout Users! I am currently working on Text Clustering and I am using Mahout and Clustering algorithms (kmeans, LDA, canopy etc) for that. I have below questions – 1. Why is Mahout giving out clusters with only 1 observation? 2. Is cluster 1 always catch-all cluster? 3. When I change the k

Topological data analysis

2014-12-04 Thread Andrew Musselman
Any interest in a topological data analysis package in Mahout? https://www.google.com/search?q=topological+data+analysis http://danifold.net/mapper/introduction.html http://danifold.net/mapper Would be nice to be able to run jobs and and export to JSON for consumption in D3 or other

Re: Topological data analysis

2014-12-04 Thread Brian Dolan
Though I don't have an immediate use case, I'd +1 the idea! On Dec 4, 2014, at 3:11 PM, Andrew Musselman andrew.mussel...@gmail.com wrote: Any interest in a topological data analysis package in Mahout? https://www.google.com/search?q=topological+data+analysis