On Wed, Dec 3, 2014 at 6:22 AM, Yash Patel yashpatel1...@gmail.com wrote:
I have multiple different columns such as category,shipping location,item
price,online user, etc.
How can i use all these different columns and improve recommendation
quality(ie.calculate more precise similarity
Calculating similarity using multiple column values is what i thought,I
looked throught the example but there is just some mention of use of
content filtering implemented but not explicitly.
Can you guide me to a working example or do i need to use
algorithms for classifiers or clustering?
Also
Hi Mahout Users!
Firstly, this community is great and appreciate all the Q A back and
forth!
I am currently working on Text Clustering and I am using Mahout and
Clustering algorithms (kmeans, krunner, canopy etc) for that.
If anyone has worked on a similar project please let me know. I
Hi All,
I have been trying mahout clustering on unstructured data i.e human
written data . I have tried mahout clustering algorithms like
Kmeans,Canopy+Kmeans and LDA but the results produced are not help full .
i see the problem is with the way data is written , Can some one please
provide
Hi
it depends on the nature of data you are clustering. If you have knowledge
about your data, you can figure out the results and you can also set the
correct parameters to the clustering algorithm like number of topics or
number of clusters.
Cheers,
Donni
On Thu, Dec 4, 2014 at 2:38 PM, Shahid
Cross Recommendors dont seem applicable because this dataset doesn't
represent different actions by a user,it just contains transaction
history.(ie.customer id,item id,shipping location,sales amount of that
item,item category etc)
Maybe location,sales per item(similarity might lead to knowledge
Hey Donni thanks but I have used the configurations and obtained the
clusters .the results are not promising enough . I was looking if there are
any known technics I can follow specifically while generating vectors .
Thanks
On Thursday, December 4, 2014, Donni Khan prince.don...@googlemail.com
My experience has been that it's best to leave the data processing for Python.
I strongly suggest you re-write your ETL and let Mahout only do the clustering.
The built-in vectorization routines are fairly primitive.
Then I would wash the features, basically set up your own list of stop words
User1 purchases = infant car seat, infant stroller
User2 purchases = infant car seat, infant stroller, infant crib mobile
The obvious recommendation for User1 is an infant crib mobile. From the
purchase history the users look similar. Here similarity is in “taste”. User or
item information that
Hi Mahout Users!
I am currently working on Text Clustering and I am using Mahout and Clustering
algorithms (kmeans, LDA, canopy etc) for that.
I have below questions –
1. Why is Mahout giving out clusters with only 1 observation?
2. Is cluster 1 always catch-all cluster?
3. When I change the k
Any interest in a topological data analysis package in Mahout?
https://www.google.com/search?q=topological+data+analysis
http://danifold.net/mapper/introduction.html
http://danifold.net/mapper
Would be nice to be able to run jobs and and export to JSON for consumption
in D3 or other
Though I don't have an immediate use case, I'd +1 the idea!
On Dec 4, 2014, at 3:11 PM, Andrew Musselman andrew.mussel...@gmail.com wrote:
Any interest in a topological data analysis package in Mahout?
https://www.google.com/search?q=topological+data+analysis
12 matches
Mail list logo