Hi everyone.
I have a data set that looks like this:
Number of users: 198651
Number of items: 9972
Statistics of purchases from users
mean number of purchases
3.3
stdDev number of purchases
3.5
min number of purchases
1
max number of purchases
176
median number
Yes, I don't know if removing that data would improve results. It might
mean you can compute things faster, at little or no observable loss in
quality of the results.
I'm not sure, but you probably have repeat purchases of the same item, and
items of different value. Working in that data may help
I can think of only 2 possibilities:
- in the script, I think it goes through the if statements to line 251
where the HADOOP_CLASSPATH is being set; that line differs from line
243 where the CLASSPATH you set also gets added. So, it seems that the
CLASSPATH you set isn't being passed to hadoop.
I modified line 251 like this: export
HADOOP_CLASSPATH=$MAHOUT_CONF_DIR:${HADOOP_CLASSPATH}:$CLASSPATH
Now I don't have the Class not found exception but I get: Error:
java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
I found a big discussion regarding this error at
I'm clustering (non-textual) data. Some of the features in my vectors represent
discrete values or types such that, for example, one feature may have the
range
of values 0=red, 1=blue, 2=green, 3=yellow.
I could also have characterized the same data as 4 features where the value of
the feature
There are several methods for recommending things given a shopping cart
contents. At the risk of using the same tool for every problem I was thinking
about a recommender's use here.
I'd do something like train on shopping cart purchases so row = cartID, column
= itemID.
Given cart contents I
Hey Vignesh,Are there specific things you need, I've built a classification
implementation in the past with naive bayes and a real time service to serve up
the results of this data. Let me know if you have specific questions.Regards
Date: Thu, 14 Feb 2013 10:18:31 +0530
Subject: Reg
I think that this is an excellent use case for cross recommendation from
cart contents (items) to cart purchases (items). The cross aspect is that
the recommendation is from two different kinds of actions, not two kinds of
things. The first action is insertion into a cart and the second is
I thought you might say that but we don't have the add-to-cart action. We have
to calculate cart purchases by matching cart IDs or session IDs. So we only
have cart purchases with items.
If we had the add-to-cart and the purchase we could use your cross-action
method for getting recs by
This sounds like a job for frequent item set mining, which is kind of a
special case of the ideas you've mentioned here. Given N items in a cart,
which next item most frequently occurs in a purchased cart?
On Thu, Feb 14, 2013 at 6:30 PM, Pat Ferrel pat.fer...@gmail.com wrote:
I thought you
Appreciate the replies!
Yes this problem has been pretty much beaten to shreds. In
fact so much so i wrote it into troubleshooting in section
5 of the manual
(https://cwiki.apache.org/confluence/download/attachments/27832158/SSVD-CLI.pdf?version=17modificationDate=134085000).
Aha, it
Yes, one time tested way to do this is the apriori algo which looks at
frequent item sets and creates rules.
I was looking for a shortcut using a recommender, which would be super easy to
try. The rule builder is a little harder to implement but we can also test
precision on that and compare
I don't think it's necessarily slow; this is how item-based recommenders
work. The only thing stopping you from using Mahout directly is that I
don't think there's an easy way to say recommend to this collection of
items. But that's what is happening inside when you recommend for a user.
You can
I'm creating a matrix of cart ids and items ids so cart x items in cart. The
'preference' then is cartID, itemID. This will create the correct matrix I
think.
For any cart id I would get a ranked list of recommended items that was
calculated from other carts. This seems like what is needed in
Yes your only issue there, which I think you had touched on, was that you
have to put your current cart (which hasn't been purchased) into the model
in order to get an answer out of a recommender. I think we've talked about
the recommend-to-anonymous function in the context of another system,
So to answer my own question, the order of training matters. I had been
doing all category 1 then all category 0. Apparently this breaks things
badly
On Wed, Feb 13, 2013 at 4:29 PM, Brian McCallister bri...@skife.org wrote:
I'm trying to do a basic two category classifier on textual data, I
Do you see the contents of the cart?
Is the cart ID opaque? Does it persist as a surrogate for a user?
On Thu, Feb 14, 2013 at 10:30 AM, Pat Ferrel pat.fer...@gmail.com wrote:
I thought you might say that but we don't have the add-to-cart action. We
have to calculate cart purchases by
Sure, we have cart/session IDs, items IDs, and user IDs when purchases are made
or when asked for a recommendation from the cart page.
We currently don't get the add-to or remove-from cart actions. We could get
them.
Are you thinking that we can use the add-to-cart user x item matrix and
18 matches
Mail list logo