The current state of the art in ad recognition is contextual bandits backed up by logistic or probit regression. The mahout logistic regression is a decent first step on this but probably doesn't provide the necessary accuracy.
I have some early work on the bandit algorithms on github but this is still early work. I think that using a recommender with ad features only would give you a very weak ad targeting algorithm because of the high level of ad churn and generally poor quality of ad meta data. Sent from my iPhone On Apr 4, 2012, at 4:44 AM, Sean Owen <[email protected]> wrote: > I would recommend you use (only) the ad data. These are "boolean" data > points in the recommender engine speak. You can 'recommend' ads this > way. > > I understand your question is a bit more than that. First you want to > use the *not*-clicked data. My first question is, is this meaningful? > I am served 1000 ads per day that I don't even look at; that I do not > click them does not say much. Is your situation some kind of > interstitial ad that the user is forced to skip? that's more > meaningful, but the same comment applies. > > If you really do have such meaningful data, consider making a separate > "anti-recommender" out of this data. This will tell you which ads are > probably worst to show. You could merge the two results then to make > your decision. > > What to do with purchase data? You could ignore it on the grounds that > when recommending ads, the only thing that matter is its ability to > induce a click -- whether it results in a purchase is a different > matter. > > Or you could view it as reaffirming that the ad click was a "strong > click", that it is more likely the user was not merely curious or > mis-clicked, but was significantly more interested in the advertised > product. > > You could go back and add "ratings" to your model -- a "1" for a click > and a "5" for a click that results in purchase? It's quite arbitrary > and I don't know if the results are much better. > > If you're serious about using this data too, I would again recommend > looking at the ALS algorithm as presented in > www2.research.att.com/~yifanhu/PUB/cf.pdf -- their model is nice in > that it ingests a "confidence" in the association between a user and > item, which is much more like what you have than a "rating". > > > On Wed, Apr 4, 2012 at 10:35 AM, vinutha <[email protected]> wrote: >> >> Hello! >> >> I have a data set containing user behavior such as which products s/he >> clicked on , and which products s/he bought from a retail site. I have >> another data set containing which ads the same user has clicked on, and >> the >> ads which were shown to him/her but hasn't been clicked on. The idea is to >> use the user behavior data set to make recommendations for ads. >> As I ve understood from Mahout in Action, there isn't a way to introduce >> user behavior has a feature set . One can only use, userid, productid /ad >> id >> , preferences. >> >> Is my understanding correct? >> Any suggestions would be most welcome! >> >> Thanks, >> Vinutha >> >> -- >> View this message in context: >> http://lucene.472066.n3.nabble.com/recommend-ads-using-mahout-tp3883496p3883496.html >> Sent from the Mahout User List mailing list archive at Nabble.com.
