Hi,

I got the demo up and running and am now trying to figure out how to go forward 
with a few tries on my own to determine, whether we can actually use Mahout. We 
are getting a lot of data on many users and would like to use this data in 
order to provide more relevant ads - relevant not only according  to the 
content of the side, but to the interests of the user and what he liked in the 
past. So I know e.g. the type of site he is on (twenty categories), the type of 
sites he has visited in the past, the ads he saw, the ads  clicked, including a 
category to which the ad belongs.
Furthermore I'd like to build a profile of interests and if I can, I'd gather 
some demographical data for a number of sites - this should enable me to use 
naïve Bayes to deduce gender and age with some probability depending on the 
recorded history of sites someone visited within the ad network.

All this information I'd like to use in order to make recommendation which ad 
to deliever, either because similar users cliked it, or because a user clicked 
on ad, which has often been clicked with another ad. (item based, user based 
depending which one provides a better result) Other interesting data points 
would be time (are there specific times at which ads do perform well or bad?) 
and location and  the actual combinations of site and ad.

I am not a very good programmer and am working more the conceptual angle and 
look for technologies which we could use. So I am not sure how to store the 
data I collect (I created a database scheme) to make it available to Mahout, as 
it seems to run on Hadoop and not with a normal database? I am still looking 
for more documentation, so if you could point me to something or have some idea 
how to proceed, I'd appreciate it.

We definitely something which scales as the ad network is creating billions of 
ad impressions per month with millions of users and Mahout seemed to be the 
only thing, that seems suitable, although it is still pretty early in its 
development process.


Thanks for any pointers and opinions,

Benjamin


_______________________________________
Benjamin Dageroth, Key Account Manager / Softwareentwickler
Webtrekk GmbH
Boxhagener Str. 76-78, 10245 Berlin
fon 030 - 755 415 - 360
fax 030 - 755 415 - 100
[email protected]
http://www.webtrekk.com
Amtsgericht Berlin, HRB 93435 B
Geschäftsführer Christian Sauer


_______________________________________

Reply via email to