@Steven Bourke I am using collaborative filtering and thinking to use item-based .
On Wed, Aug 29, 2012 at 10:54 AM, Steven Bourke <[email protected]> wrote: > Can you tell us which type of algorithm you are using? Depending on what > you are using that will affect the answer. > > On Wed, Aug 29, 2012 at 7:16 AM, Ted Dunning <[email protected]> wrote: > >> First off, it looks like Amazon is not filtering for engagement here. >> >> Second, you have to have Amazon's prominence before attacks by large groups >> of people are worth it. >> >> Third, to quote Amazon "these happen once in a blue moon". That means you >> can correct for them manually. >> >> So pragmatically speaking, this isn't a big deal if you do the basics >> right. >> >> On Tue, Aug 28, 2012 at 11:23 PM, Zia mel <[email protected]> wrote: >> >> > Thanks Ted. If you can please elaborate on this , Let's say for >> > example I am recommending online books and 1000 users joined and added >> > most of the popular books to their list and rate them high to be >> > similar to other users , then they start adding books they want to >> > advertise , how can I detect this attitude ? and how can I know if >> > these are malicious users or true users that just have common >> > interests ? Is there a way that I can solve this case that happened to >> > Amazon >> > http://news.cnet.com/2100-1023-976435.html >> > >> > Thanks >> > >> > >> > >> > >> > On Tue, Aug 28, 2012 at 8:23 PM, Ted Dunning <[email protected]> >> > wrote: >> > > The single most effective thing you can do with malicious users like >> this >> > > is to let them think that they have won. In the ideal case, you can >> > detect >> > > simple click frauds and maintain a per user play adjustment so that >> they >> > > see the fraudulent stats and everybody else sees the corrected stats. >> If >> > > you can, this should even extend to your leader board pages. Once you >> > have >> > > this, the fraudsters will generally not increase the sophistication of >> > > their attacks and you have a fairly simple situation. >> > > >> > > You also will have a bit of an advantage if you pick a metric that >> > > indicates fairly serious engagement. With videos, for instance, I have >> > > used plays > 30 seconds as the metric and this was handled by a beacon >> on >> > > the page while the 30 second delay measurement was on the server side. >> > > This requires a browser to be live and in focus for 30 seconds in >> order >> > to >> > > get a play event which substantially increases the cost of committing >> the >> > > click fraud on the fraudsters side. >> > > >> > > With the recommendation analysis itself, the key is to flatten all >> > > frequency metrics per user. With unsophisticated click fraud, the >> abuse >> > > will center on creating high play frequencies for a few users which >> will >> > > then be counted as a very small input signal since so few users are >> doing >> > > it and their high play rates won't matter. Also, the major effect if >> any >> > > will be to simply give the fraudsters recommendations for their own >> items >> > > which will make them happy and won't matter to anyone else. >> > > >> > > On Tue, Aug 28, 2012 at 6:29 PM, Zia mel <[email protected]> >> wrote: >> > > >> > >> Hi , >> > >> >> > >> Is there any way to check for malicious users in mahout so I can >> > >> remove them from the recommendations or reduce their effect ? >> > >> Malicious users are the ones that want to play with the ratings and >> > >> increase or downgrade it. >> > >> >> > >> Thanks, >> > >> >> > >>
