Re: [twitter-dev] TwitteRBL - Filtering SPAMs from Twitter

2010-02-28 Thread Fabien Penso
On Sun, Feb 28, 2010 at 6:18 AM, Atul Kulkarni  wrote:

> This sounds like a collaborative filtering problem. But rule based system
> alone might not be your best choice for such a dynamic environment like
> twitter.

Probably, but there could be different rbls for different kind of filtering.

Else a bayesian based filtering would probably be better, but I found
nothing available yet.


Re: [twitter-dev] TwitteRBL - Filtering SPAMs from Twitter

2010-02-27 Thread Atul Kulkarni
Hey,

This sounds like a collaborative filtering problem. But rule based system
alone might not be your best choice for such a dynamic environment like
twitter. I would say if u can develop a bag of word approach to write a
classifier and add that to your rule based system then u stand a good
chance, I would assume. I would assume a few hours worth of tweets from
Stream with classification done would serve a good training set for the
algorithm. I do not have any empirical evidence as of now, but that my hunch
about this.

Regards,
Atul.

On Sat, Feb 27, 2010 at 6:29 PM, Fabien Penso  wrote:

> Hi,
>
> I'm currently using the streaming API for a new service I work on, but
> I see lots of tweets I would consider as SPAM and I'd like to find a
> way to prevent it.
>
> I have not found anything to filter them, therefor I wrote a little
> blog post about how it could be done. Something to combine RBL and
> Tweets, but I wonder if that makes sense.
>
> Any feedback welcome.
>
> http://blog.penso.info/2010/02/28/filtering-spams-on-twitter-twitterbl/
>
> --
> Fabien Penso
> @fabienpenso
>



-- 
Regards,
Atul Kulkarni