Unfortunately we do not have any time to implement a spam filter/ranking
algorithm.

Besides I think this issue should be resolved on the twitter side.

Some people are sending tweets in reply to *all* twitter users.
I think the spammer twitter accounts and their tweets should be analyzed.

The behaviour I see:

Open a new twitter account
No need to follow anyone
But tweet as a reply to some people with some spam message as many as
hundreds.

As I said earlier, the tweets contain "lol" word in common.

example:

https://twitter.com/madiav_isBOMB
https://twitter.com/ddubplneandonly

for more caught by our system (as a reply to Turkish twitter-ers):
http://twitturk.com/tweet/search?q=lol



On Sun, Nov 28, 2010 at 12:10 AM, Adam Green <140...@gmail.com> wrote:

> My final suggestion is to rank users by something (age of account,
> number of mentions/mentioners/followers/following) and cut out the
> bottom N%.
>
> On Sat, Nov 27, 2010 at 4:18 PM, Furkan Kuru <furkank...@gmail.com> wrote:
> >
> > Another hosting will be problematic to maintain.
> > I have looked at a few more short urls. They redirect to very wide range
> of
> > sites not just amazon.
> >
> > I think twitter may change the priority level of "Report for spam" for
> new
> > opened accounts.
> > And the number of tweets per hour.
> >
> > Here I write again the link that shows the tweets written as a reply to
> > Turkish people
> > the lol word is the common:
> > http://twitturk.com/tweet/search?q=lol
> >
> > And an example account:
> > http://twitter.com/Bomuchellxee
> > All tweets are spam and "lol" is common.
> > It has also 0 folloing and 3 followers (real accounts I guess).
> > Unbelievable!
> >
> >
> >
> > On Sat, Nov 27, 2010 at 4:29 PM, Adam Green <140...@gmail.com> wrote:
> >>
> >> Now you know that it does resolve differently in different countries.
> >> You could set up an account with a webhost in the US, and have a
> >> script there that you can call with URLs in tweets from new users. If
> >> the URL resolves to a blank page, blacklist that user. There are
> >> plenty of good hosts that only charge $7 a month. Sounds extreme, but
> >> these are very clever spammers.
> >>
> >> Or you could just resolve URLs from new users, and blacklist them if
> >> the URL points to Amazon. That will work as long as they still point
> >> to Amazon.
> >>
> >> On Sat, Nov 27, 2010 at 9:12 AM, Furkan Kuru <furkank...@gmail.com>
> wrote:
> >> > It returns a redirection to amazon.com product page
> >> >
> >> > Example:
> >> >
> >> >
> >> >
> http://www.amazon.com/gp/product/B0041E16RC?ie=UTF8&tag=iphone403d-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=B0041E16RC
> >> >
> >> >
> >> > On Sat, Nov 27, 2010 at 4:04 PM, Adam Green <140...@gmail.com> wrote:
> >> >>
> >> >> The URLs again return a code of 200 and nothing in the content. What
> >> >> happens when you try getting one of the URLs with cURL? I'm curious
> if
> >> >> it behaves differently for an IP in Turkey.
> >> >>
> >> >> On Sat, Nov 27, 2010 at 8:56 AM, Furkan Kuru <furkank...@gmail.com>
> >> >> wrote:
> >> >> > Most of the tweets here are spams:
> >> >> >
> >> >> > http://twitturk.com/tweet/search?q=lol
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Sat, Nov 27, 2010 at 3:33 PM, Adam Green <140...@gmail.com>
> wrote:
> >> >> >>
> >> >> >> All of your sample spam tweets are from suspended accounts, yet
> the
> >> >> >> tweets were only sent yesterday. That means that the spammers
> >> >> >> behavior
> >> >> >> was so aggressive that they were suspended quickly by a Twitter
> >> >> >> algorithm. I doubt that a human at Twitter read your email and
> went
> >> >> >> through each tweet suspending the accounts. Have you checked to
> see
> >> >> >> how quickly these spam accounts get canceled for other spam
> tweets?
> >> >> >> You could hold back tweets from unknown users for 24 hours, and
> then
> >> >> >> check all new users through the API to see if they are suspended.
> If
> >> >> >> they aren't suspended, you can whitelist them in your system.
> >> >> >>
> >> >> >> What is really weird is that I also checked the URLs in these
> tweets
> >> >> >> and they resolve to an empty page. They return a header with an
> HTTP
> >> >> >> code of 200, and no content at all. That can't be an accident.
> >> >> >> Either
> >> >> >> they are sending empty responses to everyone, or they could tell
> >> >> >> from
> >> >> >> my IP that they didn't want to send anything to me. Why would a
> >> >> >> spammer do that? They only benefit if someone clicks on their
> links
> >> >> >> and buys something, or gets infected somehow. Could you be the
> >> >> >> subject
> >> >> >> of some kind of attack? You use the word "community." Would anyone
> >> >> >> want to disrupt your community? Is this a community that is in one
> >> >> >> geographic area that can be detected by IP? Very interesting...
> >> >> >>
> >> >> >> Anyway, you can use URL resolution to test new users. When you get
> a
> >> >> >> tweet from a new user with a URL, check the URL, and blacklist
> them
> >> >> >> if
> >> >> >> it resolves to an empty page. If you only have to do this for new
> >> >> >> users, it won't be too processor intensive.
> >> >> >>
> >> >> >>
> >> >> >> On Sat, Nov 27, 2010 at 5:20 AM, Furkan Kuru <
> furkank...@gmail.com>
> >> >> >> wrote:
> >> >> >> > The text in these spam tweets are not easy to recognize.
> >> >> >> > They do not repeat. They are mixed of different words and they
> >> >> >> > contain a
> >> >> >> > link.
> >> >> >> > They seem to be sent via web.
> >> >> >> >
> >> >> >> > The ranking and discarding some mentions will not completely
> >> >> >> > resolve
> >> >> >> > the
> >> >> >> > problem.
> >> >> >> > Because our mention data and trending words data both were
> >> >> >> > affected.
> >> >> >> > We
> >> >> >> > donot want to eliminate tweets from innocent people who have few
> >> >> >> > followers.
> >> >> >> >
> >> >> >> > The simplest way seems to be just ignoring the tweets coming
> from
> >> >> >> > outside of
> >> >> >> > the community.
> >> >> >> > But those tweets were helping us to extend our network.
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> > On Fri, Nov 26, 2010 at 6:42 PM, Adam Green <140...@gmail.com>
> >> >> >> > wrote:
> >> >> >> >>
> >> >> >> >> As long as you aren't trying to capture and deliver *all*
> tweets,
> >> >> >> >> there are a couple of good ways to cut out spammers. One thing
> I
> >> >> >> >> do
> >> >> >> >> is
> >> >> >> >> save all mentions for all users in a database of tweets. When a
> >> >> >> >> tweet
> >> >> >> >> comes in from the streaming API, I collect @mentions, and store
> >> >> >> >> them
> >> >> >> >> with the screen name of the tweet's author and the screen name
> >> >> >> >> mentioned. Then I can rank users based on the number of
> different
> >> >> >> >> accounts that mention them. If you only use the tweets from the
> >> >> >> >> top
> >> >> >> >> N%
> >> >> >> >> of users, the quality improves a lot. I find that the top 80%
> is
> >> >> >> >> usually enough of a screen to get good quality.
> >> >> >> >>
> >> >> >> >> Another trick is blocking duplicates from each user. The API
> only
> >> >> >> >> blocks duplicates that repeat immediately, but if a spammer has
> a
> >> >> >> >> list
> >> >> >> >> of tweets, and cycles through them, all the tweets get through.
> I
> >> >> >> >> compare all new tweets with the other tweets from that user.
> This
> >> >> >> >> is
> >> >> >> >> very expensive if you have a big database. This can be made
> less
> >> >> >> >> intensive by limiting the comparison to just the tweets from
> that
> >> >> >> >> user
> >> >> >> >> in the last few days. You can also run this with a separate
> >> >> >> >> process
> >> >> >> >> that doesn't slow down you main tweet parsing loop. Most
> spammers
> >> >> >> >> are
> >> >> >> >> so simplistic that they just repeat the same tweet over and
> over.
> >> >> >> >> In
> >> >> >> >> a
> >> >> >> >> real spammy set of keywords, if I find more than a few
> duplicates
> >> >> >> >> from
> >> >> >> >> a user, I just stop saving their tweets.
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> On Fri, Nov 26, 2010 at 11:26 AM, Furkan Kuru
> >> >> >> >> <furkank...@gmail.com>
> >> >> >> >> wrote:
> >> >> >> >> >
> >> >> >> >> > Word "lol" is the most common in these spam tweets. We
> receive
> >> >> >> >> > 400
> >> >> >> >> > spam
> >> >> >> >> > tweets per hour now tracking 100K people.
> >> >> >> >> >
> >> >> >> >> > We plan to delete all of the tweets containing "lol" word. It
> >> >> >> >> > is
> >> >> >> >> > also
> >> >> >> >> > used
> >> >> >> >> > by our users (Turkish people) writing in English though.
> >> >> >> >> >
> >> >> >> >> > Any better suggestions?
> >> >> >> >> >
> >> >> >> >>
> >> >> >> >> --
> >> >> >> >> Adam Green
> >> >> >> >> Twitter API Consultant and Trainer
> >> >> >> >> http://140dev.com
> >> >> >> >> @140dev
> >> >> >> >>
> >> >> >> >> --
> >> >> >> >> Twitter developer documentation and resources:
> >> >> >> >> http://dev.twitter.com/doc
> >> >> >> >> API updates via Twitter: http://twitter.com/twitterapi
> >> >> >> >> Issues/Enhancements Tracker:
> >> >> >> >> http://code.google.com/p/twitter-api/issues/list
> >> >> >> >> Change your membership to this group:
> >> >> >> >> http://groups.google.com/group/twitter-development-talk
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> > --
> >> >> >> > Furkan Kuru
> >> >> >> >
> >> >> >> > --
> >> >> >> > Twitter developer documentation and resources:
> >> >> >> > http://dev.twitter.com/doc
> >> >> >> > API updates via Twitter: http://twitter.com/twitterapi
> >> >> >> > Issues/Enhancements Tracker:
> >> >> >> > http://code.google.com/p/twitter-api/issues/list
> >> >> >> > Change your membership to this group:
> >> >> >> > http://groups.google.com/group/twitter-development-talk
> >> >> >> >
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> --
> >> >> >> Adam Green
> >> >> >> Twitter API Consultant and Trainer
> >> >> >> http://140dev.com
> >> >> >> @140dev
> >> >> >>
> >> >> >> --
> >> >> >> Twitter developer documentation and resources:
> >> >> >> http://dev.twitter.com/doc
> >> >> >> API updates via Twitter: http://twitter.com/twitterapi
> >> >> >> Issues/Enhancements Tracker:
> >> >> >> http://code.google.com/p/twitter-api/issues/list
> >> >> >> Change your membership to this group:
> >> >> >> http://groups.google.com/group/twitter-development-talk
> >> >> >
> >> >> >
> >> >> >
> >> >> > --
> >> >> > Furkan Kuru
> >> >> >
> >> >> > --
> >> >> > Twitter developer documentation and resources:
> >> >> > http://dev.twitter.com/doc
> >> >> > API updates via Twitter: http://twitter.com/twitterapi
> >> >> > Issues/Enhancements Tracker:
> >> >> > http://code.google.com/p/twitter-api/issues/list
> >> >> > Change your membership to this group:
> >> >> > http://groups.google.com/group/twitter-development-talk
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Adam Green
> >> >> Twitter API Consultant and Trainer
> >> >> http://140dev.com
> >> >> @140dev
> >> >>
> >> >> --
> >> >> Twitter developer documentation and resources:
> >> >> http://dev.twitter.com/doc
> >> >> API updates via Twitter: http://twitter.com/twitterapi
> >> >> Issues/Enhancements Tracker:
> >> >> http://code.google.com/p/twitter-api/issues/list
> >> >> Change your membership to this group:
> >> >> http://groups.google.com/group/twitter-development-talk
> >> >
> >> >
> >> >
> >> > --
> >> > Furkan Kuru
> >> >
> >> > --
> >> > Twitter developer documentation and resources:
> >> > http://dev.twitter.com/doc
> >> > API updates via Twitter: http://twitter.com/twitterapi
> >> > Issues/Enhancements Tracker:
> >> > http://code.google.com/p/twitter-api/issues/list
> >> > Change your membership to this group:
> >> > http://groups.google.com/group/twitter-development-talk
> >> >
> >>
> >>
> >>
> >> --
> >> Adam Green
> >> Twitter API Consultant and Trainer
> >> http://140dev.com
> >> @140dev
> >>
> >> --
> >> Twitter developer documentation and resources:
> http://dev.twitter.com/doc
> >> API updates via Twitter: http://twitter.com/twitterapi
> >> Issues/Enhancements Tracker:
> >> http://code.google.com/p/twitter-api/issues/list
> >> Change your membership to this group:
> >> http://groups.google.com/group/twitter-development-talk
> >
> >
> >
> > --
> > Furkan Kuru
> >
> > --
> > Twitter developer documentation and resources:
> http://dev.twitter.com/doc
> > API updates via Twitter: http://twitter.com/twitterapi
> > Issues/Enhancements Tracker:
> > http://code.google.com/p/twitter-api/issues/list
> > Change your membership to this group:
> > http://groups.google.com/group/twitter-development-talk
> >
>
>
>
> --
> Adam Green
> Twitter API Consultant and Trainer
> http://140dev.com
> @140dev
>
> --
> Twitter developer documentation and resources: http://dev.twitter.com/doc
> API updates via Twitter: http://twitter.com/twitterapi
> Issues/Enhancements Tracker:
> http://code.google.com/p/twitter-api/issues/list
> Change your membership to this group:
> http://groups.google.com/group/twitter-development-talk
>



-- 
Furkan Kuru

-- 
Twitter developer documentation and resources: http://dev.twitter.com/doc
API updates via Twitter: http://twitter.com/twitterapi
Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list
Change your membership to this group: 
http://groups.google.com/group/twitter-development-talk

Reply via email to