Re: [twitter-dev] Trying to get rid of twitter spammers
The text in these spam tweets are not easy to recognize. They do not repeat. They are mixed of different words and they contain a link. They seem to be sent via web. The ranking and discarding some mentions will not completely resolve the problem. Because our mention data and trending words data both were affected. We donot want to eliminate tweets from innocent people who have few followers. The simplest way seems to be just ignoring the tweets coming from outside of the community. But those tweets were helping us to extend our network. On Fri, Nov 26, 2010 at 6:42 PM, Adam Green 140...@gmail.com wrote: As long as you aren't trying to capture and deliver *all* tweets, there are a couple of good ways to cut out spammers. One thing I do is save all mentions for all users in a database of tweets. When a tweet comes in from the streaming API, I collect @mentions, and store them with the screen name of the tweet's author and the screen name mentioned. Then I can rank users based on the number of different accounts that mention them. If you only use the tweets from the top N% of users, the quality improves a lot. I find that the top 80% is usually enough of a screen to get good quality. Another trick is blocking duplicates from each user. The API only blocks duplicates that repeat immediately, but if a spammer has a list of tweets, and cycles through them, all the tweets get through. I compare all new tweets with the other tweets from that user. This is very expensive if you have a big database. This can be made less intensive by limiting the comparison to just the tweets from that user in the last few days. You can also run this with a separate process that doesn't slow down you main tweet parsing loop. Most spammers are so simplistic that they just repeat the same tweet over and over. In a real spammy set of keywords, if I find more than a few duplicates from a user, I just stop saving their tweets. On Fri, Nov 26, 2010 at 11:26 AM, Furkan Kuru furkank...@gmail.com wrote: Word lol is the most common in these spam tweets. We receive 400 spam tweets per hour now tracking 100K people. We plan to delete all of the tweets containing lol word. It is also used by our users (Turkish people) writing in English though. Any better suggestions? -- Adam Green Twitter API Consultant and Trainer http://140dev.com @140dev -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk -- Furkan Kuru -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Re: Geocoded searches broken
I can also confirm the same problem across multiple twitter clients. It also looks like any seaches including filter:links also return zero results. On Nov 27, 4:28 am, Hrishikesh Bakshi bakshi.hrishik...@gmail.com wrote: I am facing the same issue. I tried from different IP addresses just to make sure. I met more people on IRC with the same problem. I get the exact same warning each time. warning: adjusted since_id to 8376324073257984 due to temporary error On Fri, Nov 26, 2010 at 1:53 PM, Mack D. Male master...@gmail.com wrote: Since yesterday, geocoded searches have been broken intermittently. Sometimes results are returned normally, then for stretches of time (30 minutes or more) no results are returned. During that time, there's a warning like the following: adjusted since_id to 8230615843933184 due to temporary error Here's a query that at this very moment (~11:55 AM) returns zero results: http://search.twitter.com/search?q=near:%22Edmonton,Alberta%22 It stopped working about an hour ago (~10:55 AM MST). Any information on this? -- Twitter developer documentation and resources:http://dev.twitter.com/doc API updates via Twitter:http://twitter.com/twitterapi Issues/Enhancements Tracker:http://code.google.com/p/twitter-api/issues/list Change your membership to this group:http://groups.google.com/group/twitter-development-talk -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
Re: [twitter-dev] Error creating tweets by API
I discovered ... When I had the words And you? equivalent to And you? in Portuguese, he returned the error 400. Now that I have drawn is functioning normally, not strange? Regards, Luís Victor Quintas 2010/11/27 Andy Matsubara andymatsub...@gmail.com Twitter returns error when you submit duplicate tweets. I guess it is your case. Andy Matsubara On Sat, Nov 27, 2010 at 6:11 AM, Luis Victor Quintas luisvictorquin...@gmail.com wrote: The API is not timed with, and still returns error 400! 404 returns in a few moments ... If I try to publish a tweet with other text, works perfectly. It may have been blocked? Regards, Luís Victor Quintas 2010/11/26 Igor Kharin igorkha...@gmail.com http://apiwiki.twitter.com/w/page/22554652/HTTP-Response-Codes-and-Errors 400 Bad Request: The request was invalid. An accompanying error message will explain why. This is the status code will be returned during rate limiting. 401 Unauthorized: Authentication credentials were missing or incorrect. Response body may be helpful as well. On Fri, Nov 26, 2010 at 3:08 AM, Luis Victor Quintas luisvictorquin...@gmail.com wrote: Hello everybody, I created an application using the Twitter API, the TwitVou.com. It is an application for creating invitations and see who will participate. Whenever you create an invitation or a user participates in an invitation, the application publishes a tweet. A few days ago that is no longer publishing the tweets, and returns the error 400 or 401. When I try to publish other texts, works normally by the API. Anyone know that it might be? Regards, Luís Victor Quintas -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk -- Luís Victor Quintas luisvictorquintas luisvictorquin...@gmail.com luisvictorquintas -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk -- Luís Victor Quintas luisvictorquintas luisvictorquin...@gmail.com luisvictorquintas -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
Re: [twitter-dev] Trying to get rid of twitter spammers
All of your sample spam tweets are from suspended accounts, yet the tweets were only sent yesterday. That means that the spammers behavior was so aggressive that they were suspended quickly by a Twitter algorithm. I doubt that a human at Twitter read your email and went through each tweet suspending the accounts. Have you checked to see how quickly these spam accounts get canceled for other spam tweets? You could hold back tweets from unknown users for 24 hours, and then check all new users through the API to see if they are suspended. If they aren't suspended, you can whitelist them in your system. What is really weird is that I also checked the URLs in these tweets and they resolve to an empty page. They return a header with an HTTP code of 200, and no content at all. That can't be an accident. Either they are sending empty responses to everyone, or they could tell from my IP that they didn't want to send anything to me. Why would a spammer do that? They only benefit if someone clicks on their links and buys something, or gets infected somehow. Could you be the subject of some kind of attack? You use the word community. Would anyone want to disrupt your community? Is this a community that is in one geographic area that can be detected by IP? Very interesting... Anyway, you can use URL resolution to test new users. When you get a tweet from a new user with a URL, check the URL, and blacklist them if it resolves to an empty page. If you only have to do this for new users, it won't be too processor intensive. On Sat, Nov 27, 2010 at 5:20 AM, Furkan Kuru furkank...@gmail.com wrote: The text in these spam tweets are not easy to recognize. They do not repeat. They are mixed of different words and they contain a link. They seem to be sent via web. The ranking and discarding some mentions will not completely resolve the problem. Because our mention data and trending words data both were affected. We donot want to eliminate tweets from innocent people who have few followers. The simplest way seems to be just ignoring the tweets coming from outside of the community. But those tweets were helping us to extend our network. On Fri, Nov 26, 2010 at 6:42 PM, Adam Green 140...@gmail.com wrote: As long as you aren't trying to capture and deliver *all* tweets, there are a couple of good ways to cut out spammers. One thing I do is save all mentions for all users in a database of tweets. When a tweet comes in from the streaming API, I collect @mentions, and store them with the screen name of the tweet's author and the screen name mentioned. Then I can rank users based on the number of different accounts that mention them. If you only use the tweets from the top N% of users, the quality improves a lot. I find that the top 80% is usually enough of a screen to get good quality. Another trick is blocking duplicates from each user. The API only blocks duplicates that repeat immediately, but if a spammer has a list of tweets, and cycles through them, all the tweets get through. I compare all new tweets with the other tweets from that user. This is very expensive if you have a big database. This can be made less intensive by limiting the comparison to just the tweets from that user in the last few days. You can also run this with a separate process that doesn't slow down you main tweet parsing loop. Most spammers are so simplistic that they just repeat the same tweet over and over. In a real spammy set of keywords, if I find more than a few duplicates from a user, I just stop saving their tweets. On Fri, Nov 26, 2010 at 11:26 AM, Furkan Kuru furkank...@gmail.com wrote: Word lol is the most common in these spam tweets. We receive 400 spam tweets per hour now tracking 100K people. We plan to delete all of the tweets containing lol word. It is also used by our users (Turkish people) writing in English though. Any better suggestions? -- Adam Green Twitter API Consultant and Trainer http://140dev.com @140dev -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk -- Furkan Kuru -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk -- Adam Green Twitter API Consultant and Trainer http://140dev.com @140dev -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group:
Re: [twitter-dev] Trying to get rid of twitter spammers
empty url? resolve if the user clicks i'm sure there is backend code running, the only purpose of even returning a 200 On Nov 27, 2010, at 8:33 AM, Adam Green wrote: All of your sample spam tweets are from suspended accounts, yet the tweets were only sent yesterday. That means that the spammers behavior was so aggressive that they were suspended quickly by a Twitter algorithm. I doubt that a human at Twitter read your email and went through each tweet suspending the accounts. Have you checked to see how quickly these spam accounts get canceled for other spam tweets? You could hold back tweets from unknown users for 24 hours, and then check all new users through the API to see if they are suspended. If they aren't suspended, you can whitelist them in your system. What is really weird is that I also checked the URLs in these tweets and they resolve to an empty page. They return a header with an HTTP code of 200, and no content at all. That can't be an accident. Either they are sending empty responses to everyone, or they could tell from my IP that they didn't want to send anything to me. Why would a spammer do that? They only benefit if someone clicks on their links and buys something, or gets infected somehow. Could you be the subject of some kind of attack? You use the word community. Would anyone want to disrupt your community? Is this a community that is in one geographic area that can be detected by IP? Very interesting... Anyway, you can use URL resolution to test new users. When you get a tweet from a new user with a URL, check the URL, and blacklist them if it resolves to an empty page. If you only have to do this for new users, it won't be too processor intensive. On Sat, Nov 27, 2010 at 5:20 AM, Furkan Kuru furkank...@gmail.com wrote: The text in these spam tweets are not easy to recognize. They do not repeat. They are mixed of different words and they contain a link. They seem to be sent via web. The ranking and discarding some mentions will not completely resolve the problem. Because our mention data and trending words data both were affected. We donot want to eliminate tweets from innocent people who have few followers. The simplest way seems to be just ignoring the tweets coming from outside of the community. But those tweets were helping us to extend our network. On Fri, Nov 26, 2010 at 6:42 PM, Adam Green 140...@gmail.com wrote: As long as you aren't trying to capture and deliver *all* tweets, there are a couple of good ways to cut out spammers. One thing I do is save all mentions for all users in a database of tweets. When a tweet comes in from the streaming API, I collect @mentions, and store them with the screen name of the tweet's author and the screen name mentioned. Then I can rank users based on the number of different accounts that mention them. If you only use the tweets from the top N% of users, the quality improves a lot. I find that the top 80% is usually enough of a screen to get good quality. Another trick is blocking duplicates from each user. The API only blocks duplicates that repeat immediately, but if a spammer has a list of tweets, and cycles through them, all the tweets get through. I compare all new tweets with the other tweets from that user. This is very expensive if you have a big database. This can be made less intensive by limiting the comparison to just the tweets from that user in the last few days. You can also run this with a separate process that doesn't slow down you main tweet parsing loop. Most spammers are so simplistic that they just repeat the same tweet over and over. In a real spammy set of keywords, if I find more than a few duplicates from a user, I just stop saving their tweets. On Fri, Nov 26, 2010 at 11:26 AM, Furkan Kuru furkank...@gmail.com wrote: Word lol is the most common in these spam tweets. We receive 400 spam tweets per hour now tracking 100K people. We plan to delete all of the tweets containing lol word. It is also used by our users (Turkish people) writing in English though. Any better suggestions? -- Adam Green Twitter API Consultant and Trainer http://140dev.com @140dev -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk -- Furkan Kuru -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk -- Adam Green Twitter API Consultant and Trainer http://140dev.com @140dev -- Twitter developer
Re: [twitter-dev] Trying to get rid of twitter spammers
Most of the tweets here are spams: http://twitturk.com/tweet/search?q=lol On Sat, Nov 27, 2010 at 3:33 PM, Adam Green 140...@gmail.com wrote: All of your sample spam tweets are from suspended accounts, yet the tweets were only sent yesterday. That means that the spammers behavior was so aggressive that they were suspended quickly by a Twitter algorithm. I doubt that a human at Twitter read your email and went through each tweet suspending the accounts. Have you checked to see how quickly these spam accounts get canceled for other spam tweets? You could hold back tweets from unknown users for 24 hours, and then check all new users through the API to see if they are suspended. If they aren't suspended, you can whitelist them in your system. What is really weird is that I also checked the URLs in these tweets and they resolve to an empty page. They return a header with an HTTP code of 200, and no content at all. That can't be an accident. Either they are sending empty responses to everyone, or they could tell from my IP that they didn't want to send anything to me. Why would a spammer do that? They only benefit if someone clicks on their links and buys something, or gets infected somehow. Could you be the subject of some kind of attack? You use the word community. Would anyone want to disrupt your community? Is this a community that is in one geographic area that can be detected by IP? Very interesting... Anyway, you can use URL resolution to test new users. When you get a tweet from a new user with a URL, check the URL, and blacklist them if it resolves to an empty page. If you only have to do this for new users, it won't be too processor intensive. On Sat, Nov 27, 2010 at 5:20 AM, Furkan Kuru furkank...@gmail.com wrote: The text in these spam tweets are not easy to recognize. They do not repeat. They are mixed of different words and they contain a link. They seem to be sent via web. The ranking and discarding some mentions will not completely resolve the problem. Because our mention data and trending words data both were affected. We donot want to eliminate tweets from innocent people who have few followers. The simplest way seems to be just ignoring the tweets coming from outside of the community. But those tweets were helping us to extend our network. On Fri, Nov 26, 2010 at 6:42 PM, Adam Green 140...@gmail.com wrote: As long as you aren't trying to capture and deliver *all* tweets, there are a couple of good ways to cut out spammers. One thing I do is save all mentions for all users in a database of tweets. When a tweet comes in from the streaming API, I collect @mentions, and store them with the screen name of the tweet's author and the screen name mentioned. Then I can rank users based on the number of different accounts that mention them. If you only use the tweets from the top N% of users, the quality improves a lot. I find that the top 80% is usually enough of a screen to get good quality. Another trick is blocking duplicates from each user. The API only blocks duplicates that repeat immediately, but if a spammer has a list of tweets, and cycles through them, all the tweets get through. I compare all new tweets with the other tweets from that user. This is very expensive if you have a big database. This can be made less intensive by limiting the comparison to just the tweets from that user in the last few days. You can also run this with a separate process that doesn't slow down you main tweet parsing loop. Most spammers are so simplistic that they just repeat the same tweet over and over. In a real spammy set of keywords, if I find more than a few duplicates from a user, I just stop saving their tweets. On Fri, Nov 26, 2010 at 11:26 AM, Furkan Kuru furkank...@gmail.com wrote: Word lol is the most common in these spam tweets. We receive 400 spam tweets per hour now tracking 100K people. We plan to delete all of the tweets containing lol word. It is also used by our users (Turkish people) writing in English though. Any better suggestions? -- Adam Green Twitter API Consultant and Trainer http://140dev.com @140dev -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk -- Furkan Kuru -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk -- Adam Green Twitter API Consultant and Trainer
Re: [twitter-dev] Trying to get rid of twitter spammers
The URLs again return a code of 200 and nothing in the content. What happens when you try getting one of the URLs with cURL? I'm curious if it behaves differently for an IP in Turkey. On Sat, Nov 27, 2010 at 8:56 AM, Furkan Kuru furkank...@gmail.com wrote: Most of the tweets here are spams: http://twitturk.com/tweet/search?q=lol On Sat, Nov 27, 2010 at 3:33 PM, Adam Green 140...@gmail.com wrote: All of your sample spam tweets are from suspended accounts, yet the tweets were only sent yesterday. That means that the spammers behavior was so aggressive that they were suspended quickly by a Twitter algorithm. I doubt that a human at Twitter read your email and went through each tweet suspending the accounts. Have you checked to see how quickly these spam accounts get canceled for other spam tweets? You could hold back tweets from unknown users for 24 hours, and then check all new users through the API to see if they are suspended. If they aren't suspended, you can whitelist them in your system. What is really weird is that I also checked the URLs in these tweets and they resolve to an empty page. They return a header with an HTTP code of 200, and no content at all. That can't be an accident. Either they are sending empty responses to everyone, or they could tell from my IP that they didn't want to send anything to me. Why would a spammer do that? They only benefit if someone clicks on their links and buys something, or gets infected somehow. Could you be the subject of some kind of attack? You use the word community. Would anyone want to disrupt your community? Is this a community that is in one geographic area that can be detected by IP? Very interesting... Anyway, you can use URL resolution to test new users. When you get a tweet from a new user with a URL, check the URL, and blacklist them if it resolves to an empty page. If you only have to do this for new users, it won't be too processor intensive. On Sat, Nov 27, 2010 at 5:20 AM, Furkan Kuru furkank...@gmail.com wrote: The text in these spam tweets are not easy to recognize. They do not repeat. They are mixed of different words and they contain a link. They seem to be sent via web. The ranking and discarding some mentions will not completely resolve the problem. Because our mention data and trending words data both were affected. We donot want to eliminate tweets from innocent people who have few followers. The simplest way seems to be just ignoring the tweets coming from outside of the community. But those tweets were helping us to extend our network. On Fri, Nov 26, 2010 at 6:42 PM, Adam Green 140...@gmail.com wrote: As long as you aren't trying to capture and deliver *all* tweets, there are a couple of good ways to cut out spammers. One thing I do is save all mentions for all users in a database of tweets. When a tweet comes in from the streaming API, I collect @mentions, and store them with the screen name of the tweet's author and the screen name mentioned. Then I can rank users based on the number of different accounts that mention them. If you only use the tweets from the top N% of users, the quality improves a lot. I find that the top 80% is usually enough of a screen to get good quality. Another trick is blocking duplicates from each user. The API only blocks duplicates that repeat immediately, but if a spammer has a list of tweets, and cycles through them, all the tweets get through. I compare all new tweets with the other tweets from that user. This is very expensive if you have a big database. This can be made less intensive by limiting the comparison to just the tweets from that user in the last few days. You can also run this with a separate process that doesn't slow down you main tweet parsing loop. Most spammers are so simplistic that they just repeat the same tweet over and over. In a real spammy set of keywords, if I find more than a few duplicates from a user, I just stop saving their tweets. On Fri, Nov 26, 2010 at 11:26 AM, Furkan Kuru furkank...@gmail.com wrote: Word lol is the most common in these spam tweets. We receive 400 spam tweets per hour now tracking 100K people. We plan to delete all of the tweets containing lol word. It is also used by our users (Turkish people) writing in English though. Any better suggestions? -- Adam Green Twitter API Consultant and Trainer http://140dev.com @140dev -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk -- Furkan Kuru -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter:
Re: [twitter-dev] Trying to get rid of twitter spammers
It returns a redirection to amazon.com product page Example: http://www.amazon.com/gp/product/B0041E16RC?ie=UTF8tag=iphone403d-20linkCode=as2camp=1789creative=9325creativeASIN=B0041E16RC On Sat, Nov 27, 2010 at 4:04 PM, Adam Green 140...@gmail.com wrote: The URLs again return a code of 200 and nothing in the content. What happens when you try getting one of the URLs with cURL? I'm curious if it behaves differently for an IP in Turkey. On Sat, Nov 27, 2010 at 8:56 AM, Furkan Kuru furkank...@gmail.com wrote: Most of the tweets here are spams: http://twitturk.com/tweet/search?q=lol On Sat, Nov 27, 2010 at 3:33 PM, Adam Green 140...@gmail.com wrote: All of your sample spam tweets are from suspended accounts, yet the tweets were only sent yesterday. That means that the spammers behavior was so aggressive that they were suspended quickly by a Twitter algorithm. I doubt that a human at Twitter read your email and went through each tweet suspending the accounts. Have you checked to see how quickly these spam accounts get canceled for other spam tweets? You could hold back tweets from unknown users for 24 hours, and then check all new users through the API to see if they are suspended. If they aren't suspended, you can whitelist them in your system. What is really weird is that I also checked the URLs in these tweets and they resolve to an empty page. They return a header with an HTTP code of 200, and no content at all. That can't be an accident. Either they are sending empty responses to everyone, or they could tell from my IP that they didn't want to send anything to me. Why would a spammer do that? They only benefit if someone clicks on their links and buys something, or gets infected somehow. Could you be the subject of some kind of attack? You use the word community. Would anyone want to disrupt your community? Is this a community that is in one geographic area that can be detected by IP? Very interesting... Anyway, you can use URL resolution to test new users. When you get a tweet from a new user with a URL, check the URL, and blacklist them if it resolves to an empty page. If you only have to do this for new users, it won't be too processor intensive. On Sat, Nov 27, 2010 at 5:20 AM, Furkan Kuru furkank...@gmail.com wrote: The text in these spam tweets are not easy to recognize. They do not repeat. They are mixed of different words and they contain a link. They seem to be sent via web. The ranking and discarding some mentions will not completely resolve the problem. Because our mention data and trending words data both were affected. We donot want to eliminate tweets from innocent people who have few followers. The simplest way seems to be just ignoring the tweets coming from outside of the community. But those tweets were helping us to extend our network. On Fri, Nov 26, 2010 at 6:42 PM, Adam Green 140...@gmail.com wrote: As long as you aren't trying to capture and deliver *all* tweets, there are a couple of good ways to cut out spammers. One thing I do is save all mentions for all users in a database of tweets. When a tweet comes in from the streaming API, I collect @mentions, and store them with the screen name of the tweet's author and the screen name mentioned. Then I can rank users based on the number of different accounts that mention them. If you only use the tweets from the top N% of users, the quality improves a lot. I find that the top 80% is usually enough of a screen to get good quality. Another trick is blocking duplicates from each user. The API only blocks duplicates that repeat immediately, but if a spammer has a list of tweets, and cycles through them, all the tweets get through. I compare all new tweets with the other tweets from that user. This is very expensive if you have a big database. This can be made less intensive by limiting the comparison to just the tweets from that user in the last few days. You can also run this with a separate process that doesn't slow down you main tweet parsing loop. Most spammers are so simplistic that they just repeat the same tweet over and over. In a real spammy set of keywords, if I find more than a few duplicates from a user, I just stop saving their tweets. On Fri, Nov 26, 2010 at 11:26 AM, Furkan Kuru furkank...@gmail.com wrote: Word lol is the most common in these spam tweets. We receive 400 spam tweets per hour now tracking 100K people. We plan to delete all of the tweets containing lol word. It is also used by our users (Turkish people) writing in English though. Any better suggestions? -- Adam Green Twitter API Consultant and Trainer http://140dev.com @140dev -- Twitter developer documentation and resources: http://dev.twitter.com/doc API
Re: [twitter-dev] Trying to get rid of twitter spammers
Now you know that it does resolve differently in different countries. You could set up an account with a webhost in the US, and have a script there that you can call with URLs in tweets from new users. If the URL resolves to a blank page, blacklist that user. There are plenty of good hosts that only charge $7 a month. Sounds extreme, but these are very clever spammers. Or you could just resolve URLs from new users, and blacklist them if the URL points to Amazon. That will work as long as they still point to Amazon. On Sat, Nov 27, 2010 at 9:12 AM, Furkan Kuru furkank...@gmail.com wrote: It returns a redirection to amazon.com product page Example: http://www.amazon.com/gp/product/B0041E16RC?ie=UTF8tag=iphone403d-20linkCode=as2camp=1789creative=9325creativeASIN=B0041E16RC On Sat, Nov 27, 2010 at 4:04 PM, Adam Green 140...@gmail.com wrote: The URLs again return a code of 200 and nothing in the content. What happens when you try getting one of the URLs with cURL? I'm curious if it behaves differently for an IP in Turkey. On Sat, Nov 27, 2010 at 8:56 AM, Furkan Kuru furkank...@gmail.com wrote: Most of the tweets here are spams: http://twitturk.com/tweet/search?q=lol On Sat, Nov 27, 2010 at 3:33 PM, Adam Green 140...@gmail.com wrote: All of your sample spam tweets are from suspended accounts, yet the tweets were only sent yesterday. That means that the spammers behavior was so aggressive that they were suspended quickly by a Twitter algorithm. I doubt that a human at Twitter read your email and went through each tweet suspending the accounts. Have you checked to see how quickly these spam accounts get canceled for other spam tweets? You could hold back tweets from unknown users for 24 hours, and then check all new users through the API to see if they are suspended. If they aren't suspended, you can whitelist them in your system. What is really weird is that I also checked the URLs in these tweets and they resolve to an empty page. They return a header with an HTTP code of 200, and no content at all. That can't be an accident. Either they are sending empty responses to everyone, or they could tell from my IP that they didn't want to send anything to me. Why would a spammer do that? They only benefit if someone clicks on their links and buys something, or gets infected somehow. Could you be the subject of some kind of attack? You use the word community. Would anyone want to disrupt your community? Is this a community that is in one geographic area that can be detected by IP? Very interesting... Anyway, you can use URL resolution to test new users. When you get a tweet from a new user with a URL, check the URL, and blacklist them if it resolves to an empty page. If you only have to do this for new users, it won't be too processor intensive. On Sat, Nov 27, 2010 at 5:20 AM, Furkan Kuru furkank...@gmail.com wrote: The text in these spam tweets are not easy to recognize. They do not repeat. They are mixed of different words and they contain a link. They seem to be sent via web. The ranking and discarding some mentions will not completely resolve the problem. Because our mention data and trending words data both were affected. We donot want to eliminate tweets from innocent people who have few followers. The simplest way seems to be just ignoring the tweets coming from outside of the community. But those tweets were helping us to extend our network. On Fri, Nov 26, 2010 at 6:42 PM, Adam Green 140...@gmail.com wrote: As long as you aren't trying to capture and deliver *all* tweets, there are a couple of good ways to cut out spammers. One thing I do is save all mentions for all users in a database of tweets. When a tweet comes in from the streaming API, I collect @mentions, and store them with the screen name of the tweet's author and the screen name mentioned. Then I can rank users based on the number of different accounts that mention them. If you only use the tweets from the top N% of users, the quality improves a lot. I find that the top 80% is usually enough of a screen to get good quality. Another trick is blocking duplicates from each user. The API only blocks duplicates that repeat immediately, but if a spammer has a list of tweets, and cycles through them, all the tweets get through. I compare all new tweets with the other tweets from that user. This is very expensive if you have a big database. This can be made less intensive by limiting the comparison to just the tweets from that user in the last few days. You can also run this with a separate process that doesn't slow down you main tweet parsing loop. Most spammers are so simplistic that they just repeat the same tweet over and over. In a real spammy set of keywords, if I find more than a few duplicates from a user,
[twitter-dev] Re: Where can I find the updated rate limit after OAuth?
Yes.. I was expecting 350 oauthenticated calls per hour but was not able to find it after OAuth. It is still giving me the 150 rate limit. Thanks. On Nov 27, 5:06 am, Edward Hotchkiss edw...@edwardhotchkiss.com wrote: it's 150 requests for flat file data per hour and 350 oauthenticated calls per hour ... unless you use a proxy. :P On Nov 26, 2010, at 2:55 AM, m36tb6ll wrote: Hi! I am a newbie in the field and am working on my first twitter web app. I have created a variable loop timer using rate_limit_status which works well in maximizing the usage of the twitter API without going over the hourly limits. Now that I have incorporated OAuth, I was expecting to see the limit increase from 150 (unauthenticated requests) to 350 (authenticated requests). But, I am still seeing the 150 limit both in the response headers and Firebug when calling rate_limit_status API after OAuth. Is there something I'm missing here? Your help would be greatly appreciated. Thanks in advance... -- Twitter developer documentation and resources:http://dev.twitter.com/doc API updates via Twitter:http://twitter.com/twitterapi Issues/Enhancements Tracker:http://code.google.com/p/twitter-api/issues/list Change your membership to this group:http://groups.google.com/group/twitter-development-talk Regards, Edward Hotchkiss edw...@edwardhotchkiss.comhttp://www.edwardhotchkiss.com/ -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Re: How to fetch all followers of a random user of twitter?
Correct me if I'm wrong, using ids/followers will only give me the IDs and you would need to to fetch the other information for each of the IDs one by one. Meaning, you would need to make 5001 api requests to fetch information for 5000 followers/friends (such as # of following, # of followers, # of tweets, and many more). Unlike to the method I have noted above, you would only need to do 50 api requests to fetch information on 5000 followers/friends. If there is a way I'm missing with regards to the use of the IDs, I'm all ears. Thanks! On Nov 27, 5:08 am, Edward Hotchkiss edw...@edwardhotchkiss.com wrote: dude just use oauth, 5000 per call ids/followers is the method. look it up. On Nov 26, 2010, at 3:04 AM,m36tb6llwrote: Hi. If it would help. I just created a web app using http://api.twitter.com/1/statuses/followers.jsonand made a loop using a variable delay time usinghttp://api.twitter.com/1/account/rate_limit_status.json so as to avoid going over the limits. It allowed me to fetch approximately 15000 (100 per call) in 1 hour using unauthenticated requests. ;) On Nov 24, 2:29 am, Edward Hotchkiss edw...@edwardhotchkiss.com wrote: just make sure to check for next_cursor_str to grab the next page if the user has more than 5000. note that next_cursor_str does not ever return null On Nov 23, 2010, at 1:17 PM, Matt Harris wrote: You can get the list of all followers using the API request: https://api.twitter.com/1/followers/ids.json?cursor=-1 That request will return up to 5000 follower IDs in one request. You can then look up details of those users using the /1/users/lookup method. More information on these methods is available here: http://dev.twitter.com/doc/get/followers/ids and http://dev.twitter.com/doc/get/users/lookup Best @themattharris Developer Advocate, Twitter http://twitter.com/themattharris On Tue, Nov 23, 2010 at 5:11 AM, jaojao wuwei.yuan...@gmail.com wrote: Hi, I have written a php to fetch followers of a given username of twitter by using twitter API. But the result is limited by the number of followers. For example, there are only 100 followers of BBCWorld listed in result, instead of 367,480. What is the solution to overcome this limitation? my code: ?php $username=BBCWorld; //input user name of twitter $follower_url = http://api.twitter.com/1/statuses/followers/;. $username..xml; $twFriends = curl_init(); curl_setopt($twFriends, CURLOPT_URL, $follower_url); curl_setopt($twFriends, CURLOPT_RETURNTRANSFER, TRUE); $twiFriends = curl_exec($twFriends); $response = new SimpleXMLElement($twiFriends); foreach($response-user as $friends){ $thumb = $friends-profile_image_url; $url = $friends-screen_name; $name = $friends-name; ? a title=?php echo $name;? href=http://www.twitter.com/?php echo $url;?img class=photo-img src=?php echo $thumb? border=0 alt= width=40 //a ?php } ? -- Twitter developer documentation and resources:http://dev.twitter.com/doc API updates via Twitter:http://twitter.com/twitterapi Issues/Enhancements Tracker:http://code.google.com/p/twitter-api/issues/list Change your membership to this group:http://groups.google.com/group/twitter-development-talk -- Twitter developer documentation and resources:http://dev.twitter.com/doc API updates via Twitter:http://twitter.com/twitterapi Issues/Enhancements Tracker:http://code.google.com/p/twitter-api/issues/list Change your membership to this group:http://groups.google.com/group/twitter-development-talk Regards, Edward Hotchkiss edw...@edwardhotchkiss.comhttp://www.edwardhotchkiss.com/ -- Twitter developer documentation and resources:http://dev.twitter.com/doc API updates via Twitter:http://twitter.com/twitterapi Issues/Enhancements Tracker:http://code.google.com/p/twitter-api/issues/list Change your membership to this group:http://groups.google.com/group/twitter-development-talk Regards, Edward Hotchkiss edw...@edwardhotchkiss.comhttp://www.edwardhotchkiss.com/ -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Re: lang=en queries to search API not working
I am having the same problem with CURL to $this-searchURL = 'http://search.twitter.com/search.atom?lang=enq='; If I remove lang=en then I get results. Mark On Nov 26, 11:42 pm, steve ick...@gmail.com wrote: This reproduces even onhttp://search.twitter.com. If you try to filter to en only results you get back 0 items for most queries. Select try all languages from the search portal or remove lang=en from your API query and you get results for your queries (most of which are in english.) What's weird is this seemed to be working fine until about 2 days ago. And its been very intermittent since. Yesterday queries would work for a while then they would stop working (same query to the API.) But today they seem to be broken for me all day. Other members of my team reported the same issue yesterday so it defenitly seems to be something on your end. BTW... When calling the API and this happens we're getting back an error similar to this: jsonp1290717568994({results:[],max_id:7896158276488192,since_id: 7896158276488192,refresh_url:? since_id=7896158276488192q=Thanksgiving,results_per_page:50,page: 1,completed_in:0.019352,warning:adjusted since_id to 7896158276488192 due to temporary error,since_id_str:7896158276488192,max_id_str:7896158276488192,query:Thanksgiving}); I did a search and this error was reported back in June but nobody ever responded... Crossing my fingers that this message doesn't go into the void as well... -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
Re: [twitter-dev] Trying to get rid of twitter spammers
Another hosting will be problematic to maintain. I have looked at a few more short urls. They redirect to very wide range of sites not just amazon. I think twitter may change the priority level of Report for spam for new opened accounts. And the number of tweets per hour. Here I write again the link that shows the tweets written as a reply to Turkish people the lol word is the common: http://twitturk.com/tweet/search?q=lol And an example account: http://twitter.com/Bomuchellxee All tweets are spam and lol is common. It has also 0 folloing and 3 followers (real accounts I guess). Unbelievable! On Sat, Nov 27, 2010 at 4:29 PM, Adam Green 140...@gmail.com wrote: Now you know that it does resolve differently in different countries. You could set up an account with a webhost in the US, and have a script there that you can call with URLs in tweets from new users. If the URL resolves to a blank page, blacklist that user. There are plenty of good hosts that only charge $7 a month. Sounds extreme, but these are very clever spammers. Or you could just resolve URLs from new users, and blacklist them if the URL points to Amazon. That will work as long as they still point to Amazon. On Sat, Nov 27, 2010 at 9:12 AM, Furkan Kuru furkank...@gmail.com wrote: It returns a redirection to amazon.com product page Example: http://www.amazon.com/gp/product/B0041E16RC?ie=UTF8tag=iphone403d-20linkCode=as2camp=1789creative=9325creativeASIN=B0041E16RC On Sat, Nov 27, 2010 at 4:04 PM, Adam Green 140...@gmail.com wrote: The URLs again return a code of 200 and nothing in the content. What happens when you try getting one of the URLs with cURL? I'm curious if it behaves differently for an IP in Turkey. On Sat, Nov 27, 2010 at 8:56 AM, Furkan Kuru furkank...@gmail.com wrote: Most of the tweets here are spams: http://twitturk.com/tweet/search?q=lol On Sat, Nov 27, 2010 at 3:33 PM, Adam Green 140...@gmail.com wrote: All of your sample spam tweets are from suspended accounts, yet the tweets were only sent yesterday. That means that the spammers behavior was so aggressive that they were suspended quickly by a Twitter algorithm. I doubt that a human at Twitter read your email and went through each tweet suspending the accounts. Have you checked to see how quickly these spam accounts get canceled for other spam tweets? You could hold back tweets from unknown users for 24 hours, and then check all new users through the API to see if they are suspended. If they aren't suspended, you can whitelist them in your system. What is really weird is that I also checked the URLs in these tweets and they resolve to an empty page. They return a header with an HTTP code of 200, and no content at all. That can't be an accident. Either they are sending empty responses to everyone, or they could tell from my IP that they didn't want to send anything to me. Why would a spammer do that? They only benefit if someone clicks on their links and buys something, or gets infected somehow. Could you be the subject of some kind of attack? You use the word community. Would anyone want to disrupt your community? Is this a community that is in one geographic area that can be detected by IP? Very interesting... Anyway, you can use URL resolution to test new users. When you get a tweet from a new user with a URL, check the URL, and blacklist them if it resolves to an empty page. If you only have to do this for new users, it won't be too processor intensive. On Sat, Nov 27, 2010 at 5:20 AM, Furkan Kuru furkank...@gmail.com wrote: The text in these spam tweets are not easy to recognize. They do not repeat. They are mixed of different words and they contain a link. They seem to be sent via web. The ranking and discarding some mentions will not completely resolve the problem. Because our mention data and trending words data both were affected. We donot want to eliminate tweets from innocent people who have few followers. The simplest way seems to be just ignoring the tweets coming from outside of the community. But those tweets were helping us to extend our network. On Fri, Nov 26, 2010 at 6:42 PM, Adam Green 140...@gmail.com wrote: As long as you aren't trying to capture and deliver *all* tweets, there are a couple of good ways to cut out spammers. One thing I do is save all mentions for all users in a database of tweets. When a tweet comes in from the streaming API, I collect @mentions, and store them with the screen name of the tweet's author and the screen name mentioned. Then I can rank users based on the number of different accounts that mention them. If you only use the tweets from the top N% of users, the quality improves a lot. I find that the top 80% is usually
Re: [twitter-dev] Re: How to fetch all followers of a random user of twitter?
use proxy servers and get all in approx 1 minute depending on speed. On Nov 27, 2010, at 2:28 PM, m36tb6ll wrote: Correct me if I'm wrong, using ids/followers will only give me the IDs and you would need to to fetch the other information for each of the IDs one by one. Meaning, you would need to make 5001 api requests to fetch information for 5000 followers/friends (such as # of following, # of followers, # of tweets, and many more). Unlike to the method I have noted above, you would only need to do 50 api requests to fetch information on 5000 followers/friends. If there is a way I'm missing with regards to the use of the IDs, I'm all ears. Thanks! On Nov 27, 5:08 am, Edward Hotchkiss edw...@edwardhotchkiss.com wrote: dude just use oauth, 5000 per call ids/followers is the method. look it up. On Nov 26, 2010, at 3:04 AM,m36tb6llwrote: Hi. If it would help. I just created a web app using http://api.twitter.com/1/statuses/followers.jsonand made a loop using a variable delay time usinghttp://api.twitter.com/1/account/rate_limit_status.json so as to avoid going over the limits. It allowed me to fetch approximately 15000 (100 per call) in 1 hour using unauthenticated requests. ;) On Nov 24, 2:29 am, Edward Hotchkiss edw...@edwardhotchkiss.com wrote: just make sure to check for next_cursor_str to grab the next page if the user has more than 5000. note that next_cursor_str does not ever return null On Nov 23, 2010, at 1:17 PM, Matt Harris wrote: You can get the list of all followers using the API request: https://api.twitter.com/1/followers/ids.json?cursor=-1 That request will return up to 5000 follower IDs in one request. You can then look up details of those users using the /1/users/lookup method. More information on these methods is available here: http://dev.twitter.com/doc/get/followers/ids and http://dev.twitter.com/doc/get/users/lookup Best @themattharris Developer Advocate, Twitter http://twitter.com/themattharris On Tue, Nov 23, 2010 at 5:11 AM, jaojao wuwei.yuan...@gmail.com wrote: Hi, I have written a php to fetch followers of a given username of twitter by using twitter API. But the result is limited by the number of followers. For example, there are only 100 followers of BBCWorld listed in result, instead of 367,480. What is the solution to overcome this limitation? my code: ?php $username=BBCWorld; //input user name of twitter $follower_url = http://api.twitter.com/1/statuses/followers/;. $username..xml; $twFriends = curl_init(); curl_setopt($twFriends, CURLOPT_URL, $follower_url); curl_setopt($twFriends, CURLOPT_RETURNTRANSFER, TRUE); $twiFriends = curl_exec($twFriends); $response = new SimpleXMLElement($twiFriends); foreach($response-user as $friends){ $thumb = $friends-profile_image_url; $url = $friends-screen_name; $name = $friends-name; ? a title=?php echo $name;? href=http://www.twitter.com/?php echo $url;?img class=photo-img src=?php echo $thumb? border=0 alt= width=40 //a ?php } ? -- Twitter developer documentation and resources:http://dev.twitter.com/doc API updates via Twitter:http://twitter.com/twitterapi Issues/Enhancements Tracker:http://code.google.com/p/twitter-api/issues/list Change your membership to this group:http://groups.google.com/group/twitter-development-talk -- Twitter developer documentation and resources:http://dev.twitter.com/doc API updates via Twitter:http://twitter.com/twitterapi Issues/Enhancements Tracker:http://code.google.com/p/twitter-api/issues/list Change your membership to this group:http://groups.google.com/group/twitter-development-talk Regards, Edward Hotchkiss edw...@edwardhotchkiss.comhttp://www.edwardhotchkiss.com/ -- Twitter developer documentation and resources:http://dev.twitter.com/doc API updates via Twitter:http://twitter.com/twitterapi Issues/Enhancements Tracker:http://code.google.com/p/twitter-api/issues/list Change your membership to this group:http://groups.google.com/group/twitter-development-talk Regards, Edward Hotchkiss edw...@edwardhotchkiss.comhttp://www.edwardhotchkiss.com/ -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk Regards, Edward Hotchkiss edw...@edwardhotchkiss.com http://www.edwardhotchkiss.com/ -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group:
Re: [twitter-dev] Re: lang=en queries to search API not working
i think we're on api.twitter.com for atom/json searches - On Nov 27, 2010, at 2:43 PM, kprobe wrote: I am having the same problem with CURL to $this-searchURL = 'http://search.twitter.com/search.atom?lang=enq='; If I remove lang=en then I get results. Mark On Nov 26, 11:42 pm, steve ick...@gmail.com wrote: This reproduces even onhttp://search.twitter.com. If you try to filter to en only results you get back 0 items for most queries. Select try all languages from the search portal or remove lang=en from your API query and you get results for your queries (most of which are in english.) What's weird is this seemed to be working fine until about 2 days ago. And its been very intermittent since. Yesterday queries would work for a while then they would stop working (same query to the API.) But today they seem to be broken for me all day. Other members of my team reported the same issue yesterday so it defenitly seems to be something on your end. BTW... When calling the API and this happens we're getting back an error similar to this: jsonp1290717568994({results:[],max_id:7896158276488192,since_id: 7896158276488192,refresh_url:? since_id=7896158276488192q=Thanksgiving,results_per_page:50,page: 1,completed_in:0.019352,warning:adjusted since_id to 7896158276488192 due to temporary error,since_id_str:7896158276488192,max_id_str:7896158276488192,query:Thanksgiving}); I did a search and this error was reported back in June but nobody ever responded... Crossing my fingers that this message doesn't go into the void as well... -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk Regards, Edward Hotchkiss edw...@edwardhotchkiss.com http://www.edwardhotchkiss.com/ -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
Re: [twitter-dev] Re: Where can I find the updated rate limit after OAuth?
? On Nov 27, 2010, at 2:17 PM, m36tb6ll wrote: Yes.. I was expecting 350 oauthenticated calls per hour but was not able to find it after OAuth. It is still giving me the 150 rate limit. Thanks. On Nov 27, 5:06 am, Edward Hotchkiss edw...@edwardhotchkiss.com wrote: it's 150 requests for flat file data per hour and 350 oauthenticated calls per hour ... unless you use a proxy. :P On Nov 26, 2010, at 2:55 AM, m36tb6ll wrote: Hi! I am a newbie in the field and am working on my first twitter web app. I have created a variable loop timer using rate_limit_status which works well in maximizing the usage of the twitter API without going over the hourly limits. Now that I have incorporated OAuth, I was expecting to see the limit increase from 150 (unauthenticated requests) to 350 (authenticated requests). But, I am still seeing the 150 limit both in the response headers and Firebug when calling rate_limit_status API after OAuth. Is there something I'm missing here? Your help would be greatly appreciated. Thanks in advance... -- Twitter developer documentation and resources:http://dev.twitter.com/doc API updates via Twitter:http://twitter.com/twitterapi Issues/Enhancements Tracker:http://code.google.com/p/twitter-api/issues/list Change your membership to this group:http://groups.google.com/group/twitter-development-talk Regards, Edward Hotchkiss edw...@edwardhotchkiss.comhttp://www.edwardhotchkiss.com/ -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk Regards, Edward Hotchkiss edw...@edwardhotchkiss.com http://www.edwardhotchkiss.com/ -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
Re: [twitter-dev] Trying to get rid of twitter spammers
the followers are probably bots, create an account and within about 5 minutes or less you will generally have 2-3 followers that appear [real]. they iterate over ids. someone is running a dating/hookup bot net with those user accounts. On Nov 27, 2010, at 4:18 PM, Furkan Kuru wrote: Another hosting will be problematic to maintain. I have looked at a few more short urls. They redirect to very wide range of sites not just amazon. I think twitter may change the priority level of Report for spam for new opened accounts. And the number of tweets per hour. Here I write again the link that shows the tweets written as a reply to Turkish people the lol word is the common: http://twitturk.com/tweet/search?q=lol And an example account: http://twitter.com/Bomuchellxee All tweets are spam and lol is common. It has also 0 folloing and 3 followers (real accounts I guess). Unbelievable! On Sat, Nov 27, 2010 at 4:29 PM, Adam Green 140...@gmail.com wrote: Now you know that it does resolve differently in different countries. You could set up an account with a webhost in the US, and have a script there that you can call with URLs in tweets from new users. If the URL resolves to a blank page, blacklist that user. There are plenty of good hosts that only charge $7 a month. Sounds extreme, but these are very clever spammers. Or you could just resolve URLs from new users, and blacklist them if the URL points to Amazon. That will work as long as they still point to Amazon. On Sat, Nov 27, 2010 at 9:12 AM, Furkan Kuru furkank...@gmail.com wrote: It returns a redirection to amazon.com product page Example: http://www.amazon.com/gp/product/B0041E16RC?ie=UTF8tag=iphone403d-20linkCode=as2camp=1789creative=9325creativeASIN=B0041E16RC On Sat, Nov 27, 2010 at 4:04 PM, Adam Green 140...@gmail.com wrote: The URLs again return a code of 200 and nothing in the content. What happens when you try getting one of the URLs with cURL? I'm curious if it behaves differently for an IP in Turkey. On Sat, Nov 27, 2010 at 8:56 AM, Furkan Kuru furkank...@gmail.com wrote: Most of the tweets here are spams: http://twitturk.com/tweet/search?q=lol On Sat, Nov 27, 2010 at 3:33 PM, Adam Green 140...@gmail.com wrote: All of your sample spam tweets are from suspended accounts, yet the tweets were only sent yesterday. That means that the spammers behavior was so aggressive that they were suspended quickly by a Twitter algorithm. I doubt that a human at Twitter read your email and went through each tweet suspending the accounts. Have you checked to see how quickly these spam accounts get canceled for other spam tweets? You could hold back tweets from unknown users for 24 hours, and then check all new users through the API to see if they are suspended. If they aren't suspended, you can whitelist them in your system. What is really weird is that I also checked the URLs in these tweets and they resolve to an empty page. They return a header with an HTTP code of 200, and no content at all. That can't be an accident. Either they are sending empty responses to everyone, or they could tell from my IP that they didn't want to send anything to me. Why would a spammer do that? They only benefit if someone clicks on their links and buys something, or gets infected somehow. Could you be the subject of some kind of attack? You use the word community. Would anyone want to disrupt your community? Is this a community that is in one geographic area that can be detected by IP? Very interesting... Anyway, you can use URL resolution to test new users. When you get a tweet from a new user with a URL, check the URL, and blacklist them if it resolves to an empty page. If you only have to do this for new users, it won't be too processor intensive. On Sat, Nov 27, 2010 at 5:20 AM, Furkan Kuru furkank...@gmail.com wrote: The text in these spam tweets are not easy to recognize. They do not repeat. They are mixed of different words and they contain a link. They seem to be sent via web. The ranking and discarding some mentions will not completely resolve the problem. Because our mention data and trending words data both were affected. We donot want to eliminate tweets from innocent people who have few followers. The simplest way seems to be just ignoring the tweets coming from outside of the community. But those tweets were helping us to extend our network. On Fri, Nov 26, 2010 at 6:42 PM, Adam Green 140...@gmail.com wrote: As long as you aren't trying to capture and deliver *all* tweets, there are a couple of good ways to cut out spammers. One thing I do is save all mentions for all users in a database of tweets. When a tweet comes in from the streaming API, I collect @mentions, and store
Re: [twitter-dev] lang=en queries to search API not working
I ran into the same problem in june/july. I added the since parameter to solve the problem. Example: http://search.twitter.com/search.atom?lang=enq=thanksgivingsince=2010-11-24 Make sure the since parameter is not more then 5/6 days back in the past. Cheers P.S.: On the new API-Documentation http://dev.twitter.com/doc/get/search this parameter is now named until but didn't work. Maybe that's an error in the documentation. See the old API-Doc http://apiwiki.twitter.com/w/page/22554756/Twitter-Search-API-Method:-search this parameter is named since. -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
Re: [twitter-dev] Trying to get rid of twitter spammers
My final suggestion is to rank users by something (age of account, number of mentions/mentioners/followers/following) and cut out the bottom N%. On Sat, Nov 27, 2010 at 4:18 PM, Furkan Kuru furkank...@gmail.com wrote: Another hosting will be problematic to maintain. I have looked at a few more short urls. They redirect to very wide range of sites not just amazon. I think twitter may change the priority level of Report for spam for new opened accounts. And the number of tweets per hour. Here I write again the link that shows the tweets written as a reply to Turkish people the lol word is the common: http://twitturk.com/tweet/search?q=lol And an example account: http://twitter.com/Bomuchellxee All tweets are spam and lol is common. It has also 0 folloing and 3 followers (real accounts I guess). Unbelievable! On Sat, Nov 27, 2010 at 4:29 PM, Adam Green 140...@gmail.com wrote: Now you know that it does resolve differently in different countries. You could set up an account with a webhost in the US, and have a script there that you can call with URLs in tweets from new users. If the URL resolves to a blank page, blacklist that user. There are plenty of good hosts that only charge $7 a month. Sounds extreme, but these are very clever spammers. Or you could just resolve URLs from new users, and blacklist them if the URL points to Amazon. That will work as long as they still point to Amazon. On Sat, Nov 27, 2010 at 9:12 AM, Furkan Kuru furkank...@gmail.com wrote: It returns a redirection to amazon.com product page Example: http://www.amazon.com/gp/product/B0041E16RC?ie=UTF8tag=iphone403d-20linkCode=as2camp=1789creative=9325creativeASIN=B0041E16RC On Sat, Nov 27, 2010 at 4:04 PM, Adam Green 140...@gmail.com wrote: The URLs again return a code of 200 and nothing in the content. What happens when you try getting one of the URLs with cURL? I'm curious if it behaves differently for an IP in Turkey. On Sat, Nov 27, 2010 at 8:56 AM, Furkan Kuru furkank...@gmail.com wrote: Most of the tweets here are spams: http://twitturk.com/tweet/search?q=lol On Sat, Nov 27, 2010 at 3:33 PM, Adam Green 140...@gmail.com wrote: All of your sample spam tweets are from suspended accounts, yet the tweets were only sent yesterday. That means that the spammers behavior was so aggressive that they were suspended quickly by a Twitter algorithm. I doubt that a human at Twitter read your email and went through each tweet suspending the accounts. Have you checked to see how quickly these spam accounts get canceled for other spam tweets? You could hold back tweets from unknown users for 24 hours, and then check all new users through the API to see if they are suspended. If they aren't suspended, you can whitelist them in your system. What is really weird is that I also checked the URLs in these tweets and they resolve to an empty page. They return a header with an HTTP code of 200, and no content at all. That can't be an accident. Either they are sending empty responses to everyone, or they could tell from my IP that they didn't want to send anything to me. Why would a spammer do that? They only benefit if someone clicks on their links and buys something, or gets infected somehow. Could you be the subject of some kind of attack? You use the word community. Would anyone want to disrupt your community? Is this a community that is in one geographic area that can be detected by IP? Very interesting... Anyway, you can use URL resolution to test new users. When you get a tweet from a new user with a URL, check the URL, and blacklist them if it resolves to an empty page. If you only have to do this for new users, it won't be too processor intensive. On Sat, Nov 27, 2010 at 5:20 AM, Furkan Kuru furkank...@gmail.com wrote: The text in these spam tweets are not easy to recognize. They do not repeat. They are mixed of different words and they contain a link. They seem to be sent via web. The ranking and discarding some mentions will not completely resolve the problem. Because our mention data and trending words data both were affected. We donot want to eliminate tweets from innocent people who have few followers. The simplest way seems to be just ignoring the tweets coming from outside of the community. But those tweets were helping us to extend our network. On Fri, Nov 26, 2010 at 6:42 PM, Adam Green 140...@gmail.com wrote: As long as you aren't trying to capture and deliver *all* tweets, there are a couple of good ways to cut out spammers. One thing I do is save all mentions for all users in a database of tweets. When a tweet comes in from the streaming API, I collect @mentions, and store them with the screen name of the
[twitter-dev] help me plz quiry speed and geocode
I am a uni student, I am very new to the twitter api, i have been making an aplication in Processing to show tweets coming up on a world map in real time. im not trying to get the post its the geocodes i want, with lots of help from people here and on the processing forums i have made it so that it plotts the geocode in my application as a dot and this will represent twitter posts from all around the world. link to a picture of my app to give you an idea: http://yfrog.com/b7worldstarmapj here is the quiry i am using: search.twitter.com/1/statuses/filter.json? location=-168.75,9.79,158.90,83.02 The problem i am getting is that i am getting a twitter post about every 30 seconds with this and after about 5/10 posts it stops feeding me the posts and wont let me connect for about another hour. OVER VIEW OF WHAT I WANT TO ACHIEVE - geocode of posts from all around the would not just one place - i dont need what they have posted but if i get it then its a bonus - i need to be able to recieve this information at a reasonable speed (i dont know if im allowed because of the quiry limit) - any sugestions of different quirys or code are VERY WELCOME here is my processing code is anyone is interested: import com.twitter.processing.*; // // test tweet counting (not sure if it will work) int tweetCount; // this stores how many tweets we've gotten int tweets = 0; // and this stores the text of the last tweet String tweetText = ; Geo tweetGeo; double lati, longi; float latiFl, longiFl,latiMap, longiMap; int textsize; void setup() { size(1000,600); background(0); // set up twitter stream object TweetStream s = new TweetStream(this, search.twitter.com, 80, 1/statuses/filter.json?location=-168.75,9.79,158.90,83.02, Usser, PASSWORD); s.go(); smooth(); } void draw() { fill(0); rect(0,450,1000,150); textsize = 12; // set up fonts PFont font; font = createFont(ArialMT-48.vlw, textsize); textFont(font); textSize(textsize); fill(255); //converts double to float latiFl = (float)lati; longiFl = (float)longi; //map value to screen latiMap = map(latiFl, -90, 90, 0, width); longiMap = map(longiFl, -180, 180, 0, height); // and draw the text of the last tweet text(tweetText, 20, 520); // adn its lat and long text(lat = + lati + long = + longi,20,560); text(number of tweets: +tweetCount, 880, 580); for( int i = 0; i 7000; i++){ fill(255); ellipse(latiMap, longiMap, 5,5); /* fill(0); rect(0,500,1000,100); */ } } // called by twitter stream whenever a new tweet comes in void tweet(Status tweet) { // print a message to the console just for giggles if you like // println(got tweet + tweet.id()); // store the latest tweet text tweetText = tweet.text(); tweetGeo = tweet.geo(); lati = tweetGeo.latitude(); longi = tweetGeo.longitude(); println(lat = + lati + long = + longi); // bump our tweet count by one tweets += 1; println(number of tweets: +tweetCount); tweetCount++; } thanks for looking -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Search not working properly for filters
curl http://search.twitter.com/search.atom?q=test%20filter%3Alinks %20(yfrog) returns no results and there are only 3 results on the search page for the same query: http://search.twitter.com/search?q=http+filter%3Alinks+%28yfrog%29 -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Re: lang=en queries to search API not working
The thing that gets me is the fact their own site (http:// search.twitter.com) doesn't work when you filter things to show only en results. On Nov 27, 2:04 pm, CWorster cwors...@schlimmer.com wrote: I ran into the same problem in june/july. I added the since parameter to solve the problem. Example:http://search.twitter.com/search.atom?lang=enq=thanksgivingsince=20... Make sure the since parameter is not more then 5/6 days back in the past. Cheers P.S.: On the new API-Documentationhttp://dev.twitter.com/doc/get/searchthis parameter is now named until but didn't work. Maybe that's an error in the documentation. See the old API-Dochttp://apiwiki.twitter.com/w/page/22554756/Twitter-Search-API-Method:... this parameter is named since. -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Re: Help Getting Started with Site Streams
no one available to help me with this? On Nov 25, 9:55 am, Jayrox jay...@gmail.com wrote: I'd really like to make my own site stream library. Ive been white listed for the site stream and have been using the origional stream for a long time now. My issue is that I cannot get any response from connection besides 401 Unauthorized. Based on that I assume the following to be true: I have access because I am no longer getting the role not defined for user message, or however it was worded. Based on that I also assume I am using the wrong connection params for providing my oAuth credentials or I am giving the wrong oAuth credentials. Or maybe I am using the right credentials but with the wrong param names. Ive tried: oauth_token_key/oauth_token_secret, oauth_key/oauth_secret, access_key/ access_secret, access_token_key/access_token_secret, consumer_key/ consumer_secret, consumer_token_key/consumer_token_secret Ive tried all of the combinations as GET and as POST. All of which have given a 401 Unauthorized. Which leaves me here: stuck and looking for help. :) On Nov 25, 12:05 am, Nancy Neira n143dra...@hotmail.com wrote: Jay You looking for code patch or the actual code to do streaming? Nancy Date: Wed, 24 Nov 2010 23:53:21 -0500 Subject: [twitter-dev] Help Getting Started with Site Streams From: jay...@gmail.com To: twitter-development-talk@googlegroups.com I am trying to get started with the Site Streams. However, I am having a hard time finding the documentation for getting the stream started. Anyone know where I can find this info or able to provide it? I think I just need to know the names of the params, I can probably figure out the rest. Thanks -- Twitter developer documentation and resources:http://dev.twitter.com/doc API updates via Twitter:http://twitter.com/twitterapi Issues/Enhancements Tracker:http://code.google.com/p/twitter-api/issues/list Change your membership to this group:http://groups.google.com/group/twitter-development-talk -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk