Re: [twitter-dev] Re: Apps that Site Hack

2011-02-25 Thread Pascal Jürgens
How about a competition to develop spam-detection algorithms :)

Pascal

On Feb 24, 2011, at 10:38 PM, Dewald Pretorius wrote:

 Apart from implementing reCAPTCHA on tweet submission, follow, and
 unfollow, I can't see what Twitter can do to prevent that kind of
 abuse (can you imagine the revolt by bona fide users?). How else do
 you determine that it is an actual human and not a piece of automated
 software behind the browser on the user's desktop or laptop? The only
 other option is legally, and that depends on the country of residence
 of the owners of the software. At this point in time, it appears that
 anyone who is able to and have the inclination to write desktop
 software that bypasses the API might have carte blanche to do so.

-- 
Twitter developer documentation and resources: http://dev.twitter.com/doc
API updates via Twitter: http://twitter.com/twitterapi
Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list
Change your membership to this group: 
http://groups.google.com/group/twitter-development-talk


Re: [twitter-dev] I've got the error:OAuth Authentication Failed.And I tried all the wordpress to twitter plugins.

2011-02-12 Thread Pascal Jürgens
I'm no OAuth expert, but did you make sure your system time is properly 
synchronized with a regional NTP server?

Pascal

On Feb 12, 2011, at 3:13 AM, Winson wrote:

 Hi there.
 
 Using WP 3.0.5and WP to Twitter 2.2.6 on a CentOS server, which I
 don't manage at all. I've got the error OAuth Authentication Failed.
 Check your credentials and verify that Twitter is running..
 
 I've also checked all the data from the application, including erasing
 the older and creating a new one, but this issue remains. Twitter is
 active and working at this very moment I'm writing this message.
 
 Anybody help?
 
 PS:And I thought the problem is the Wp to Twitter 2.2.6 Plugins.Then I
 tried to install the other Wordpree to twitter
 
 Plugins..The result:  Authentication Failed..
 
 -- 
 Twitter developer documentation and resources: http://dev.twitter.com/doc
 API updates via Twitter: http://twitter.com/twitterapi
 Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list
 Change your membership to this group: 
 http://groups.google.com/group/twitter-development-talk

-- 
Twitter developer documentation and resources: http://dev.twitter.com/doc
API updates via Twitter: http://twitter.com/twitterapi
Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list
Change your membership to this group: 
http://groups.google.com/group/twitter-development-talk


Re: [twitter-dev] help me plz quiry speed and geocode

2010-11-28 Thread Pascal Jürgens
Hello noname,

the search api is rate limited and only allows a non-disclosed amount of 
queries per hour. You will need to look into the streaming api: consume the 
sample stream and extract geodata. This also gives you tweets from all over the 
world.

Have a look at

http://dev.twitter.com/pages/streaming_api

Cheers,
Pascal

On Nov 28, 2010, at 12:34 AM, disc31 wrote:

 search.twitter.com/1/statuses/filter.json?
 location=-168.75,9.79,158.90,83.02
 
 The problem i am getting is that i am getting a twitter post about
 every 30 seconds with this and after about 5/10 posts it stops feeding
 me the posts and wont let me connect for about another hour.

-- 
Twitter developer documentation and resources: http://dev.twitter.com/doc
API updates via Twitter: http://twitter.com/twitterapi
Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list
Change your membership to this group: 
http://groups.google.com/group/twitter-development-talk


Re: [twitter-dev] Basic Auth deprecation August 16th?

2010-07-28 Thread Pascal Jürgens
http://countdowntooauth.com/


On Jul 29, 2010, at 1:22 AM, chinaski007 wrote:

 
 Any word on if this is still planned?
 
 Any further extensions?
 
 Or is the drop-dead deadline still August 16th?



Re: [twitter-dev] Re: Sending 1600 DMs?

2010-07-28 Thread Pascal Jürgens
Just curious:

the limit is on sending, not receiving. Why exactly would one want to send more 
than 250 tweets for one incident? Wouldn't that many messages overwhelm any 
helpful agency and actually have a detrimental effect?

Pascal


On Jul 29, 2010, at 2:06 AM, Bess wrote:

 There is no way to lift this DM daily limit?
 
 If I build an emergency system to report accidents then official
 twitter for police or Red Cross won't be able to receive more than 250
 DM per day.
 
 If there is a major accidents that involve more than 250 injuries
 assuming each DM per injury report, Twitter will send out Whale
 error after exceeding that limit?



Re: [twitter-dev] Twitter feed showing incorrect dates

2010-07-26 Thread Pascal Jürgens
Ben,

did you account for UTC time?

http://apiwiki.twitter.com/Return-Values

Pascal

On 26.Jul2010, at 18:21, Ben Juneau wrote:

 The dates are incorrect on my website... http://www.bjuneau.com
 
 I'm not sure what I'm doing wrong here? Any help is much appreciated.



Re: [twitter-dev] Home_timeline after tweet destroy

2010-07-23 Thread Pascal Jürgens
Hi Luis,

I might be wrong there, but I think this is the way it works because of 
twitter's caching and distribution architecture. You can never assume to get 
the full amount of tweets or users - some might be filtered, deleted or 
whatnot. If you need more, just get the next page/set using cursors.

Pascal

On 23.Jul2010, at 10:18, luisg wrote:

 Hi there...
 
 I'm experience something strange, I think...
 If I do a tweet destroy through my application and after that I get
 the home_timeline with the property count=20, I'm not getting the 20
 tweets that I should get. I'm getting just 19.
 If I do another tweet destroy and execute home_timeline again, I will
 get only 18... and so on...
 
 Am I doing something wrong? Is there a way to force to get the 20
 tweets?
 
 Thanks,
 
 Luis



Re: [twitter-dev] Re: Home_timeline after tweet destroy

2010-07-23 Thread Pascal Jürgens
Hi Luis,

yes, that's what I mean. You can either get the second page, or just request 
some more, as in:

 http://api.twitter.com/1/statuses/home_timeline.xml?count=25


Pascal

On 23.Jul2010, at 11:40, luisg wrote:

 Hi Pascal,
 
 Thanks for your reply.
 What you mean with cursors?
 
 I have a way to solve this problem:
 
 1- get the home_timeline
 2- count the number of tweets got from (1) and if length  20 I do
 another home_timeline call with page=2
 
 I think this might work. The problem is, I need to do 2 call to
 Twitter and I don't think that's a good way...
 
 That's the type of solution your mean? You have better solution?
 
 Thanks a lot,
 
 Luis



Re: [twitter-dev] Re: Home_timeline after tweet destroy

2010-07-23 Thread Pascal Jürgens
Yes. You can't trust anything on twitter. Hope for good, valid results, prepare 
for anything else.

Pascal

On 23.Jul2010, at 15:03, luisg wrote:

 This means that the count property is not something that you can
 trust, right?
 
 Luis



Re: [twitter-dev] Re: Bug: friends/ids returning invalid user IDs

2010-07-22 Thread Pascal Jürgens
Stale caches.

Pascal

On 22.Jul2010, at 23:38, soung3 wrote:

 It's not only suspended users, but also users that are no longer
 found.  Why does Twitter return ids of users that no longer exist?



Re: [twitter-dev] Method statuses/filter from Streaming API return only incoming tweets and no updates

2010-07-20 Thread Pascal Jürgens
Hello Rostand,

I did my master's thesis using twitter data. My recommendation:

- You can not will not and should not get DMs. They are *private*. Even if you 
do a closed study with 300 consenting people, it's unethical. If you're in the 
US, the ethics committee of your university will have you for lunch.

- Using the stream is easier because the API requires OAUTH, but you will need 
to apply and wait. However, the API will give you historical data.

- Coding your software will take more time than you think!

---
SUMMARY

Use an external service instead! There are several people who can collect 
tweets for you. Have a look at:

http://140kit.com/
http://www.contextminer.org/

Good luck!

Pascal

On 20.Jul2010, at 21:49, Rostand wrote:

 Hi All,
 
 I need for master research project  the public timeline of approx 300
 users.
 
 That is, all the outgoing messages (updates, retweets.., DM) + all
 incoming messages (replies, @user...).
 
 I first thought that the option 'follow' from the Streaming API
 following will do.
 
 http://stream.twitter.com/1/statuses/filter.json?follow=3004231
 
 But I just discovered that I get only the incoming tweets and not the
 outgoing ones.
 
 Is this correct? Any workaround this ? Suggestion
 
 It is king of urgent. Any quick reaction will be appreciated
 
 Greetings
 
 Rostand



Re: [twitter-dev] Re: Twitter backup script

2010-07-19 Thread Pascal Jürgens
Thomas,

last time I heard from the project, they were busy sorting the technical 
details out and still not sure who would even get access. It'll probably be 
open to a selected group of researchers first.

Pascal

On Jul 18, 2010, at 8:16 PM, Tomas Roggero wrote:

 Hi Pascal
 
 What I'm doing is requesting 150 per hour. I've 43 pages, so in about a week 
 I'll get my almost full backup. :D
 
 (I've wrote a PHP script to do that automatically, of course)
 
 Does the Congress Library have an API or some ?



Re: [twitter-dev] Re: Twitter backup script

2010-07-18 Thread Pascal Jürgens
Tom,

at least you know that the library of congress has a backup :)

Pascal

On Jul 18, 2010, at 7:07 , Tom Roggero wrote:

 I've tried your script on Mac, it only works for first 3 pages that's
 weird (i'm running darwin ports for xml functions)...
 Anyway, tried manually do it through firefox, latest page is 16.
 That's the limit. But if you have the ID of the previous tweets you
 could use statuses/SHOW for that ID... the problem is:
 
 IF you are backing up your account because you have all your tweet
 IDs, you will need 1 request per tweet, and maximum is 350 (with HUGE
 luck) per hour In my case, I got 10k, 3k via REST and 7k left to
 do... 7 THOUSAND REQUEST???
 
 Come on twitter, help us, developers, start thingking on backups, we
 know you are gonna explode!.



Re: [twitter-dev] Re: Gardenhose feed down to a trickle

2010-07-16 Thread Pascal Jürgens
Tomo,

John replied on another thread just minutes after you:

 I hoped we'd have an email out on Thursday about this, but I'd imagine it'll 
 go out on Friday. There isn't a problem with your client.

Pascal

On Jul 16, 2010, at 6:14 , Tomo Osumi wrote:

 Dear John,
 
 Could I have any updates about streaming api issues? I met the same
 situation as Sanjay's. both 'sample' and 'garden hose' streaming API
 have still 1/3 - 1/5 as much traffic as usual.
 
 Tomo
 http://twitter.com/elrana/



Re: [twitter-dev] A feed greater equivalent to the old gardenhose?

2010-07-16 Thread Pascal Jürgens
In addition to the note from Taylor, I think it's a good idea to remind people 
that stream contents are identical - it's absolutely no use and a waste of 
resources to consume more than one sample stream. Just pick the largest one - 
that will contain all messages you can get.


Pascal

On Jul 16, 2010, at 16:07 , Sanjay wrote:

 Just saw the posting about the reduction in the gardenhose (and
 sprtizer) feeds ( http://t.co/d6o1npx ).  So for those of us who need
 the additional data and are designed around it (and can consume it),
 is there a way to get that level of feed back?  For me in particular
 this is going to significantly hamper the app that I'm working on and
 was looking to launch in a few weeks.
 
 Help...?
 
 Sanjay



Re: [twitter-dev] Twitter API HTTP Heading

2010-07-15 Thread Pascal Jürgens
Nicholas,

Did you just publish your account credentials?


Pascal

On Jul 15, 2010, at 14:02 , Nicholas Kingsley wrote:

   INC send$, Authorization: Basic FishyMcFlipFlop:burpmachine\r\n



Re: [twitter-dev] Gardenhose feed down to a trickle

2010-07-15 Thread Pascal Jürgens
# Idle musing

Inflation adjustment?

# end

Pascal

On Jul 15, 2010, at 17:14 , John Kalucki wrote:

 This is a known issue. We'll have an email about the Gardenhose and Spritzer 
 later today.
 
 -John Kalucki
 http://twitter.com/jkalucki
 Infrastructure, Twitter Inc.



Re: [twitter-dev] Juitter - Some accounts aren't indexed by search?

2010-07-13 Thread Pascal Jürgens
Hi,

those are probably accounts which twitter filtered out. The docs are pretty 
clear on this and give practical advice:

http://dev.twitter.com/pages/streaming_api_concepts

 Both the Streaming API and the Search API filter statuses created by a small 
 proportion of accounts based upon status quality metrics. For example, 
 frequent and repetitious status updates may, in some instances, and in 
 combination with other metrics, result in a different status quality score 
 for a given account. Results that are not selected by user id, for example: 
 samples and keyword track, are filtered by this status quality metric. 
 Results that are selected by user id, currently only results from the follow 
 predicate, are unfiltered and allow all matching statuses to pass. If an 
 expected user's statuses are not present in a non-follow-predicate stream 
 type, manually cross-check the user against Search results. If the user's 
 statuses are also not returned in Search, you can assume that the user's 
 statuses will not be returned by non-follow-predicated streams.
 For more details see: http://help.twitter.com/forums/10713/entries/42646 
 which states, in part:
 In order to keep your search results relevant, Twitter filters search results 
 for quality. Our search results will not include suspended accounts, or 
 accounts that may jeopardize search quality. Material that degrades search 
 relevancy or creates a bad search experience for people using Twitter may be 
 permanently removed.


On Jul 13, 2010, at 23:30 , codeless wrote:

 Hey there, I'm working with Juitter (http://juitter.com) on a project
 and have noticed that some users tweets won't appear on the live
 stream. And not just accounts that are set to private, but regular
 twitter accounts. I have done some research and have found some saying
 that not all Twitter accounts are indexed by the Twitter search
 engine. Is there anything I can change in Juitter to fix it so it'll
 work with all non-private accounts? Or a way to force Twitter into
 indexing a certain account? Any and all information is welcomed. Thank
 you for your help.



Re: [twitter-dev] Re: Can not tweet to e.g. #Studentenjob anymore

2010-07-12 Thread Pascal Jürgens
Michael,

you can find out how to check here:

http://help.twitter.com/entries/15790-how-to-contest-account-suspension

Pascal

On Jul 12, 2010, at 10:32 , microcosmic wrote:

 Hello Pascal.
 
 It's not the case that our account is disabled. Or is there a hidden
 message that is saying account disabled?
 
 Regards,
 
 Michael



Re: [twitter-dev] Can not tweet to e.g. #Studentenjob anymore

2010-07-11 Thread Pascal Jürgens
Hello Michael,

just an idea:
try to log into the twitter website with your account and see whether it was 
disabled for spam.

Pascal
On Jul 11, 2010, at 17:34 , microcosmic wrote:

 Hello there.
 
 Since Friday I am not able to send tweets to e.g. #Studentenjob or
 #Nebenjob anymore. I found out that it is not possible to send tweets
 to any #... for our twitter account.
 
 What can I do? Is there an error in our program we use?
 
 Thanks in advance.
 
 Regards,
 
 Michael



Re: [twitter-dev] Streaming API time drifting problem and possible solutions

2010-07-08 Thread Pascal Jürgens
Larry,

have you decoupled the processing code from tweepy's StreamListener, for 
example using a Queue.Queue oder some message queue server?

Pascal

On Jul 8, 2010, at 17:31 , Larry Zhang wrote:

 Hi everyone,
 
 I have a program calling the statuses/sample method of a garden hose
 of the Streaming API, and I am experiencing the following problem: the
 timestamps of the tweets that I downloaded constantly drift behind
 real-time, the time drift keeps increasing until it reaches around 25
 minutes, and then I get a timeout from the request, sleep for 5
 seconds and reset the connection. The time drift is also reset to 0
 when the connection is reset.
 
 One solution for this I have now is to proactively reset the
 connection more frequently, e.g., if I reconnect every 1 minute, the
 time drift I get will be at most 1 minute. But I am not sure whether
 this is allow by the API.
 
 So could anyone tell me if you have the same problem as mine or I am
 using the API in the wrong way. And is it OK to reset connection every
 minute?
 
 I am using Tweepy (http://github.com/joshthecoder/tweepy) as the
 library for accessing the Streaming API.
 
 Thanks a lot!
 -Larry



Re: [twitter-dev] Streaming API time drifting problem and possible solutions

2010-07-08 Thread Pascal Jürgens
Larry,

moreover, I assume you checked I/O and CPU load. But even if that's not the 
issue, you should absolutely check if you have simplejson with c extension 
installed. The python included version is 1.9 which is decidedly slower than 
the new 2.x branch. You might see json decoding load drop by 50% or more.


Pascal


On Jul 8, 2010, at 17:31 , Larry Zhang wrote:

 Hi everyone,
 
 I have a program calling the statuses/sample method of a garden hose
 of the Streaming API, and I am experiencing the following problem: the
 timestamps of the tweets that I downloaded constantly drift behind
 real-time, the time drift keeps increasing until it reaches around 25
 minutes, and then I get a timeout from the request, sleep for 5
 seconds and reset the connection. The time drift is also reset to 0
 when the connection is reset.
 
 One solution for this I have now is to proactively reset the
 connection more frequently, e.g., if I reconnect every 1 minute, the
 time drift I get will be at most 1 minute. But I am not sure whether
 this is allow by the API.
 
 So could anyone tell me if you have the same problem as mine or I am
 using the API in the wrong way. And is it OK to reset connection every
 minute?
 
 I am using Tweepy (http://github.com/joshthecoder/tweepy) as the
 library for accessing the Streaming API.
 
 Thanks a lot!
 -Larry



Re: [twitter-dev] Re: Friend and Follower count - since timestamp

2010-07-07 Thread Pascal Jürgens
Just wanted to add,

it's a sad thing etags see hardly any use today. Back when the graph methods 
weren't paginated, you could just send a request with the etag header set and 
it would come back not modified, a very efficient thing to do. It won't give 
you the difference between arbitrary points in time, but for most applications, 
it's quite enough.

I don't think anybody ever confirmed that this even works with paginated calls, 
but I don't see why it couldn't (especially since pages are apparently newest 
first, as Raffi said).

Pascal

On Jul 7, 2010, at 10:50 , nischalshetty wrote:

 Raised an issue: http://code.google.com/p/twitter-api/issues/detail?id=1732
 
 Hope one of you finds time to work on this, would be a big help for me
 as well whole lot of other apps that deal with a users friend and
 followers.
 
 -Nischal



Re: [twitter-dev] Re: Search API rate limit

2010-07-07 Thread Pascal Jürgens
Shan,

as far as I know twitter has been reluctant to state definite numbers, so 
you'll have to experiment and implement a backoff mechanism in your app. Here 
is the relevant part of the docs:

 Search API Rate Limiting
 The Search API is rate limited by IP address. The number of search requests 
 that originate from a given IP address are counted against the search rate 
 limiter. The specific number of requests a client is able to make to the 
 Search API for a given hour is not released. Note that the Search API is not 
 limited by the same 150 requests per hour limit as the REST API. The number 
 is quite a bit higher and we feel it is both liberal and sufficient for most 
 applications. We do not give the exact number because we want to discourage 
 unnecessary search usage.
  
 Search API usage requires that applications include a unique and identifying 
 User Agent string. A HTTP Referrer is expected but is not required. Consumers 
 using the Search API but failing to include a User Agent string will receive 
 a lower rate limit.
  
 An application that exceeds the rate limitations of the Search API will 
 receive HTTP 420 response codes to requests. It is a best practice to watch 
 for this error condition and honor the Retry-After header that instructs the 
 application when it is safe to continue. The Retry-After header's value is 
 the number of seconds your application should wait before submitting another 
 query (for example: Retry-After: 67).

Cheers,

Pascal


On Jul 7, 2010, at 1:55 , Ramanean wrote:

 Matt,
 
 
 What is exact limit..Whether I can write to twitter for whitelisting
 of the IP?
 
 Whether whitelisting of the IP would do any good?
 
 
 Shan



Re: [twitter-dev] lockouts are the new black

2010-07-06 Thread Pascal Jürgens
With multi-level loosely-coordinated best-effort distributed cache you 
certainly got the naming, all that's left is the cache invalidation. :)

Pascal

On Jul 6, 2010, at 18:10 , John Kalucki wrote:

 These lockouts are almost certainly due to a performance optimization 
 intended to reduce network utilization by increasing physical reference 
 locality in a multi-level loosely-coordinated best-effort distributed cache. 
 Not easy to get right, and the engineers involved are working to resolve the 
 issue. There's absolutely no intention to lock people out.
 
 -John Kalucki
 http://twitter.com/jkalucki
 Infrastructure, Twitter Inc.
 



Re: [twitter-dev] Re: Rate Limiting

2010-07-06 Thread Pascal Jürgens
Just a sidenote: This can be coincidental. Unless you try several dozen times 
with each client, no valid inference can be drawn from the tests.

Pascal
On Jul 6, 2010, at 18:46 , Johnson wrote:

 I notice that the rate limit is application specific. I've tried a few
 clients, some of them goes thru, some don't.



Re: [twitter-dev] Streaming API and Oauth

2010-07-05 Thread Pascal Jürgens
Quoting John Kalucki:

 We haven't announced our plans for streaming and oAuth, beyond stating that 
 User Streams will only be on oAuth.


Right now, basic auth and oAuth both work on streaming, and that won't change 
when basic for REST turns off. Since there's no set shutdown date yet for 
basic/streaming, I wouldn't expect it to happen soon.

Pascal

On Jul 5, 2010, at 20:25 , Zhami wrote:

 The Oauth Overview page http://dev.twitter.com/pages/auth_overview
 has sections for three APIs: REST, Search, and Streaming. The bottom
 of the page displays a ribbon stating that The @twitterapi team will
 be shutting of basic authentication for the Twitter API.  Does this
 mean all of the Twitter APIs (REST, Search, and Streaming)? or just
 the REST API?
 
 Most specifically, while I know that the Streaming API end-point now
 supports OAuth, I do not know if Streaming will require OAuth come
 August 16th...  can someone please clarify. TIA.



Re: [twitter-dev] Re: Farsi Twitter App

2010-07-04 Thread Pascal Jürgens
Google Translate is easy, but *very* inaccurate. I tested it on a set of 30.000 
tweets, and more than 60% were unreliably classified (google will tell you the 
confidence of the classification inline).

Don't rely on that for language detection unless you pretty much don't care!

On Jul 4, 2010, at 4:43 , Sami wrote:

 A simple solution I experimented is using Google Ajax translation
 APIs, it is pretty 
 reliablehttp://code.google.com/apis/ajaxlanguage/documentation/
 but it works only for web apps and you have to push all the tweets
 from the sample stream to the client to filter.



Re: [twitter-dev] Farsi Twitter App

2010-07-04 Thread Pascal Jürgens
Interesting. Your method is similar to the breadth-first crawl that many people 
do (for example, see the academic paper by Kwak et al. 2010).

You have to keep in mind, however, that you are only crawling the giant 
component of the network, the connected part. If there are any turkish users 
who have their *separate* subpopulation, which is not connected to the rest, 
you won't find those.

You could easily find those with a sample stream. Although I have to admit that 
the number of non-connected users is not so big, no one has really tested that 
so far.

Pascal

On Jul 3, 2010, at 20:00 , Furkan Kuru wrote:

 We have implemented the Turkish version: 
 Twitturkhttp://twitturk.com/home/lang/en
 
 We skipped the first three steps but started with a few Turkish users and 
 crawled all the network and for each new user we tested if the description or 
 latest tweets are in Turkish language.
 
 We have almost 100.000 Turkish users identified so far.
 
 Using stream api we collect their tweets and we find out the popular people 
 and key-words, top tweets (most retweeted ones) among Turkish people.



Re: [twitter-dev] Farsi Twitter App

2010-07-03 Thread Pascal Jürgens
Hi Lucas,

as someone who approached a similar problem, my recommendation would be to 
track users.  In order to get results quickly (rather than every few hours via 
user timeline calls), you need streaming access, which is a bit more 
complicated. I implemented such a system in order to track the german-speaking 
population of twitter users, and it works extremely well.

1) get access to the sample stream (5% or 15% type) (warning: the 15% stream is 
~10GB+ a day)

2) construct an efficient cascading language filter, ie:
- first test the computationally cheap AND precise attributes, such as a list 
of known farsi-only keywords or the location box
- if those attribute tests are negative, perform more computationally expensive 
tests
- if in doubt, count it as non-farsi! False positives will kill you if you 
sample a very small population!

3) With said filter, identify the accounts using farsi

4) Perform a first-degree network sweep and scan all their friends+followers, 
since those have a higher likelihood to speak farsi as well

5) compile a list of those known users

6) track those users with the shadow role stream (80.000 users) or higher.

If your language detection code is not efficient enough, you might want to 
include a cheap, fast and precise negative filter of known non-farsi 
attributes. Test that one before all the others and you should be able to 
filter out a large part of the volume.


Don't hesitate to ask for any clarification!

Pascal Juergens
Graduate Student / Mass Communication
U of Mainz, Germany

On Jul 3, 2010, at 0:36 , Lucas Vickers wrote:

 Hello,
 
 I am trying to create an app that will show tweets and trends in
 Farsi, for native speakers.  I would like to somehow get a sample
 'garden hose' of Farsi based tweets, but I am unable to come up with
 an elegant solution.
 
 I see the following options:
 
 - Sample all tweets, and run a language detection algorithm on the
 tweet to determine which are/could be Farsi.
  * Problem: only a very very small % of the tweets will be in Farsi
 
 - Use the location filter to try and sample tweets from countries that
 are known to speak Farsi, and then run a language detection algorithm
 on the tweets.
  * Problem: I seem to be limited on the size of the coordinate box I
 can provide.  I can not even cover all of Iran for example.
 
 - Filter a standard farsi term.
  * Problem: will limit my results to only tweets with this term
 
 - Search for laguage = farsi
   * Problem: Not a stream, I will need to keep searching.
 
 I think of the given options I mentioned what makes the most sense is
 to search for tweets where language=farsi, and use the since_id to
 keep my results new.  Given this method, I have three questions
 1 - since_id I imagine is the highest tweet_id from the previous
 result set?
 2 - How often can I search (given API limits of course) in order to
 ensure I get new data?
 3 - Will the language filter provide me with users who's default
 language is farsi, or will it actually find tweets in farsi?
 
 I am aware that the user can select their native language in the user
 profile, but I also know this is not 100% reliable.
 
 Can anyone think of a more elegant solution?
 Are there any hidden/experimental language type filters available to
 us?
 
 Thanks!
 Lucas



Re: [twitter-dev] Farsi Twitter App

2010-07-03 Thread Pascal Jürgens
John,

yes, thanks a lot for the design proposal - that is what inspired my own 
system. I am not primarily filtering by language, however, but by country, so 
I'm using time zone and location data together with a list of cities from 
http://www.geonames.org/

The manual cross-check in my thesis shows that this gets you close to 1 in 
specificity and above .7 in sensitivity.

From my experience, the key is to develop efficient language-specific tests 
with as low an error rate as possible (this, sadly, largely excludes 
conventional SVM, HMM models etc, because tweets are so short and full of weird 
punctuation).

Pascal

On Jul 3, 2010, at 15:26 , John Kalucki wrote:

 It's great to hear that someone implemented all this. There's a similar 
 technique documented here: 
 http://dev.twitter.com/pages/streaming_api_concepts, under By Language and 
 Country. My suggestion was to start with a list of stop words to build your 
 user corpus -- but I don't know how well Farsi works with track, so random 
 sample method might indeed be better.
 
 -John Kalucki
 http://twitter.com/jkalucki
 Infrastructure, Twitter Inc.



Re: [twitter-dev] My Client API was Decreased to 175

2010-07-01 Thread Pascal Jürgens
http://status.twitter.com/post/750140886/site-tweaks


On Jul 1, 2010, at 9:49 , PiPS wrote:

 Hi.
 
 I am developing on twitter client.
 
 My client uses xAuth.
 
 But.. My Client API is 175
 
 
 That was before 350.
 
 Why was suddenly reduced by half?
 



Re: [twitter-dev] Twitter 1500 search results

2010-06-07 Thread Pascal Jürgens
As stated in the API WIKI, the number of search results you can get at any 
given point in time for one search term is indeed ~1500.
(http://apiwiki.twitter.com/Twitter-Search-API-Method:-search)

There are several ways to go beyond that.

a) Do perpetual searches (say, one every day), and merge the results
b) Get streaming access and track keywords in real time
c) Vary search terms and combine the results

Good luck.


On Jun 7, 2010, at 22:53 , sahmed10 wrote:

 I am developing an application where i am trying to get more than 1500
 results for a search query. Is it possible?
 For example when i specify return all result from 2nd June to 6th June
 with the search string of iphone i only get 1500 latest tweets, but
 on the other hand i am interested in all the tweets which have metion
 of iphone from 2nd June to 6th June.Is there a work around this?



Re: [twitter-dev] Re: Twitter 1500 search results

2010-06-07 Thread Pascal Jürgens
Good to know. Did you mean to say consume … streaming results? I don't really 
see where you use the stream here.

Also, please note that it's not a good idea to work with since_id and 
max_id any more, because those will soon be (already are?) NON-SEQUENTIAL. 
This means you will lose tweets if you rely on the IDs incrementing over time. 
To quote the relevant email from Taylor Singletary:

 Please don't depend on the exact format of the ID. As our infrastructure 
 needs evolve, we might need to tweak the generation algorithm again.
 
 If you've been trying to divine meaning from status IDs aside from their role 
 as a primary key, you won't be able to anymore. Likewise for usage of IDs in 
 mathematical operations -- for instance, subtracting two status IDs to 
 determine the number of tweets in between will no longer be possible

Cheers.

On Jun 8, 2010, at 0:06 , sahmed10 wrote:

 yes it works! This algorithm works
 Its something like this
 Set the query to a string with appropriate To and From dates. Then
 consuem the 1500 streaming results and also save the status id of the
 very last tweet you got. As they are in order sequentially(with gaps)
 it wont be a problem. The very last tweet status id should be assigned
 as the MaxId for the next set of results and so on.



[twitter-dev] Re: 2 week advance notice: changes to /friends/ids and /followers/ids

2009-08-01 Thread Pascal Jürgens

Thanks for the notification.

What will this mean for etag checks?
I currently fetch a large number of graphs in regular intervals. Any
check that returns a 304 should incur little cost.

Will I need to crawl all the pages and check for their 304?
If I get a 304 on the first one, can I assume that the rest remains
equally unchanged?

Pascal

--
Pascal Juergens
twitter.com/pascal


On Jul 31, 7:35 pm, Alex Payne a...@twitter.com wrote:
 The Twitter API currently has two methods for returning a user's
 denormalized social graph: /friends/ids [1] and /followers/ids [2].
 These methods presently allow pagination by use of a ?page=n
 parameter; without that parameter, they attempt to return all user IDs
 in the specified set. If you've used this methods, particularly for
 exploring the social graphs of users that are following or followed by
 a large number of other users, you've probably run into lag and server
 errors.

 In two weeks, we'll be addressing this with a change in back-end
 infrastructure. The page parameter will be replaced with a cursor
 parameter, which in turn will result in a change in the response
 bodies for these two methods. Whereas currently you'd receive an array
 response like this (in JSON):

   [1,2,3,...]

 You will now receive:

   {ids: [1,2,3], next_id: 1231232}

 You can then use the next_id value to paginate through the set:

   /followers/ids.json?cursor=1231232

 To start paginating:

   /followers/ids.json?cursor=-1

 The negative one (-1) indicates that you want to begin paginating.
 When the next_id value is zero (0), you're at the last page.

 Documentation of the new functionality will, of course, be provided on
 the API Wiki in advance of the change going live. If you have any
 questions or concerns, please contact us as soon as possible.

 [1] http://apiwiki.twitter.com/Twitter-REST-API-Method%3A-friends%C2%A0ids
 [2] http://apiwiki.twitter.com/Twitter-REST-API-Method%3A-followers%C2%A0ids

 --
 Alex Payne - Platform Lead, Twitter, Inc.http://twitter.com/al3x