[twitter-dev] Multiple issues with favorites info, REST API

2010-10-22 Thread AJ

I posted the following a few days ago.  Since then, in just the past
day or so, I'm seeing additional *new* issues with the favorites
info.  The favorites count shown for a user, e.g. via
 twurl /1/show/.xml
is now very inaccurate-- so much so that I'm wondering if different
user's information is getting pulled up instead.  For example, with my
account the number of favs is ~1042 lower than the actual number,
which was correctly reported as recently as a couple days ago.  When I
check random status posts that I know should be favorited for me, they
still are.  (So, as with the issue I reported below, both issues might
be related to how information is being served up, rather than the
actual database information being changed)


On Oct 18, 9:51 am, AJ  wrote:
> I have noticed an apparent bug with the REST favorites API which seems
> to have just started a few days ago, I think around the 12th or 13th
> Oct.
>
> I am working on an app which amongst other things archives a user's
> favorites.
> Periodically, the app pages through older pages of favorites to see if
> any tweets further back in the stream are now favorited, and to see if
> any older favs have been unfavorited.
>
> e.g. /1/favorites.xml?page=20 (as an authenticated user)
>
> It appears that only about every other actually favorited tweet is
> being returned from these requests.  This is new behaviour-- it worked
> correctly last week. Because the app has the older tweets stored, it
> detects those that are missing from the API response sequence.  It is
> typically about 10 out of 20 missing, typically ~every other one.
>
> However, if I check the actual 'missing' favs individually, they do
> indicate that the authenticated user has favorited them.  So it
> appears that Twitter still considers the missing tweets favorited by
> the user, but the API is not serving them all up any more, only about
> half of them.
>
> This bug doesn't appear to manifest with 'newer' favs (e.g. the first
> retrieved page seems okay).
>
>  -Amy

-- 
Twitter developer documentation and resources: http://dev.twitter.com/doc
API updates via Twitter: http://twitter.com/twitterapi
Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list
Change your membership to this group: 
http://groups.google.com/group/twitter-development-talk


[twitter-dev] apparent new bug with REST favorites API

2010-10-18 Thread AJ
I have noticed an apparent bug with the REST favorites API which seems
to have just started a few days ago, I think around the 12th or 13th
Oct.

I am working on an app which amongst other things archives a user's
favorites.
Periodically, the app pages through older pages of favorites to see if
any tweets further back in the stream are now favorited, and to see if
any older favs have been unfavorited.

e.g. /1/favorites.xml?page=20 (as an authenticated user)

It appears that only about every other actually favorited tweet is
being returned from these requests.  This is new behaviour-- it worked
correctly last week. Because the app has the older tweets stored, it
detects those that are missing from the API response sequence.  It is
typically about 10 out of 20 missing, typically ~every other one.

However, if I check the actual 'missing' favs individually, they do
indicate that the authenticated user has favorited them.  So it
appears that Twitter still considers the missing tweets favorited by
the user, but the API is not serving them all up any more, only about
half of them.

This bug doesn't appear to manifest with 'newer' favs (e.g. the first
retrieved page seems okay).


 -Amy

-- 
Twitter developer documentation and resources: http://dev.twitter.com/doc
API updates via Twitter: http://twitter.com/twitterapi
Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list
Change your membership to this group: 
http://groups.google.com/group/twitter-development-talk


Re: [twitter-dev] Re: Search API questions

2009-12-02 Thread AJ Chen
unless I miss something, it's usually user's responsibility to dedup
returned tweets on the client side. if you see duplicates between two feeds,
just remove the duplicates. this is what client application should have in
any case.

if you see no fresh tweets but only old tweets, there may be a possibility
that twitter returns only cashed results because you api calls exceed
rate-limit. I'm not sure, though.  does any one know about rate-limit for
using search feed
http://search.twitter.com/search.atom<http://search.twitter.com/search.atom?geocode=19.017656%2C72.856178%2.>
?

-aj

On Tue, Dec 1, 2009 at 8:49 PM, enygmatic  wrote:

> Hi, Raffi
> Were you able to raise the cache issue with the search team?
> Seems the problem is worse than I thought. I have run my script
> (getting 25 results from search every 15 minutes, for Mumbai) for two
> days. The first day had 71% duplicate results due to the caching
> issue, while the second day fetched an amazing 90% duplicates. With
> these kind of results, I think it’s probably quite useless for me to
> even use the search API .
> So would appreciate if you could let me know if there is a chance that
> this issue may be resolved in the near future or if location specific
> streams would be available via the streaming API anytime soon. I
> understand that the twitter dev team has a lot on its hands, so it
> would be understandable if this isn’t anywhere in the list of features
> they intend to ship out in the near future. However, would definitely
> appreciate it if you could let me know if anything could be done or
> not.
> Thanks and Regards,
> Elroy Serrao
>
>
> On Nov 28, 7:45 pm, Raffi Krikorian  wrote:
> > unfortunately, there is no (current) way to subscribe to the streaming
> > API for a particular location.  as for the caching issue on the
> > search, that's unfortunate, and i'll try to raise the issue with the
> > search team next week.
> >
> >
> >
> >
> >
> > > @Abraham
> > > I actually use the geocode with the search api for my script, so using
> > > the search api isn't my problem. My problem is that I get "stale"
> > > results from the search cache, even when querying after a sufficient
> > > interval. Also the "stale" results seem hours old (at times, in fact
> > > yesterday at 23:00 hours I got a few results that were from
> > > 22:00-22:30 hours. Didn't have the problem when using twitter search
> > > from the browser). To overcome this Raffi Krikorian suggested using
> > > the streaming api instead of the search api. My question was - how do
> > > i get a location specific stream using the streaming api. From the
> > > streaming api docs, there doesn't seem a way to do this at the moment,
> > > which kind of defeats my purpose as I need to the deploy the script in
> > > the next one week or so. Guess I'll have to live with the stale
> > > results...
> >
> > > Anyway thanks for the help.
> >
> > > On Nov 28, 12:40 am, Abraham Williams <4bra...@gmail.com> wrote:
> > >> On Fri, Nov 27, 2009 at 12:38, enygmatic  wrote:
> > >>> From what I have
> > >>> gone through so far, there doesn't seem to be a way to query for
> > >>> status updates from a certain geographical location, say limited
> > >>> to a
> > >>> city. I may be mistaken here, so do correct me if I am wrong.
> >
> > >> Check out the search operators:http://search.twitter.com/operators
> >
> > >> For example:http://search.twitter.com/search?q=near:NYC+within:15mi
> >
> > >> Abraham
> > >> --
> > >> Abraham Williams | Community Evangelist |http://web608.org
> > >> Hacker |http://abrah.am|http://twitter.com/abraham
> > >> Project | Awesome Lists |http://twitterli.st
> > >> This email is: [ ] blogable [x] ask first [ ] private.
> > >> Sent from Madison, WI, United States
> >
> > --
> > Raffi Krikorian
> > Twitter Platform Team
> > ra...@twitter.com | @raffi- Hide quoted text -
> >
> > - Show quoted text -
>



-- 
AJ Chen, PhD
Chair, Semantic Web SIG, sdforum.org
http://web2express.org
@web2express on twitter
Palo Alto, CA, USA
650-283-4091
*Monitor realtime web and follow trending topics with semantic intelligence*


[twitter-dev] Re: API 140 character truncation change?

2009-10-23 Thread AJ Chen
then, comparing the front part (without url at the end) of the status is
probably sufficient. -aj

On Fri, Oct 23, 2009 at 3:07 PM, Dewald Pretorius  wrote:

>
> You cannot compare the status sent with the status returned when the
> status contains an URL. The returned status contains Twitter's own
> bit.ly shortened URL instead of the URL your status sent had.
>
> Dewald
>
> On Oct 23, 6:24 pm, AJ Chen  wrote:
> > I noticed this behavior a long time ago (may be a month) and reported the
> > problem on this list, but it did not get any response from the api team.
> I
> > thought it was a bug, but just realized yesterday that the api probably
> > ignores 140+ chars status update intentionally. but' I'm not sure this is
> > the policy or temporary tactic to reduce workload on api. it would be
> good
> > that api team can clasify on this issue. to check if this happens or not,
> > you can compare the status sent to api and the status returned from api
> in
> > your application code.
> >  -aj
> >
> >
> >
> > On Fri, Oct 23, 2009 at 1:51 PM, Naveen  wrote:
> >
> > > Here are two threads related to this issue.
> >
> > >http://groups.google.com/group/twitter-development-talk/browse_thread.
> ..
> >
> > >http://groups.google.com/group/twitter-development-talk/browse_thread.
> ..
> >
> > > It is an inconvenient change, not because they changed it, but because
> > > they did not announce that the change was happening.
> >
> > > On Oct 21, 5:37 am, Dave Sherohman  wrote:
> > > > On Tue, Oct 20, 2009 at 07:37:03AM -0700, James Tymann wrote:
> > > > > Has anyone else noticed a change in the way that the 140 character
> > > > > limit is enforced via the API? I noticed a change sometime between
> the
> > > > > 13th and the 16th that is now causing all my 140+ character posts
> to
> > > > > be rejected by the API.
> > > > > Also a side note is that the api is not returning errors, they
> return
> > > > > proper responses however they are the proper response for the
> current
> > > > > status of the account, not the new status that was just attempted
> to
> > > > > be posted.
> >
> > > > My users first reported issues arising from this on the 15th,
> although I
> > > > didn't identify the cause until the 17th, at which point I asked
> about
> > > > it in #Net::Twitter and Marc Mims brought the question here under the
> > > > subject line "Bug? Updates > 140 characters return success with prior
> > > > update  payload".  See the discussion under that thread for more on
> it,
> > > > but the overall upshot is:
> >
> > > > - This is an intentional (if poorly-announced) change, not a bug.
> > > > - Status updates are known to be getting silently rejected in this
> > > >   manner both due to exceeding 140 characters and due to violation of
> > > >   the expanded "no duplicates" policy.
> > > > - Twitter has not stated whether there are any additional
> circumstances
> > > >   beyond those two cases in which updates will be silently rejected.
> > > > - Twitter has not stated any plans regarding adding an indicator for
> > > >   when a "200 OK" status update has, in fact, been rejected.
> >
> > > > I am attempting to compensate for this change by checking the
> returned
> > > > status ID against the previous highest-seen ID to determine whether
> the
> > > > status returned with the "200 OK" response is a new one or the user's
> > > > pre-existing status.  This seems to work, but does not indicate the
> > > > reason for the silent failure, so I can't report the cause to my
> users.
> > > > Andy Freeman has mentioned that, in the case of rejection due to
> > > > duplication, this is also unsatisfactory in that it does not allow
> him
> > > > to identify the original status which was duplicated.
> >
> > > > --
> > > > Dave Sherohman
> >
> > --
> > AJ Chen, PhD
> > Chair, Semantic Web SIG, sdforum.orghttp://web2express.org
> > Palo Alto, CA




-- 
AJ Chen, PhD
Chair, Semantic Web SIG, sdforum.org
http://web2express.org
Palo Alto, CA


[twitter-dev] Re: API 140 character truncation change?

2009-10-23 Thread AJ Chen
I noticed this behavior a long time ago (may be a month) and reported the
problem on this list, but it did not get any response from the api team. I
thought it was a bug, but just realized yesterday that the api probably
ignores 140+ chars status update intentionally. but' I'm not sure this is
the policy or temporary tactic to reduce workload on api. it would be good
that api team can clasify on this issue. to check if this happens or not,
you can compare the status sent to api and the status returned from api in
your application code.
 -aj

On Fri, Oct 23, 2009 at 1:51 PM, Naveen  wrote:

>
> Here are two threads related to this issue.
>
> http://groups.google.com/group/twitter-development-talk/browse_thread/thread/cd95ce07be341223/66c66de585383868#66c66de585383868
>
> http://groups.google.com/group/twitter-development-talk/browse_thread/thread/3d6a727892710d5e#
>
> It is an inconvenient change, not because they changed it, but because
> they did not announce that the change was happening.
>
> On Oct 21, 5:37 am, Dave Sherohman  wrote:
> > On Tue, Oct 20, 2009 at 07:37:03AM -0700, James Tymann wrote:
> > > Has anyone else noticed a change in the way that the 140 character
> > > limit is enforced via the API? I noticed a change sometime between the
> > > 13th and the 16th that is now causing all my 140+ character posts to
> > > be rejected by the API.
> > > Also a side note is that the api is not returning errors, they return
> > > proper responses however they are the proper response for the current
> > > status of the account, not the new status that was just attempted to
> > > be posted.
> >
> > My users first reported issues arising from this on the 15th, although I
> > didn't identify the cause until the 17th, at which point I asked about
> > it in #Net::Twitter and Marc Mims brought the question here under the
> > subject line "Bug? Updates > 140 characters return success with prior
> > update  payload".  See the discussion under that thread for more on it,
> > but the overall upshot is:
> >
> > - This is an intentional (if poorly-announced) change, not a bug.
> > - Status updates are known to be getting silently rejected in this
> >   manner both due to exceeding 140 characters and due to violation of
> >   the expanded "no duplicates" policy.
> > - Twitter has not stated whether there are any additional circumstances
> >   beyond those two cases in which updates will be silently rejected.
> > - Twitter has not stated any plans regarding adding an indicator for
> >   when a "200 OK" status update has, in fact, been rejected.
> >
> > I am attempting to compensate for this change by checking the returned
> > status ID against the previous highest-seen ID to determine whether the
> > status returned with the "200 OK" response is a new one or the user's
> > pre-existing status.  This seems to work, but does not indicate the
> > reason for the silent failure, so I can't report the cause to my users.
> > Andy Freeman has mentioned that, in the case of rejection due to
> > duplication, this is also unsatisfactory in that it does not allow him
> > to identify the original status which was duplicated.
> >
> > --
> > Dave Sherohman
>



-- 
AJ Chen, PhD
Chair, Semantic Web SIG, sdforum.org
http://web2express.org
Palo Alto, CA


[twitter-dev] api error when updating statuses?

2009-09-03 Thread AJ Chen
wired response from status update: I update my status with different tweets,
but for a period of time, the responded statuses are always the same and
wrong status. for example,
2009-09-03 21:30:51,149 DEBUG report.TweetMgr (TweetMgr.java:tweet(63)) -
newsweb2x tweeted: Shuttle X500V All-In-One Desktop PC. Shuttle is back with
its new all-in-one desktop PC by releasing the X500V. Mea
http://feedproxy.google.com/%7Er/techfresh/%7E3/BpmQzdgy7EM/
responded status: Re: Are You Serious About Twitter?. Hi David. I have
described earlier today in my Blog a Twitter based application
http://bit.ly/370jK
2009-09-03 21:30:51,368 DEBUG report.TweetMgr (TweetMgr.java:tweet(63)) -
newsweb2x tweeted: Poker Chip Set - 100 Chips. A custom set of 100 Poker
Chips. Includes: 20 x Red 20 x Blue 20 x Green 20 x Black 20
http://www.ponoko.com/showroom/PlaySmart/3426
responded status: Re: Are You Serious About Twitter?. Hi David. I have
described earlier today in my Blog a Twitter based application
http://bit.ly/370jK
2009-09-03 21:30:51,368 DEBUG report.FeedReporter
(FeedReporter.java:report(113)) - report feed items: 2/60

I verify on my twitter home page that the two tweets I tried to update are
not there, but the wrong one is there.
is there a bug in api? or the api intentionally does this for some unknown
policy reason?
thanks,

-aj
-- 
AJ Chen, PhD
Co-Chair, Semantic Web SIG, sdforum.org
http://web2express.org
Palo Alto, CA


[twitter-dev] Re: following twitter conversations by topics

2009-08-05 Thread AJ Chen
please check the "about" page on http://web2express.org. basically, it uses
OpenCalais semantic analysis service as well as OpenNlp tools. I like these
free tools. Another one is the stanford parser, but it's slower.  The key is
to make the whole process of auto-discovery superfast so that the twitter
streaming data can be analyzed in real time as they come in.
-aj

On Wed, Aug 5, 2009 at 12:21 PM, Scott Haneda  wrote:

>
> Can you tell me more about this auto topic discovery feature?  I am not
> seeing anything of that nature on the twitter Web site at all.
>
>
> On Aug 5, 2009, at 2:29 AM, AJ Chen wrote:
>
>  After playing around with auto-discovery of topics in twitter
>> conversations for a while, it seems to me that following topics is another
>> effective way to communicate on twitter. So, I've added a new set of
>> features on http://web2express.org website to make it easy for people to
>> follow and tweet about topics. The auto-discovered topics give you a fairly
>> good starting point to read about current new hot topics. In addition, one
>> can create any topic to follow. By adding of a set of keywords or phrases to
>> the topic, you will find that topic following pulls out more complete list
>> of conversations from twitter with much less work. When you tweet or retweet
>> about a topic, the topic's hashtag is automatically appended to your message
>> so that you don't need to remember what hashtag to use.
>>
>> I notice several websites including twitter.com homepage have started to
>> provide similar functions recently. Anyone has any early experience or
>> comment to share?
>>
>
> --
> Scott * If you contact me off list replace talklists@ with scott@ *
>
>


-- 
AJ Chen, PhD
Co-Chair, Semantic Web SIG, sdforum.org
http://web2express.org
Palo Alto, CA


[twitter-dev] following twitter conversations by topics

2009-08-05 Thread AJ Chen
After playing around with auto-discovery of topics in twitter conversations
for a while, it seems to me that following topics is another effective way
to communicate on twitter. So, I've added a new set of features on
http://web2express.org website to make it easy for people to follow and
tweet about topics. The auto-discovered topics give you a fairly good
starting point to read about current new hot topics. In addition, one can
create any topic to follow. By adding of a set of keywords or phrases to the
topic, you will find that topic following pulls out more complete list of
conversations from twitter with much less work. When you tweet or retweet
about a topic, the topic's hashtag is automatically appended to your message
so that you don't need to remember what hashtag to use.

I notice several websites including twitter.com homepage have started to
provide similar functions recently. Anyone has any early experience or
comment to share?

-aj
-- 
AJ Chen, PhD
Co-Chair, Semantic Web SIG, sdforum.org
http://web2express.org
Palo Alto, CA


[twitter-dev] Re: "id" field is missing in status from streaming API frequently

2009-07-29 Thread AJ Chen
thank you for the fix. you rock.
-aj

2009/7/29 H12山本 裕介 

>
> Fixed.
> http://yusuke.homeip.net/hudson/job/Twitter4J/296/
> Please try the latest build.
> http://yusuke.homeip.net/maven2/net/homeip/yusuke/twitter4j/2.0.9-SNAPSHOT/
> Now T4J ignores deleted tweets.
>
> Cheers,
> --
> Yusuke Yamamoto
> yus...@mac.com
>
> this email is: [x] bloggable/twittable [ ] ask first [ ] private
> follow me on : http://twitter.com/yusukeyamamoto
> subscribe me at : http://yusuke.homeip.net/blog/
>
> On 7月24日, 午後9:15, AJ Chen  wrote:
> > John, thanks.
> >
> > Yusuke, it may be a good idea for twitter4j library to exclude the
> deleted
> > statuses as they are received. currently, twitter4j throws an exception
> for
> > them, which is less informative. thanks.
> >
> > -aj
> >
> >
> >
> >
> >
> > On Fri, Jul 24, 2009 at 3:20 PM, John Kalucki 
> wrote:
> >
> > > It appears that you are treating status deletions as statuses.
> >
> > > -John Kalucki
> > >http://twitter.com/jkalucki
> > > Services, Twitter Inc.
> >
> > > On Jul 24, 3:18 pm, AJ Chen  wrote:
> > > > twitter streaming api has lots of statuses missing id?
> > > > the following exception appears almost continuously in my log. it
> > > indicates
> > > > the "id" field is missing in status from streaming API.
> >
> > > > twitter4j.TwitterException: JSONObject["id"] not
> > > > found.:{"delete":{"status":{"id":2813410502,"user_id":47157439}}}
> > > > twitter4j.TwitterException: JSONObject["id"] not
> > > > found.:{"delete":{"status":{"id":2812385903,"user_id":54420955}}}
> >
> > > > thanks,
> > > > -aj
> > > > --
> > > > AJ Chen, PhD
> > > > Co-Chair, Semantic Web SIG, sdforum.orghttp://web2express.org
> > > > Palo Alto, CA
> >
> > --
> > AJ Chen, PhD
> > Co-Chair, Semantic Web SIG, sdforum.orghttp://web2express.org
> > Palo Alto, CA
>



-- 
AJ Chen, PhD
Co-Chair, Semantic Web SIG, sdforum.org
http://web2express.org
Palo Alto, CA


[twitter-dev] Re: "id" field is missing in status from streaming API frequently

2009-07-24 Thread AJ Chen
John, thanks.

Yusuke, it may be a good idea for twitter4j library to exclude the deleted
statuses as they are received. currently, twitter4j throws an exception for
them, which is less informative. thanks.

-aj

On Fri, Jul 24, 2009 at 3:20 PM, John Kalucki  wrote:

>
> It appears that you are treating status deletions as statuses.
>
> -John Kalucki
> http://twitter.com/jkalucki
> Services, Twitter Inc.
>
>
> On Jul 24, 3:18 pm, AJ Chen  wrote:
> > twitter streaming api has lots of statuses missing id?
> > the following exception appears almost continuously in my log. it
> indicates
> > the "id" field is missing in status from streaming API.
> >
> > twitter4j.TwitterException: JSONObject["id"] not
> > found.:{"delete":{"status":{"id":2813410502,"user_id":47157439}}}
> > twitter4j.TwitterException: JSONObject["id"] not
> > found.:{"delete":{"status":{"id":2812385903,"user_id":54420955}}}
> >
> > thanks,
> > -aj
> > --
> > AJ Chen, PhD
> > Co-Chair, Semantic Web SIG, sdforum.orghttp://web2express.org
> > Palo Alto, CA




-- 
AJ Chen, PhD
Co-Chair, Semantic Web SIG, sdforum.org
http://web2express.org
Palo Alto, CA


[twitter-dev] "id" field is missing in status from streaming API frequently

2009-07-24 Thread AJ Chen
twitter streaming api has lots of statuses missing id?
the following exception appears almost continuously in my log. it indicates
the "id" field is missing in status from streaming API.

twitter4j.TwitterException: JSONObject["id"] not
found.:{"delete":{"status":{"id":2813410502,"user_id":47157439}}}
twitter4j.TwitterException: JSONObject["id"] not
found.:{"delete":{"status":{"id":2812385903,"user_id":54420955}}}

thanks,
-aj
-- 
AJ Chen, PhD
Co-Chair, Semantic Web SIG, sdforum.org
http://web2express.org
Palo Alto, CA


[twitter-dev] twitter app event in silicon valley on July 1

2009-06-26 Thread AJ Chen
The program for SDForum's July 1 event on twitter apps is finalized. Please
see below or the semantic web SIG
website<http://www.sdforum.org/index.cfm?fuseaction=Page.viewPage&pageId=656&parentID=483&nodeID=1>.
Thanks for all the developers who will share their cool projects with the
community.

Topic: Hacking the Semantics of Twitter

6:30pm - 9pm July 1, 2009
Cubberley Community Center
4000 Middlefield Rd., RM H-1
Palo Alto,, CA

Agenda:

6:30pm-7:00pm
Registration / Networking / Refreshments / Pizza

7:00pm-7:20pm
Doug Williams: Twitter API introduction.  http://twitter.com

7:20pm-7:40pm
Kevin Boer: Where are the breaking news in Twitter conversation?
http://breakingsfnews.com/

7:40pm-8:00pm
AJ Chen: Using Open Calais and openNLP tools to study tweets in real time.
http://web2express.org/

8:00pm-8:20pm
Christopher Peri: Managing incoming tweets. http://www.twittfilter.com/

8:20pm-8:40pm
Patrick Nicolas: Real-time semantic dashboard for Twitter

8:40pm-9:00pm
Sid Gabriel Hubbard: "tweegeo" joining LOD with twitter users.
http://cloudstem.com/

-aj--

-- 
AJ Chen, PhD
Co-Chair, Semantic Web SIG, sdforum.org
http://web2express.org
Palo Alto, CA


[twitter-dev] Re: best practice for dealing with streaming api connection close

2009-06-23 Thread AJ Chen
I use two levels of controls, which seems working smoothly.
1. when exception is thrown, check if it's the type that results from
connection dropping, i.e. IOException or HTTP code=4xx or error message;
reconnect only if this is true. there may be many other types of exceptions,
but don't reconnect in those cases. Normally, I only notice a couple of
disconnection a day.
2. set a max number of reconnection; when reaching the max, don't
auto-reconnect, but requires a manual reconnect instead.  This way, if the
api server goes wrong, you won't bombard the server.

-aj

On Tue, Jun 23, 2009 at 3:49 PM, danielo  wrote:

>
> I had a similar question. I think you've mostly answered it, but I
> want to be clear so as to avoid harassing the API.
>
> I'm developing a client to connect to the streaming API (nothing fancy
> at the moment; just spritzer), and of course, I'm bungling it up
> regularly. I'll hack a bunch, try it, watch it break, shut it down,
> and hack some more. Is there a practical limit at which point I should
> apply the human throttle-back? Or is there no realistic human limit at
> which I risk a ban from the streaming service? I imagine that if a 15-
> second wait period is sufficient to avoid bad things, the more likely
> 1-to-2-minute wait between my attempts will be fine. I ask,
> nonetheless, as my repeated requests will persist for the duration of
> my work, whereas a running client would (hopefully) snag a valid
> connection after some time and stop "spamming" at that point.
>
> Thanks!
>
> On Jun 14, 8:14 pm, John Kalucki  wrote:
> > AJ,
> >
> > If you had a validconnectionand theconnectiondrops, reconnect
> > immediately. This is encouraged!
> >
> > If you attempt aconnectionand get a TCP or IP level error, back off
> > linearly, but cap the backoff to something fairly short. Perhaps start
> > at 20 milliseconds, double, and cap at 15 seconds. There's probably a
> > transitory network problem and it will probably clear up quickly.
> >
> > If you get a HTTP error (4XX), backoff linearly, but cap the backoff
> > at something longer, perhaps start at 250 milliseconds, double, and
> > cap at 120 seconds. Whatever has caused the issue isn't going away
> > anytime soon. There's not much point in polling any faster and you are
> > just more likely to run afoul of some rate limit.
> >
> > The service is fairly lenient. You aren't going to get banned for a
> > few dozen bungled connections here and there. But, if you do anything
> > in a while loop that also doesn't have a sleep, you'll eventually get
> > the hatchet for some small number of minutes. If you get the hatchet
> > repeatedly, you'll be cut off for an indeterminate period of time.
> >
> > There are four main reasons to have yourconnectionclosed:
> > * Duplicate clients logins (earlier connections terminated)
> > * Hosebird server restarts (code deploys)
> > * Laggingconnectiongetting thrown off (client too slow, or
> > insufficient bandwidth)
> > * General Twitter network maintenance (Load balancer restarts, network
> > reconfigurations, other very very rare events)
> >
> > We plan to have enough spare capacity on the surviving servers to
> > absorb the load from server restarts. You must ensure that your client
> > is fast enough and that you have sufficient bandwidth and a stable
> > enoughconnectionto consume your stream. I usually see connections
> > that survive for a few days before mysteriously being dropped. Just
> > reconnect in these cases.
> >
> > -John Kalucki
> > Services, Twitter Inc.
> >
> > On Jun 14, 3:31 pm, AJ  wrote:
> >
> > > Thestreamingapiis great, but it sometimes closes theconnectionfor
> > > whatever reason. my realtime system must figure out when to reconnect
> > > automatically.  the auto-reconnection can't blindly request a
> > >connectionwhenever it is not connected, otherwise it will floor the
> > >apiand may cause theapito ban or refuse the user's request. it's
> > > bad to bombard theapiserver with repeatedconnectionrequests.
> > > Could theapiteam recommend some best practice for dealing with auto-
> > > reconnection?
> >
> > > maybe certain error code or error message can indicate the cause of
> > > droppingconnectionand wait time for nextconnectionrequest. I just
> > > a long list of exceptions fromstreamingapias a result of repeated
> > >connection, and the different messages are:
> >
> > > twitter4j.TwitterException: Address already in use: connect
> > > twi

[twitter-dev] update: RSS feeds of twitter topics now available from web2express.org

2009-06-23 Thread AJ Chen
Update: I just added RSS feed links on web2express.org digest website. The
RSS feeds provide up to 100 new topics from today's twitter conversations or
tweets in the last 3 days or 7 days. Twitter.com search gives you 10 top
trending topics.  If you want more top twitter topics, you may get the rss
feeds from web2express.org.
-aj

-- 
AJ Chen, PhD
Co-Chair, Semantic Web SIG, sdforum.org
http://web2express.org
Palo Alto, CA


[twitter-dev] Re: airline accident case study

2009-06-20 Thread AJ Chen
I'm talking about any trending topics. because it's not trivial to
distinguish news topics from conversation topics, I don't make such
distinction when making the above comparison . I doubt twitter or google
makes such distinction either. Just to clarify: do you imply that the
trending topics on twitter.com or google trends are type of "conversation
topic"?
thanks,
-aj

On Fri, Jun 19, 2009 at 3:18 PM, Andrew Badera  wrote:

> Just because something's a trending news topic, doesn't guarantee, or
> necessarily even imply, that it's a trending topic of conversation ...
>
>
>
> On Fri, Jun 19, 2009 at 5:04 PM, AJ Chen  wrote:
>
>> From user perspective, it's useful if a trending app can pick up new hot
>> topics as they are emerging, particularly for the rather distinct events
>> like airline accident. this is one of the main design principles I have for
>> my twitter digest app. now, whether a new topic should be considered as
>> trending topic may vary a lot among the various trending applications, which
>> depends on detection sensitivity and policy. I'm sure Twitter guys spotted
>> the airline accident, but it did not make it to the top 10 list. At google
>> trends, the signal may be too low to be detected because they are dealing
>> with much larger volumes.
>>
>> I'm just trying to understand the difference between different services by
>> looking at some real cases. another good case study is today's recall of
>> sour dough.  as shown on the daily new topics on http://web2express.org,
>> it emerged out at 8:40am shortly after AP reported the news.  I consider it
>> a new trending topic interesting to consumers. but, it does not make it to
>> Twitter.com top 10 topics. It did show up on google trends today.
>>
>> -aj
>>
>>
>>
>> On Fri, Jun 19, 2009 at 11:51 AM, David Fisher  wrote:
>>
>>>
>>> Topics don't just trend because its something 'important'. Now if it
>>> was of significantly larger volume than another topics (like the
>>> iphone's launch today), then that is rather interesting, but from what
>>> I can tell its mostly the most popular things floating to the top
>>> generally, plus some spam-filtering. I haven't figured out the
>>> exacting mechanisim for when something hits trending, but its not
>>> rocket science either.
>>>
>>> Maybe I'm missing your point
>>>
>>> -...@tibbon
>>>
>>> On Jun 19, 2:35 am, Bjoern  wrote:
>>> > On Jun 19, 7:00 am, AJ  wrote:
>>> >
>>> > > This case study shows the difference between various trending
>>> > > applications. A good real time semantic analysis is the key that
>>> makes
>>> > > the difference, I think.
>>> >
>>> > Maybe I misunderstood, but isn't the more likely explanation that the
>>> > topic simply wasn't trending?
>>> >
>>> > Björn
>>>
>>
>>
>>
>> --
>> AJ Chen, PhD
>> Co-Chair, Semantic Web SIG, sdforum.org
>> http://web2express.org
>> Palo Alto, CA
>>
>
>


-- 
AJ Chen, PhD
Co-Chair, Semantic Web SIG, sdforum.org
http://web2express.org
Palo Alto, CA


[twitter-dev] Re: airline accident case study

2009-06-19 Thread AJ Chen
>From user perspective, it's useful if a trending app can pick up new hot
topics as they are emerging, particularly for the rather distinct events
like airline accident. this is one of the main design principles I have for
my twitter digest app. now, whether a new topic should be considered as
trending topic may vary a lot among the various trending applications, which
depends on detection sensitivity and policy. I'm sure Twitter guys spotted
the airline accident, but it did not make it to the top 10 list. At google
trends, the signal may be too low to be detected because they are dealing
with much larger volumes.

I'm just trying to understand the difference between different services by
looking at some real cases. another good case study is today's recall of
sour dough.  as shown on the daily new topics on http://web2express.org, it
emerged out at 8:40am shortly after AP reported the news.  I consider it a
new trending topic interesting to consumers. but, it does not make it to
Twitter.com top 10 topics. It did show up on google trends today.

-aj


On Fri, Jun 19, 2009 at 11:51 AM, David Fisher  wrote:

>
> Topics don't just trend because its something 'important'. Now if it
> was of significantly larger volume than another topics (like the
> iphone's launch today), then that is rather interesting, but from what
> I can tell its mostly the most popular things floating to the top
> generally, plus some spam-filtering. I haven't figured out the
> exacting mechanisim for when something hits trending, but its not
> rocket science either.
>
> Maybe I'm missing your point
>
> -...@tibbon
>
> On Jun 19, 2:35 am, Bjoern  wrote:
> > On Jun 19, 7:00 am, AJ  wrote:
> >
> > > This case study shows the difference between various trending
> > > applications. A good real time semantic analysis is the key that makes
> > > the difference, I think.
> >
> > Maybe I misunderstood, but isn't the more likely explanation that the
> > topic simply wasn't trending?
> >
> > Björn
>



-- 
AJ Chen, PhD
Co-Chair, Semantic Web SIG, sdforum.org
http://web2express.org
Palo Alto, CA


[twitter-dev] airline accident case study

2009-06-18 Thread AJ

It happened again. just can't help to share this case study:
"Continental airlines incident" was emerging at about 8:29am this
morning on the twitter daily new topic list on my http://web2express.org
website. it was probably at the same time as other major news outlets
broke the news, maybe even a litter bit earlier.  Surprisingly, this
top news did not show up on Twitter.com's trending topics list, nor on
google trends at all for the whole day. (I just checked and compared
the lists)

This case study shows the difference between various trending
applications. A good real time semantic analysis is the key that makes
the difference, I think.

-aj
--
AJ Chen, PhD
Co-Chair, Semantic Web SIG, sdforum.org
http://web2express.org
Palo Alto, CA


[twitter-dev] comparing trending twitter topics

2009-06-15 Thread AJ Chen
Hi, there are several resources providing trending topics: twitter search,
google trends, yahoo buzz, and web2express digest. Because they use
different algorithms on different contents, it may not be fair to make
direct comparison. But, just looking at the outcomes, it's helpful to know
how they differ.

Twitter provides only top 10 trending topics from tweets. It feels like a
teaser to me. Maybe more comprehensive list will come out in the future.

Web2express digest gives out all identified topics from twitter stream in
real time (thanks to twitter API). To answer different questions from
different user groups, it present fresh topics in several ways. You can see
new topics just emerging today, or daily new topics for the last few days,
or even all of today's topics (including not-so-exciting ones). The topics
are sorted by scores by default, but you can also sort the topics by time to
spot latest topics.

Google trends are based on user search queries and/or web page index.  It
provides top 100 topics. It seems to me that it requires lots of contents
being generated before topics can be calculated. for example, there is no
index-based trend lines for many twitter topics because of not enough
relevant data in the index.

Yahoo buzz are based on user search queries and maybe buzzup contents. there
are several categories of buzz as well as overall leaders.

I'm just starting to look at these differences, but my immediate feeling is
that they are quite complementary to each other. Combining all of them on
one web page would give user a better view than individual source could.
this is the new page I added earlier today to web2express digest web
site<http://web2express.org>.
For each topic, I included links to search the various sources. from the
page, I find it very easy to figure out what a wired twitter topic is about
by clicking the links to twitter search, google search and google trend.

cheers,
-aj
-- 
AJ Chen, PhD
Co-Chair, Semantic Web SIG, sdforum.org
http://web2express.org
Palo Alto, CA


[twitter-dev] Re: best practice for dealing with streaming api connection close

2009-06-14 Thread AJ Chen
John, great information. thanks a lot. I'll put in a proper wait time before
next re-connection.
-aj

On Sun, Jun 14, 2009 at 8:14 PM, John Kalucki  wrote:

>
> AJ,
>
> If you had a valid connection and the connection drops, reconnect
> immediately. This is encouraged!
>
> If you attempt a connection and get a TCP or IP level error, back off
> linearly, but cap the backoff to something fairly short. Perhaps start
> at 20 milliseconds, double, and cap at 15 seconds. There's probably a
> transitory network problem and it will probably clear up quickly.
>
> If you get a HTTP error (4XX), backoff linearly, but cap the backoff
> at something longer, perhaps start at 250 milliseconds, double, and
> cap at 120 seconds. Whatever has caused the issue isn't going away
> anytime soon. There's not much point in polling any faster and you are
> just more likely to run afoul of some rate limit.
>
> The service is fairly lenient. You aren't going to get banned for a
> few dozen bungled connections here and there. But, if you do anything
> in a while loop that also doesn't have a sleep, you'll eventually get
> the hatchet for some small number of minutes. If you get the hatchet
> repeatedly, you'll be cut off for an indeterminate period of time.
>
> There are four main reasons to have your connection closed:
> * Duplicate clients logins (earlier connections terminated)
> * Hosebird server restarts (code deploys)
> * Lagging connection getting thrown off (client too slow, or
> insufficient bandwidth)
> * General Twitter network maintenance (Load balancer restarts, network
> reconfigurations, other very very rare events)
>
> We plan to have enough spare capacity on the surviving servers to
> absorb the load from server restarts. You must ensure that your client
> is fast enough and that you have sufficient bandwidth and a stable
> enough connection to consume your stream. I usually see connections
> that survive for a few days before mysteriously being dropped. Just
> reconnect in these cases.
>
> -John Kalucki
> Services, Twitter Inc.
>
>
> On Jun 14, 3:31 pm, AJ  wrote:
> > The streaming api is great, but it sometimes closes the connection for
> > whatever reason. my realtime system must figure out when to reconnect
> > automatically.  the auto-reconnection can't blindly request a
> > connection whenever it is not connected, otherwise it will floor the
> > api and may cause the api to ban or refuse the user's request. it's
> > bad to bombard the api server with repeated connection requests.
> > Could the api team recommend some best practice for dealing with auto-
> > reconnection?
> >
> > maybe certain error code or error message can indicate the cause of
> > dropping connection and wait time for next connection request. I just
> > a long list of exceptions from streaming api as a result of repeated
> > connection, and the different messages are:
> >
> > twitter4j.TwitterException: Address already in use: connect
> > twitter4j.TwitterException: Authentication credentials were missing or
> > incorrect.
> > twitter4j.TwitterException: Connection refused: connect
> > twitter4j.TwitterException: No route to host: connect
> > twitter4j.TwitterException: Stream closed.
> > twitter4j.TwitterException: The request is understood, but it has been
> > refused.  An accompanying error message will explain why.
> > twitter4j.TwitterException: connect timed out
> >
> > How to prevent such situation of repeated connections requests?
> >
> > thanks,
> > aj
>



-- 
AJ Chen, PhD
Co-Chair, Semantic Web SIG, sdforum.org
Technical Architect, healthline.com
http://web2express.org
Palo Alto, CA


[twitter-dev] best practice for dealing with streaming api connection close

2009-06-14 Thread AJ

The streaming api is great, but it sometimes closes the connection for
whatever reason. my realtime system must figure out when to reconnect
automatically.  the auto-reconnection can't blindly request a
connection whenever it is not connected, otherwise it will floor the
api and may cause the api to ban or refuse the user's request. it's
bad to bombard the api server with repeated connection requests.
Could the api team recommend some best practice for dealing with auto-
reconnection?

maybe certain error code or error message can indicate the cause of
dropping connection and wait time for next connection request. I just
a long list of exceptions from streaming api as a result of repeated
connection, and the different messages are:

twitter4j.TwitterException: Address already in use: connect
twitter4j.TwitterException: Authentication credentials were missing or
incorrect.
twitter4j.TwitterException: Connection refused: connect
twitter4j.TwitterException: No route to host: connect
twitter4j.TwitterException: Stream closed.
twitter4j.TwitterException: The request is understood, but it has been
refused.  An accompanying error message will explain why.
twitter4j.TwitterException: connect timed out

How to prevent such situation of repeated connections requests?

thanks,
aj


[twitter-dev] Re: WWDC Twitter developer meetup at Twitter HQ: RSVP!

2009-06-08 Thread AJ Chen
I'm coming, too. could someone provide the exact address for the meetup?
thanks,
-aj

On Sun, Jun 7, 2009 at 10:35 AM, Mark Paine  wrote:

>
> I'm in.
>
> -Mark
>
>
> On May 21, 2:18 pm, Alex Payne  wrote:
> > Hi all,
> >
> > There's great crossover between Twitter API developers and Mac/iPhone
> > developers. Andrew Stone, developer of Twittelator Pro, suggested that
> > we all get together during WWDC and coordinate around the Apple Push
> > Notification Service and other issues of mutual interest. Twitter's
> > offices are just a few blocks from Moscone, so it should be easy for
> > any interested coders to make it over here.
> >
> > Please RSVP with a reply to this thread and let us know what dates and
> > times work for you. Andrew was thinking early one morning, but not
> > being much of a morning person, I'd prefer something later in the day.
> > We'll let group consensus decide.
> >
> > Thanks, and hope to see you in early June.
> >
> > --
> > Alex Payne - API Lead, Twitter, Inc.http://twitter.com/al3x
>



-- 
AJ Chen, PhD
Co-Chair, Semantic Web SIG, sdforum.org
Technical Architect, healthline.com
http://web2express.org
Palo Alto, CA


[twitter-dev] interesting community events

2009-05-22 Thread AJ Chen
I thought the following events may be interesting to some you:

1.  "hacking the semantics of Twitter". 6:30-9pm, July 1, palo alto, ca.
Agenda:
  - Doug will introduce twitter API
  - presentations from developers and companies.  you are encouraged to
present your twitter app. there are proposals from five speakers already.
but, I'd like to accommodate as many speakers as possible. if you think your
twitter app is cool, drop me an email.
Detailed program is on my web2express.org
website<http://web2express.org/openlab/sdforum-semantic-web-sig/>(will
be moved to
sdforum.org website after June 16).

2. 2009 Semantic Technology Conference <http://semtech2009.com/>, June
14-18, San Jose, California. This is the largest annual conference on
semantic technology. Lots of information on content analysis, social
netwoking, search, intelligence, etc.  I'm organizing the panel "entering
web intelligence under cover", which I'll moderate as well. I plan to
hightlight intelligent twitter application in my opening remarks. There are
Free exhibit registration prior to the panel. Enter the coupon code of ST9A4
and use this link to register:
https://www.regonline.com?eventID=677058&rTypeID=131026<https://www.regonline.com/?eventID=677058&rTypeID=131026>.
Save $300 on conference registration fees when using the coupon code of
ST9A4 for a paid registration.  To register for conference sessions:
http://www.semtech2009.com/2009/registration/

regards,
-aj
-- 
AJ Chen, PhD
Co-Chair, Semantic Web SIG, sdforum.org
Technical Architect, healthline.com
http://web2express.org
Palo Alto, CA


[twitter-dev] Re: The Data Mining Feed has troubles?

2009-05-22 Thread AJ Chen
The data mining feed is not functional for several days, and it will be
phased out shortly according to Alex's email yesterday. The better
replacement is streaming API.
-aj

On Fri, May 22, 2009 at 5:45 AM, junki  wrote:

>
> Hi, there.
>
> I've got a "The Data Mining Feed"'s right about 1 mouth ago.
> Since then, I could obtained 600 recent public statuses per a minute
> with my shell script.
>
> But last few days, I could get only 20 tweets. "20" is acquired by
> normal API-Method.
> Does it mean that I was banned by Twitter API Team?
> The acount allowed "The Data Mining Feed" is http://twitter.com/jonki_bot
>
> Thanks to your help.
>
> ---
> Junki OHMURA (http://twitter.com/jonki)
>



-- 
AJ Chen, PhD
Co-Chair, Semantic Web SIG, sdforum.org
Technical Architect, healthline.com
http://web2express.org
Palo Alto, CA


[twitter-dev] Re: Twitter4J 2.0.4 released - added Streaming API support & fixed OAuth compatibility issue

2009-05-19 Thread AJ Chen
Yusuke, thank you for providing twitter4j api. it's very useful.
-aj

On Tue, May 19, 2009 at 2:07 AM, Yusuke Yamamoto  wrote:

> Hi all,
>
> Twitter4J 2.0.4 is available for download.
> http://yusuke.homeip.net/twitter4j/en/index.html#download
> It's also available at the Maven central repository.
> http://repo1.maven.org/maven2/net/homeip/yusuke/twitter4j/
>
> Previous versions have a compatibility issue with OAuth since May 13th.
> Projects requiring OAuth support need to migrate to this version.
>
> Compatibility notes:
> - retirement of ExtendedUser class
> Following methods return User, or List instead of ExtendedUser, or
> List:
> getUserDetail()
> verifyCredentials()
> updateProfile()
> updateProfileColors()
> getBlockingUsers()
> getAuthenticatedUser()
>
> The method signatures of TwitterListener and TwitterAdapter are changed
> accordingly.
>
> "ExtendedUser" and "UserWithStatus" class are retired(deleted) since the
> API returns extended user information with all methods.
> Use "User" class instead.
>
> - Streaming API support
> Now Twitter4J supports the Streaming API which is in alpha test phase.
> http://apiwiki.twitter.com/Streaming-API-Documentation
> Please read the above document from top to bottom carefully before you dive
> into 
> TwitterStream<http://yusuke.homeip.net/twitter4j/en/javadoc/twitter4j/TwitterStream.html>
> .
> Note that the Streaming API is subject to change.
>
> Release Notes - Twitter4J - Version 2.0.4 - HTML formatBug
>
>- [TFJ-142 <http://yusuke.homeip.net/jira/browse/TFJ-142>] -
>DocumentBuilder.parse is not thread safe : NullPointerException at
>AbstractDOMParser.startElement
>- [TFJ-145 <http://yusuke.homeip.net/jira/browse/TFJ-145>] -
>twitter4j.http.Response shouldn't be Serializable
>- [TFJ-146 <http://yusuke.homeip.net/jira/browse/TFJ-146>] -
>getUserDetail should be invocable from unauthenticated Twitter instances
>- [TFJ-149 <http://yusuke.homeip.net/jira/browse/TFJ-149>] - OAuth
>fails with "Invalid / expired Token" after May 13, 2009
>
> Improvement
>
>- [TFJ-147 <http://yusuke.homeip.net/jira/browse/TFJ-147>] - retire
>ExtendedUser and UserWithStatus
>
> New Feature
>
>- [TFJ-139 <http://yusuke.homeip.net/jira/browse/TFJ-139>] - streaming
>API support beta
>- [TFJ-144 <http://yusuke.homeip.net/jira/browse/TFJ-144>] - Add
>methods to retrieve blocking information
>
> Task
>
>- [TFJ-143 <http://yusuke.homeip.net/jira/browse/TFJ-143>] -
>Deprecation of following and notification elements
>
>
> Have fun!
> --
> Yusuke Yamamoto
> yus...@mac.com
> follow me at http://twitter.com/yusukeyamamoto
>
>


-- 
AJ Chen, PhD
Co-Chair, Semantic Web SIG, sdforum.org
Technical Architect, healthline.com
http://web2express.org
Palo Alto, CA


[twitter-dev] Re: twitter digest not available

2009-05-14 Thread AJ Chen
Yes, the daily hot topics may surprise many people. I could not believe what
I saw when the system went online for the first time a few months ago. If
you are used to reading tech news or WSJ, you may get a shock. The daily
conversations on twitter, and probably other social networking sites, are
mostly about TV shows, movies, games, and other entertainment stuff. But, on
the other hand, this also makes sense. People are talking about their lives
on social networking sites, and life is not all about technology and stock
market, at least for most ordinary people.

Web2express Digest does not cut or selection of topics. It just shows
whatever comes out of the ongoing conversations from millions of people. I
think we can learn a lot from this information in addition to becoming more
effective in navigating through the twitter sphere.

-aj
-- 
AJ Chen, PhD
Co-Chair, Semantic Web SIG, sdforum.org
http://web2express.org
Palo Alto, CA

On Thu, May 14, 2009 at 10:03 AM, Patrick Burrows
wrote:

>
> That's awesome, AJ.
>
> Though it hurts me in the opinion-of-humanity part of my brain to learn how
> heavily represented American Idol is on that list.
>
>
> --
> Patrick Burrows
> http://Categorical.ly (the Best Twitter Client Possible)
> @Categorically
>
> -Original Message-
> From: twitter-development-talk@googlegroups.com
> [mailto:twitter-development-t...@googlegroups.com] On Behalf Of AJ
> Sent: Wednesday, May 13, 2009 6:36 PM
> To: Twitter Development Talk
> Subject: [twitter-dev] twitter digest not available
>
>
> Hi, thanks to twitter's api and the api team, the data feed for data
> mining is just wonderful. I have put together a real time system that
> takes in the feed and does some NLP analysis on tweets using open
> tools like Open Calais and openNLP.  The results are freely available
> on http://web2express.org/.  Using this twiiter web app, you can
> spot   daily hot topics and for each hot topic, quickly find the top
> contributing twitter users. I hope this real time information will
> help users to understand the popular topics at any given moment and
> easily identify who to follow.
>
> Please let me know if you have any comment.
>
> -aj
> AJ Chen, PhD
> Co-Chair, Semantic Web SIG, sdforum.org
> http://web2express.org
> Palo Alto, CA
>
>


[twitter-dev] twitter digest not available

2009-05-13 Thread AJ

Hi, thanks to twitter's api and the api team, the data feed for data
mining is just wonderful. I have put together a real time system that
takes in the feed and does some NLP analysis on tweets using open
tools like Open Calais and openNLP.  The results are freely available
on http://web2express.org/.  Using this twiiter web app, you can
spot   daily hot topics and for each hot topic, quickly find the top
contributing twitter users. I hope this real time information will
help users to understand the popular topics at any given moment and
easily identify who to follow.

Please let me know if you have any comment.

-aj
AJ Chen, PhD
Co-Chair, Semantic Web SIG, sdforum.org
http://web2express.org
Palo Alto, CA


[twitter-dev] xml feed for data mining is not fresh

2009-05-09 Thread AJ

problem with data mining xml feed: the first call to api returns a
fresh feed; but subsequent calls return the same feed. maybe a cache.
did the server forget to return to normal operation?
-aj


[twitter-dev] Re: invalid xml char in public timeline

2009-05-06 Thread AJ Chen

the example xml feed I'm looking at has status ID from 1718273418 to
1718264182
-aj

On May 6, 7:21 pm, Cameron Kaiser  wrote:
> > I'm getting this xml parsing error all day long. I'm using jdom.jar
> > and pass twitter api xml response directly to build jdom document.
> > Looking at the xml file right now, but hope you can take at look at it
> > as well. I expect other jdom users may see the same error.
>
> > 2009-05-06 19:11:49,401 ERROR feed.XmlFetcher (XmlFetcher.java:run
> > (136)) - failed to fetchhttp://twitter.com/statuses/public_timeline.xml;
> > org.jdom.input.JDOMParseException: Error on line 8148: An invalid XML
> > character (Unicode: 0x19) was found in the element content of the
> > document.
>
> The problem is that the view moves so fast that it's unlikely it's still
> there. When you dump the raw data, what is the last status ID you see before
> it bugs out?
>
> --
>  personal:http://www.cameronkaiser.com/--
>   Cameron Kaiser * Floodgap Systems *www.floodgap.com* ckai...@floodgap.com
> -- It's the car, right? Chicks dig the car. -- "Batman Forever" 
> ---


[twitter-dev] Re: invalid xml char in public timeline

2009-05-06 Thread AJ Chen

I switch to use http://twitter.com/statuses/public_timeline.xml, and
jdom does not throw the same error.

I check xml specs and here is what I think:
the data mining feed "http://twitter.com/statuses/
public_timeline_partners_#.xml returns xml document containing an
invisible "end of medium" (maybe end of line, see
http://www.fileformat.info/info/unicode/char/0019/index.htm) at line
8148.  this character has unicode 0x19, which is not a recommended
char (see http://www.w3.org/TR/REC-xml/#charsets). jdom parser
complains about this invalid char.

I think you can reproduce the problem by using any data mining feed
(xml format) and test it with jdom parser (code in my first email).

To solve the problem, please make sure no invalid xml char like "end
of medium" character is inserted into the xml when the xml feed is
created. this error occurs often before, but today every data mining
xml feed has the invalid char.

thanks for helping,

-aj


On May 6, 7:21 pm, Cameron Kaiser  wrote:
> > I'm getting this xml parsing error all day long. I'm using jdom.jar
> > and pass twitter api xml response directly to build jdom document.
> > Looking at the xml file right now, but hope you can take at look at it
> > as well. I expect other jdom users may see the same error.
>
> > 2009-05-06 19:11:49,401 ERROR feed.XmlFetcher (XmlFetcher.java:run
> > (136)) - failed to fetchhttp://twitter.com/statuses/public_timeline.xml;
> > org.jdom.input.JDOMParseException: Error on line 8148: An invalid XML
> > character (Unicode: 0x19) was found in the element content of the
> > document.
>
> The problem is that the view moves so fast that it's unlikely it's still
> there. When you dump the raw data, what is the last status ID you see before
> it bugs out?
>
> --
>  personal:http://www.cameronkaiser.com/--
>   Cameron Kaiser * Floodgap Systems *www.floodgap.com* ckai...@floodgap.com
> -- It's the car, right? Chicks dig the car. -- "Batman Forever" 
> ---


[twitter-dev] Re: invalid xml char in public timeline

2009-05-06 Thread AJ Chen

Alex,
I'm getting this xml parsing error all day long. I'm using jdom.jar
and pass twitter api xml response directly to build jdom document.
Looking at the xml file right now, but hope you can take at look at it
as well. I expect other jdom users may see the same error.

2009-05-06 19:11:49,401 ERROR feed.XmlFetcher (XmlFetcher.java:run
(136)) - failed to fetch http://twitter.com/statuses/public_timeline.xml;
org.jdom.input.JDOMParseException: Error on line 8148: An invalid XML
character (Unicode: 0x19) was found in the element content of the
document.

thanks,
-aj


On Apr 2, 11:45 am, Alex Payne  wrote:
> Following up: if you can point us to a status (by ID) that has this
> unwanted control character, we'll track down the source of the issue.
>
>
>
> On Wed, Apr 1, 2009 at 23:48, AJ  wrote:
>
> > I'm using the public timeline feed (for data mining) and frequently
> > see xml paring error like this:
> > org.jdom.input.JDOMParseException: Error on line 3016: An invalid XML
> > character (Unicode: 0x1) was found in the element content of the
> > document.
>
> > the error comes from building JDom document in the following code.
> >        SAXBuilder builder = new SAXBuilder();
> >        URL u = new URL( url );
> >        URLConnection conn = u.openConnection();
> >        if(agent != null){
> >            conn.setRequestProperty("User-Agent", agent);
> >        }
> >        doc = builder.build( conn.getInputStream() );
>
> > is this a know issue with public timeline feed? any good way to fix
> > this error?
> > -aj
>
> --
> Alex Payne - API Lead, Twitter, Inc.http://twitter.com/al3x


[twitter-dev] inviting developers to showcase twitter apps on SDForum July event

2009-04-27 Thread AJ

After just finalizing our next event program, which will be focused on
web intelligence and co-hosted with the semtech09 conference in June,
it’s time to plan on the July event for our SIG. I’d like to do
something different this time, that is, to allow as many developers as
possible to showcase their projects of using NLP/semantics for cool
applications.

Since Twitter is the most popular topic these days and they are so
generously opening their data stream to third-party developers, I
think it will serve as the best theme for this developer-focused
event.  I’d like to invite all developers to submit proposal for
speaking on the event. Any project using NLP or semantic technology to
make use of twitter data/api is welcome, including personal projects
actively pursued by developers or entrepreneurs.  Of course, anyone
from big companies or startups is encouraged to share his or her
twitter app.  The only requirement is that you should be able to show
how your code works. This is similar to what people do on codecamp.

If you are doing some cool stuff with twitter data, please send me a
description of what you would like to talk about. Everyone gets only
10 min to present/demo/coding so that more projects can be
accommodated on the 2-hour event.

If twitter API team has someone available, we could get them to
introduce twitter API to kick off the event, which I’d like to call
“Hacking the semantics of twitter”.

This semantic web SIG event is scheduled for July 1st in palo alto,
CA.  For more information about the SIG, please visit SDForum.org
website.

Best,
-aj

--
AJ Chen, PhD
Co-Chair, Semantic Web SIG, sdforum.org
Technical Architect, healthline.com
http://web2express.org
Palo Alto, CA


[twitter-dev] invalid xml char in public timeline

2009-04-02 Thread AJ

I'm using the public timeline feed (for data mining) and frequently
see xml paring error like this:
org.jdom.input.JDOMParseException: Error on line 3016: An invalid XML
character (Unicode: 0x1) was found in the element content of the
document.

the error comes from building JDom document in the following code.
SAXBuilder builder = new SAXBuilder();
URL u = new URL( url );
URLConnection conn = u.openConnection();
if(agent != null){
conn.setRequestProperty("User-Agent", agent);
}
doc = builder.build( conn.getInputStream() );

is this a know issue with public timeline feed? any good way to fix
this error?
-aj


[twitter-dev] Re: Twitter IM: AIM, GTalk, Jabber, Etc..

2009-02-20 Thread AJ McKee

>
> I can't see anything about IM integration on the site, and all
> information is in old blog entries that seem to no longer apply. Does
> anyone know the *exact* status as to what's going to be done with IM
> integration or if this has been tabled?
>

IM is gone for the moment according to the twitter blog. However you
could use a service such as GNIP and have then send you the tweets via
POST and from there have it sent to you bot.

Aj


data mining feeds

2008-11-17 Thread AJ

Hi Alex, I'm exploring opencalais web service for analyzing tweets,
try to see what topics are popular based on statistics. I would need
more data than the current limit allows.  Please let me know how I
could use the data mining feeds?  thanks a lot,
-aj
sdforum semantic web SIG