[twitter-dev] Re: Quick hack: using Twitter with Yahoo Placemaker to geolocate tweets

Brendan O'Connor Wed, 27 May 2009 19:03:25 -0700

On Wed, May 27, 2009 at 5:04 PM, Christian Heilmann <
[email protected]> wrote:


>
> http://isithackday.com/hacks/placemaker/tweet-locations.php?user=codepo8
>
> What do you think?


Hey, nicely done.  I like the maps.

Are you sending the raw tweet texts to the Yahoo Placemaker service?  Do you
try to use the tweet['user']['location'] data at all?

It's interesting to look at the quality level of this yahoo service.
 Unfortunately, it makes lots of mistakes.  I was looking at my own feed
(since i know what i was trying to talk about):
http://isithackday.com/hacks/placemaker/tweet-locations.php?user=brendan642

Out of 10 identifications, 5 of them are errors.

   - "#scala"   !=   "Monte Scala, Switzerland"
      - i meant the programming language.


   - "middle-of-the-street *valencia* parking"   !=   "valencia, CA"
      - that's a street name (in san francisco).


   - "go easy on the *cancun*"   !=   "cancun, MX"
      - minor error: name of a (mexican) restaurant.


   - "sports, *mission*, *bay* bridge"   !=   "mission bay, SF, CA"
      - that's a list of several things.  the "mission bay" neighborhood is
      not one of them .. "bay" is part of the multiword "bay bridge".


and most humorously,

   - "giant *ec2* nodes"   !=   "EC2 area code, London, England"


... I haven't used this Yahoo service before, but I bet that, if it's any
good at all, it's probably optimized for web pages or big documents, where
there are many more context words to help disambiguate and safely identify.
 There hasn't been a ton of NLP research on really short twitter-length
messages, and I suspect the problem is harder, and might require somewhat
different algorithms, than document-sized NLP problems.

Are there any applications for this where a 50% error rate is OK?

-Brendan

-- 
Brendan O'Connor - http://anyall.org

[twitter-dev] Re: Quick hack: using Twitter with Yahoo Placemaker to geolocate tweets

Reply via email to