[twitter-dev] Re: Search queries not working

2009-04-02 Thread feedbackmine

Hi Matt,

I have tried to use language parameter of twitter search and find the
result is very unreliable. For example:
http://search.twitter.com/search?lang=allq=tweetjobsearch returns 10
results (all in english), but
http://search.twitter.com/search?lang=enq=tweetjobsearch only returns
3.

I googled this list and it seems you are using n-gram based algorithm
(http://groups.google.com/group/twitter-development-talk/msg/
565313d7b36e8d65). I have found n-gram algorithm works very well for
language detection, but the quality of training data may make a big
difference.

Recently I have developed a language detector (in ruby) myself:
http://github.com/feedbackmine/language_detector/tree/master
It uses wikipedia's data for training, and based on my limited
experience it works well. Actually using wikipedia's data is not my
idea, all credits should go to Kevin Burton (http://feedblog.org/
2005/08/19/ngram-language-categorization-source/ ).

Just thought you may be interested.

@feedbackmine
http://twitter.com/feedbackmine

On Mar 31, 11:22 am, Matt Sanford m...@twitter.com wrote:
 Hi there,

      Can you provide an example URL where since_id isn't working so I  
 can try and reproduce the issue? As forlanguage, thelanguage 
 identifier is not a 100% and sometimes makes mistakes. Hopefully not  
 too many mistakes but it definitely does.

 Thanks;
    — Matt Sanford / @mzsanford

 On Mar 31, 2009, at 08:14 AM, codepuke wrote:





  Hi all;

  I see a few people complaining about the since_id not working.  I too
  have the same issue - I am currently storing the last executed id and
  having to check new tweets to make sure their id is greater than my
  last processed id as a temporary workaround.

  I have also noticed that the filter bylanguageparam also doesn't
  seem to be working 100% - I notice a few chinese tweets, as well as
  tweets having a null value forlanguage...


[twitter-dev] Open source twitter job search engine

2009-03-08 Thread feedbackmine

Hello all,
  I developed a job search engine for twitter using two weekends, and
thought someone may be interested.

 A few hightlights:
 1. uses twitter data mining feeds to collect data
 2. uses libsvm classifier and a few hardcoed twitter id to identify
job posts
 3. it is a rails web app and uses sphinx for search

 Lessons learned:
 The are lots of recruiters out there (way more than I expected!)
using twitter to re-publish jobs that they have published somewhere
else. Originally I just want to identify jobs posts that are ONLY
available on twitter.

 Demo site is at: http://tweetjobsearch.com/
 Source code is at: http://github.com/feedbackmine/tweetjobsearch/tree/master
 I have documented the journey at: http://twitter.com/feedbackmine

 Thanks!