Cool! This is an Parts-of-Speech toolkit for twitter:
http://www.ark.cs.cmu.edu/TweetNLP/

It's great that there is an NLP ecosystem developing around this new "grammar". Are there Twitter monitoring services which use this type of tool to fine-tune relevance? That would be a cool and resume-enhancing technical report.

Lance

On 09/20/2013 10:59 AM, Michael Schmitz wrote:
You might find this package helpful--it's specifically for NER and tweets.

https://github.com/aritter/twitter_nlp

Peace.  Michael

On Fri, Sep 13, 2013 at 3:49 AM, Siva Sakthi <[email protected]> wrote:
Hi,
   we are using opennlp for finding organizations (code below)

e.g.

1. Find out how Intel Xeon processors help make #EMC number 1 in backup at
#IDF13 going on now in San Francisco. #Speed2Lead Protect your data
Opennlp returns "Intel" in the above sentence

2. NYPD Intel Division Chief Lashes Out At FBI Over Failed Terrorist Plot
http://t.co/V0XLKrp3TI
Opennlp returns "Intel Division Chief Lashes"

Issue 1: I don't understand why it returns a composite string in the second
case, instead of just Intel
Issue 2: The "Intel" in the second sentence is not really "Intel"

My code as follows,

     public static String findOrg(String message) throws Exception {
         String[] words = message.split(" ");
         InputStream orgIs = new FileInputStream("en-ner-organization.bin");
         TokenNameFinderModel tnf = new TokenNameFinderModel(orgIs);
         NameFinderME nf = new NameFinderME(tnf);
         Span sp[] = nf.find(words);
         String a[] = Span.spansToStrings(sp, words);
         StringBuilder sb = new StringBuilder();
         int l = a.length;

         for (int j = 0; j < l; j++) {
             sb = sb.append(a[j] + "\n");
         }

         return sb.toString();
     }

Thanks,
Ss

Reply via email to