No it is not The right answer would be to explain the features used in the model
Sent from my iPhone On Sep 21, 2013, at 1:41 PM, Lance Norskog <[email protected]> wrote: > And yet it is the right one. How odd. > > On 09/20/2013 11:16 AM, [email protected] wrote: >> That is such a poor answer >> >> >> Sent from my iPhone >> >> On Sep 20, 2013, at 11:11 AM, Jeffrey Mershon <[email protected]> wrote: >> >>> Siva, >>> >>> I'm assuming there is nothing wrong with you code. OpenNLP's named-entity >>> recognizer is based on MaxEnt modeling, as opposed to rule-based >>> programming, to identify named entities. So, the answer to "Why did OpenNLP >>> return X as an organization" is always going to be "Because it was trained >>> to do so". If the training set--that is, the set of sentences used to train >>> the recognition model that you are using--does not possess similar >>> characteristics to the sentences you are using that model to process, you >>> are going to get sub-optimal results. >>> >>> It looks to me as if you are processing tweets. If you're using the default >>> recognizer, I doubt very much whether that was trained on tweets, and >>> tweets possess very different characteristics than regular prose. >>> Consequently, I suggest that you consider training a model using data that >>> represents what you want to actually process. >>> >>> In the examples you give, Intel is a company name in on case and a slang >>> term (contraction of Intelligence) in another.You may find that it is not >>> possible to train just one model to handle all cases. You might need >>> individual strategies for different industries, depending on what you are >>> trying to achieve. Good Luck. >>> >>> Regards, >>> >>> Jeff >>> >>> >>> On Fri, Sep 20, 2013 at 2:59 AM, Siva Sakthi <[email protected]> wrote: >>> >>>> Can anyone answer the above question??? >>>> >>>> Thanks >>>> >>>> >>>> On Fri, Sep 13, 2013 at 4:19 PM, Siva Sakthi <[email protected]> wrote: >>>> >>>>> Hi, >>>>> we are using opennlp for finding organizations (code below) >>>>> >>>>> e.g. >>>>> >>>>> 1. Find out how Intel Xeon processors help make #EMC number 1 in backup >>>> at >>>>> #IDF13 going on now in San Francisco. #Speed2Lead Protect your data >>>>> Opennlp returns "Intel" in the above sentence >>>>> >>>>> 2. NYPD Intel Division Chief Lashes Out At FBI Over Failed Terrorist Plot >>>>> http://t.co/V0XLKrp3TI >>>>> Opennlp returns "Intel Division Chief Lashes" >>>>> >>>>> Issue 1: I don't understand why it returns a composite string in the >>>>> second case, instead of just Intel >>>>> Issue 2: The "Intel" in the second sentence is not really "Intel" >>>>> >>>>> My code as follows, >>>>> >>>>> public static String findOrg(String message) throws Exception { >>>>> String[] words = message.split(" "); >>>>> InputStream orgIs = new >>>> FileInputStream("en-ner-organization.bin"); >>>>> TokenNameFinderModel tnf = new TokenNameFinderModel(orgIs); >>>>> NameFinderME nf = new NameFinderME(tnf); >>>>> Span sp[] = nf.find(words); >>>>> String a[] = Span.spansToStrings(sp, words); >>>>> StringBuilder sb = new StringBuilder(); >>>>> int l = a.length; >>>>> >>>>> for (int j = 0; j < l; j++) { >>>>> sb = sb.append(a[j] + "\n"); >>>>> } >>>>> >>>>> return sb.toString(); >>>>> } >>>>> >>>>> Thanks, >>>>> Ss >>>>> >>>>> >
